from bs4 import BeautifulSoup
from urllib.request import urlopen
fout = open('words_list2.txt','w')
url = 'http://endic.naver.com/?sLn=kr'
doc = urlopen(url)
web_page = BeautifulSoup(doc, 'html.parser')
word = web_page.find(attrs={'class':"tit"})
definition = web_page.find(attrs={'class':"align_line"})
fout.write(word.get_text()+':'+ definition.get_text().replace('\u200b',''))
fout.close()
-
1Why do you think you have a valid result in the previous line?Ignacio Vazquez-Abrams– Ignacio Vazquez-Abrams2016-05-03 12:22:04 +00:00Commented May 3, 2016 at 12:22
-
i don`t know what you mean...jinho park– jinho park2016-05-03 12:37:58 +00:00Commented May 3, 2016 at 12:37
-
Have you read your code?Ignacio Vazquez-Abrams– Ignacio Vazquez-Abrams2016-05-03 12:38:19 +00:00Commented May 3, 2016 at 12:38
-
yes. i just think that 'class':"tit" this type is not correct. so i want to know how to web crawling at this sitejinho park– jinho park2016-05-03 12:40:51 +00:00Commented May 3, 2016 at 12:40
-
There are dozens of questions and answers with this exact or near exact error message. Please do a little research before asking a new question. If your question is truly unique, cite what research you've done and why the other answers aren't applicable.Bryan Oakley– Bryan Oakley2016-05-03 13:35:42 +00:00Commented May 3, 2016 at 13:35
Add a comment
|
1 Answer
At the url http://endic.naver.com/?sLn=kr there's no element with the class align_line so web_page.find(attrs={'class':"align_line"}) returns None and therefore definition is None and definition.get_text() isn't going to work