I am following a tutorial to build a simple webscraper from a static website, but i get the following TypeError: TypeError(f'Object of type {o.class.name} ' TypeError: Object of type bytes is not JSON serializable
Here is my code thus far: from bs4 import BeautifulSoup import requests import json
url = 'http://ethans_fake_twitter_site.surge.sh/'
response = requests.get(url, timeout=5)
content = BeautifulSoup(response.content, "html.parser")
tweetArr = []
for tweet in content.findAll('div', attrs = {'class': 'tweetcontainer'}):
tweetObject = {
"author": tweet.find('h2', attrs= {'class': 'author'}).text.encode('utf-8'),
"date": tweet.find('h5', attrs= {'class': 'dateTime'}).text.encode('utf-8'),
"content": tweet.find('p', attrs= {'class': 'content'}).text.encode('utf-8'),
"likes": tweet.find('p', attrs= {'class': 'likes'}).text.encode('utf-8'),
"shares": tweet.find('p', attrs= {'class': 'shares'}).text.encode('utf-8')
}
tweetArr.append(tweetObject)
with open('twitterData.json', 'w') as outfile:
json.dump(tweetArr, outfile)
The only thing I can assume is wrong is that the article is using an earlier version of python, but the article is quite recent, so that should't be the case. The code is being executed and the json file is created, but the only data on there is "author:". Sorry if the answer is obvious to some of you, but I'm just starting to learn.
Here's the entire error log:
(tutorial-env) C:\Users\afaal\Desktop\python\webscraper>python webscraper.py
Traceback (most recent call last):
File "webscraper.py", line 20, in <module>
json.dump(tweetArr, outfile)
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\__init__.py", line 179, in dump
for chunk in iterable:
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\encoder.py", line 429, in _iterencode
yield from _iterencode_list(o, _current_indent_level)
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\encoder.py", line 325, in _iterencode_list
yield from chunks
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\encoder.py", line 405, in _iterencode_dict
yield from chunks
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\encoder.py", line 438, in _iterencode
o = _default(o)
File "C:\Users\afaal\AppData\Local\Programs\Python\Python38\lib\json\encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
.text.encode('utf-8')?.textpart. It requires astrobject. Notebytesor whatever custom type your library is using. Honestly, you really need to do some basic research on JSON serialization in Python. This sort of cargo-cult programming is not an effective way to learn anything.