0

I am trying to write data I 'scraped' from a site to a json output file with the following code:

from bs4 import BeautifulSoup
import requests
import json

path = ["https://www.test.be?page=,https://www.test2.be?page="]

adresArr = []
for i in path:
    pagina = 0;
    for x in range(0, 4):
        url = i + str(pagina)
        response = requests.get(url, timeout=5)
        content = BeautifulSoup(response.content, "html.parser")
        for adres in content.findAll('tr', attrs={"class": "odd clickable-row"}):
            adresObject = {
                "postcode": adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.encode('utf-8'),
                "naam": adres.find('td', attrs={"class": "views-field views-field-field-locatie-thoroughfare"}).text.encode('utf-8'),
                "plaats": adres.find('td', attrs={"class": "views-field views-field-field-locatie-locality"}).text.encode('utf-8')
            }
            adresArr.append(adresObject)


        for adres in content.findAll('tr', attrs={"class": "odd clickable-row active"}):
            adresObject = {
                "postcode": adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.encode('utf-8'),
                "naam": adres.find('td', attrs={"class": "views-field views-field-field-locatie-thoroughfare"}).text.encode('utf-8'),
                "plaats": adres.find('td', attrs={"class": "views-field views-field-field-locatie-locality"}).text.encode('utf-8')
            }
            adresArr.append(adresObject)

            pagina = x

    with open('adresData.json', 'w') as outfile:
         json.dump(adresArr, outfile)

I am getting the following error: object of type bytes is not json serializable

If I print the array itself, it looks OK. But i'm stuck at writing it to a json file. What am I doing wrong ?

It's my first time coding in python (and not alot of coding experience) So please make your answer clear to understand :)

Thanks in advance

2 Answers 2

1

To resolve this Problem... You just have to convert data-type of your element here is a reference of the previously answered same question

TypeError: Object of type 'bytes' is not JSON serializable

this would might help

Sign up to request clarification or add additional context in comments.

Comments

0

In the lines like this:

"postcode": adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.encode('utf-8')

The .text result should already be a string; .encode('utf-8') makes it the bytes object that the json library is complaining about. So just leave that off: adres.find('td', attrs={"class": "views-field views-field-field-locatie-postal-code"}).text.

Background info: bytes are the raw units of information; strings are how we represent text. We encode a string to make the bytes that are used for storage; we decode bytes to get a string back. But JSON is already designed to work with strings - the library will handle the file encoding for you when it actually writes to the disk.

1 Comment

Thanks this solved my problem . And I understand what I was doing wrong now :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.