1

I've got multiple file to load as JSON, they are all formatted the same way but for one of them I can't load it without raising an exception. This is where you can find the file:

File

I did the following code:

def from_seed_data_extract_summoners():
   summonerIds = set()
   for i in range(1,11):
       file_name = 'data/matches%s.json' % i
       print file_name
       with open(file_name) as data_file:    
           data = json.load(data_file)
       for match in data['matches']:
           for summoner in match['participantIdentities']:
               summonerIds.add(summoner['player']['summonerId'])
   return summonerIds

The error occurs when I do the following: json.load(data_file). I suppose there is a special character but I can't find it and don't know how to replace it. The error generated is:

UnicodeDecodeError: 'utf8' codec can't decode byte 0xeb in position 6: invalid continuation byte

Do you know how I can get ride of it?

4 Answers 4

2

Your JSON is trying to force the data into unicode, not just a simple string. You've got some embedded character (probably a space or something not very noticable) that is not able to be forced into unicode.

How to get string objects instead of Unicode ones from JSON in Python?

That is a great thread about making JSON objects more manageable in python.

Sign up to request clarification or add additional context in comments.

Comments

2
  1. replace file_name = 'data/matches%s.json' % i with file_name = 'data/matches%i.json' % i
  2. the right syntax is data = json.load(file_name) and not -

    with open(file_name) as data_file: data = json.load(data_file)

EDIT:

def from_seed_data_extract_summoners():
 summonerIds = set()   
   for i in range(1,11):
        file_name = 'data/matches%i.json' % i
        with open(file_path) as f:
            data = json.load(f, encoding='utf-8')
        for match in data['matches']:
            for summoner in match['participantIdentities']:
                summonerIds.add(summoner['player']['summonerId'])    
    return summonerIds

3 Comments

I did the change and get the following error: AttributeError: 'str' object has no attribute 'read'
I will rewrite my answer for you
1

Try:

json.loads(unicode(data_file.read(), errors='ignore'))

or :

json.loads(unidecode.unidecode(unicode(data_file.read(), errors='ignore')))

(for the second, you would need to install unidecode)

Comments

1

try :

json.loads(data_file.read(), encoding='utf-8')

1 Comment

I got the following: 'ascii' codec can't decode byte 0xc3 in position 16260798: ordinal not in range(128)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.