1

I am having an encoding issue when I run my script below: Here is the error code: -UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 9: ordinal not in range(128)

Here is my script:

import logging
import urllib
import csv
import json
import io
import codecs

with open('/home/local/apple.csv',
          'rb') as csvinput:
    reader = csv.reader(csvinput, delimiter=',')
    firstline = True
    for row in reader:
        if firstline:
            firstline = False
            continue

        address1 = row[0]
        print row[0]
        locality = row[1]
        admin_area = row[2]
        query = ' '.join(str(x) for x in (address1, locality, admin_area))
        normalized = query.replace(" ", "+")
        BaseURL = 'http://localhost:8080/verify?country=JP&freeform='
        URL = BaseURL + normalized
        print URL
        data = urllib.urlopen(URL)
        response = data.getcode()
    print response

        if response == 200:
            file= json.load(data)
        print file
        output_f=open('output.csv','wb')
        csvwriter=csv.writer(output_f)
            count = 0
            for f in file:
        if count == 0:
            header= f.keys()
            csvwriter.writerow(header)
            count += 1
        csvwriter.writerow(f.values())
        output_f.close()
        else:
            print 'error'

can anyone help me fix this its getting really annoying. I need to encode to utf8

1 Answer 1

1

Looks like you are using Python 2.x, instead of python's standard open, use codecs.open where you can optionally pass an encoding to use and what to do when there are errors. Gets a little less confusing in Python 3 where the standard Python open can do this.

So in your two lines where you are opening, do:

with codecs.open('/home/local/apple.csv',
      'rb', 'utf-8') as csvinput:

output_f = codecs.open('output.csv','wb', 'utf-8')

The optional error parm defaults to "strict" which raises an exception if the bytes can't be mapped to the given encoding. In some contexts you may want to use 'ignore' or 'replace'.

See the python doc for a bit more info.

Sign up to request clarification or add additional context in comments.

2 Comments

So i am getting this error now UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)
File "Python48.py", line 16, in <module> for row in reader: UnicodeEncodeError: 'ascii' codec can't encode character u'\ufeff' in position 0: ordinal not in range(128)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.