Getting encoding error while writing data to csv file

Question

from tweetpy import *
import re
import json
from pprint import pprint
import csv

# Import the necessary methods from "twitter" library
from twitter import Twitter, OAuth, TwitterHTTPError, TwitterStream

# Variables that contains the user credentials to access Twitter API
ACCESS_TOKEN =  ''
ACCESS_SECRET = ''
CONSUMER_KEY = ''
CONSUMER_SECRET = ''

oauth = OAuth(ACCESS_TOKEN, ACCESS_SECRET, CONSUMER_KEY, CONSUMER_SECRET)

# Initiate the connection to Twitter Streaming API
twitter_stream = TwitterStream(auth=oauth)

# Get a sample of the public data following through Twitter
iterator = twitter_stream.statuses.filter(track="#kindle",language="en",replies="all")
 # Print each tweet in the stream to the screen

 # Here we set it to stop after getting 10000000 tweets.
 # You don't have to set it to stop, but can continue running
 # the Twitter API to collect data for days or even longer.

tweet_count = 10000000

file = "C:\\Users\\WELCOME\\Desktop\\twitterfeeds.csv"
with open(file,"w") as csvfile:
    fieldnames=['Username','Tweet','Timezone','Timestamp','Location']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    for tweet in iterator:
        #pprint(tweet)
        username = str(tweet['user']['screen_name'])
        tweet_text = str(tweet['text'])
        user_timezone = str(tweet['user']['time_zone'])
        tweet_timestamp=str(tweet['created_at'])
        user_location = str(tweet['user']['location'])
        print tweet
        tweet_count -= 1
        writer.writerow({'Username':username,'Tweet':tweet_text,'Timezone':user_timezone,'Location':user_location,'Timestamp':tweet_timestamp})

        if tweet_count <= 0:
            break

I am trying to write tweets to to csv file with columns 'username', 'Tweet', 'Timezone', 'Location', and 'Timestamp'.

But I am getting the following error:

tweet_text = str(tweet['text'])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2026' in position 139: ordinal not in range(128).

I know it is encoding issue but I dont know the exact position of the variable to encode.

What do you want to do with the offending character(s)? Omit them? Convert them to the closest ASCII equivalent? Convert to a fixed character such as a question mark? — John Gordon
– John Gordon, Commented Jun 4, 2017 at 16:14
The answer may very well be different for Python 2 vs Python 3. Regardless, you're not opening the csv file correctly. Suggest you read the documentation (in both versions) where how to do so correctly is shown. — martineau
– martineau, Commented Jun 4, 2017 at 16:48

Mark Tolonen · Accepted Answer · 2017-06-04 19:03:29Z

1

Use Python 3, because the Python 2 csv module doesn't do encodings well.
Use open with the encoding and newline options.
Remove str conversion (In Python 3 str is Unicode strings already.

Result:

with open(file,"w",encoding='utf8',newline='') as csvfile:
    fieldnames=['Username','Tweet','Timezone','Timestamp','Location']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    writer.writeheader()
    for tweet in iterator:
        username = tweet['user']['screen_name']
        tweet_text = tweet['text']
        user_timezone = tweet['user']['time_zone']
        tweet_timestamp = tweet['created_at']
        user_location = tweet['user']['location']
            .
            .
            .

If using Python 2, get the 3rd party unicodecsv module to overcome csv shortcomings.

answered Jun 4, 2017 at 19:03

Mark Tolonen

181k26 gold badges182 silver badges278 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Ptank · Accepted Answer · 2017-06-04 16:27:14Z

0

If you really want to transform all your unicode data

tweet['text'].encode("ascii", "replace")
or
tweet['text'].encode("ascii", "ignore") # if you want skip char

answered Jun 4, 2017 at 16:27

Ptank

1046 bronze badges

Collectives™ on Stack Overflow

Getting encoding error while writing data to csv file

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related