0

I am loading Twitter data to a pandas data frame. After preprocessing I am storing the results in a csv file. When I do this the lists are being stored as strings. This makes it difficult to process this csv file further. I want to avoid storing the lists as strings and I want them to be stored as lists in the csv. How can I do this?

Before Storing as csv cleanedData.head(3).to_dict()

{'id': {0: 1042616899408945154, 1: 1042592536769044487, 2: 1042587702040903680}, 'month': {0: 9, 1: 9, 2: 9}, 'hour': {0: 3, 1: 1, 2: 1}, 'text': {0: [['are', 'red', 'violets', 'are', 'blue', 'if', 'you', 'want', 'to', 'buy', 'us', 'here', 'is', 'a', 'clue', 'our', 'eye', 'amp', 'cheek', 'palette', 'is', 'al']], 1: [['is', 'it', 'too', 'late', 'now', 'to', 'say', 'sorry']], 2: [['oh', 'no'], ['please', 'email', 'your', 'order', 'to', 'social', 'amp', 'we', 'can', 'help'], ['this', 'is', 'a', 'newest', 'offer'], []]}, 'hasMedia': {0: 0, 1: 1, 2: 0}, 'hasHashtag': {0: 1, 1: 1, 2: 0}, 'followers_count': {0: 801745, 1: 801745, 2: 801745}, 'retweet_count': {0: 17, 1: 94, 2: 0}, 'favourite_count': {0: 181, 1: 408, 2: 0}, 'sentiments': {0: {'neg': 0.0, 'neu': 0.949, 'pos': 0.051, 'compound': 0.0772}, 1: {'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}, 2: {'neg': 0.1, 'neu': 0.634, 'pos': 0.266, 'compound': 0.5684}}, 'text_posTagged': {0: [[('are', 'VBP'), ('red', 'JJ'), ('violets', 'NNS'), ('are', 'VBP'), ('blue', 'JJ'), ('if', 'IN'), ('you', 'PRP'), ('want', 'VBP'), ('to', 'TO'), ('buy', 'VB'), ('us', 'PRP'), ('here', 'RB'), ('is', 'VBZ'), ('a', 'DT'), ('clue', 'JJ'), ('our', 'PRP$'), ('eye', 'NN'), ('amp', 'NN'), ('cheek', 'NN'), ('palette', 'NN'), ('is', 'VBZ'), ('al', 'JJ')]], 1: [[('is', 'VBZ'), ('it', 'PRP'), ('too', 'RB'), ('late', 'RB'), ('now', 'RB'), ('to', 'TO'), ('say', 'VB'), ('sorry', 'NN')]], 2: [[('oh', 'UH'), ('no', 'DT')], [('please', 'VB'), ('email', 'VB'), ('your', 'PRP$'), ('order', 'NN'), ('to', 'TO'), ('social', 'JJ'), ('amp', 'IN'), ('we', 'PRP'), ('can', 'MD'), ('help', 'VB')], [('this', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('newest', 'NN'), ('offer', 'NN')], []]}}

Storing data in csv

cleanedData.to_csv('preprocessed_data.csv', sep=',')

Few rows in preprocessed_data.csv

1,1042592536769044487,9,1,"[['is', 'it', 'too', 'late', 'now', 'to', 'say', 'sorry']]",1,1,801745,94,408,"{'neg': 0.0, 'neu': 1.0, 'pos': 0.0, 'compound': 0.0}","[[('is', 'VBZ'), ('it', 'PRP'), ('too', 'RB'), ('late', 'RB'), ('now', 'RB'), ('to', 'TO'), ('say', 'VB'), ('sorry', 'NN')]]"
2,1042587702040903680,9,1,"[['oh', 'no'], ['please', 'email', 'your', 'order', 'to', 'social', 'amp', 'we', 'can', 'help'], ['this', 'is', 'a', 'newest', 'offer'], []]",0,0,801745,0,0,"{'neg': 0.1, 'neu': 0.634, 'pos': 0.266, 'compound': 0.5684}","[[('oh', 'UH'), ('no', 'DT')], [('please', 'VB'), ('email', 'VB'), ('your', 'PRP$'), ('order', 'NN'), ('to', 'TO'), ('social', 'JJ'), ('amp', 'IN'), ('we', 'PRP'), ('can', 'MD'), ('help', 'VB')], [('this', 'DT'), ('is', 'VBZ'), ('a', 'DT'), ('newest', 'NN'), ('offer', 'NN')], []]"
3,1042587263643930626,9,1,"[['its', 'best', 'applied', 'with', 'our', 'buffer', 'brush']]",0,0,801745,0,0,"{'neg': 0.0, 'neu': 0.64, 'pos': 0.36, 'compound': 0.6696}","[[('its', 'PRP$'), ('best', 'JJS'), ('applied', 'VBN'), ('with', 'IN'), ('our', 'PRP$'), ('buffer', 'NN'), ('brush', 'NN')]]"
4,1042586780292276230,9,1,[['dead']],0,0,801745,0,14,"{'neg': 0.834, 'neu': 0.166, 'pos': 0.0, 'compound': -0.7213}","[[('dead', 'JJ')]]"

In the above csv file the lists and dictionaries are stored as strings. I want to avoid this.

2
  • What is the expected output? Commented Nov 19, 2018 at 16:32
  • List without double quotes around it. Must be ['is', 'it', 'too', 'late'] and NOT "['is', 'it', 'too', 'late']" Commented Nov 19, 2018 at 16:34

1 Answer 1

3

Something like this?

import csv
df.to_csv("preprocess.csv", quoting=csv.QUOTE_NONE, escapechar=' ')
Sign up to request clarification or add additional context in comments.

1 Comment

Exactly what I needed

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.