Extract list from a string

Question

I am extracting data from the Google Adwords Reporting API via Python. I can successfully pull the data and then hold it in a variable data.

data = get_report_data_from_google()

type(data)
str

Here is a sample:

data = 'ID,Labels,Date,Year\n3179799191,"[""SKWS"",""Exact""]",2016-05-16,2016\n3179461237,"[""SKWS"",""Broad""]",2016-05-16,2016\n3282565342,"[""SKWS"",""Broad""]",2016-05-16,2016\n'

I need to process this data more, and ultimately output a processed flat file (Google Adwords API can return a CSV, but I need to pre-process the data before loading it into a database.).

If I try to turn data into a csv object, and try to print each line, I get one character per line like:

c = csv.reader(data, delimiter=',')

for i in c:
    print(i)

    ['I']
    ['D']
    ['', '']
    ['L']
    ['a']
    ['b']
    ['e']
    ['l']
    ['s']
    ['', '']
    ['D']
    ['a']
    ['t']
    ['e']

So, my idea was to process each column of each line into a list, then add that to a csv object. Trying that:

for line in data.splitlines():
    print(line)

3179799191,"[""SKWS"",""Exact""]",2016-05-16,2016

What I actually find is that inside of the str, there is a list: "[""SKWS"",""Exact""]"

This value is a "label" documentation

This list is formatted a bit weird - it has numerous parentheses in the value, so trying to use a quote char, like ", will return something like this: [ SKWS Exact ]. If I could get to [""SKWS"",""Exact""], that would be acceptable.

Is there a good way to extract a list object within a str? Is there a better way to process and output this data to a csv?

Generally webservices return JSON or XML for this exact reason because those formats can easily be converted to a Python dictionary. Have you tried parsing the API response as JSON? Do you need help with that? — Charlie
– Charlie, Commented May 18, 2016 at 21:33

TigerhawkT3 · Accepted Answer · 2016-05-18 21:35:33Z

2

You need to split the string first. csv.reader expects something that provides a single line on each iteration, like a standard file object does. If you have a string with newlines in it, split it on the newline character with splitlines():

>>> import csv
>>> data = 'ID,Labels,Date,Year\n3179799191,"[""SKWS"",""Exact""]",2016-05-16,2016\n3179461237,"[""SKWS"",""Broad""]",2016-05-16,2016\n3282565342,"[""SKWS"",""Broad""]",2016-05-16,2016\n'
>>> c = csv.reader(data.splitlines(), delimiter=',')
>>> for line in c:
...     print(line)
...
['ID', 'Labels', 'Date', 'Year']
['3179799191', '["SKWS","Exact"]', '2016-05-16', '2016']
['3179461237', '["SKWS","Broad"]', '2016-05-16', '2016']
['3282565342', '["SKWS","Broad"]', '2016-05-16', '2016']

answered May 18, 2016 at 21:35

TigerhawkT3

49.5k6 gold badges65 silver badges101 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Alex Hall Over a year ago

And to add to this, it looks like you should then have labels = json.loads(line[1]).

njzk2 · Accepted Answer · 2016-05-18 21:39:48Z

This has to do with how csv.reader works.

According to the documentation:

csvfile can be any object which supports the iterator protocol and returns a string each time its next() method is called

The issue here is that if you pass a string, it supports the iterator protocol, and returns a single character for each call to next. The csv reader will then consider each character as a line.

You need to provide a list of line, one for each line of your csv. For example:

c = csv.reader(data.split(), delimiter=',')
for i in c:
    print i

# ['ID', 'Labels', 'Date', 'Year']
# ['3179799191', '["SKWS","Exact"]', '2016-05-16', '2016']
# ['3179461237', '["SKWS","Broad"]', '2016-05-16', '2016']
# ['3282565342', '["SKWS","Broad"]', '2016-05-16', '2016']

Now, your list looks like a JSON list. You can use the json module to read it.

Collectives™ on Stack Overflow

Extract list from a string

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related