python csv writer if row key does not exist

Question

The following script is erroring out:

import csv,time,string,os,requests, datetime

test = "\\\\network\\Shared\\test.csv"

fields = ["id", "Expiration Date", "Cost", "Resale" ]

with open(test) as infile, open("c:\\upload\\tested.csv", "wb") as outfile:
    r = csv.DictReader(infile)
    w = csv.DictWriter(outfile, fields, extrasaction="ignore")
    r = (dict((k, v.strip()) for k, v in row.items() if v) for row in r)

    wtr = csv.writer( outfile )                                                                                                                                                                                                                                                                                                                                         
    wtr.writerow(["id", "upload_date", "cost", "resale"])
    for i, row in enumerate(r, start=1):
        row['id'] = i
        print(row['Expiration Date']
        row['Expiration Date'] = datetime.datetime.strptime(row['Expiration Date'][:10], "%m/%d/%Y").strftime("%Y-%m-%d")
        w.writerow(row)


D:\Python\Scripts>python test.py
Traceback (most recent call last):
  File "test.py", line 18, in <module>
    print(row['Expiration Date'])
KeyError: 'Expiration Date'

So I think I understand what's going on - something like this from the original file:

Expiration Date     Cost     Resale
2016-01-01          1.00     2.00
                    1.42     2.42
2016-05-02          1.45     9.00

From what I can gather, there is a row where the expiration date column is NOT populated. How do I force DictWriter to skip over blanks - assuming that is the cause of my error?

Community · Accepted Answer · 2017-05-23 11:44:20Z

1

Actually, the dict produced by the csv.DictReader just puts None into a field it does not find and thus you should not get that error. You are not using the functionality of the DictReader to produce a proper dict! As far as I can tell, you try to do the parsing yourself by use of the line r = (dict((k, v.strip()) for k, v in row.items() if v) for row in r). That does not actually work, though. If you print the rows afterwards you get:

{'Expiration Date     Cost     Resale': '2016-01-01          1.00     2.00'}
{'Expiration Date     Cost     Resale': '1.42     2.42'}
{'Expiration Date     Cost     Resale': '2016-05-02          1.45     9.00'}

So every dict contains only one key. A problem with your file is, that you don't have a valid delimiter between lines. It looks like you mean to use a whitespace, but you have a whitespace in Expiration Date, as well. You will have to get rid of that. If you do that, then you can use the DictReader like this:

import csv,time,string,os,requests, datetime

test = "test.csv"

with open(test) as infile:
    r = csv.DictReader(infile, delimiter=" ", skipinitialspace=True)
for row in r:
    print(row)

will now give you:

{'Resale': '2.00', 'Cost': '1.00', 'ExpirationDate': '2016-01-01'}
{'Resale': None, 'Cost': '2.42', 'ExpirationDate': '1.42'}
{'Resale': '9.00', 'Cost': '1.45', 'ExpirationDate': '2016-05-02'}

which is a proper dict (Notice that the reader has no way of telling, that the first element is the one missing, though). Now you only have to exclude lines that are not complete from writing. A nice way to do that is described here:

import csv,time,string,os,requests, datetime

test = "test.csv"

with open(test) as infile:
    r = csv.DictReader(infile, delimiter=" ", skipinitialspace=True)

    for row in r:
        if not any(val in (None, "") for val in row.itervalues()):
            print(row)

Finally, this will give you all valid lines as dicts:

{'Resale': '2.00', 'Cost': '1.00', 'ExpirationDate': '2016-01-01'}
{'Resale': '9.00', 'Cost': '1.45', 'ExpirationDate': '2016-05-02'}

edited May 23, 2017 at 11:44

CommunityBot

11 silver badge

answered Aug 2, 2016 at 15:55

jotasi

5,2072 gold badges31 silver badges53 bronze badges

Sign up to request clarification or add additional context in comments.

7 Comments

Brian Powell Over a year ago

I didn't really understand the r = (dict((k, v.strip()) for k, v in row.items() if v) for row in r) line - the code itself came from an SO post I made about trimming white space on each value before it was written, but you're right, when I remove that line, it solves the problem of this particular thing happening - but the code WAS stripping white space, so now I need to find a new way to trim white space from my output file...

jotasi Over a year ago

@BrianPowell The skipinitialspace=Trueparameter that I used in my sample strips all leading whitespaces. Isn't that what you need?

Brian Powell Over a year ago

I have several values in the sheet (which is like 50 columns, not the 3 I showed here) which have white space at the beginning, or at the end: ` hello ` - with the code from my comments above, it did trim that white space out, but I guess it also screwed up my dictionary. My understanding is that skipinitialspace only trims white space from the beginning of a value.

jotasi Over a year ago

The problem is not that line but that you don't have commas in you comma-separated-values file and you didn't specify any other delimiter. If you specify delimiter=" " then he is able to correctly produce your dictionary (as long as none of the keys/values conatain whitespaces). And then skipinitialspace=True skips all whitespaces, since there will be none after an entry (as the first whitespace after an entry is treated as delimiter).

jotasi Over a year ago

Ok then you can actually use something like the line you had. First you will have to change delimiter="," and afterwards you can insert r = [{k.strip(): v.strip() for k, v in row.items()} for row in r]. That will work for python 2.7.2 and newer. Then you can use the if clause later to excempt all rows with a missing entry.

|

maxpolk · Accepted Answer · 2016-08-02 15:37:40Z

0

You got a KeyError accessing something not in the dict at x['Expiration Date'] so you could say x.get('Expiration Date') or possibly 'Expiration Date' in x instead to discover if it exists and conditionally discard that row.

answered Aug 2, 2016 at 15:37

maxpolk

2,2371 gold badge20 silver badges25 bronze badges

1 Comment

jotasi Over a year ago

Actually, his read in is flawed and he never has anything else but "Expiration Date Cost Resale" as a key in his dict.

Collectives™ on Stack Overflow

python csv writer if row key does not exist

2 Answers 2

7 Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

7 Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related