3

So I am parsing data contained in many files, looping through them and storing certain elements in a list, and appending each resulting list to a dataframe with Pandas using Python.

It works, except I can't figure out how to keep the header row while appending. It either disappears or is duplicated with each append.

The below code serves as an example of the latest code:

import pandas as pd

for i in range(1,4):
    data = [{'name': 'Company'+str(i), 'city': 'New York'}]

    stuff = []
    for element in data:
        stuff.append(element)

    df = pd.DataFrame(columns=["name",
                               "city"])

    for record in stuff:
        df = df.append(record, ignore_index=True)

    df.to_csv('test.csv', mode='a', header=False, index=False)

With this code the output (csv file) is:

Company1    New York
Company2    New York
Company3    New York

But I am looking for the output to be:

name        city
Company1    New York
Company2    New York
Company3    New York

Thanks!

2
  • Also i don't understand your code, is this your real code or mock up code because if it was working like you wanted, it'd just create a csv 4 times with the first one overwritten 3 times Commented Feb 7, 2015 at 20:47
  • The code serves as an example, it's a mock up. Commented Feb 7, 2015 at 20:49

1 Answer 1

10

But you've set header=False:

df.to_csv('test.csv', mode='a', header=False, index=False)

you should do:

df.to_csv('test.csv', mode='a', header=True, index=False)

You just need to set it to True for the first iteration and then False for the subsequent iterations

Basically you just do something like the following:

# add this outside your for loop
writeHeader = True

# use the following where you write to csv
if writeHeader is True:
    df.to_csv('test.csv', mode='a', header=True, index=False)
    writeHeader = False
else:
    df.to_csv('test.csv', mode='a', header=False, index=False)

or similar

So the complete thing looks like:

import pandas as pd
writeHeader = True

for i in range(1,4):
    data = [{'name': 'Company'+str(i), 'city': 'New York'}]

    stuff = []
    for element in data:
        stuff.append(element)

    df = pd.DataFrame(columns=["name",
                               "city"])

    for record in stuff:
        df = df.append(record, ignore_index=True)

    if writeHeader is True:    
        df.to_csv('test.csv', mode='a', header=True, index=False)
        writeHeader = False
    else:
        df.to_csv('test.csv', mode='a', header=False, index=False)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.