Python - adding multiple tables into a single CSV with Panda

Question

I'm wondering how to get parsed tables from panda into a single CSV, I have managed to get each table into a separate CSV for each one, but would like them all on one CSV. This is my current code to get multiple CSVs:

import pandas as pd
import csv

url = "https://fasttrack.grv.org.au/RaceField/ViewRaces/228697009? 
raceId=318809897"

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )

for i, datas in enumerate(data):

    datas.to_csv("new{}.csv".format(i), header = False, index = False)

Is the schema for all tables same?

Sam Chats
– Sam Chats

2018-05-09 03:48:45 +00:00
Commented May 9, 2018 at 3:48 — Sam Chats
– Sam Chats, Commented May 9, 2018 at 3:48
yes the schema is the same

user3170725
– user3170725

2018-05-09 04:33:20 +00:00
Commented May 9, 2018 at 4:33 — user3170725
– user3170725, Commented May 9, 2018 at 4:33

jezrael · Accepted Answer · 2018-05-09 06:45:52Z

4

I think need concat only, because data is list of DataFrames:

df = pd.concat(data, ignore_index=True)
df.to_csv(file, header=False, index=False)

edited May 9, 2018 at 6:45

answered May 9, 2018 at 6:34

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Benares Over a year ago

You can use axis=1 in concat to put the dataframes side-by-side instead of one after the other (not sure which one you want).

klvmungai · Accepted Answer · 2018-05-09 12:21:35Z

3

You have 2 options:

You can tell pandas to append data while writing to the CSV file.

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
for datas in data:
    datas.to_csv("new.csv", header=False, index=False, mode='a')

Merge all the tables into one DataFrame and then write that into the CSV file.

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
df = pd.concat(data, ignore_index=True)
df.to_csv("new.csv", header=False, index=False)

Edit

To still separate the dataframes on the csv file, we shall have to stick with option #1 but with a few additions

data = pd.read_html(url, attrs = {'class': 'ReportRaceDogFormDetails'} )
with open('new.csv', 'a') as csv_stream:
    for datas in data:
        datas.to_csv(csv_stream, header=False, index=False)
        csv_stream.write('\n')

edited May 9, 2018 at 12:21

answered May 9, 2018 at 6:42

klvmungai

8241 gold badge13 silver badges21 bronze badges

1 Comment

user3170725 Over a year ago

Thankyou! Would you know how to somehow still seperate the tables during the concat? So they aren't straight after one another? Like have one row of space between

Sam Chats · Accepted Answer · 2018-05-09 03:50:57Z

0

all_dfs = []

for i, datas in enumerate(data):
    all_dfs.append(datas.to_csv("new{}.csv".format(i), header = False, index = False))

result = pd.concat(all_dfs)

answered May 9, 2018 at 3:50

Sam Chats

2,3211 gold badge14 silver badges36 bronze badges

2 Comments

Sam Chats Over a year ago

This can be a one-liner with list comprehension, but I chose the form above for clarity.

user3170725 Over a year ago

Thanks for your reply, I'm getting an error with that code ValueError: All objects passed were None

Collectives™ on Stack Overflow

Python - adding multiple tables into a single CSV with Panda

3 Answers 3

1 Comment

Edit

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Edit

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related