Automate csv dump into new Postgres tables [duplicate]

Question

I have multiple thousands of csvs and each csv has over 10000 records. I'm looking for the most efficient way to dump this data into tables in Postgres DB with minimal time and effort.

What methods have you found on your searches? How did they work out? — roganjosh
– roganjosh, Commented Aug 23, 2018 at 14:23
Similar questions have been answered already. Please look at - stackoverflow.com/questions/30050097/… and also look at stackoverflow.com/questions/12646305/… — sulabh chaturvedi
– sulabh chaturvedi, Commented Aug 23, 2018 at 14:32
It s not the same as the previously opened questions because: The old questions give you a way to import a SINGLE csv file into postgres. But in my case, I want to automate the import of a very large number of files where there are 2 manual processes involved: 1. Create a new table 2. Import the csv into this new table. I want to accomplish these 2 steps in one procedure for multiple thousands of files through automation. — Priya Sreetharan
– Priya Sreetharan, Commented Aug 23, 2018 at 15:12
In other words, I want to be able to create the tables on the fly and assign table names as source file names and then import data from source file into the table created for a bulk of files. — Priya Sreetharan
– Priya Sreetharan, Commented Aug 23, 2018 at 15:16
@sulabhchaturvedi the links you have provided have answers to tackle a single file import into a new table with manual table creation. But my question is different. — Priya Sreetharan
– Priya Sreetharan, Commented Aug 23, 2018 at 15:17

Hervé Piedvache · Accepted Answer · 2018-08-23 14:31:33Z

1

COPY is usually the best solution. Depends of your constraints.

COPY table_name FROM 'path_readable_by_postgres/file.cvs';

And you can cat your files in a big one to import data quickly.

Look ta https://www.postgresql.org/docs/current/static/sql-copy.html for more details.

answered Aug 23, 2018 at 14:31

Hervé Piedvache

7984 silver badges9 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

zeevb Over a year ago

Answered also here: stackoverflow.com/questions/2987433/…

Inder Over a year ago

as @zeevb pointed out this is same answer as provided by a different user in the link

Priya Sreetharan Over a year ago

I have multiple thousands of files. I don't want to manually add all the file names. ALso, for this command to work, I already need to have the table created, which I don't have. If I were to manually create, I'd have to create multiple thousands of table and it would be extremely time consuming.

enoted · Accepted Answer · 2018-08-25 12:42:08Z

1

You can use pandas library to read and transform data (if needed), sqlalchemy to create postgres engine and psycopg2 to load data into postgresql. I assume, that you've already created tables in Postgres DB. Try something like the code below

import pandas as pd
from sqlalchemy import create_engine
import pandas as pd
import psycopg2
# Drop "Unnamed: 0", as it often causes problems in writing to table
pd.read_csv({path/to/file.csv}, index_col={index_column}).drop(["Unnamed: 0"], axis=1)
# Now simply load your data into database
engine = create_engine('postgresql://user:password@host:port/database')
try:
    pd_table.to_sql({'name_of_table_in_postgres_db'}, engine, if_exists='append')
except (Exception, psycopg2.DatabaseError) as error:
    print(error)
finally:
    print('Closed connection to the database')

edited Aug 25, 2018 at 12:42

answered Aug 23, 2018 at 14:39

enoted

63511 silver badges21 bronze badges

5 Comments

Priya Sreetharan Over a year ago

So, no I haven't yet created the tables. The code above reads one csv and dumps it to a new table and I can loop it out for all the files?

Priya Sreetharan Over a year ago

The above code throws an error in line engine = try:

roganjosh Over a year ago

engine = try: why are you using assignment here?

enoted Over a year ago

engine = try: --- sorry, I have lost some code, now it should be okay

enoted Over a year ago

The code above adds one csv into postgresql table, created earlier. You can loop it to add all csv files into table.

Collectives™ on Stack Overflow

Automate csv dump into new Postgres tables [duplicate]

2 Answers 2

3 Comments

5 Comments

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

5 Comments

Linked

Related