8

I'm trying to fill a table from CSV files in a python script.

The SQL statement, which follows, runs without error:

COPY registro
FROM '/home/pablo/Escritorio/puntos/20140227.csv'
DELIMITER ','
CSV header;

CSV has headers, and using header parameter, it imports without error.

The problem comes when I execute it from my python script. The only way I've found not to try to import the headers is with copy_expert() method. I get no error message but the table is still empty after I run the Python script below.

Any possible clue? Or maybe any other way to copy a table from CSV with headers?

Thanks.

#/usr/bin/env python
# -*- coding: utf-8 -*-
import psycopg2
import os
import glob
DSN = "dbname=gps user=postgres host=localhost"
con = psycopg2.connect(DSN)
cur = con.cursor()
my_file = open('/home/pablo/Escritorio/puntos/20140227.csv')
#This is only a test file, not all the directory
sql = "COPY registro FROM stdin DELIMITER \',\' CSV header;"
cur.copy_expert(sql, my_file)
cur.close()
con.close()
3
  • possible duplicate of Recreating Postgres COPY directly in Python? Commented Aug 29, 2014 at 10:32
  • 1
    I don't think this is a duplicate, it not talks about copy_expert() neither CSV headers Commented Aug 29, 2014 at 10:41
  • It was more like a "That might also work" forgot to write this as a comment. Commented Aug 29, 2014 at 11:54

2 Answers 2

20

I'd try con.commit() after cur.copy_expert().

Also I would avoid preprocessing and uploading the file row by row as Sam P. pointed out above if the dataset is large. cur.copy_expert() is significantly faster.

conn = psycopg2.connect('postgresql://scott:tiger@localhost:5432/database')
cur = conn.cursor()
copy_sql = """
           COPY table_name FROM stdin WITH CSV HEADER
           DELIMITER as ','
           """
with open(path, 'r') as f:
    cur.copy_expert(sql=copy_sql, file=f)
    conn.commit()
    cur.close()
Sign up to request clarification or add additional context in comments.

3 Comments

should you not also close the connection?
Does the copy_from(file, table_name, feature=value) method have an option for header=true?
@aydow, closing the connection is not a bad idea. I should've included that in my original post outside of the with block
2

I would recommend dealing with the csv file in python first. It will be best to structure the data pulled from the csv file into rows/columns (in python this will nested lists, or a list of a tuples) then you can construct & execute SQL commands based on that data iteratively.

Use the csv library to interact with the csv file, take a look at the documentation here: https://docs.python.org/2/library/csv.html. It's very user friendly and will help you with a lot of your problems.

Here's a way to do it without csv (as I can't remember all the functions off the top of my head), however it would be best not to use this approach:

#/usr/bin/env python
# -*- coding: utf-8 -*-
import psycopg2
import os
import glob
DSN = "dbname=gps user=postgres host=localhost"
con = psycopg2.connect(DSN)
cur = con.cursor()

# 'rb' used as I don't know the encoding of your file
# just use r if it's in utf-8 or a known/consistent charset
with open(file,'rb') as open_file:
    my_file = open_file.read().decode('utf-8','ignore')

data = my_file.splitlines()
data = [r.split(delimiter) for r in data]

data = data[1:] # get rid of headers

for r in data:
     # create command
     # cur.execute(command)

2 Comments

This line helps me while reading data from csv/json and copy into flex in vertica. Thanks -:) my_file = open_file.read().decode('utf-8','ignore')

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.