1

In the below example contents of the file seems to become empty after performing copy_expert command and pd.read_csv is throwing an error stating it is an empty file. The file seems to be empty even when I performing this operation in reverse order (read_csv before copy_expert). Why is file empty and how can I handle this scenario?

    file = //download file from S3

    copy_sql = """
                   COPY %s FROM stdin WITH CSV HEADER
                   DELIMITER as ','
                   """
    cursor = self.connection.cursor()
    cursor.copy_expert(sql=copy_sql % table_name, file=file)
    cursor.close()

   df = pd.read_csv(file, dtype={// dtype value})

EDIT

I was able to solve it by performing following , however it would be helpful to understand why the file object is emptied and also if there is a more efficient way of doing this.

    file = //download file from S3
    file_clone = copy.deepcopy(file)
    copy_sql = """
                   COPY %s FROM stdin WITH CSV HEADER
                   DELIMITER as ','
                   """
    cursor = self.connection.cursor()
    cursor.copy_expert(sql=copy_sql % table_name, file=file)
    cursor.close()

   df = pd.read_csv(file_clone, dtype={// dtype value})

1 Answer 1

1

The issue is that after first reading the file data, the file pointer will be at the end of the file; the subsequent read call will return no data.
In order to be able to read the data again you need to move the pointer to the beginning of the file:

cursor.copy_expert(sql=copy_sql % table_name, file=file)
file.seek(0)
df = pd.read_csv(file, dtype={// dtype value})
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.