1

I have a csv events.csv:

"PATIENT ID,PATIENT NAME,EVENT TYPE,EVENT VALUE,EVENT UNIT,EVENT TIME"
"1,Jane,HR,82,beats/minute,2021-07-07T02:27:00Z"
"1,Jane,RR,5,breaths/minute,2021-07-07T02:27:00Z"

Then I use python csv to read it:

import csv
with open(r'/Users/williaml/Downloads/events.csv') as csvfile: 
    spamreader = csv.DictReader(csvfile, delimiter=',' ,quotechar=' ')
    for row in spamreader:            
        print(row)

Output:

{'"PATIENT ID': '"1', 'PATIENT NAME': 'Jane', 'EVENT TYPE': 'HR', 'EVENT VALUE': '82', 'EVENT UNIT': 'beats/minute', 'EVENT TIME"': '2021-07-07T02:27:00Z"'}

{'"PATIENT ID': '"1', 'PATIENT NAME': 'Jane', 'EVENT TYPE': 'RR', 'EVENT VALUE': '5', 'EVENT UNIT': 'breaths/minute', 'EVENT TIME"': '2021-07-07T02:27:00Z"'}

And I tried to insert these rows into database:

import psycopg2
conn = psycopg2.connect(host='localhost', dbname='patientdb',user='username',password='password',port='')
cur = conn.cursor()
import csv
with open(r'apps/patients/management/commands/events.csv') as csvfile:
        spamreader = csv.DictReader(csvfile, delimiter=',' ,quotechar=' ')
        for row in spamreader:
                cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
  ({row['"PATIENT ID']},{row['EVENT TYPE']},{row['EVENT VALUE']},
   {row['EVENT UNIT']},{row['EVENT TIME"']})""")

Error:

psycopg2.errors.UndefinedColumn: column "1,HR,82,
   beats/minute,2021-07-07T02:27:00Z" does not exist
LINE 2:   ("1,HR,82,
           ^

However if I directly run the following sql in database command terminal it works:

INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES('1','HR','82','beats/minute','2021-07-07T02:27:00Z');

So I think it seems this part of code is incorrect:

cur.execute(f"""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES
      ({row['"PATIENT ID']},{row['EVENT TYPE']},{row['EVENT VALUE']},
       {row['EVENT UNIT']},{row['EVENT TIME"']})""")

Any friend can help?

8
  • Can you see anything in {'"PATIENT ID': '"1', which might be the reason for this. This is from your Output mentioned above which you are writing into db. Commented Nov 11, 2021 at 17:35
  • 1
    Why not to use COPY ? Commented Nov 11, 2021 at 17:48
  • @balderman Not every column is needed.For example the PATIENT NAME is not needed can I still use copy ? Commented Nov 11, 2021 at 17:51
  • @William I think COPY knows how to handle this situation. Commented Nov 11, 2021 at 17:53
  • 1
    Does the CSV file really have double quotes at the beginning and end of each line, and nowhere else? That would make each line a single field, which seems highly suspicious. That should be fixed upstream, if possible. And once you have sorted that out, please please please use proper parameter passing and not string formatting to pass values to SQL queries. Commented Nov 11, 2021 at 18:16

2 Answers 2

1

Use this:

cur.execute("""INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES ({1},{2},{3},{4},{5})"""
            .format(row['"PATIENT ID'][1:], row['EVENT TYPE'], row['EVENT VALUE'], row['EVENT UNIT'], row['EVENT TIME"'][:-1]))

So, this basically handles your extra quotes in the output dict which I have mentioned in the comment here which is causing this issue.

And, that's why

INSERT INTO patients_event (patient_id, event_type_id , event_value ,event_unit, event_time) VALUES('1','HR','82','beats/minute','2021-07-07T02:27:00Z');

passes on db terminal as you can see the difference between values inserted here in both ways.

UPDATE: Avoid using python's string formatting for queries as it can lead to wrong queries or vulnerable points for sql injection. See parameters to know the correct ways to do this, as Adrian have mentioned in comments below.

Sign up to request clarification or add additional context in comments.

7 Comments

Thank you for your answer,after using your code,the error change to:psycopg2.errors.SyntaxError: syntax error at or near "T02" LINE 3: beats/minute,2021-07-07T02:27:00Z)
Do not use formatted strings to do this. Read this section of docs Parameters for correct way to do this. This should not be an accepted answer as it promotes bad coding.
@AdrianKlaver psycopg has nowhere mentioned explicitly that one should avoid using string formatting for queries. Also, they have used .format in the last code block in your mentioned reference.
That is the format provided by the psycopg2.sql and is a method off sql.SQL(). It is an different thing, in particular because it does correct escaping. f strings are just a form of string interpolation and are not suitable for building SQL queries.
Ok, my bad! But, I think the question is more about where OP was wrong and what a solution in his perspective would be. And I answered accordingly. I'll update my answer.
|
0

so one problem with CSV is " at the beginning and end of each line. The way you are interpreting it is causing it to become part of SQL expression.

           here 
LINE 2:   ("1,HR,82

And this is causing error, as it's not correctly closed" and actually it not intended for it to be in the generated SQL.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.