2

I'm trying to append new rows in an existing bigquery table from a csv file. The csv is:

"sprotocol";"w5q53";"insertingdate";"closeddate";"sollectidate";"company";"companyid";"contact"
"20-22553";"DELETED";"2020-01-26;0000-01-01 00:00";"0000-01-01 00:00";"";"";"this is a ticket"

An this is my python function:

job_config = bigquery.LoadJobConfig()
    job_config.source_format = 'text/csv'
    job_config.write_disposition = bigquery.WriteDisposition.WRITE_APPEND
    job_config.source_format = bigquery.SourceFormat.CSV
    job_config.skip_leading_rows = 1
    job_config.autodetect = False
    job_config.schema = [
        bigquery.SchemaField("sprotocol", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("w5q53", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("insertingdate", "TIMESTAMP", mode="NULLABLE"),
        bigquery.SchemaField("closeddate", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("sollectidate", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("company", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("companyid", "STRING", mode="NULLABLE"),
        bigquery.SchemaField("contact", "STRING", mode="NULLABLE")
    ]
    job_config.fieldDelimiter = ';'
    job_config.allow_quoted_newlines = True

    with open(file_path, "rb") as file:
        load_job = _connection.load_table_from_file(
            file,
            table_ref,
            job_config=job_config
        )  # API request
        print("Starting job {}".format(load_job.job_id))

        load_job.result()  # Waits for table load to complete.
        print("Job finished.")
    file.close()

I receive the following error:

[{'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table encountered too many errors, giving up. Rows: 1; errors: 1. Please look into the errors[] collection for more details.'}, {'reason': 'invalid', 'message': 'Error while reading data, error message: CSV table references column position 55, but line starting at position:743 contains only 1 columns.'}]

I've tried also to remove the schema definition, but I receive the same error. Someone can help me?

1
  • No need for file.close() at the end. with does it automatically :-) Commented Apr 7, 2022 at 5:28

1 Answer 1

5

There are three issues in the above code

  1. use field_delimiter instead of fieldDelimiter

    job_config.field_delimiter = ';'

  2. Use DATE instead of TIMESTAMP because input contains only date

    bigquery.SchemaField("insertingdate", "DATE", mode="NULLABLE"),

  3. Double quotes are not proper

    "20-22553";"DELETED";"2020-01-26";"0000-01-01 00:00";"0000-01-01 00:00";"";"";"this is a ticket"

Sign up to request clarification or add additional context in comments.

2 Comments

Hi, thanks for your answer. About point 2 I have to refer to bigquery table schema (is a timestamp). About point 3 I've removed quoted fields. The problem was related on this point.
@br1 Happy that solved your issue, can you mark this as answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.