2

My use case includes creating an external table in Bigquery using Pyspark code. The data source is Google cloud storage bucket where JSON data is sitting. I am reading the JSON data into a data frame and want to create an external Bigquery table. As of now, the table is getting created but it is not an external one.

df_view.write\
    .format("com.google.cloud.spark.bigquery")\
    .option('table', 'xyz-abc-abc:xyz_zone.test_table_yyyy')\
    .option("temporaryGcsBucket","abcd-xml-abc-warehouse")\
    .save(mode='append',path='gs://xxxxxxxxx/')

P.S. - I am using spark-bigquery connector to achieve my goal.

Please let me know in case anyone has faced the same issue.

1 Answer 1

1

At the moment the spark-bigquery-connector does not support writing to an external table. Please create an issue and we will try to add it soon.

You can of course do it in two steps:

  • Write the JSON files to GCS.
  • Use the BigQuery API in order to create the external table.
Sign up to request clarification or add additional context in comments.

1 Comment

Thanks David. I will raise a ticket with Google Cloud support team.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.