2

Given a query example like

import uuid

from google.cloud import bigquery


def query_shakespeare():
    client = bigquery.Client()
    query_job = client.run_async_query(str(uuid.uuid4()), """
        #standardSQL
        SELECT corpus AS title, COUNT(*) AS unique_words
        FROM `publicdata.samples.shakespeare`
        GROUP BY title
        ORDER BY unique_words DESC
        LIMIT 10""")

    query_job.begin()
    query_job.result()  # Wait for job to complete.

    destination_table = query_job.destination
    destination_table.reload()
    for row in destination_table.fetch_data():
        print(row)


if __name__ == '__main__':
    query_shakespeare()

how can I get the schema of the table? row, in the previous example has the form

Row(('august', -1, 'aaa', 333), {'col1': 0, 'col2': 1, 'col3': 2})

but I can't find, for the package google-cloud-bigquery==0.28.0 the way to extract the header JSON. Of course, also the extraction of the table schema would be fine for me, but the current Google documentation looks not to work for the last version...

2 Answers 2

5

If you need the schema of the table you just queried, you can get it from the result method from the QueryJob:

client = bq.Client()
query = """
         #standardSQL
         SELECT corpus AS title, COUNT(*) AS unique_words
         FROM `publicdata.samples.shakespeare`
         GROUP BY title
         ORDER BY unique_words DESC
         LIMIT 10"""
query_job = client.query(query)
result = query_job.result()

schema = result.schema

Result:

[SchemaField(u'title', u'string', u'NULLABLE', None, ()),
 SchemaField(u'unique_words', u'integer', u'NULLABLE', None, ())]

(The code you presented in your question is related to version 0.27).

As for the question of getting the header JSON, not sure if I understood it correctly but as it seems you need the schema to find where the json is located (I'm guessing here).

Sign up to request clarification or add additional context in comments.

1 Comment

Very interesting, because the documentation says table.schema
1
table_ref = client.dataset(dataset_id).table(table_id)
table = client.get_table(table_ref)
original_schema = table.schema

this code will print table schema. In my case, it was following output

[SchemaField('guidislink', 'STRING', 'NULLABLE', None, ()),
 SchemaField('id', 'STRING', 'NULLABLE', None, ()),
 SchemaField('link', 'STRING', 'NULLABLE', None, ())]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.