5

I need to have a Python equivalence of this BigQuery bq show --format=prettyjson myproject:mydataset.mytable.

Is there a way to do it with the BigQuery API in Python ?

I tried this in Python:

view_ref = self._client.dataset(dataset.dataset_id).table(table.table_id)
table_obj = self._client.get_table(view_ref)

dict_schema = []
for schema_field in table_obj.schema:
    dict_schema.append({
        'name': schema_field.name,
        'mode': schema_field.mode,
        'type': schema_field.field_type
   })

It almost works; I just don't have the nested schema field/

Thanks for replies and have a nice day.

1 Answer 1

15

You can get convert your table schema to json simply using the schema_to_json() method. It needs two attributes, schema_list and destination, respectively.

I exemplified your case using a public dataset with nested data and used StringIO() just to show how the schema will be.

from google.cloud import bigquery
import io

client = bigquery.Client()

project = 'bigquery-public-data'
dataset_id = 'samples'
table_id = 'shakespeare'

dataset_ref = client.dataset(dataset_id, project=project)
table_ref = dataset_ref.table(table_id)
table = client.get_table(table_ref)


f = io.StringIO("")
client.schema_to_json(table.schema, f)
print(f.getvalue())

And the output:

[
  {
    "description": "A single unique word (where whitespace is the delimiter) extracted from a corpus.",
    "mode": "REQUIRED",
    "name": "word",
    "type": "STRING"
  },
  {
    "description": "The number of times this word appears in this corpus.",
    "mode": "REQUIRED",
    "name": "word_count",
    "type": "INTEGER"
  },
  {
    "description": "The work from which this word was extracted.",
    "mode": "REQUIRED",
    "name": "corpus",
    "type": "STRING"
  },
  {
    "description": "The year in which this corpus was published.",
    "mode": "REQUIRED",
    "name": "corpus_date",
    "type": "INTEGER"
  }
]

Which is the same as the output displayed when using the command !bq show --format=prettyjson bigquery-public-data:samples.wikipedia | jq '.schema.fields'

Sign up to request clarification or add additional context in comments.

2 Comments

Getting a mypy error: Argument 2 to "schema_to_json" of "Client" has incompatible type "StringIO"; expected "Union[str, bytes, PathLike[str], PathLike[bytes]]" with this approach. Any suggestion how to fix it?
client.schema_to_json(table.schema, f) AttributeError: 'TableListItem' object has no attribute 'schema'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.