0

When I try to find the size of all the tables using listTables API of BigQuery java client, it returns null. But if I use getTable individually, I get the proper data. Is this a known issue, or am I doing something wrong. Following is the code that returns null value for numBytes:

Page<Dataset> datasetPage = getAllDatasets("projectId");
        if(datasetPage!=null) {
            for (Dataset dataset : datasetPage.iterateAll()) {
                for(Table table : dataset.list().iterateAll()) {
                    System.out.println(table.getNumBytes());  // Returns Null. **
                }
            }
        }

2 Answers 2

1

In this Public Issue Tracker thread, it has been discussed that getting null value for numBytes and numRows using listTables is the expected behaviour. The BigQuery API considers retrieving numBytes and numRows to be an expensive operation and thus returns null. So, the listTables only returns partial information on a table.

As a workaround, use getTable() to retrieve the information of the table individually in a loop. I tested the below code snippet and was able to get the table size in bytes for all the tables.

public static void getAllTableSize(String projectId) {
    try {
      BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
      Page<Dataset> datasetPage = bigquery.listDatasets(projectId);
        if(datasetPage!=null) {
            for (Dataset datasetTemp : datasetPage.iterateAll()) {
                for(Table table : datasetTemp.list().iterateAll()) {
                    Table tableTemp = bigquery.getTable(table.getTableId());
                    String tableName = tableTemp.getTableId().getTable();
                    Long tableSize = tableTemp.getNumBytes();
                    System.out.println("Table Name: " + tableName + "  " + "Table Size: " + tableSize);  
                }
            }
        }
    } catch (BigQueryException e) {
        System.out.println("Error occurred: " + e.toString());
      }
    }
Sign up to request clarification or add additional context in comments.

3 Comments

Yes individual getTable returns all the information as i mentioned in the question itself but my main concern was why its not available for listTables @Vishal K
The reason i am asking this is because call getTable individually for thousands of table is time consuming
Hi @SachinJanani, please check my updated answer. The same issue as discussed in the Google Issue Tracker. You can also STAR the issue to receive automatic updates and give it traction by referring to this link.
0

To answer my own question listTables api is designed to return only the partial information. This is mentioned in the code document https://github.com/googleapis/java-bigquery/blob/dfa15e5ca08a3227f015a389c4c08732178a73e7/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/spi/v2/BigQueryRpc.java#L155

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.