BigQuery- can't list tables


#1

Hi,

I’ve upgraded to version 4, and suddenly can’t see the tables.
The user is an admin.

Thanks,
Gal


#2

Can you make sure that the Load Schema option is checked in the data source settings?

If it is, click on the schema refresh button (next to search) and check the API logs at the same time for errors.


#3

@arikfr, from the logs I see it reads the tables, and then:
[2018-03-15 10:47:54 +0000] [14036] [CRITICAL] WORKER TIMEOUT (pid:8181)
[2018-03-15 10:47:54 +0000] [8181] [INFO] Worker exiting (pid: 8181)
[2018-03-15 10:47:54 +0000] [11472] [INFO] Booting worker with pid: 11472

We have hundreds of tables.


#4

How can we make it work?


#5

Can you check how many tables/columns you have?


#6

Ok, the issue is sure on our side.
I see we have 14,000 tables. For sure, we need to drop 99% of them.
It’s a process that someone creates a “temp” table, but just doesn’t drop it…

Thanks for your help!


#7

@arikfr Shouldn’t it work anyway even with 14k tables? Through https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list it works without any timeout issue.


#8

We dropped half of the tables but it still doesn’t work.


#9

@arikfr, I even changed the big_query.py to not read this problematic dataset, and I still get the timeout issue.
I believe it’s a bug. I see in the logs we still get the timeout, and now not when it tries this dataset.

def get_schema(self, get_stats=False):
        if not self.configuration.get('loadSchema', False):
            return []

        service = self._get_bigquery_service()
        project_id = self._get_project_id()
        datasets = service.datasets().list(projectId=project_id).execute()
        schema = []
        for dataset in datasets.get('datasets', []):
            dataset_id = dataset['datasetReference']['datasetId']
            if dataset_id == 'backend_tmp_tables':
                continue
            tables = service.tables().list(projectId=project_id, datasetId=dataset_id).execute()
            for table in tables.get('tables', []):
                table_data = service.tables().get(projectId=project_id, datasetId=dataset_id, tableId=table['tableReference']['tableId']).execute()

                columns = []
                for column in table_data['schema']['fields']:
                    if column['type'] == 'RECORD':
                        for field in column['fields']:
                            columns.append(u"{}.{}".format(column['name'], field['name']))
                    else:
                        columns.append(column['name'])
                schema.append({'name': table_data['id'], 'columns': columns})

        return schema 

The log: