BigQuery- can't list tables

galmoran · March 15, 2018, 9:54am

Hi,

I’ve upgraded to version 4, and suddenly can’t see the tables.
The user is an admin.

Thanks,
Gal

arikfr · March 15, 2018, 10:08am

Can you make sure that the Load Schema option is checked in the data source settings?

If it is, click on the schema refresh button (next to search) and check the API logs at the same time for errors.

galmoran · March 15, 2018, 10:49am

@arikfr, from the logs I see it reads the tables, and then:
[2018-03-15 10:47:54 +0000] [14036] [CRITICAL] WORKER TIMEOUT (pid:8181)
[2018-03-15 10:47:54 +0000] [8181] [INFO] Worker exiting (pid: 8181)
[2018-03-15 10:47:54 +0000] [11472] [INFO] Booting worker with pid: 11472

We have hundreds of tables.

galmoran · March 15, 2018, 1:02pm

How can we make it work?

arikfr · March 15, 2018, 1:16pm

Can you check how many tables/columns you have?

galmoran · March 15, 2018, 3:58pm

Ok, the issue is sure on our side.
I see we have 14,000 tables. For sure, we need to drop 99% of them.
It’s a process that someone creates a “temp” table, but just doesn’t drop it…

Thanks for your help!

galmoran · March 18, 2018, 1:50pm

@arikfr Shouldn’t it work anyway even with 14k tables? Through https://cloud.google.com/bigquery/docs/reference/rest/v2/tables/list it works without any timeout issue.

galmoran · March 18, 2018, 3:45pm

We dropped half of the tables but it still doesn’t work.

galmoran · March 18, 2018, 4:17pm

@arikfr, I even changed the big_query.py to not read this problematic dataset, and I still get the timeout issue.
I believe it’s a bug. I see in the logs we still get the timeout, and now not when it tries this dataset.

def get_schema(self, get_stats=False):
        if not self.configuration.get('loadSchema', False):
            return []

        service = self._get_bigquery_service()
        project_id = self._get_project_id()
        datasets = service.datasets().list(projectId=project_id).execute()
        schema = []
        for dataset in datasets.get('datasets', []):
            dataset_id = dataset['datasetReference']['datasetId']
            if dataset_id == 'backend_tmp_tables':
                continue
            tables = service.tables().list(projectId=project_id, datasetId=dataset_id).execute()
            for table in tables.get('tables', []):
                table_data = service.tables().get(projectId=project_id, datasetId=dataset_id, tableId=table['tableReference']['tableId']).execute()

                columns = []
                for column in table_data['schema']['fields']:
                    if column['type'] == 'RECORD':
                        for field in column['fields']:
                            columns.append(u"{}.{}".format(column['name'], field['name']))
                    else:
                        columns.append(column['name'])
                schema.append({'name': table_data['id'], 'columns': columns})

        return schema

The log:

timelf123 · November 29, 2018, 6:33pm

Sorry to necro, but I believe this is still outstanding. I am on 5 latest version from docker image and cannot pull schema. schema xhr returns Bad Gateway

Marcf · January 7, 2019, 7:03am

Hi, I am still experiencing this error/bug as well.
Any ETA on the fix?

BradFromThePiggery · January 7, 2019, 3:28pm

I just read through this thread and I was thinking it held a clue to my problems. arikfr responded:

"Can you make sure that the Load Schema option is checked in the data source settings?

If it is, click on the schema refresh button (next to search) and check the API logs at the same time for errors."

I do not have this option. How do I force a schema refresh?

BradFromThePiggery · January 7, 2019, 3:30pm

Specifically, I do not have a “Load Schema” checkbox in my data source settings. Refreshing the schema in the lefthand panel of the query page does not seem to update the schema.

Brad

aluiz · February 13, 2020, 11:06pm

Any update? I am experiencing exactly the same issue with an Oracle data source.
Thanks,

jesse · February 20, 2020, 12:58am

You should create a new topic if you’re experiencing issues with Oracle. This thread is specific to BigQuery.

acondrat · February 20, 2020, 10:38am

I found that raising Gunicorn timeout is a workaround for this issue. I am running Redash in Docker/Kubernetes and this could be achieved by setting the GUNICORN_CMD_ARGS environment variable

  GUNICORN_CMD_ARGS="--timeout=600"

https://docs.gunicorn.org/en/stable/settings.html

aluiz · July 3, 2020, 3:42pm

It worked for me!! Thanks @acondrat!!!

▼Categories

▼Tags

BigQuery- can't list tables