Scheduled workers in self-hosted Redash is not working

postgres_1          | ERROR:  column dashboards.options does not exist at character 614
postgres_1          | STATEMENT:  SELECT count(*) AS count_1
postgres_1          |   FROM (SELECT dashboards.updated_at AS dashboards_updated_at, dashboards.created_at AS dashboards_created_at, dashboards.id AS dashboards_id, dashboards.version AS dashboards_version, dashboards.org_id AS dashboards_org_id, dashboards.slug AS dashboards_slug, dashboards.name AS dashboards_name, dashboards.user_id AS dashboards_user_id, dashboards.layout AS dashboards_layout, dashboards.dashboard_filters_enabled AS dashboards_dashboard_filters_enabled, dashboards.is_archived AS dashboards_is_archived, dashboards.is_draft AS dashboards_is_draft, dashboards.tags AS dashboards_tags, dashboards.options AS dashboards_options
nginx_1             | 10.0.1.97 - - [07/Dec/2021:12:38:18 +0000] "GET /static/app.101c1284c4a7520ca61d.css.map HTTP/1.1" 200 173089 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.45 Safari/537.36" "37.228.230.71"
postgres_1          |   FROM dashboards
postgres_1          |   WHERE 1 = dashboards.org_id AND dashboards.is_archived = false) AS anon_1
server_1            | [2021-12-07 12:38:18,498][PID:10][ERROR][redash.app] Exception on /api/organization/status [GET]
server_1            | Traceback (most recent call last):
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
server_1            |     cursor, statement, parameters, context
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
server_1            |     cursor.execute(statement, parameters)
server_1            | psycopg2.errors.UndefinedColumn: column dashboards.options does not exist
server_1            | LINE 2: ...rds_is_draft, dashboards.tags AS dashboards_tags, dashboards...
server_1            |                                                              ^
server_1            |
server_1            |
server_1            | The above exception was the direct cause of the following exception:
server_1            |
server_1            | Traceback (most recent call last):
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
server_1            |     response = self.full_dispatch_request()
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
server_1            |     rv = self.handle_user_exception(e)
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask_restful/__init__.py", line 269, in error_router
server_1            |     return original_handler(e)
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
server_1            |     reraise(exc_type, exc_value, tb)
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
server_1            |     raise value
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
server_1            |     rv = self.dispatch_request()
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
server_1            |     return self.view_functions[rule.endpoint](**req.view_args)
server_1            |   File "/usr/local/lib/python3.7/site-packages/flask_login/utils.py", line 261, in decorated_view
server_1            |     return func(*args, **kwargs)
server_1            |   File "/app/redash/handlers/organization.py", line 22, in organization_status
server_1            |     models.Dashboard.org == current_org, models.Dashboard.is_archived == False
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3605, in count
server_1            |     return self.from_self(col).scalar()
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3330, in scalar
server_1            |     ret = self.one()
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3300, in one
server_1            |     ret = self.one_or_none()
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3269, in one_or_none
server_1            |     ret = list(self)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3342, in __iter__
server_1            |     return self._execute_and_instances(context)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3367, in _execute_and_instances
server_1            |     result = conn.execute(querycontext.statement, self._params)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
server_1            |     return meth(self, multiparams, params)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
server_1            |     return connection._execute_clauseelement(self, multiparams, params)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
server_1            |     distilled_params,
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1253, in _execute_context
server_1            |     e, statement, parameters, cursor, context
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1473, in _handle_dbapi_exception
server_1            |     util.raise_from_cause(sqlalchemy_exception, exc_info)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 398, in raise_from_cause
server_1            |     reraise(type(exception), exception, tb=exc_tb, cause=cause)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 152, in reraise
server_1            |     raise value.with_traceback(tb)
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1249, in _execute_context
server_1            |     cursor, statement, parameters, context
server_1            |   File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 580, in do_execute
server_1            |     cursor.execute(statement, parameters)
server_1            | sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column dashboards.options does not exist
server_1            | LINE 2: ...rds_is_draft, dashboards.tags AS dashboards_tags, dashboards...
server_1            |                                                              ^
server_1            |
server_1            | [SQL: SELECT count(*) AS count_1
server_1            | FROM (SELECT dashboards.updated_at AS dashboards_updated_at, dashboards.created_at AS dashboards_created_at, dashboards.id AS dashboards_id, dashboards.version AS dashboards_version, dashboards.org_id AS dashboards_org_id, dashboards.slug AS dashboards_slug, dashboards.name AS dashboards_name, dashboards.user_id AS dashboards_user_id, dashboards.layout AS dashboards_layout, dashboards.dashboard_filters_enabled AS dashboards_dashboard_filters_enabled, dashboards.is_archived AS dashboards_is_archived, dashboards.is_draft AS dashboards_is_draft, dashboards.tags AS dashboards_tags, dashboards.options AS dashboards_options
server_1            | FROM dashboards
server_1            | WHERE %(param_1)s = dashboards.org_id AND dashboards.is_archived = false) AS anon_1]
server_1            | [parameters: {'param_1': 1}]
server_1            | (Background on this error at: http://sqlalche.me/e/f405)

Found the issue I think

I bumped into a user that had a similar issue and I managed to solve from his solution.

Still getting an issue with scheduled queries not executing

Please drop a link to the solution that worked for you (for future searchers).

Awesome, this is headway.

For debugging, here’s the flow for scheduled queries:

  1. You add a schedule for a query
  2. Every minute, the scheduler service in your Docker compose should run a subroutine where it checks which queries need to be executed based on their schedules. Those that need to be executed now are enqueued as new jobs on the scheduled queue in Redis.
  3. The scheduled_worker service in your Docker compose should be monitoring the scheduled queue in Redis for new execution jobs. If a new job appears, one of the threads of scheduled_worker will pick up the job and execute it. If no threads are available to execute the job presently, it waits in queue until a thread picks it up or it’s purged automatically by Redis (defaults to 1000 seconds from the time the job was enqueued).
  4. When the scheduled_worker thread finishes executing the task, it stores the query result in the postgres metadata database. The database returns a query_result_id to the worker thread. The worker thread then updates the job status code in Redis to indicate the query result is ready and includes its query_result_id. The worker thread also updates the latest_query_data_id on the query object.

Debugging scheduled executions is a pretty straightforward process of going through each of these steps to figure out which one isn’t happening.

Note: for ease of use it’s best to have a very simple debugging query (like SELECT 1) that you set up with a 1 minute refresh schedule. If you can, also disable the execution schedules for all other queries.

Debugging Step 1:

Make sure you actually added a schedule to the query (I know it’s obvious, but hey, we all make mistakes)

Debugging Step 2:

Confirm that your scheduler service is running with docker-compose ps. Once per minute you should see it enqueuing a new job for your debug query. It will emit a log message visible with docker-compose logs -f --tail 100 scheduler whenever it enqueues a new job.

Then check Redis to see that the new job actually exists in the scheduled queue.

Debugging Step 3:

Confirm that your scheduled_worker is running with docker-compose ps and that it’s monitoring the scheduled queue in Redis (this is controlled by the environment portion of your compose file).

When a thread of scheduled_worker picks up a job from the scheduled queue, it will emit a log that you can see with docker-compose logs -f --tail 100 scheduled_worker.

Check redis to see that the job status code is changed from 1 (in queue) to 2 (executing).


95% of the time the problem is evident at this point. I won’t go into debugging step 4 right now as it’s unlikely this is a problem.

Give this a try and let us know which step isn’t happening.

1 Like

Here is the solution link that worked for to solve this issue that I was encountering: Solution

1 Like

How do I check Redis to see of there’s a new job there, like what are the commands to address that ?


Can’t seem to execute this command

image
Not sure if celery id suppose to be there ? Isn’t it RQ now ?

After doing some other research I noticed that we don’t need that celery queue. How can I replace it with RQ ?

My mistake: I put the CLI commands in the wrong order. I’ve updated the post above. The -f --tail 100 must go directly after the logs command. The name of the service comes last.

1 Like

Did you follow the upgrade instructions on the releases page? It should cover the changes to make to your docker-compose.yml file.

Yes. I got rid of celery and removed:

 #  environment:
  #    QUEUES: "scheduled_queries,schemas"
  #    WORKERS_COUNT: 2

That seemed to have decreased the amount of queues that was in the queued jobs in Redash. My question is do I have to add in the RQ scheduler ? I know there was a user that showed his docker file and under scheduler he only had:

  <<: *redash-service
    command: scheduler

That’s right. With the move to RQ we didn’t need the QUEUES or WORKERS_COUNT environment variables. This service just runs the scheduler command in a loop :slight_smile:


Is this necessary to in my dockerfile ? The contents of the docker file is as follows:

version: "2"
x-redash-service: &redash-service
  image: redash/redash:10.1.0.b50633
  depends_on:
    - postgres
    - redis
  env_file: /opt/redash/env
  restart: always
services:
  server:
    <<: *redash-service
    command: server
    ports:
      - "5000:5000"
    environment:
      REDASH_WEB_WORKERS: 4
  scheduler:
    <<: *redash-service
    command: scheduler
  scheduled_worker:
    <<: *redash-service
    command: worker
  #  environment:
  #    QUEUES: "scheduled_queries,schemas"
  #    WORKERS_COUNT: 2
  adhoc_worker:
    <<: *redash-service
    command: worker
    environment:
      QUEUES: "queries"
      WORKERS_COUNT: 2
  redis:
    image: redis:5.0-alpine
    restart: always
  postgres:
    image: postgres:9.6-alpine
    env_file: /opt/redash/env
    volumes:
      - /opt/redash/postgres-data:/var/lib/postgresql/data
    restart: always
  nginx:
    image: redash/nginx:latest
    ports:
      - "80:80"
    depends_on:
      - server
    links:
      - server:redash
    restart: always

Is this necessary to in my dockerfile ?

Aye, it’s in the instructions because without it the general jobs, such as cleaning up old query results, will never run.

Also :eyes: You commented out the QUEUES and WORKER COUNT for the scheduled worker. This means your scheduled queries may not execute as expected.

I’d strongly recommend using the default docker-compose.yml file unless you have a strong reason to alter it.

I thought for V10 we didn’t need to use that ? When I commented that out I was looking in the scheduler_worker logs and it seemed to be allocating jobs for the threads to execute. Shall I put it back in ?

I thought for V10 we didn’t need to use that ?

Did you find this somewhere in the docs?

The default guidance is to include it. If you don’t, your scheduled_worker is not actually restricted to scheduled tasks. It will pick up adhoc executions too. Which is fine if that’s what you want, but it’s not obvious.

The reason for the default is to separate the performance of scheduled executions from adhoc executions. So that if your adhoc queue grows it won’t affect scheduled executions or vice versa.

Found it here

Could be wrong

Those instructions don’t mention removing the QUEUES for the scheduled_worker but for the scheduler.

[edit] FWIW these same instructions are duplicated on the github release page with screenshots showing exactly what to add and remove :100: