Redash docker services questions

ramziyassine · October 18, 2021, 10:27pm

Issue Summary

We are working on deploying a product-level redash v10 in AWS ECS. There are many topics on this site that has helped (Example: Redash on AWS ECS).

My assumption

First I wanted to understand the purpose of each service as it is listed in the docker compose file. I created this diagram by looking at the code in redash’s github.

My questions

Is my diagram correct? Any help on this is appreciated
It is still not clear to me the role of the scheduler
- Is it creating the rq-scheduler?
- Is it scheduling scheduled queries, or is the server doing that?
- Does it hit the RDS database
We want to understand so that we can reason about how many ecs tasks, CPU, mem limits, multiple az,…

Any comments are appreciated

jesse · October 19, 2021, 3:58pm

Great question. We should probably publish something like this in the docs

You can tell a lot about the services by looking at the command they run once started. Here I’m running through the docker-compose.yml file in getredash/redash but the same would apply if you are modeling off of setup.sh.

`server`

Handles HTTP requests to the API and the front-end. It spends most of its time waiting for HTTP traffic.

`scheduler`

Enqueues periodic jobs in redis. It checks for scheduled queries that need to be executed now, data source schema refreshes, cleaning up old query results, and any other periodic job specified here. The scheduler runs this routine twice a minute for infinity.

`worker`

Pulls job definitions from redis and does them. This can be query executions, schedule refreshes, clearing out old results anything. If you see adhoc_worker or scheduled_worker, these are actually duplicates of the worker spec. They all run the same command. The difference is that adhoc_worker only looks for jobs in the adhoc queue in Redis. scheduled_worker only looks for tasks in the scheduled queue. This is helpful so that if your scheduled tasks begin to bottleneck, it won’t affect users running queries.

`redis`

Runs the Redis instance that Redash uses for message queuing

`postgres`

Runs the Postgres database where Redash preserves its state.

`email`

An image used for sending emails (if configured).

`nginx`

We use this as a proxy in front of server because it’s an easy way to configure HTTPs.