Sequence diagrams for SQL queries

KeithYJohnson · August 23, 2019, 6:47pm

Hello,

I was wondering if there are any architecture diagrams or sequence diagrams that exist for when a user types in a SQL query into the redash UI. In this case lets say there’s the redis cache on its own host, as well as one datasource on its own host. Has anyone created something like this? I’m curious exactly whats sent over the wire and in which order between all the components: nginx, web servers, redis cache, works, datastore.

Thanks!

jesse · August 27, 2019, 7:45pm

We don’t have anything formal, no. But if we did, it would probably help bring new developers up to speed. The architecture is pretty straightforward. Are you asking strictly out of curiosity? Or do you need a more rigorous write-up for internal security purposes?

One item to note: Redash doesn’t use Redis for caching. Just as a message queue. The query result cache is held in Postgres.

KeithYJohnson · August 28, 2019, 3:23pm

Hey Jesse,

Thanks for the clarification about Redis! And your intuition is correct! I’m looking for something very-detailed for an internal security review from a use case perspective. For example, if someone runs an ad-hoc query through the Redash UI, for each network hop amongst the Redash stack, what gets sent to which component, in what order, and ensures its encrypted in transit/rest. I’ll soon stand up a local 7.x stack and follow the trace so if there’s anything informal you could share that you think would be helpful, please do! Thanks!

arikfr · August 28, 2019, 3:45pm

The flow is something along the lines of:

UI sends query request to the API (can be in form of the query text in case of adhoc, or saved query id).
A job gets enqueued.
A Celery worker picks up the job, send the query to the database, gets back results and persists them in our Postgres database.
The UI will request the result and receive it from the API, which fetches it from Postgres.

So at transit it can be:

Between the worker and the database (if you use SSL, it’s encrypted).
Between worker and Postgres, and between Postgres and the API (if you use SSL, it’s encrypted).
Between the API and the browser (SSL…).

At rest:

In the database.
If you use any proxy in front of Redash it might cache the results, but most configurations don’t have something like this.

If you end up producing a diagram out of this, feel free to share

KeithYJohnson · August 28, 2019, 3:57pm

This is very helpful, thank you Arik! I’ll share the diagram if/when I get it done

▼Categories

▼Tags

Sequence diagrams for SQL queries