Connect to Amazon RDS via SSH Tunnel for Self-hosted Redash

jon234 · March 3, 2021, 8:38pm

Hey everyone,

I’m trying to connect my self-hosted Redash instance (on AWS EC2) to a MYSQL Amazon RDS DB in a private VPC network behind a bastion (My RDS DB & EC2 instance running Redash are in separate networks).

I’ve read the instructions here:

The part that was confusing is that it asks you to download the Redash public key and put it in the home of your bastion. For self-hosted Redash, should I generate my own private/public key pair, and then ssh into my EC2 server that is hosting Redash and copy that private key into the file under ssh_tunnel_auth here: Run queries through ad-hoc SSH tunnels by rauchy · Pull Request #4797 · getredash/redash · GitHub
What’s the best way to do this?

Then I take the corresponding public key I generated (as opposed to the Redash public key) and put it in the home of my user for the bastion?

Thank you!

k4s1m · March 8, 2021, 3:20pm

You have the right idea.

The doc you linked is specific to customers of app.redash.io. For a self-hosted instance you need your own public/private key pair. Add the public key to the trusted hosts on your bastion. Add the path to the private key to the Python file you linked. And configure an ssh_tunnel object on the data source using the REST API.

louisn · October 30, 2021, 9:07am

For a digital ocean droplet using Redash marketplace app, is it possible to modify the file redash/settings/dynamic_settings.py on the droplet itself? Where would that file be located? I’d prefer not to build my own Redash just to get the ssh tunnel feature. Thanks

louisn · November 4, 2021, 12:55pm

Any comment on using SSH tunnel with an image?

louisn · November 4, 2021, 12:56pm

Is there a way to do this with the AWS image or Digital Ocean’s marketplace App for Redash?

jesse · November 4, 2021, 6:10pm

Hey sorry for my late response here! Yes you can totally edit the file on digital ocean, although it is probably more hassle than it should be (this is an area for improvement). I’ve been using sed since that’s the only built-in utility within the container itself.

I’ll put together a guide of how to do it in the next couple days. Until then, you are welcome to message me directly through the forum.

louisn · November 8, 2021, 2:57am

That’s great to hear. I’m eager to get it set up to avoid a last minute rush. Which container would that change be part of?

jesse · November 8, 2021, 3:40pm

I haven’t done this before, but I think only the worker containers really need the change. They are the containers that actually connect and run queries. The others (server, scheduler, nginx, redis, postgres) never communicate outside the local network.

louisn · November 11, 2021, 2:17pm

Possibly a script that copies the python code into the container. Although it would be better to have that as a mapped volume on the container so it doesn’t get removed if the container is re-created. Even better, could the source be modified with a default key configured to a mapped volume? That’d be best I think.

jesse · November 11, 2021, 2:22pm

You can of course modify anything on the image itself. Will need to consider how we can update the defaults going forward (we’re getting ready to build the V10 images so this is topical )

louisn · November 11, 2021, 4:15pm

I recall seeing this issue. [Feature] Tunnel support in default Docker image · Issue #2013 · getredash/redash · GitHub. Perhaps that can be re-opened?

jesse · November 11, 2021, 5:59pm

We won’t reopen that issue because we’re not going to make ssh tunnels the default behavior. But we could certainly use some documentation for setting one up. I’d love to review a PR adding those docs (along with many others )

justinclift · November 14, 2021, 3:48am

One working approach for doing this with Docker based installs, is adding the keys and customised python into the containers using volumes:

    volumes:
      - /some/path/to/the/ssh/keys:/keys:ro
      - /opt/redash/overrides/dynamic_settings.py:/app/redash/settings/dynamic_settings.py:ro

That /opt/redash/overrides/ directory is something you’d need to manually create, then put the modified dynamic_settings.py in. The modified dynamic_settings.py has an updated ssh_tunnel_auth() function:

def ssh_tunnel_auth():
    """
    To enable data source connections via SSH tunnels, provide your SSH authentication
    pkey here. Return a string pointing at your **private** key's path (which will be used
    to extract the public key), or a `paramiko.pkey.PKey` instance holding your **public** key.
    """
    return {
        'ssh_pkey': '/keys/id_rsa'
    }

Note that the /keys/ directory there matches up with the /keys directory given in the volume clause above. So, the /keys/id_rsa file is really just an id_rsa file that needs to exist in your actual keys directory. The file needs to be readable by the ubuntu user inside the container too (uid 1000), which is probably easiest to do by chown-ing it. eg chown 1000: id_rsa.

As Jesse mentions above, only the scheduled_worker, adhoc_worker, and worker containers need the volume piece added, and they can all use the exact same keys.

eg:

  scheduled_worker:
    <<: *redash-service
    command: worker
    environment:
      QUEUES: "scheduled_queries,schemas"
      WORKERS_COUNT: some_number_here
    volumes:
      - /opt/redash/keys:/keys:ro
      - /opt/redash/overrides/dynamic_settings.py:/app/redash/settings/dynamic_settings.py:ro
  adhoc_worker:
    <<: *redash-service
    command: worker
    environment:
      QUEUES: "queries"
      WORKERS_COUNT: some_number_here
    volumes:
      - /opt/redash/keys:/keys:ro
      - /opt/redash/overrides/dynamic_settings.py:/app/redash/settings/dynamic_settings.py:ro
  worker:
    <<: *redash-service
    command: worker
    environment:
      QUEUES: "periodic emails default"
      WORKERS_COUNT: some_number_here
    volumes:
      - /opt/redash/keys:/keys:ro
      - /opt/redash/overrides/dynamic_settings.py:/app/redash/settings/dynamic_settings.py:ro

justinclift · November 14, 2021, 3:54am

There’s also another approach - using persistent ssh tunnels - which doesn’t need using modified python scripts, instead having the ssh tunnel be set up externally to Redash. eg using a container to manage the tunnel.

Both ways seem to work fine, but have different strengths and weaknesses:

Redash managed SSH tunnel
- Slow to run queries due to tunnel creation each time
  - For long running queries, this extra time isn’t really noticeable
- But, doesn’t really need separate monitoring
Persistent SSH tunnel
- Quick to run queries, as the tunnel is already existing and ready to go
  - Better for fast queries, where faster GUI responsiveness is noticeable
- Needs separate monitoring
- Each SSH tunnel needs manually setting up/configuring

Hopefully that helps.

louisn · November 15, 2021, 12:08pm

Beauty! Setting up the volume mapping looks great! I am not familiar enough with open source Redash but it appears the overrides folder is a mechanism built in to Redash to allow setting customization, right?

justinclift · November 15, 2021, 2:12pm

Almost. Docker (Redash uses it for management) allows sharing files and folders from the host server with it’s containers. So, in this case, it’s a way of both making SSH keys available to the worker containers + a way of persistently replacing specific Docker files.

Without an override like that, people would need to build their own custom Redash docker images (possible, but a bunch of effort). Or they’d need to manually log into their Docker containers and update files inside them. Which would then lose the changes any time the Docker container is rebuilt (can be pretty often, depending on what’s happening).

Does that help?

louisn · November 22, 2021, 7:27pm

That helps tremendously! Is there a way to see that the key is getting picked up correctly. A log file or similar?

louisn · November 22, 2021, 10:38pm

I configured the volume mapping with the key and placed the public key on the ssh tunnel host in the same manner as the hosted Redash configuration. I’m getting an ssh negotiation error, most likely not picking up the key or something along those lines. Any idea how to troubleshoot on the self hosted redash with Docker implementation?

{
“message”: “could not send SSL negotiation packet: Resource temporarily unavailable\n”,
“ok”: false
}

justinclift · November 23, 2021, 2:30am

Hmmm, if you manually run SSH (using that key) from the host your Redash is on, does the connection succeed? eg:

$ ssh -i path_to_key someuser@your_bastion_host

Note that a simple ssh like above will try creating a remote login session for your user (eg in order to run commands remotely). That capability can be disabled on the bastion server, and isn’t needed for tunnels. So, it’s very possible you’ll connect successfully when testing, then ssh will just close the connection without further message.

The thing to look for is whether the attempted connection times out, generates an error, or something similar. A timeout or “No route to host” will generally mean there’s a network layer problem that needs fixing (maybe a firewall needs updating?), whereas other things are more obvious. eg if ssh prompts for acceptance of a host key, then it means the connection is getting to the server and it might be a public key problem after all

So, try the connection, and let us know what happens with it…

louisn · November 26, 2021, 7:13pm

Yes the ssh connection works.

▼Categories

▼Tags

Connect to Amazon RDS via SSH Tunnel for Self-hosted Redash