I am not quite sure how to trouble shoot this, but let me give a bit of background.
We rolled out Redash 1.0.3 recently in ECS, using RDS for the redash database. Unfortunately, the database got accidentally wiped and we lost all of our snapshots. We didn’t implement manual backup strategies yet. Here is what we did to recover:
- Provision RDS again
- Once provisioned, we created the redash and redash_reader users and a blank redash database
- We have a clean sql file used to create the tables (we ran into issues with the scripts). Our SQL script was just a dump of a clean install of Redash AMI that had our Google organization configured. Only the default admin login was created.
- We restarted the ECS containers and then we were up and running again. We went through the process of adding datasources and groups again and had users login and then we mapped them to groups.
It was working, but we noticed every now and then someone logs in using Google OAuth and they log in as another user. They log out and try the Google Sign in again and they’re logged in as the correct user. When they log in as another user, they get those users permissions. You can see this is bad and exposes bad management. So…
- I remembered Redis holds some metadata to cache content so I flushed the entire Redis database and restarted the frontend and the issue still comes up randomly.
Before the RDS instance was destroyed, this was never an issue. The only thing I haven’t tried is stopping flask and all the celery workers and flushing Redis again. Would celery have issues with logins? My guess is this is entirely done through Flask.
I’m trying to poke through the logs and the database, but I’m not sure what to look for. Logs aren’t very helpful at this time.