I wonder if this is the right solution to that problem.
Data visibility is the entire point of Redash. There is an obvious need to completely restrict users from accessing some data (financials, payroll files, health info etc.). But apart from performance concerns, restricting the download size (without also restricting the query result size) is just odd. It sounds like a software solution to a training problem.
Like you mentioned, it won’t stop a determined person from stealing data. But it will overcomplicate legitimate use-cases. Especially considering that Redash queries which return large numbers of records (10k+) are exactly the ones that should be downloaded in Excel! Results that size are simply too big to be visualised in Redash.
Here are a few other ways to reach the same target:
- Create two data source connections + groups: one for visualising queries and one for making “extracts”. Trusted users have Full Access to both groups. But untrusted users have “View Only” access to the “visualising” group. This way they can view dashboards and small amounts of data but can’t write their own queries.
- If you don’t need extracts for anything, modify your query runner to append a
LIMIT 500 or
LIMIT 2000 at the end of each query.
- Add an option to disable downloads completely either by query or by group. And potentially add logic that obscures API requests to make it harder to bypass this restriction