Why not add a new feature about sql query limit and download limit rows?

Hello, everyone, why not add a function to limit the number of rows downloaded by the user and the number of SQL rows returned by the query?

I think this function has a great effect in the query performance and data security of redash. Thanks!

You can implement a limit on the number of rows returned by a query by including the TOP or LIMIT (depending on your data source) keyword in the query. If you want to make this configurable you can add a parameter to set the number of rows as well e.g.

SELECT col1, col2 FROM myTable LIMIT {{ numrows }} ;

Hi @griffinator76, thanks you reply. Your advice is good, but it would be better to consider to add a function to limit.

Can you elaborate on what kind of “function” you’re imagining? A data source level setting for admins? Something users can set themselves?
Different data sources handle query result limiting in different ways, and some don’t support it at all, so what data sources did you have in mind?

The data sources are mainly Postgresql and presto. People can manually click to set the number of rows displayed without sql limit. If you know supset, this feature like that.
image

Agree with @griffinator76 this would be complex to implement across all data sources. But it’s an easy change for Postgres and Presto. Just modify your pg.py file to append LIMIT 1000 to every query or whichever you prefer.

Hello K4s1m,
Greetings for the day!!!

Your answer seems very much interesting for me! We do have a redash instance running in AWS and our primary data source is Redshift. I would like to set a default query limit for all user query. With reference to your answer, I found a pg.py file in /app/redash/query_runner/pg.py under docker with the code,
def run_query(self, query, user):
connection = self._get_connection()
_wait(connection, timeout=10)

    cursor = connection.cursor()

    try:
        cursor.execute(query) ----> Shall I just concat LIMIT 100 (eg) here? would that work?

Thanks in advance,

Kylash

Yes that would do it. I like to check for a limit clause first:

if "LIMIT" not in query.upper():
    query = query + " LIMIT 1000"

This way I can override the default in a pinch.

1 Like

Hello K4s1m,
Thanks a ton for your valuable information. I had a ticket in our backlog to test this in next sprint. This would be great if it’s works for us.

May be one more question on next level on same topics. Is there a way to improve Postgres db memory and cpus in docker? The reason why I’m asking this, people tend to query so much and we get I/O errors as there is a memory leak . I assume because of huge result dataset from query, redash needs to spill the data into disk which results in crashing the whole application and we need to restart the server itself.

Is there any guidelines to fix this? Much appreciated if you could redirect/share some steps with me for fixing this.

Last but not least, your timely help and information is a great :relieved: help for us. Thanks a ton once again and wishing you a great day!