Thoughts on adding support for queries with parameters in query results data source

@ariarijp As you suggested, I’m using re:dash DB as data source. So I just added its postgres DB to my data sources.

@leshik I see, Can you run your second query on Query Results Data Source?

@ariarijp omg, I’m so sorry for wasting your time, I just forgot to change the data source in the query editor. It works now with your provided example as well as with my case with GA. I will experiment more and will let you know if I hit some edge case. Thank you!

1 Like

@leshik Great! Happy Redash-ing! :redash:

@ariarijp I figured out what was breaking my queries. After the PoC patch, it’s not possible to use queries like this one anymore:

select total.school
from query_7 as total;

This works on current master, but stops after applying the patch. Before the patch, this query succeds, but after it fails with Error running query: no such column: total.school.

My queries rely on joins that depend on tables aliases so they stopped working.

@leshik Thank you for your feedback!

I understand what is annoying you.

For now, It is one of the limitations of this PoC.

Of course, I always welcome any patch for solving it :slight_smile:

Is it possible to do things like that?
FROM query_254('{"mindate":'{{ mindate }}', "maxdate":'{{ maxdate }}'}');
This doesn’t work because of the brackets. Am I doing this wrong or are parameters into parameters not supported?

Hi Arikfr,

First let me start by saying that redash is awesome, thank you for open-sourcing it and most of all thank you for engaging so actively with the community that sprung around it!

As for adding support for queries with parameters, I believe that the best way to implement it would be to lean on sqlite’s native support for user-defined table-returning functions.

These can be implemented in Python with the help of either the sqlite-vtfunc module or Peewee’s TableFunction class.

(For the purposes of the rest of this discussion, I’m going call the queries from other data sources as “source queries” and the query that runs on sqlite based on the “source queries” as “sink query”).

Implementing query results as table functions brings ancillary benefits:

  • No need to invent new syntax. Passing parameters declared on the sink query to a source query is simply a matter of doing “select * from query_175({{parameter}})” whereas passing a correlated value from the sink query into the source query can be achieved with something like:

    select record.title, ( select count(*) from query_175(record.id)), from record

  • The source queries wouldn’t need to be executed completely before running the sink query. On the contrary, the sink query actually runs first, and the source queries only need to produce as much data as the sink query consumes, allowing the queries to run in parallel

    • This solves a pet peeve of mine, where the source queries are completely executed before the sink query fails with a syntax error when there is something a simple as a missing comma on the sink query.

Since these functions need to be pre-declared before running the query in sqlite, this would require parsing the query, as is currently done, to locate which existing queries should be declared as table functions.

Alternatively, the data source could require configuration of which source queries should be available to sink queries (and perhaps even their names!), which would also help to serve as a sort of access control. This would allow removing all code that currently parses sink queries to locate the source queries.

Using json in the argument passing, as suggested by @ariarijp, is also a possibility which could help with mapping arguments passed in the sink query with parameters declared in the source queries, which otherwise would need some sort of convention about the order of parameters.

These are my thoughts on the matter.

Again, thank you for engaging with the community on this matter!

Hey all,
following this thread with great interest, is there any progress / status update on implementing this feature in an upcoming redash (beta) version @arikfr ?

Im afraid I will run into the same problems as @leshik if I apply the patch that @ariarijp posted…

I explored this option in the past, but at the time I think it needed Cython to run. I did another try now and while it didn’t work with Ubuntu’s Docker image it did work with the python:2 Docker image (which @jezdez started porting our base image to anyway ).

Creating a table function will definitely be much better than what we do now.

Unless I miss something we can create a generic query_result table function that takes query id (and optional parameters) instead of creating a query specific function. We will probably need to support the query_<id> format for some time for backward compatibility, but we can implement it using our current method.

Giving this another look, it looks like TableFunction need to declare its parameters during definition. So maybe we should create a specific function for each query? :thinking:

1 Like

Either you declare each source query, perhaps according to a configuration on the data source as I suggested earlier, or you declare a single generic query_result() as you suggested. That query would accept two parameters:

  • The query id
  • A json string parameter as suggested by @ariarijp

In either case, for backward compatibility, you can do a simple regex replace on old queries, trading query_xx for query_result(xx, {}) (or query_xx()) before executing the query with sqlite.

Hi there,

BTW, here is new version of my PoC code on v7.0.0.

It supports nested query and alias.

Now you can run complex query like this.

select
    *
from (
    select
        1, current_timestamp, *
    from
        query_1('{"sheet_index": {{ sheet_index }}}') a
    limit 1
)
union all select
    2, *
from (
    select * from (
        select current_timestamp, * from query_1('{"sheet_index": {{ sheet_index }}}') b limit 2 offset 10
    )
)
union all select
    * 
from (
    select
        3, *
    from (
        select * from (
            select current_timestamp, * from query_1('{"sheet_index": {{ sheet_index }}}') c limit 3 offset 20
        )
    )
);

Enjoy!

1 Like

Hi @ariarijp
I just deployed image: redash/redash:7.0.0.b18042. but there is no ‘Query Results with parameters(PoC)’ data source in my data sources list. should I do anything else?

Hi @Nima

Thank you for asking me.

You should clone my forked repository from GitHub and checkout poc-query-results-with-params branch.

Then, you can build and run my PoC code included Docker image.

Thanks,

1 Like

Thanks a million @ariarijp.
It works great

1 Like

hi @ariarijp, I want pass the parameter which from other query result, eg:
Query 1

SELECT action as a1, obj_type as ot1
FROM query_1;

Query 2

SELECT *
FROM query_1('{"action":"$a1", "object_type":"$ot1"}');

In Query 2 sql, $a1 and $ot1 represent the value query from Query 1. How could I write Query 2 sql with your json input parameter?

Hello,

Was this feature ever deployed to reDash?

Thanks

3 Likes

I would love to use this feature. Guys, is there any news about pushing it into master?

3 Likes

Ping. What’s the status of this today? Currently I get Error running query: 'query_<id>' is not a function when running SELECT * FROM query_<id>('{"some":"json"}').

However I am not an admin for my organizations Redash account, and I don’t know what version we are using.

Is this feature stable and working in master and we are behind in pulling the latest?

1 Like