Sum/Average X axis data

tekne · April 11, 2018, 4:55pm

This would ease the need for separated queries from the same data source.

tonyjiangh · April 11, 2018, 10:57pm

If you are talking about accumulation.
I thought about this before. Then I realized that for many cases,
unless you bring all the data to the front-end first, accumulation of just some of the data is meaningless.

It could be helpful for small sized dataset still, I just worry more about the confusion it might create.

tekne · April 11, 2018, 11:20pm

It’s my fault I didn’t develop too much on the proposal.

Take the following example: You have a query that returns 3 Columns: Month, City, Revenue. Each city has an entry per Month, which means that January will have as much entries as the number of cities.

Now, if you plot a Chart with bars with X values with the Month and Y Values with the Revenue, I should expect the option to SUM or AVG the values that appear on each Month. Instead, it looks as Redash is just picking up the first one or in some cases plotting an unpredictable Bar height and a non-corresponding legend.

The only workaround this is to do the calculation at the Query level, but now I need to do a Query for each Chart. If this was possible, I could instead run 1x Query for several visualizations.

I hope I explained it clear enough.

tonyjiangh · April 11, 2018, 11:35pm

Thanks for the explanation, I strongly agree with you.
I had the same problem, but I decided to work around this with query.

The visualization library (plotly) we’re using actually support this kind of aggregation.
I tried to fix once but I just could not manage to turn it on.

adamb0mb · October 15, 2018, 6:08pm

Bump on this.

It’d be nice to have one query drive several visualizations (e.g. revenue by month and revenue by city) in the above example.

justinmulvs · October 18, 2018, 1:09pm

+1 on this as well. Even more valuable if you can combine this with the group by functionality.

For example, imagine the above query but you also had revenue by product type. So the query returns month, city, product type, revenue.

You could generate two visualizations, one for revenue by city over time and another for revenue by product type over time.

Would vastly reduce total query count.

arikfr · October 28, 2018, 8:12am

This issue comes up from time to time, and I can see how it can be useful and save time. The main reason we never introduced such functionality was to protect the user from generating the wrong data. The example with revenues by city is a good case for this functionality. But allow me to offer a counter example:

SELECT created_at::date, action, count(distinct user_id)
FROM events
GROUP by 1, 2
ORDER by 1 DESC

The above query returns number of unique users per day / action type. Now most of you will know that you can’t sum the number of users per action to get the total number of unique users in each day (as the same user might do multiple actions), but your CMO? He might just remove the group by action from the chart settings to get this value.

Although I have to admit that a similar mistake can happen in the query as well. It just requires more effort to type it out. Maybe our approach should be less “protective” of the user?

Previous discussion on the same topic with @vladis:

justinmulvs · October 29, 2018, 1:34pm

Mmm I see what you’re saying. I suppose it depends on a company’s analytics workflow.

Realistically at my org, myself or another technical contributor will be the only person manipulating visualizations. All other stakeholders simply get their data from dashboards that these contributors build. Therefore the risk of the CMO changing a dashboard is low.

That being said, I don’t know what the best approach is here. I know some other tools offer this functionality (mode, for example), but idk what the right answer is. The risk you pose is certainly a real one.

Yannick · January 17, 2020, 3:53pm

Hi,
I see this discussion is quite old and still open.
I am currently evaluating Redash, it’s a great solution, congrats!
So far it looks like it’s answering most of my needs except…
the issue mentioned on this thread (which will maybe be a reason for a no go).
I think not being allowed to sum different rows is limiting Redash (you can do in all other solutions, and I have worked with most of them) and makes the filters (ie. ::filter / ::multiFilter) not as useful as they could be.
To add another concrete example to the ones mentioned in this thread:
I created a funnel, with different dimensions (country::multiFilter, cohort::multFilter, month::multiFilter) and I wanted the user to be able to filter by any of these 3 dimensions.
Not being allowed to sum duplicates the funnel steps when more than one value is selected for one of the dimensions.

This is the structure of the query output:
country | cohort | month | step | count
UK | 2019-12 | 2019-12 | step 1 | 100
FR | 2019-12 | 2019-12 | step 1 | 80
UK | 2019-12 | 2019-12 | step 2 | 70
FR | 2019-12 | 2019-12 | step 2 | 50
UK | 2019-12 | 2020-01 | step 1 | 90
FR | 2019-12 | 2020-01 | step 1 | 50
UK | 2019-12 | 2020-01 | step 2 | 30
FR | 2019-12 | 2020-01 | step 2 | 10

Do you have any plan to work on this feature?
Thanks!

jesse · January 23, 2020, 11:32am

For this use case you can likely use window functions to pull the SUM and AVERAGES separately. This is not being actively developed in-house but we would certainly review a pull request

eddiethinhvuong · March 25, 2022, 7:18am

I wonder if this issue will have a fix or become an feature in the near future because this indeed gonna be very useful in many cases.

▼Categories

▼Tags

Sum/Average X axis data