Visualizing JSON records from Redshift

Hi everyone, happy holidays!

I’m using Amazon Redshift as a datasource and one of the columns in the table returns the record as a JSON string. The JSON returned from the query has a format similar to:

{"array_1":[0,128,256,512,1024],"array_2":[0,1,2,3],"version":"v1"}

I would like to be able to visualize array_1 on the x-axis and array_2 on the y-axis as a barchart, but is unable to do so through regular query statements since my column is not a SUPER type. Using json_extract_path_text, (docs), I’m able to get the response [0,128,256,512,1024] under array_1 but since it’s just an array vs. multiple rows under the same column, the visualization does not work.

Ideally we don’t want to rely on another data source like QRDS for this and would rather update the existing code for the query runner and visualizations to be able to parse JSON. I’ve gone through creating a new query runner in Redash and how to create new visualization types in Redash, but docs seems to be a bit outdated since the migration to React.

Just to double check I’m on the right track, at a high level I would need to figure out how to:

  1. Update the query runner for Redshift to be able to parse JSON data (will try to reference the JSON query runner for this).
  2. Once the JSON array data is parsed into rows under a column, extend the existing chart visualizations to display the parsed array_1 and array_2

Would anyone know if there is a better approach or have any references? Any tips or guidance will be greatly appreciated, thank you!

1 Like

Welcome to the forum and thanks for an excellent post. While we on the core team don’t use Redshift, I’m happy to help design / test a revision to the query runner that would support this kind of visualisation.

Approach

I would not update the visualisation to work with a new column type. Partly because we don’t have updated docs covering how to do this (I’m working on that), but also because the visualisations are all designed to take a similar data format input. For example: a pivot table accepts the same kind of input as the stock table visualisation, but displays it differently. Visualisations do not and should not care which data source provided the data.

Instead, we can update the data source to recognise this JSON data and return it to the front-end in the standard format (an array of row objects).

The easiest way I can imagine to do this is like so:

  1. Query author uses json_extract_path_text to pull array_1 into a column and array_2 into a second column.
“array_1” “array_2”
[0,128,256,512,1024] [0,1,2,3]
  1. The query_runner somehow monitors for this and pivots the data. One way it could sense this type of pivot is needed would be if the query result has exactly one row and each row contains JSON formatted data. Regardless, it would pivot the data like this:
“value_1” “value_2”
0 1
128 2
256 3
512 4
1024 None
  1. The visualisation can be configured to use the value_1 column for x and the value_2 column for Y e.g.