Graphistry chart integration for big data tables/graphs (server-generated)

lmeyerov · October 22, 2021, 8:34pm

Any pointers for a Graphistry integration into redash? And whether/how this carries over to Databricks dashboards? We’ve been getting asked more and more, and some active projects can really use it, so thought it was time to ask

For background, Graphistry visualizations use client+server GPU acceleration, so normal Jupyter or Streamlit flow might look like the below split between viz generation vs vis loading:

Chart generation: (SQL engine) —[1GB arrow dataframe]–>(Python kernel)—[200MB arrow dataframe]–(graphistry server)–>(iframe url)
Chart viewing: (Python kernel)–[iframe url]–(browser)<–[1MB/s JS/arrow stream]–(graphistry server)

The main point is dashboard server <> graphistry server can handle bigger datasets than we’d want for the iframe doing graphistry server <> browser. So while Graphistry does have a React component, we don’t want to round trip big data through the browser, just keep it to symbolic things like filter controls.

This would be similar to apps doing things like GIS, Bokeh datashader, and other modern non-tiny charting. And the databricks dashboarding question because I suspect we may be able to carry over the benefits to both communities in one go

jesse · October 23, 2021, 1:22am

This is a fantastic question. Will noodle this over the weekend and get back to you. I like the idea of writing to arrow so that we can stream the results out, however.

lmeyerov · October 23, 2021, 3:20am

Awesome, thanks. We actually were the ones to create the Arrow JS libs and explicitly for these purposes, so happy to (try to) answer q’s on those aspects. But for the same reasons, that’s why we don’t want to send 100MB-1GB of data to a browser (browser-side JS VMs actually run out of memory!). So for the best current viable experience is to work with the viz server that’s doing the < 50ms latency tier stuff and then the browser’s webgl does the < 20m stuff on slices.

If it helps, one of the current prompts is working with a DB extension that’s already returning a DF to redash. They’re already using Graphistry for viz in streamlit/plotly/etc. and already runs interactively on big datasets by the architecture I described, so we’re trying to figure out how to recreate here. But my ideal would to enable for all redash users, including spark (as we have customers wanting exactly that for sec/fraud/misinfo/genomics/etc), vs. just for that DB

lmeyerov · October 26, 2021, 9:31pm

Hi @jesse ! Any thoughts or tips?

Herk · December 10, 2021, 8:06pm

@jesse @lmeyerov Reviving this tread. I would ABSOLUTELY love to have this inside of reDash. It would open a whole new side of reDash for Graph analytics! Hopefully we can keep this tread alive!

lmeyerov · December 10, 2021, 9:29pm

Same!

As some progress since then:

Databricks notebook/dashboard integration (successfully being used by customers): pygraphistry/graphistry-notebook-dashboard.ipynb at master · graphistry/pygraphistry · GitHub
ReDash: Let us know

jesse · December 10, 2021, 9:41pm

Thank you for bumping this! When this issue was first posted we were in the throes of sunsetting hosted Redash. Will be following up on many of these items over the next couple of weeks!

jesse · December 10, 2021, 9:42pm

@Herk Would you be interested in helping test this integration? What’s your dream data source for this kind of analysis?

Herk · December 10, 2021, 10:07pm

@jesse yes our team is very much interested in testing. We’ve already built a lot of modifications into reDash on a forked version.

Connection to TigerGraph (Graph Database)
GSQL support (Language for TigerGraph)
Actively developing GraphQL support into reDash as well because TigerGraph has a GraphQL connector
Dynamic Schema listing based on the Graph Box you connected to
Also doing a few more…

jesse · December 10, 2021, 10:20pm

If you’re interested, let’s get that merged into master on the main repo! I’ve wanted to implement a neo4j connector for ages but i’m not a graph expert. Amazing to see others care about it too.

ilyaminati · December 21, 2021, 2:40pm

GraphQL support would possibly enable a Dgraph backend. We’d be happy to test.

▼Categories

▼Tags

Graphistry chart integration for big data tables/graphs (server-generated)