Redash 10 Excel data source

SunlinkHealthTech · October 6, 2021, 3:16pm

Is there any info out there on how the new Excel data source works?

jesse · October 6, 2021, 7:36pm

Great question

This merged as part of V10 but isn’t included in the docs yet. I haven’t used it before but I just loaded it up and have a working version. I’ll open a PR to the documentation website soon. I figured this out just by looking at the query runner source code

But for now here you go:

Excel Data Source

The Excel data source introduced in V10 accepts queries in yaml format. Your yaml may contain two different tags:

url - the URL that will be fetched
ua - the user agent string that will be passed in with the request

An example query that uses a publicly available data set:

url: https://www.dol.gov/sites/dolgov/files/ETA/naws/pdfs/NAWS_A2E191.xlsx

It produces this table:

vucu · October 19, 2021, 3:39pm

Not sure if I should create a new topic, or if i am allowed to continue here, but i was wondering if the excel data source allows picking up multiple files from the location ( something like location*.xlsx, or location test%.xlsx)

jesse · October 19, 2021, 3:45pm

Not as far as I know. But this could be a good addition. How would you expect this to behave? Since most queries return at most one set of tabular data, would you expect multiple table visualisations to appear? Would the tables be concatenated?

vucu · October 19, 2021, 3:53pm

i was thinking of a scenario where the files are similar in structure (eg: different day logs , etc), so the expected behavior in this case would be indeed concatenation.

jesse · October 19, 2021, 4:00pm

That makes sense. Can you create a feature request thread for this here on the forum? Just specify the expected behavior. This way the community can get visibility on outstanding pull requests.

Alternatively, if you want to implement this yourself I’m happy to answer questions, review, merge etc.

ilyaminati · October 19, 2021, 7:43pm

How does the Excel data source handle spreadsheets that contain multiple sheets / tabs?

jesse · October 19, 2021, 7:49pm

If you look at the query runner code it appears to use the default behavior of read_excel in Pandas. Which is to pull the first sheet. It would be a trivial adjustment to make this configurable in the query, though.

See the query runner code here.

willy-dauphin · May 12, 2022, 6:51am

Hi !
I rode the query runner again and documentation too. Actually, it’s already possible to query a sheet number with XLS file

The solution is here (excel.py, line 57) :

workbook = pd.read_excel(response.content, **args)

As you can see, the list called args is depacted as function parameters. So, we just have to use the correct parameter name to use it, according to the read_excel function !

For example, this is my nextcloud spreadsheet. First share the link (yellow rectangle) :

Then, copy the download link (yellow rectangle) :

Create a new query with excel datasource and fill parameters “url” with the link copied and “sheet_name” with the sheet number. First sheet = 0 (by default) / second sheet = 1 / etc.

You can easily change the sheetnumber like this :

And of course others parameters like nrows :

see you

Rash1d · July 19, 2022, 5:02pm

could someone help with authenticating to excel datasource, e.g. if we have an excel hosted on sharepoint, it will require authentication.

jesse · July 19, 2022, 5:22pm

What kind of authentication is needed? HTTP Basic auth? Oauth?

Rash1d · July 19, 2022, 6:49pm

yes basic auth should work

jesse · July 19, 2022, 6:57pm

Alright the existing excel query runner doesn’t support basic auth. This means there are a couple options:

Write a custom sharepoint query runner so you enter the username and password in secret fields on the data source configuration screen (I can help you with this).
Update the excel query runner to accept HTTP basic auth key-value-pairs
Use the existing Python query runner so you can access requests directly and also transform the data however you need.

angelochu7 · October 12, 2022, 4:48am

May I know whether it will work with Microsoft OneDrive hyperlink? It seems it’s not working.

Another question is, I got the following error when trying to load the sample excel file (url: https://www.dol.gov/sites/dolgov/files/ETA/naws/pdfs/NAWS_A2E191.xlsx):

Thanks a million for your comment.

willy-dauphin · January 15, 2023, 6:44am

Hey @jesse !
Is it possible to read an excel data source and provide a private token ? I’m working with KoboToolbox and I would like to access to private data. Any idea ?

▼Categories

▼Tags

Redash 10 Excel data source

Excel Data Source