Want the feature of scan/cost estimation in Redash using Athena similar to Google BigQuery? Basically when you type a query, you can see estimated scan size/cost before you hit run for Athena. Because Athena charges by scan size, this would give an idea of how much will cost and whether the query can be more optimized.
When you enter a query in the Cloud Console, the query validator verifies the query syntax and provides an estimate of the number of bytes read. You can use this estimate to calculate query cost in the Pricing Calculator.
It seems possible for Athena if it is partitioned in s3 and use Glue data catalog:
- if using Glue crawler, we can fetch scan size from partitions information
- else, we knows the s3 paths to scan, and call s3 api to sum the object size