Why am I suggesting this?
We at Hudl self-host the OSS version of Redash. v3. We consistently see one bad query returning large data set use the entire memory of the server affecting other workers and queries. Currently Celery v3.x does not allow adding memory usage limits (--max-memory-per-child
flag). However, Celery v4.x does. We would like to use this to isolate and limit contention between celery workers.
This issue exists for Redash v4 too as the celery version is the same major version.
What needs to be done?
To limit contention of resources between Celery workers is to limit how much memory gets used by each worker. One way to do this is to upgrade version of Celery to v4.x. This is not a trivial change however. Here’s what’s changed in celery between v3 and v4: http://docs.celeryproject.org/en/master/whatsnew-4.0.html
There are changes to interfaces, routing, redis events, and dropped support for few old frameworks, etc. This is probably a large update to the backend and could be phased. The lazy upgrade is to update python package and any decorators and function signatures. This is only to be done in small number of places but testing will be important.
There will be a lot more changes to do such as changing setting names, function signatures, etc.
Next Steps
The main aim is to use this as a discussion thread. If there are ways to limit the impact of upgrade, I’d like to hear this, If there are ways that don’t require an upgrade to Celery or the owners and community don’t want to upgrade celery, I’d like to find other ways.
I know parts of the codebase but not everything so I’m looking for help identifying list of changes to be done.