Hello,

I’ve seen several workers die with the next error output:

[2018-01-29 22:42:09,116][PID:1][ERROR][MainProcess] Process 'Worker-35' pid:45 exited with 'signal 9 (SIGKILL)'
[2018-01-29 22:42:09,129][PID:1][ERROR][MainProcess] Task redash.tasks.execute_query[c82c4cab-c1cc-4d46-b5cd-d515fa8bc8ba] raised unexpected: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).',)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/billiard/pool.py", line 1175, in mark_as_worker_lost
    human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).

It seems they are being killed by a SIGKILL signal. Any idea how to get more information about those crashes?

As explained on redash github issue 812 (the platform does not allow me to put the link) we were running out of memory. Someone on our team did a SELECT * FROM ... kind o query. We increase the container memory from 256 to 512 but it still happens. It is a way to avoid this or to die more elegantly?

This indeed happens when the OOM killer kills the process. Unfortunately there is no elegant way… for now you have to ask your users to put a LIMIT on their queries :slight_smile:

It is what I suggested them. Thanks.