Worker exited prematurely: signal 9 (SIGKILL)

javier-sanz · January 30, 2018, 12:02am

Hello,

I’ve seen several workers die with the next error output:

[2018-01-29 22:42:09,116][PID:1][ERROR][MainProcess] Process 'Worker-35' pid:45 exited with 'signal 9 (SIGKILL)'
[2018-01-29 22:42:09,129][PID:1][ERROR][MainProcess] Task redash.tasks.execute_query[c82c4cab-c1cc-4d46-b5cd-d515fa8bc8ba] raised unexpected: WorkerLostError('Worker exited prematurely: signal 9 (SIGKILL).',)
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/billiard/pool.py", line 1175, in mark_as_worker_lost
    human_status(exitcode)),
WorkerLostError: Worker exited prematurely: signal 9 (SIGKILL).

It seems they are being killed by a SIGKILL signal. Any idea how to get more information about those crashes?

javier-sanz · January 30, 2018, 10:01am

As explained on redash github issue 812 (the platform does not allow me to put the link) we were running out of memory. Someone on our team did a SELECT * FROM ... kind o query. We increase the container memory from 256 to 512 but it still happens. It is a way to avoid this or to die more elegantly?

arikfr · January 30, 2018, 10:19am

This indeed happens when the OOM killer kills the process. Unfortunately there is no elegant way… for now you have to ask your users to put a LIMIT on their queries

javier-sanz · January 30, 2018, 10:38am

It is what I suggested them. Thanks.

▼Categories

▼Tags

Worker exited prematurely: signal 9 (SIGKILL)