I may have multiple problems.
So my server (4GB ram, 2CPU) just goes dead every six weeks or so. I have to reset the entire system. Then redash gives me a 504 error. I fiddle around a bit - no luck and give up. A day to two later its working and I have no idea how or why. I’m not sure if I did something to fix it or it just settles down and then is ok. Then six weeks later it crashes again and I’m stumped.
It looks to me (I know very little) like its related to a high CPU usages and that it is caused by the Nginx being in some sort of loop. But I’m not sure what is usual for a resting redash server. If I shut down all the containers the server is fine (almost no CPU usage). When bring it up all the docker containers start but the CPU usages load average is over 4.
I shut it all down and did a docker-compose up --force-recreate --build. I can then see what it is doing but it looks like it gets stuck in a loop and then I press Ctrl-C and it all shuts down.
statsadmin@safdashboard:/opt/redash$ sudo docker-compose up --force-recreate --build
Recreating redash_postgres_1 ... done
Recreating redash_redis_1 ... done
Recreating redash_scheduled_worker_1 ... done
Recreating redash_scheduler_1 ... done
Recreating redash_worker_1 ... done
Recreating redash_server_1 ... done
Recreating redash_adhoc_worker_1 ... done
Recreating redash_nginx_1 ... done
Attaching to redash_redis_1, redash_postgres_1, redash_scheduler_1, redash_scheduled_worker_1, redash_adhoc_worker_1, redash_server_1, redash_worker_1, redash_nginx_1
redis_1 | 1:C 10 Jun 2022 10:11:38.074 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
redis_1 | 1:C 10 Jun 2022 10:11:38.083 # Redis version=5.0.7, bits=64, commit=00000000, modified=0, pid=1, just started
redis_1 | 1:C 10 Jun 2022 10:11:38.083 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
redis_1 | 1:M 10 Jun 2022 10:11:38.114 * Running mode=standalone, port=6379.
redis_1 | 1:M 10 Jun 2022 10:11:38.114 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
redis_1 | 1:M 10 Jun 2022 10:11:38.114 # Server initialized
redis_1 | 1:M 10 Jun 2022 10:11:38.115 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
redis_1 | 1:M 10 Jun 2022 10:11:38.125 * DB loaded from disk: 0.010 seconds
redis_1 | 1:M 10 Jun 2022 10:11:38.125 * Ready to accept connections
postgres_1 |
postgres_1 | PostgreSQL Database directory appears to contain a database; Skipping initialization
postgres_1 |
postgres_1 | LOG: database system was shut down at 2022-06-10 10:03:05 UTC
postgres_1 | LOG: MultiXact member wraparound protections are now enabled
postgres_1 | LOG: database system is ready to accept connections
postgres_1 | LOG: autovacuum launcher started
scheduler_1 | Starting RQ scheduler...
scheduled_worker_1 | Starting RQ worker...
adhoc_worker_1 | Starting RQ worker...
worker_1 | Starting RQ worker...
scheduled_worker_1 | 2022-06-10 10:11:58,933 INFO RPC interface 'supervisor' initialized
worker_1 | 2022-06-10 10:11:58,937 INFO RPC interface 'supervisor' initialized
worker_1 | 2022-06-10 10:11:58,947 CRIT Server 'unix_http_server' running without any HTTP authentication checking
worker_1 | 2022-06-10 10:11:58,950 INFO supervisord started with pid 1
scheduled_worker_1 | 2022-06-10 10:11:58,965 CRIT Server 'unix_http_server' running without any HTTP authentication checking
scheduled_worker_1 | 2022-06-10 10:11:58,968 INFO supervisord started with pid 1
server_1 | [2022-06-10 10:11:58 +0000] [1] [INFO] Starting gunicorn 20.0.4
server_1 | [2022-06-10 10:11:58 +0000] [1] [INFO] Listening at: http://0.0.0.0:5000 (1)
server_1 | [2022-06-10 10:11:59 +0000] [1] [INFO] Using worker: sync
adhoc_worker_1 | 2022-06-10 10:11:59,025 INFO RPC interface 'supervisor' initialized
adhoc_worker_1 | 2022-06-10 10:11:59,025 CRIT Server 'unix_http_server' running without any HTTP authentication checking
adhoc_worker_1 | 2022-06-10 10:11:59,027 INFO supervisord started with pid 1
server_1 | [2022-06-10 10:11:59 +0000] [8] [INFO] Booting worker with pid: 8
server_1 | [2022-06-10 10:11:59 +0000] [9] [INFO] Booting worker with pid: 9
server_1 | [2022-06-10 10:11:59 +0000] [10] [INFO] Booting worker with pid: 10
server_1 | [2022-06-10 10:11:59 +0000] [11] [INFO] Booting worker with pid: 11
scheduled_worker_1 | 2022-06-10 10:11:59,985 INFO spawned: 'worker_healthcheck' with pid 8
worker_1 | 2022-06-10 10:12:00,003 INFO spawned: 'worker_healthcheck' with pid 8
adhoc_worker_1 | 2022-06-10 10:12:00,077 INFO spawned: 'worker_healthcheck' with pid 8
scheduled_worker_1 | 2022-06-10 10:12:00,105 INFO spawned: 'worker-0' with pid 9
adhoc_worker_1 | 2022-06-10 10:12:00,161 INFO spawned: 'worker-0' with pid 9
worker_1 | 2022-06-10 10:12:00,166 INFO spawned: 'worker-0' with pid 9
adhoc_worker_1 | 2022-06-10 10:12:00,294 INFO spawned: 'worker-1' with pid 10
scheduled_worker_1 | 2022-06-10 10:12:01,164 INFO success: worker_healthcheck entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
worker_1 | 2022-06-10 10:12:01,263 INFO success: worker_healthcheck entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
adhoc_worker_1 | 2022-06-10 10:12:01,388 INFO success: worker_healthcheck entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
nginx_1 | 102.39.50.201 - - [10/Jun/2022:10:12:04 +0000] "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1" 301 185 "-" "Apple-iPhone12C8/1906.77" "-"
server_1 | [2022-06-10 10:12:29 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:8)
server_1 | [2022-06-10 10:12:29 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:9)
server_1 | [2022-06-10 10:12:29 +0000] [8] [INFO] Worker exiting (pid: 8)
server_1 | [2022-06-10 10:12:30 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:10)
server_1 | [2022-06-10 10:12:30 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:11)
server_1 | [2022-06-10 10:12:29 +0000] [9] [INFO] Worker exiting (pid: 9)
server_1 | [2022-06-10 10:12:30 +0000] [11] [INFO] Worker exiting (pid: 11)
server_1 | [2022-06-10 10:12:30 +0000] [10] [INFO] Worker exiting (pid: 10)
server_1 | [2022-06-10 10:12:31 +0000] [12] [INFO] Booting worker with pid: 12
server_1 | [2022-06-10 10:12:31 +0000] [13] [INFO] Booting worker with pid: 13
server_1 | [2022-06-10 10:12:31 +0000] [14] [INFO] Booting worker with pid: 14
server_1 | [2022-06-10 10:12:31 +0000] [15] [INFO] Booting worker with pid: 15
server_1 | [2022-06-10 10:13:01 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:12)
server_1 | [2022-06-10 10:13:01 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:13)
server_1 | [2022-06-10 10:13:01 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:14)
server_1 | [2022-06-10 10:13:01 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:15)
server_1 | [2022-06-10 10:13:01 +0000] [14] [INFO] Worker exiting (pid: 14)
server_1 | [2022-06-10 10:13:01 +0000] [15] [INFO] Worker exiting (pid: 15)
server_1 | [2022-06-10 10:13:01 +0000] [12] [INFO] Worker exiting (pid: 12)
server_1 | [2022-06-10 10:13:01 +0000] [13] [INFO] Worker exiting (pid: 13)
server_1 | [2022-06-10 10:13:02 +0000] [16] [INFO] Booting worker with pid: 16
server_1 | [2022-06-10 10:13:03 +0000] [17] [INFO] Booting worker with pid: 17
server_1 | [2022-06-10 10:13:03 +0000] [18] [INFO] Booting worker with pid: 18
server_1 | [2022-06-10 10:13:03 +0000] [19] [INFO] Booting worker with pid: 19
nginx_1 | 2022/06/10 10:13:05 [error] 6#6: *2 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 102.39.50.201, server: safdashboard.dedicated.co.za, request: "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1", upstream: "http://172.18.0.8:5000/Microsoft-Server-ActiveSync", host: "154.0.162.233"
nginx_1 | 102.39.50.201 - - [10/Jun/2022:10:13:05 +0000] "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1" 504 183 "-" "Apple-iPhone12C8/1906.77"
nginx_1 | 2022/06/10 10:13:05 [info] 6#6: *2 client 102.39.50.201 closed keepalive connection
nginx_1 | 102.39.50.201 - - [10/Jun/2022:10:13:05 +0000] "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1" 301 185 "-" "Apple-iPhone12C8/1906.77" "-"
http://soluscloud.co.zahttp://soluscloud.co.zaserver_1 | [2022-06-10 10:13:33 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:16)
server_1 | [2022-06-10 10:13:33 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:17)
server_1 | [2022-06-10 10:13:33 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:18)
server_1 | [2022-06-10 10:13:33 +0000] [16] [INFO] Worker exiting (pid: 16)
server_1 | [2022-06-10 10:13:33 +0000] [18] [INFO] Worker exiting (pid: 18)
server_1 | [2022-06-10 10:13:33 +0000] [17] [INFO] Worker exiting (pid: 17)
server_1 | [2022-06-10 10:13:34 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:19)
server_1 | [2022-06-10 10:13:34 +0000] [19] [INFO] Worker exiting (pid: 19)
server_1 | [2022-06-10 10:13:34 +0000] [20] [INFO] Booting worker with pid: 20
server_1 | [2022-06-10 10:13:34 +0000] [21] [INFO] Booting worker with pid: 21
server_1 | [2022-06-10 10:13:34 +0000] [22] [INFO] Booting worker with pid: 22
server_1 | [2022-06-10 10:13:34 +0000] [23] [INFO] Booting worker with pid: 23
server_1 | [2022-06-10 10:14:04 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:20)
server_1 | [2022-06-10 10:14:04 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:21)
server_1 | [2022-06-10 10:14:04 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:22)
server_1 | [2022-06-10 10:14:04 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:23)
server_1 | [2022-06-10 10:14:04 +0000] [21] [INFO] Worker exiting (pid: 21)
server_1 | [2022-06-10 10:14:04 +0000] [22] [INFO] Worker exiting (pid: 22)
server_1 | [2022-06-10 10:14:04 +0000] [20] [INFO] Worker exiting (pid: 20)
server_1 | [2022-06-10 10:14:04 +0000] [23] [INFO] Worker exiting (pid: 23)
scheduler_1 | [2022-06-10 10:14:05,506][PID:1][INFO][redash.tasks.schedule] Scheduling 305c0cae0a196ae96915fd2b6f81001c435aad65 (refresh_queries) with interval 30.
scheduler_1 | [2022-06-10 10:14:05,525][PID:1][INFO][redash.tasks.schedule] Scheduling 59a84668e68687338646f735965213c58e814b32 (remove_ghost_locks) with interval 60.
scheduler_1 | [2022-06-10 10:14:05,537][PID:1][INFO][redash.tasks.schedule] Scheduling 18281a865ed3d1a60f366aeb8596fe2283aa421f (sync_user_details) with interval 60.
scheduler_1 | [2022-06-10 10:14:05,574][PID:1][INFO][redash.tasks.schedule] Scheduling e27209059575fcc17c527c47d0957cb21756e551 (cleanup_query_results) with interval 300.
scheduler_1 | [2022-06-10 10:14:05,577][PID:1][INFO][rq_scheduler.scheduler] Registering birth
nginx_1 | 2022/06/10 10:14:05 [error] 6#6: *7 upstream timed out (110: Connection timed out) while reading response header from upstream, client: 102.39.50.201, server: safdashboard.dedicated.co.za, request: "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1", upstream: "http://172.18.0.8:5000/Microsoft-Server-ActiveSync", host: "154.0.162.233"
nginx_1 | 102.39.50.201 - - [10/Jun/2022:10:14:05 +0000] "OPTIONS /Microsoft-Server-ActiveSync HTTP/1.1" 504 183 "-" "Apple-iPhone12C8/1906.77"
server_1 | [2022-06-10 10:14:05 +0000] [24] [INFO] Booting worker with pid: 24
nginx_1 | 2022/06/10 10:14:05 [info] 6#6: *7 client 102.39.50.201 closed keepalive connection
server_1 | [2022-06-10 10:14:05 +0000] [26] [INFO] Booting worker with pid: 26
server_1 | [2022-06-10 10:14:06 +0000] [25] [INFO] Booting worker with pid: 25
server_1 | [2022-06-10 10:14:06 +0000] [27] [INFO] Booting worker with pid: 27
nginx_1 | 167.248.133.47 - - [10/Jun/2022:10:14:36 +0000] "GET / HTTP/1.1" 301 185 "-" "-" "-"
server_1 | [2022-06-10 10:14:36 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:24)
server_1 | [2022-06-10 10:14:36 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:25)
server_1 | [2022-06-10 10:14:36 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:26)
server_1 | [2022-06-10 10:14:36 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:27)
server_1 | [2022-06-10 10:14:36 +0000] [24] [INFO] Worker exiting (pid: 24)
server_1 | [2022-06-10 10:14:36 +0000] [25] [INFO] Worker exiting (pid: 25)
server_1 | [2022-06-10 10:14:36 +0000] [26] [INFO] Worker exiting (pid: 26)
server_1 | [2022-06-10 10:14:36 +0000] [27] [INFO] Worker exiting (pid: 27)
nginx_1 | 167.248.133.47 - - [10/Jun/2022:10:14:37 +0000] "GET / HTTP/1.1" 301 185 "-" "Mozilla/5.0 (compatible; CensysInspect/1.1; +https://about.censys.io/)" "-"
server_1 | [2022-06-10 10:14:37 +0000] [28] [INFO] Booting worker with pid: 28
nginx_1 | 167.248.133.47 - - [10/Jun/2022:10:14:37 +0000] "PRI * HTTP/2.0" 400 173 "-" "-" "-"
server_1 | [2022-06-10 10:14:37 +0000] [29] [INFO] Booting worker with pid: 29
server_1 | [2022-06-10 10:14:37 +0000] [30] [INFO] Booting worker with pid: 30
server_1 | [2022-06-10 10:14:38 +0000] [31] [INFO] Booting worker with pid: 31
nginx_1 | 2022/06/10 10:14:52 [info] 6#6: *13 epoll_wait() reported that client prematurely closed connection, so upstream connection is closed too while sending request to upstream, client: 167.248.133.47, server: safdashboard.dedicated.co.za, request: "GET / HTTP/1.1", upstream: "http://172.18.0.8:5000/", host: "154.0.162.233"
nginx_1 | 167.248.133.47 - - [10/Jun/2022:10:14:52 +0000] "GET / HTTP/1.1" 499 0 "-" "Mozilla/5.0 (compatible; CensysInspect/1.1; +https://about.censys.io/)"
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] Starting the health check for worker process Checks config: [(<class 'redash.cli.rq.WorkerHealthcheck'>, {})]
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] Installing signal handlers.
scheduled_worker_1 | READY
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] Received TICK_60 event from supervisor
scheduled_worker_1 | RESULT 2
scheduled_worker_1 | OKREADY
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] No processes in state RUNNING found for process worker
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] Received TICK_60 event from supervisor
scheduled_worker_1 | RESULT 2
scheduled_worker_1 | OKREADY
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] No processes in state RUNNING found for process worker
scheduled_worker_1 | 2022/06/10 10:15:03 [worker_healthcheck] Received TICK_60 event from supervisor
scheduled_worker_1 | RESULT 2
scheduled_worker_1 | OKREADY
scheduled_worker_1 | 2022/06/10 10:15:04 [worker_healthcheck] No processes in state RUNNING found for process worker
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] Starting the health check for worker process Checks config: [(<class 'redash.cli.rq.WorkerHealthcheck'>, {})]
worker_1 | READY
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] Installing signal handlers.
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] Received TICK_60 event from supervisor
worker_1 | RESULT 2
worker_1 | OKREADY
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] No processes in state RUNNING found for process worker
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] Received TICK_60 event from supervisor
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] No processes in state RUNNING found for process worker
worker_1 | RESULT 2
worker_1 | OKREADY
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] Received TICK_60 event from supervisor
worker_1 | 2022/06/10 10:15:06 [worker_healthcheck] No processes in state RUNNING found for process worker
worker_1 | RESULT 2
worker_1 | OKREADY
scheduled_worker_1 | [2022-06-10 10:15:06,865][PID:9][INFO][rq.worker] Worker rq:worker:ae9df11b731841e7b678d7d8f0b2500d: started, version 1.5.0
scheduled_worker_1 | [2022-06-10 10:15:06,920][PID:9][INFO][rq.worker] *** Listening on scheduled_queries, schemas...
scheduled_worker_1 | [2022-06-10 10:15:06,961][PID:9][INFO][rq.worker] Cleaning registries for queue: scheduled_queries
scheduled_worker_1 | [2022-06-10 10:15:06,986][PID:9][INFO][rq.worker] Cleaning registries for queue: schemas
server_1 | [2022-06-10 10:15:07 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:28)
server_1 | [2022-06-10 10:15:07 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:29)
server_1 | [2022-06-10 10:15:07 +0000] [28] [INFO] Worker exiting (pid: 28)
server_1 | [2022-06-10 10:15:07 +0000] [29] [INFO] Worker exiting (pid: 29)
worker_1 | [2022-06-10 10:15:08,243][PID:9][INFO][rq.worker] Worker rq:worker:b06d0d843b8347b285ac3381b5f3e2be: started, version 1.5.0
worker_1 | [2022-06-10 10:15:08,270][PID:9][INFO][rq.worker] *** Listening on periodic, emails, default...
worker_1 | [2022-06-10 10:15:08,295][PID:9][INFO][rq.worker] Cleaning registries for queue: periodic
worker_1 | [2022-06-10 10:15:08,304][PID:9][INFO][rq.worker] Cleaning registries for queue: emails
worker_1 | [2022-06-10 10:15:08,341][PID:9][INFO][rq.worker] Cleaning registries for queue: default
worker_1 | [2022-06-10 10:15:08,411][PID:9][INFO][rq.worker] periodic: 18281a865ed3d1a60f366aeb8596fe2283aa421f
worker_1 | [2022-06-10 10:15:08,804][PID:20][INFO][rq.worker] periodic: Job OK (18281a865ed3d1a60f366aeb8596fe2283aa421f)
worker_1 | [2022-06-10 10:15:08,807][PID:20][INFO][rq.worker] Result is kept for 600 seconds
server_1 | [2022-06-10 10:15:08 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:30)
server_1 | [2022-06-10 10:15:08 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:31)
worker_1 | [2022-06-10 10:15:08,949][PID:9][INFO][rq.worker] periodic: 305c0cae0a196ae96915fd2b6f81001c435aad65
server_1 | [2022-06-10 10:15:08 +0000] [32] [INFO] Booting worker with pid: 32
server_1 | [2022-06-10 10:15:09 +0000] [33] [INFO] Booting worker with pid: 33
worker_1 | [2022-06-10 10:15:09,121][PID:21][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Refreshing queries...
server_1 | [2022-06-10 10:15:09 +0000] [34] [INFO] Booting worker with pid: 34
server_1 | [2022-06-10 10:15:09 +0000] [35] [INFO] Booting worker with pid: 35
worker_1 | [2022-06-10 10:15:10,704][PID:21][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Done refreshing queries: {'outdated_queries_count': 0, 'last_refresh_at': 1654856110.6998672, 'query_ids': '[]'}
worker_1 | [2022-06-10 10:15:10,716][PID:21][INFO][rq.worker] periodic: Job OK (305c0cae0a196ae96915fd2b6f81001c435aad65)
worker_1 | [2022-06-10 10:15:10,716][PID:21][INFO][rq.worker] Result is kept for 600 seconds
postgres_1 | LOG: unexpected EOF on client connection with an open transaction
worker_1 | [2022-06-10 10:15:10,864][PID:9][INFO][rq.worker] periodic: 59a84668e68687338646f735965213c58e814b32
worker_1 | [2022-06-10 10:15:11,180][PID:22][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.remove_ghost_locks job.id=59a84668e68687338646f735965213c58e814b32 Locks found: 0, Locks removed: 0
worker_1 | [2022-06-10 10:15:11,259][PID:22][INFO][rq.worker] periodic: Job OK (59a84668e68687338646f735965213c58e814b32)
worker_1 | [2022-06-10 10:15:11,260][PID:22][INFO][rq.worker] Result is kept for 600 seconds
worker_1 | [2022-06-10 10:15:11,464][PID:9][INFO][rq.worker] periodic: e27209059575fcc17c527c47d0957cb21756e551
worker_1 | [2022-06-10 10:15:11,545][PID:23][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.cleanup_query_results job.id=e27209059575fcc17c527c47d0957cb21756e551 Running query results clean up (removing maximum of 100 unused results, that are 7 days old or more)
worker_1 | [2022-06-10 10:15:12,004][PID:23][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.cleanup_query_results job.id=e27209059575fcc17c527c47d0957cb21756e551 Deleted 0 unused query results.
worker_1 | [2022-06-10 10:15:12,018][PID:23][INFO][rq.worker] periodic: Job OK (e27209059575fcc17c527c47d0957cb21756e551)
worker_1 | [2022-06-10 10:15:12,030][PID:23][INFO][rq.worker] Result is kept for 600 seconds
worker_1 | [2022-06-10 10:15:12,166][PID:9][INFO][rq.worker] periodic: 305c0cae0a196ae96915fd2b6f81001c435aad65
worker_1 | [2022-06-10 10:15:12,307][PID:24][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Refreshing queries...
worker_1 | [2022-06-10 10:15:12,895][PID:24][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Done refreshing queries: {'outdated_queries_count': 0, 'last_refresh_at': 1654856112.8887806, 'query_ids': '[]'}
worker_1 | [2022-06-10 10:15:12,990][PID:24][INFO][rq.worker] periodic: Job OK (305c0cae0a196ae96915fd2b6f81001c435aad65)
worker_1 | [2022-06-10 10:15:12,994][PID:24][INFO][rq.worker] Result is kept for 600 seconds
postgres_1 | LOG: unexpected EOF on client connection with an open transaction
worker_1 | [2022-06-10 10:15:13,154][PID:9][INFO][rq.worker] periodic: 18281a865ed3d1a60f366aeb8596fe2283aa421f
worker_1 | [2022-06-10 10:15:13,485][PID:25][INFO][rq.worker] periodic: Job OK (18281a865ed3d1a60f366aeb8596fe2283aa421f)
worker_1 | [2022-06-10 10:15:13,486][PID:25][INFO][rq.worker] Result is kept for 600 seconds
worker_1 | [2022-06-10 10:15:13,611][PID:9][INFO][rq.worker] periodic: 305c0cae0a196ae96915fd2b6f81001c435aad65
worker_1 | [2022-06-10 10:15:13,861][PID:26][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Refreshing queries...
worker_1 | [2022-06-10 10:15:14,410][PID:26][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Done refreshing queries: {'outdated_queries_count': 0, 'last_refresh_at': 1654856114.406925, 'query_ids': '[]'}
worker_1 | [2022-06-10 10:15:14,420][PID:26][INFO][rq.worker] periodic: Job OK (305c0cae0a196ae96915fd2b6f81001c435aad65)
worker_1 | [2022-06-10 10:15:14,421][PID:26][INFO][rq.worker] Result is kept for 600 seconds
postgres_1 | LOG: unexpected EOF on client connection with an open transaction
worker_1 | [2022-06-10 10:15:14,579][PID:9][INFO][rq.worker] periodic: 59a84668e68687338646f735965213c58e814b32
worker_1 | [2022-06-10 10:15:14,742][PID:27][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.remove_ghost_locks job.id=59a84668e68687338646f735965213c58e814b32 Locks found: 0, Locks removed: 0
worker_1 | [2022-06-10 10:15:14,773][PID:27][INFO][rq.worker] periodic: Job OK (59a84668e68687338646f735965213c58e814b32)
worker_1 | [2022-06-10 10:15:14,773][PID:27][INFO][rq.worker] Result is kept for 600 seconds
worker_1 | [2022-06-10 10:15:35,715][PID:9][INFO][rq.worker] periodic: 305c0cae0a196ae96915fd2b6f81001c435aad65
worker_1 | [2022-06-10 10:15:35,859][PID:28][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Refreshing queries...
worker_1 | [2022-06-10 10:15:36,384][PID:28][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Done refreshing queries: {'outdated_queries_count': 0, 'last_refresh_at': 1654856136.3767474, 'query_ids': '[]'}
worker_1 | [2022-06-10 10:15:36,419][PID:28][INFO][rq.worker] periodic: Job OK (305c0cae0a196ae96915fd2b6f81001c435aad65)
worker_1 | [2022-06-10 10:15:36,421][PID:28][INFO][rq.worker] Result is kept for 600 seconds
postgres_1 | LOG: unexpected EOF on client connection with an open transaction
server_1 | [2022-06-10 10:15:39 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:32)
server_1 | [2022-06-10 10:15:39 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:33)
server_1 | [2022-06-10 10:15:39 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:34)
server_1 | [2022-06-10 10:15:39 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:35)
server_1 | [2022-06-10 10:15:39 +0000] [33] [INFO] Worker exiting (pid: 33)
server_1 | [2022-06-10 10:15:39 +0000] [35] [INFO] Worker exiting (pid: 35)
server_1 | [2022-06-10 10:15:39 +0000] [34] [INFO] Worker exiting (pid: 34)
server_1 | [2022-06-10 10:15:40 +0000] [36] [INFO] Booting worker with pid: 36
server_1 | [2022-06-10 10:15:40 +0000] [37] [INFO] Booting worker with pid: 37
server_1 | [2022-06-10 10:15:40 +0000] [38] [INFO] Booting worker with pid: 38
server_1 | [2022-06-10 10:15:41 +0000] [39] [INFO] Booting worker with pid: 39
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] Starting the health check for worker process Checks config: [(<class 'redash.cli.rq.WorkerHealthcheck'>, {})]
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] Installing signal handlers.
adhoc_worker_1 | READY
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] Received TICK_60 event from supervisor
adhoc_worker_1 | RESULT 2
adhoc_worker_1 | OKREADY
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] No processes in state RUNNING found for process worker
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] Received TICK_60 event from supervisor
adhoc_worker_1 | RESULT 2
adhoc_worker_1 | OK2022/06/10 10:15:47 [worker_healthcheck] No processes in state RUNNING found for process worker
adhoc_worker_1 | READY
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] Received TICK_60 event from supervisor
adhoc_worker_1 | RESULT 2
adhoc_worker_1 | OKREADY
adhoc_worker_1 | 2022/06/10 10:15:47 [worker_healthcheck] No processes in state RUNNING found for process worker
adhoc_worker_1 | [2022-06-10 10:15:49,259][PID:10][INFO][rq.worker] Worker rq:worker:5dee4f381a8346fa95e1396058091017: started, version 1.5.0
adhoc_worker_1 | [2022-06-10 10:15:49,266][PID:10][INFO][rq.worker] *** Listening on queries...
adhoc_worker_1 | [2022-06-10 10:15:49,284][PID:10][INFO][rq.worker] Cleaning registries for queue: queries
adhoc_worker_1 | [2022-06-10 10:15:49,603][PID:9][INFO][rq.worker] Worker rq:worker:2029bfe94f62490db896b79e53af6756: started, version 1.5.0
adhoc_worker_1 | [2022-06-10 10:15:49,609][PID:9][INFO][rq.worker] *** Listening on queries...
scheduled_worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] Received TICK_60 event from supervisor
scheduled_worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] No processes in state RUNNING found for process worker
scheduled_worker_1 | RESULT 2
scheduled_worker_1 | OKREADY
worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] Received TICK_60 event from supervisor
worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] No processes in state RUNNING found for process worker
worker_1 | RESULT 2
worker_1 | OKREADY
adhoc_worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] Received TICK_60 event from supervisor
adhoc_worker_1 | 2022/06/10 10:16:01 [worker_healthcheck] No processes in state RUNNING found for process worker
adhoc_worker_1 | RESULT 2
adhoc_worker_1 | OKREADY
worker_1 | [2022-06-10 10:16:05,755][PID:9][INFO][rq.worker] periodic: 18281a865ed3d1a60f366aeb8596fe2283aa421f
worker_1 | [2022-06-10 10:16:06,047][PID:29][INFO][rq.worker] periodic: Job OK (18281a865ed3d1a60f366aeb8596fe2283aa421f)
worker_1 | [2022-06-10 10:16:06,047][PID:29][INFO][rq.worker] Result is kept for 600 seconds
worker_1 | [2022-06-10 10:16:06,207][PID:9][INFO][rq.worker] periodic: 305c0cae0a196ae96915fd2b6f81001c435aad65
worker_1 | [2022-06-10 10:16:06,352][PID:30][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Refreshing queries...
worker_1 | [2022-06-10 10:16:06,838][PID:30][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.refresh_queries job.id=305c0cae0a196ae96915fd2b6f81001c435aad65 Done refreshing queries: {'outdated_queries_count': 0, 'last_refresh_at': 1654856166.8278556, 'query_ids': '[]'}
worker_1 | [2022-06-10 10:16:06,873][PID:30][INFO][rq.worker] periodic: Job OK (305c0cae0a196ae96915fd2b6f81001c435aad65)
worker_1 | [2022-06-10 10:16:06,873][PID:30][INFO][rq.worker] Result is kept for 600 seconds
postgres_1 | LOG: unexpected EOF on client connection with an open transaction
worker_1 | [2022-06-10 10:16:06,981][PID:9][INFO][rq.worker] periodic: 59a84668e68687338646f735965213c58e814b32
worker_1 | [2022-06-10 10:16:07,186][PID:31][INFO][rq.job.redash.tasks.queries.maintenance] job.func_name=redash.tasks.queries.maintenance.remove_ghost_locks job.id=59a84668e68687338646f735965213c58e814b32 Locks found: 0, Locks removed: 0
worker_1 | [2022-06-10 10:16:07,195][PID:31][INFO][rq.worker] periodic: Job OK (59a84668e68687338646f735965213c58e814b32)
worker_1 | [2022-06-10 10:16:07,208][PID:31][INFO][rq.worker] Result is kept for 600 seconds
server_1 | [2022-06-10 10:16:11 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:36)
server_1 | [2022-06-10 10:16:11 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:37)
server_1 | [2022-06-10 10:16:11 +0000] [36] [INFO] Worker exiting (pid: 36)
server_1 | [2022-06-10 10:16:11 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:38)
server_1 | [2022-06-10 10:16:11 +0000] [1] [CRITICAL] WORKER TIMEOUT (pid:39)
I have a backup machine with exactly the same env and docker-compose.yml (except for the HTTPS stuff and Host name) which works perfectly.