Workers getting crashed and queue keeps on building

Issue Summary

Docker Error logs of workers:
[2021-04-27 05:48:44,075][PID:7][ERROR][MainProcess] Control command error: OperationalError(u"\nCannot route message for exchange ‘reply.celery.pidbox’: Table empty or key no longer exists.\nProbably the key (u’_kombu.binding.reply.celery.pidbox’) has been removed from the Redis database.\n",)
Traceback (most recent call last):
File “/usr/local/lib/python2.7/site-packages/celery/worker/pidbox.py”, line 46, in on_message
self.node.handle_message(body, message)
File “/usr/local/lib/python2.7/site-packages/kombu/pidbox.py”, line 145, in handle_message
return self.dispatch(**body)
File “/usr/local/lib/python2.7/site-packages/kombu/pidbox.py”, line 115, in dispatch
ticket=ticket)
File “/usr/local/lib/python2.7/site-packages/kombu/pidbox.py”, line 151, in reply
serializer=self.mailbox.serializer)
File “/usr/local/lib/python2.7/site-packages/kombu/pidbox.py”, line 285, in _publish_reply
**opts
File “/usr/local/lib/python2.7/site-packages/kombu/messaging.py”, line 181, in publish
exchange_name, declare,
File “/usr/local/lib/python2.7/site-packages/kombu/connection.py”, line 543, in _ensured
errback and errback(exc, 0)
File “/usr/local/lib/python2.7/contextlib.py”, line 35, in exit
self.gen.throw(type, value, traceback)
File “/usr/local/lib/python2.7/site-packages/kombu/connection.py”, line 436, in _reraise_as_library_errors
sys.exc_info()[2])
File “/usr/local/lib/python2.7/site-packages/kombu/connection.py”, line 431, in _reraise_as_library_errors
yield
File “/usr/local/lib/python2.7/site-packages/kombu/connection.py”, line 510, in _ensured
return fun(*args, **kwargs)
File “/usr/local/lib/python2.7/site-packages/kombu/messaging.py”, line 203, in _publish
mandatory=mandatory, immediate=immediate,
File “/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/base.py”, line 605, in basic_publish
message, exchange, routing_key, **kwargs
File “/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/exchange.py”, line 70, in deliver
for queue in _lookup(exchange, routing_key):
File “/usr/local/lib/python2.7/site-packages/kombu/transport/virtual/base.py”, line 714, in _lookup
self.get_table(exchange),
File “/usr/local/lib/python2.7/site-packages/kombu/transport/redis.py”, line 829, in get_table
raise InconsistencyError(NO_ROUTE_ERROR.format(exchange, key))
OperationalError:
Cannot route message for exchange ‘reply.celery.pidbox’: Table empty or key no longer exists.
Probably the key (u’_kombu.binding.reply.celery.pidbox’) has been removed from the Redis database.
[2021-04-27 05:48:44,080][PID:7][INFO][MainProcess] Received task: redash.tasks.execute_query[c68082ca-d0b9-4fb6-b0ad-2ae875394ba4]
[2021-04-27 05:48:44,081][PID:7][CRITICAL][MainProcess] Unrecoverable error: RuntimeError(u’pubsub connection not set: did you forget to call subscribe() or psubscribe()?’,)
Traceback (most recent call last):
File “/usr/local/lib/python2.7/site-packages/celery/worker/worker.py”, line 205, in start
self.blueprint.start(self)
File “/usr/local/lib/python2.7/site-packages/celery/bootsteps.py”, line 119, in start
step.start(parent)
File “/usr/local/lib/python2.7/site-packages/celery/bootsteps.py”, line 369, in start
return self.obj.start()
File “/usr/local/lib/python2.7/site-packages/celery/worker/consumer/consumer.py”, line 318, in start
blueprint.start(self)
File “/usr/local/lib/python2.7/site-packages/celery/bootsteps.py”, line 119, in start
step.start(parent)
File “/usr/local/lib/python2.7/site-packages/celery/worker/consumer/consumer.py”, line 596, in start
c.loop(*c.loop_args())
File “/usr/local/lib/python2.7/site-packages/celery/worker/loops.py”, line 91, in asynloop
next(loop)
File “/usr/local/lib/python2.7/site-packages/kombu/asynchronous/hub.py”, line 362, in create_loop
cb(*cbargs)
File “/usr/local/lib/python2.7/site-packages/kombu/transport/redis.py”, line 1052, in on_readable
self.cycle.on_readable(fileno)
File “/usr/local/lib/python2.7/site-packages/kombu/transport/redis.py”, line 348, in on_readable
chan.handlerstype
File “/usr/local/lib/python2.7/site-packages/kombu/transport/redis.py”, line 679, in _receive
ret.append(self._receive_one(c))
File “/usr/local/lib/python2.7/site-packages/kombu/transport/redis.py”, line 690, in _receive_one
response = c.parse_response()
File “/usr/local/lib/python2.7/site-packages/redis/client.py”, line 3032, in parse_response
'pubsub connection not set: ’
RuntimeError: pubsub connection not set: did you forget to call subscribe() or psubscribe()?
[2021-04-27 05:48:44,090][PID:7][INFO][MainProcess] beat: Shutting down…
[2021-04-27 05:48:44,855][PID:7][WARNING][MainProcess] Restoring 2 unacknowledged message(s)

Requirements.txt file:

    Flask==0.12.4
Werkzeug==0.11.11
Jinja2==2.8
itsdangerous==0.24
click==6.6
MarkupSafe==0.23
pyOpenSSL==17.5.0
httplib2==0.10.3
wtforms==2.2.1
Flask-RESTful==0.3.5
Flask-Login==0.4.0
Flask-OAuthLib==0.9.5
# pin this until https://github.com/lepture/flask-oauthlib/pull/388 is released
requests-oauthlib>=0.6.2,<1.2.0
Flask-SQLAlchemy==2.3.2
Flask-Migrate==2.0.1
flask-mail==0.9.1
flask-talisman==0.6.0
Flask-Limiter==0.9.3
passlib==1.6.2
aniso8601==1.1.0
blinker==1.3
psycopg2==2.7.3.2
python-dateutil==2.8.0
pytz==2016.7
PyYAML==3.12
redis==3.2.1
requests==2.21.0
six==1.12.0
SQLAlchemy==1.2.12
# We can't upgrade SQLAlchemy-Searchable version as newer versions require PostgreSQL > 9.6, but we target older versions at the moment.
SQLAlchemy-Searchable==0.10.6
# We need to pin the version of pyparsing, as newer versions break SQLAlchemy-Searchable-10.0.6 (newer versions no longer depend on it)
pyparsing==2.3.0
SQLAlchemy-Utils==0.33.11
sqlparse==0.2.4
statsd==2.1.2
gunicorn==19.7.1
celery==4.3.0
kombu==4.6.3
jsonschema==2.4.0
RestrictedPython==3.6.0
pysaml2==4.5.0
pycrypto==2.6.1
funcy==1.7.1
sentry-sdk==0.11.2
semver==2.2.1
xlsxwriter==0.9.3
pystache==0.5.4
parsedatetime==2.1
PyJWT==1.6.4
cryptography==2.3
simplejson==3.10.0
ua-parser==0.7.3
user-agents==1.1.0
python-geoip-geolite2==2015.303
chromelogger==0.4.3
pypd==1.1.0
disposable-email-domains>=0.0.52
gevent==1.4.0
# Install the dependencies of the bin/bundle-extensions script here.
# It has its own requirements file to simplify the frontend client build process
-r requirements_bundles.txt
# Uncomment the requirement for ldap3 if using ldap.
# It is not included by default because of the GPL license conflict.
# ldap3==2.2.4

Technical details:

  • Redash Version: 8.0.2
  • Browser/OS: Ubuntu 16
  • How did you install Redash: Github Redash Repo(Docker builds)

What happens when you restart redis and the workers?

Only option when such a thing happens is to restart and things start work fine. I wanted to understand what i could be missing which is causing this in the first place?

Issues with Celery are what prompted the migration to RQ in V9+. It is probably not worth your time to learn all the failure modes. I would upgrade if I were you.

Redash version 9.0.0 is still in beta right? Latest stable version is 8 which i have been using.

Also lately we have moved out Redash’s postgres database from the same server to different independent RDS instance, could this be responsible for this error?

Version 9 has been in stable beta for more than a year. The ultimate release will be identical to the beta.

It could be. Up to you if it’s worth the time to chase this down versus upgrading.