Celery not connecting to Redis broker: Connection to broker lost - redis

I'm trying to get Redis to work as a broker for my Celery 3.0.19 install on Django. I see that redis-server is running on port 6379. When I run a simple Celery test, I get the following stack trace:
Ubuntu Lucid 10.0.4
Celery 3.0.19
celery -A tasks worker --loglevel=info
[2013-05-02 18:56:27,835: INFO/MainProcess] consumer: Connected to redis://127.0.0.1:6379/0.
[2013-05-02 18:56:27,835: ERROR/MainProcess] consumer: Connection to broker lost. Trying to re-establish the connection...
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/celery/worker/consumer.py", line 394, in start
self.reset_connection()
File "/usr/local/lib/python2.6/dist-packages/celery/worker/consumer.py", line 744, in reset_connection
self.connection, on_decode_error=self.on_decode_error,
File "/usr/local/lib/python2.6/dist-packages/celery/app/amqp.py", line 311, in __init__
**kw
File "/usr/local/lib/python2.6/dist-packages/kombu/messaging.py", line 355, in __init__
self.revive(self.channel)
File "/usr/local/lib/python2.6/dist-packages/kombu/messaging.py", line 367, in revive
self.declare()
File "/usr/local/lib/python2.6/dist-packages/kombu/messaging.py", line 377, in declare
queue.declare()
File "/usr/local/lib/python2.6/dist-packages/kombu/entity.py", line 490, in declare
self.queue_declare(nowait, passive=False)
File "/usr/local/lib/python2.6/dist-packages/kombu/entity.py", line 516, in queue_declare
nowait=nowait)
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/virtual/__init__.py", line 404, in queue_declare
return queue, self._size(queue), 0
File "/usr/local/lib/python2.6/dist-packages/kombu/transport/redis.py", line 516, in _size
sizes = cmds.execute()
File "/usr/local/lib/python2.6/dist-packages/redis/client.py", line 1919, in execute
return execute(conn, stack, raise_on_error)
File "/usr/local/lib/python2.6/dist-packages/redis/client.py", line 1811, in _execute_transaction
self.parse_response(connection, '_')
File "/usr/local/lib/python2.6/dist-packages/redis/client.py", line 1882, in parse_response
self, connection, command_name, **options)
File "/usr/local/lib/python2.6/dist-packages/redis/client.py", line 387, in parse_response
response = connection.read_response()
File "/usr/local/lib/python2.6/dist-packages/redis/connection.py", line 312, in read_response
raise response
ResponseError: unknown command 'MULTI'

You need redis version >= 2.2.0.

Related

Redis: Redisinsight : Error while connecting : "Something went wrong adding the database. Please try again"

I have a Redis server that I am trying to connect to for Mule 4 applications.
My objective is :
Connect with Redis using Mule 4 app : Success
Connect with Redis using Redisinsight to visualise the data -> Problem
While connecting using Redisinsight I do the following :
Launch the Redisinsight tool. It starts the tool at : http://localhost:8001/
Click on "I already have a database"
Click on "Connect to a Redis Database"
Here I provide the host, port, name (which as per documentation I provide anything say redis_test) and password.
I get the error message : "Something went wrong adding the database. Please try again"
Interestingly while connecting by mule, I just need to provide host, port and password and it works.
Please help. Thanks in advance
From the redisinsight logs :
ERROR 2021-02-09 14:14:20,123 django.request Internal Server Error: /api/instance/
Traceback (most recent call last):
File "django\core\handlers\exception.py", line 34, in inner
File "django\core\handlers\base.py", line 115, in _get_response
File "django\core\handlers\base.py", line 113, in _get_response
File "django\views\decorators\csrf.py", line 54, in wrapped_view
File "django\views\generic\base.py", line 71, in view
File "rest_framework\views.py", line 495, in dispatch
File "rest_framework\views.py", line 455, in handle_exception
File "rest_framework\views.py", line 492, in dispatch
File "redisinsight\core\views\instance.py", line 208, in post
File "redisinsight\core\views\instance.py", line 147, in _save_redis_instance
File "redisinsight\core\services\database\_routines.py", line 80, in _wrapped_add_db_func
File "redisinsight\core\services\database\_routines.py", line 765, in add_redis_database
File "redisinsight\core\services\database\_routines.py", line 809, in add_standalone_db
File "redisinsight\core\services\database\_routines.py", line 576, in _add_standalone_db
File "redisinsight\core\services\database\_routines.py", line 190, in _assert_db_type
File "redisinsight\core\services\database\_routines.py", line 175, in _probe_db_type
File "redis\client.py", line 1281, in info
File "redis\client.py", line 878, in execute_command
File "redis\client.py", line 892, in parse_response
File "redis\connection.py", line 752, in read_response
redis.exceptions.ResponseError: unknown command `INFO`, with args beginning with:
It looks like the INFO command is disabled on your Redis server. RedisInsight needs basic commands like INFO and PING to be enabled.
To enable INFO command
Edit the Redis config file:
sudo nano /etc/redis/redis.conf
search for the INFO command
something like:
rename-command INFO ""
Comment the line and restart redis:
systemctl restart redis

Ambari cluster restart error: Timeline Service V2.0 Reader not restarting

Attempting to restart an Ambari-managed cluster and getting errors related to the Timeline Service V2.0 Reader service starting:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 108, in <module>
ApplicationTimelineReader().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 353, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/timelinereader.py", line 51, in start
hbase(action='start')
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 80, in hbase
createTables()
File "/var/lib/ambari-agent/cache/stacks/HDP/3.0/services/YARN/package/scripts/hbase_service.py", line 147, in createTables
logoutput=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/system.py", line 263, in action_run
returns=self.resource.returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 308, in _call
raise ExecuteTimeoutException(err_msg)
resource_management.core.exceptions.ExecuteTimeoutException: Execution of 'ambari-sudo.sh su yarn-ats -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/texlive/2016/bin/x86_64-linux:/usr/lib64/qt-3.3/bin:/usr/local/texlive/2016/bin/x86_64-linux:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/opt/maven/bin:/root/bin:/opt/maven/bin:/opt/maven/bin:/var/lib/ambari-agent'"'"' ; sleep 10;export HBASE_CLASSPATH_PREFIX=/usr/hdp/3.0.0.0-1634/hadoop-yarn/timelineservice/*; /usr/hdp/3.0.0.0-1634/hbase/bin/hbase --config /usr/hdp/3.0.0.0-1634/hadoop/conf/embedded-yarn-ats-hbase org.apache.hadoop.yarn.server.timelineservice.storage.TimelineSchemaCreator -Dhbase.client.retries.number=35 -create -s'' was killed due timeout after 300 seconds
I have not changed any configs or installed anything new between the restart attempt; simply stopped the cluster services and attempted to restart them. Not sure what this error message means. Any debugging tips or fixes?
Found the solution on another community post.
navigate to the host where Timeline Reader is installed and Install Hbase Client in that host
Here is how I installed HBase Client from via the Ambari UI...
In the Ambari UI, go to Hosts then click the host you want to install the hbase client component on
In the list on components, you will have option to add more, see...
From here I installed the HBase client
Then stopped and restarted the cluster via Ambari UI (got notification of stale configs (though not sure if this was my problem all along or if installing the HBase Client reaised the stale configs alert))

Workflow Disconnection

When running a workflow from the repo like:
snakemake --use-conda -p --cluster-config cluster.yaml --cluster "qsub -l {cluster.l} -m {cluster.m} -N {cluster.N} -r {cluster.r} -V" --jobs 1
My job starts but runs into a urllib related error (see below). I'm running v3.13.3 on a compute server. Any tips on how to avoid this? Thanks in advance.
File "miniconda3/lib/python3.5/site-packages/snakemake/__init__.py", line 469, in snakemake
force_use_threads=use_threads)
File "miniconda3/lib/python3.5/site-packages/snakemake/workflow.py", line 450, in execute
dag.create_conda_envs(dryrun=dryrun)
File "miniconda3/lib/python3.5/site-packages/snakemake/dag.py", line 166, in create_conda_envs
hash = env.hash
File "miniconda3/lib/python3.5/site-packages/snakemake/conda.py", line 55, in hash
md5hash.update(self.content)
File "miniconda3/lib/python3.5/site-packages/snakemake/conda.py", line 38, in content
content = urlopen(env_file).read()
File "miniconda3/lib/python3.5/urllib/request.py", line 163, in urlopen
return opener.open(url, data, timeout)
File "miniconda3/lib/python3.5/urllib/request.py", line 466, in open
response = self._open(req, data) ...
raise URLError(err)
urllib.error.URLError: <urlopen error [Errno 101] Network is unreachable>

Airflow CROSSSLOT Keys in request don't hash to the same slot error using AWS ElastiCache

I am running apache-airflow 1.8.1 on AWS ECS and I have an AWS ElastiCache cluster (redis 3.2.4) running 2 shards / 2 nodes with multi-AZ enabled (clustered redis engine). I've verified that airflow can access the host/port of the cluster without any problem.
Here's the logs:
Thu Jul 20 01:39:21 UTC 2017 - Checking for redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379) connectivity
Thu Jul 20 01:39:21 UTC 2017 - Connected to redis (endpoint: redis://xxxxxx.xxxxxx.clustercfg.usw2.cache.amazonaws.com:6379)
logging to s3://xxxx-xxxx-xxxx/logs/airflow
Starting worker
[2017-07-20 01:39:44,020] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-07-20 01:39:45,960] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-07-20 01:39:45,989] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
[2017-07-20 01:39:53,352] {__init__.py:57} INFO - Using executor CeleryExecutor
[2017-07-20 01:39:55,187] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/Grammar.txt
[2017-07-20 01:39:55,210] {driver.py:120} INFO - Generating grammar tables from /usr/lib/python2.7/lib2to3/PatternGrammar.txt
[2017-07-20 01:53:09,536: ERROR/MainProcess] Unrecoverable error: ResponseError("CROSSSLOT Keys in request don't hash to the same slot",)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/__init__.py", line 206, in start
self.blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 374, in start
return self.obj.start()
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 278, in start
blueprint.start(self)
File "/usr/local/lib/python2.7/dist-packages/celery/bootsteps.py", line 123, in start
step.start(parent)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/consumer.py", line 569, in start
replies = I.hello(c.hostname, revoked._data) or
{}
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 112, in hello
return self._request('hello', from_node=from_node, revoked=revoked)
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 71, in _request
timeout=self.timeout, reply=True,
File "/usr/local/lib/python2.7/dist-packages/celery/app/control.py", line 307, in broadcast
limit, callback, channel=channel,
File "/usr/local/lib/python2.7/dist-packages/kombu/pidbox.py", line 294, in _broadcast
serializer=serializer)
File "/usr/local/lib/python2.7/dist-packages/kombu/pidbox.py", line 259, in _publish
maybe_declare(self.reply_queue(channel))
File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 120, in maybe_declare
return _maybe_declare(entity, declared, ident, channel)
File "/usr/local/lib/python2.7/dist-packages/kombu/common.py", line 127, in _maybe_declare
entity.declare()
File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 522, in declare
self.queue_declare(nowait, passive=False)
File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 548, in queue_declare
nowait=nowait)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/virtual/__init__.py", line 447, in queue_declare
return queue_declare_ok_t(queue, self._size(queue), 0)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/redis.py", line 690, in _size
sizes = pipe.execute()
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2626, in execute
return execute(conn, stack, raise_on_error)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2518, in _execute_transaction
response = self.parse_response(connection, '_')
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 2584, in parse_response
self, connection, command_name, **options)
File "/usr/local/lib/python2.7/dist-packages/redis/client.py", line 585, in parse_response
response = connection.read_response()
File "/usr/local/lib/python2.7/dist-packages/redis/connection.py", line 582, in read_response
raise response
ResponseError: CROSSSLOT Keys in request don't hash to the same slot
I had the exact same issue, I solved it by not using a clustered setup with elasticache. Perhaps celery workers don't support using clustered Redis, I was unable to find any information that definitively pointed this out.

Ambari shows zeppelin server not started but the server is actually up and running

I am using HDP 2.4.2 and I had previously installed the zeppelin server. It was working fine but today when i restarted the cluster ( AWS nodes were restarted), Ambari shows that Zeppelin server is not running and fails to start the server with the following error:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stacks/HDP/2.4/services/ZEPPELIN/package/scripts/master.py", line 235, in <module>
Master().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 219, in execute
method(env)
File "/var/lib/ambari-agent/cache/stacks/HDP/2.4/services/ZEPPELIN/package/scripts/master.py", line 169, in start
+ params.zeppelin_log_file, user=params.zeppelin_user)
File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 154, in __init__
self.env.run()
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 158, in run
self.run_action(resource, action)
File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 121, in run_action
provider_action()
File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 238, in action_run
tries=self.resource.tries, try_sleep=self.resource.try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 70, in inner
result = function(command, **kwargs)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 92, in checked_call
tries=tries, try_sleep=try_sleep)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 140, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 291, in _call
raise Fail(err_msg)
resource_management.core.exceptions.Fail: Execution of '/usr/hdp/current/zeppelin-server/lib/bin/zeppelin-daemon.sh start >> /var/log/zeppelin/zeppelin-setup.log' returned 1. /usr/hdp/current/zeppelin-server/lib/bin/zeppelin-daemon.sh: line 187: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid: Permission denied
cat: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid: No such file or directory
In the zeppelin logs:
ERROR [2016-06-06 03:20:36,714] ({main} VFSNotebookRepo.java[list]:140) - Can't read note file:///usr/hdp/current/zeppelin-server/lib/notebook/screenshots java.io.IOException: file:///usr/hdp/current/zeppelin-server/lib/notebook/screenshots/note.json not found
ERROR [2016-06-06 03:34:12,795] ({main} Notebook.java[loadNoteFromRepo]:330) - Failed to load 2BHU1G67J java.io.IOException: file:///usr/hdp/current/zeppelin-server/lib/notebook/2BHU1G67J is not a directory
But for some reason, the zeppelin port is listening and despite these errors, the zeppelin server is running fine and executing all the queries. Please advice on how to correct the issue in Ambari and start the service without error from ambari.
The problem is with the PID file for the zeppelin service. It's either owned by the wrong user or has the wrong permissions. Manually stop the zeppelin service then delete the pid file locate at: /var/run/zeppelin-notebook/zeppelin-zeppelin-ip-10-0-0-11.eu-west-1.compute.internal.pid. Double check the owner/permissions on the /var/run/zeppelin-notebook folder as well. You should then be able to restart the service in the Ambari UI.