celery worker not publishing message to the rabbitmq? - rabbitmq

I have a setup where celery_result_backend has been configured to 'amqp'. I can see my tasks getting executed by the worker in logs. But
It is creating the queue with task id but its status is expired.I am not getting the result (result = AsyncResult(taskid); result.get() hangs). I tried all the backed supported:
1)Mysql: It is not putting data to the celery created tables
2) Redis: It is not putting data to the db
I two centos system.
1) I am calling the delay method to send the task to proper rabbitmq. And the worker is listening to the queue, from there it will pick the task and process(I can see task in the queue and getting executed by the worker in machine 2 But the result is not being put into the backend.
).Here I am doing the result.get() It hangs.
2) The worker is running on it to execute the task.It executes the task but I think not able to put the rersult
Settings:
RABBITMQ_BROKER_HOST = '10.213.166.133'
RABBITMQ_BROKER_PORT = dqms_settings.RABBITMQ_BROKER_PORT
RABBITMQ_BROKER_VHOST = dqms_settings.RABBITMQ_BROKER_VHOST
RABBITMQ_BROKER_USERNAME = dqms_settings.RABBITMQ_BROKER_USERNAME
RABBITMQ_BROKER_PASSWORD = dqms_settings.RABBITMQ_BROKER_PASSWORD
BROKER_URL = 'amqp://%s:%s#%s:%s/%s' % (RABBITMQ_BROKER_USERNAME,
RABBITMQ_BROKER_PASSWORD,
RABBITMQ_BROKER_HOST,
RABBITMQ_BROKER_PORT,
RABBITMQ_BROKER_VHOST)
#CELERY_TASK_RESULT_EXPIRES = 18000
#CELERY_IGNORE_RESULT = True
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
#CELERY_RESULT_BACKEND = 'db+mysql://svcacct-dqms:s3cretP#ssw0rd#10.213.166.202:3306/dqms'
#CELERY_RESULT_BACKEND = 'amqp'
#CELERY_AMQP_TASK_RESULT_EXPIRES = 1000
#CELERY_RESULT_SERIALIZER = 'json'
CELERY_TIMEZONE = TIME_ZONE
CELERYD_PREFETCH_MULTIPLIER = dqms_settings.CELERYD_PREFETCH_MULTIPLIER
CELERY_DEFAULT_QUEUE = dqms_settings.CELERY_DEFAULT_QUEUE
CELERY_DEFAULT_EXCHANGE_TYPE = dqms_settings.CELERY_DEFAULT_EXCHANGE_TYPE
CELERY_DEFAULT_ROUTING_KEY = dqms_settings.CELERY_DEFAULT_ROUTING_KEY
CELERY_QUEUES = dqms_settings.CELERY_QUEUES
CELERY_ROUTES = dqms_settings.CELERY_ROUTES
CELERYD_HIJACK_ROOT_LOGGER = dqms_settings.CELERYD_HIJACK_ROOT_LOGGER
CELERY_ACKS_LATE = dqms_settings.CELERY_ACKS_LATE
CELERY_RESULT_BACKEND = 'redis://:s3cretP#ssw0rd#10.213.166.204:6379/5' #'djcelery.backends.database.DatabaseBackend'
#CELERY_REDIS_MAX_CONNECTIONS = 6
#CELERY_ALWAYS_EAGER = False
Can some one help why it is not putting the result in the queue?

This is a issue which is happening quite common now.
setting CELERY_ALWAYS_EAGER to TRUE will do the work
However this is not the best solution in production scenario.

Related

ActiveMQ/STOMP Clear Schedule Messages Pointed To Destination

I would like to remove messages that are scheduled to be delivered to a specific queue but i'm finding the process to be unnecessarily burdensome.
Here I am sending a blank message to a queue with a delay:
self._connection.send(body="test", destination=f"/queue/my-queue", headers={
"AMQ_SCHEDULED_DELAY": 100_000_000,
"foo": "bar"
})
And here I would like to clear the scheduled messages for that queue:
self._connection.send(destination=f"ActiveMQ.Scheduler.Management", headers={
"AMQ_SCHEDULER_ACTION": "REMOVEALL",
}, body="")
Of course the "destination" here needs to be ActiveMQ.Scheduler.Management instead of my actual queue. But I can't find anyway to delete scheduled messages that are destined for queue/my-queue. I tried using the selector header, but that doesn't seem to work for AMQ_SCHEDULER_ACTION type messages.
The only suggestions I've seen is to write a consumer to browser all of the scheduled messages, inspect each one for its destination, and delete each schedule by its ID. This seems insane to me as I don't have just a handful of messages but many millions of messages that I'd like to delete.
Is there a way I could send a command to ActiveMQ to clear scheduled messages with a custom header value?
Maybe I can define a custom scheduled messages location for each queue?
Edit:
I've written a wrapper around the stomp.py connection to handle purging schedules destined for a queue. The MQStompFacade takes an existing stomp.Connection and the name of the queue you are working with and provides enqueue, enqueue_many, receive, purge, and move.
When receiving from a queue, if include_delayed is True, it will subscribe to both the queue and a topic that consumes the schedules. Assuming the messages were enqueued with this class and have the name of the original destination queue as a custom header, scheduled messages that aren't destined for the receiving queue will be filtered out.
Not yet testing in production. Probably a lot of of optimizations here.
Usage:
stomp = MQStompFacade(connection, "my-queue")
stomp.enqueue_many([
EnqueueRequest(message="hello"),
EnqueueRequest(message="goodbye", delay=100_000)
])
stomp.purge() # <- removes queued and scheduled messages destined for "/queues/my-queue"
class MQStompFacade (ConnectionListener):
def __init__(self, connection: Connection, queue: str):
self._connection = connection
self._queue = queue
self._messages: List[Message] = []
self._connection_id = rand_string(6)
self._connection.set_listener(self._connection_id, self)
def __del__(self):
self._connection.remove_listener(self._connection_id)
def enqueue_many(self, requests: List[EnqueueRequest]):
txid = self._connection.begin()
for request in requests:
headers = request.headers or {}
# Used in scheduled message selectors
headers["queue"] = self._queue
if request.delay_millis:
headers['AMQ_SCHEDULED_DELAY'] = request.delay_millis
if request.priority is not None:
headers['priority'] = request.priority
self._connection.send(body=request.message,
destination=f"/queue/{self._queue}",
txid=txid,
headers=headers)
self._connection.commit(txid)
def enqueue(self, request: EnqueueRequest):
self.enqueue_many([request])
def purge(self, selector: Optional[str] = None):
num_purged = 0
for _ in self.receive(idle_timeout=5, selector=selector):
num_purged += 1
return num_purged
def move(self, destination_queue: AbstractQueueFacade,
selector: Optional[str] = None):
buffer_size = 500
move_buffer = []
for message in self.receive(idle_timeout=5, selector=selector):
move_buffer.append(EnqueueRequest(
message=message.body
))
if len(move_buffer) >= buffer_size:
destination_queue.enqueue_many(move_buffer)
move_buffer = []
if move_buffer:
destination_queue.enqueue_many(move_buffer)
def receive(self,
max: Optional[int] = None,
timeout: Optional[int] = None,
idle_timeout: Optional[int] = None,
selector: Optional[str] = None,
peek: Optional[bool] = False,
include_delayed: Optional[bool] = False):
"""
Receiving messages until one of following conditions are met
Args:
max: Receive messages until the [max] number of messages are received
timeout: Receive message until this timeout is reached
idle_timeout (seconds): Receive messages until the queue is idle for this amount of time
selector: JMS selector that can be applied to message headers. See https://activemq.apache.org/selector
peek: Set to TRUE to disable automatic ack on matched criteria. Peeked messages will remain the queue
include_delayed: Set to TRUE to return messages scheduled for delivery in the future
"""
self._connection.subscribe(f"/queue/{self._queue}",
id=self._connection_id,
ack="client",
selector=selector
)
if include_delayed:
browse_topic = f"topic/scheduled_{self._queue}_{rand_string(6)}"
schedule_selector = f"queue = '{self._queue}'"
if selector:
schedule_selector = f"{schedule_selector} AND ({selector})"
self._connection.subscribe(browse_topic,
id=self._connection_id,
ack="auto",
selector=schedule_selector
)
self._connection.send(
destination=f"ActiveMQ.Scheduler.Management",
headers={
"AMQ_SCHEDULER_ACTION": "BROWSE",
"JMSReplyTo": browse_topic
},
id=self._connection_id,
body=""
)
listen_start = time.time()
last_receive = time.time()
messages_received = 0
scanning = True
empty_receive = False
while scanning:
try:
message = self._messages.pop()
last_receive = time.time()
if not peek:
self._ack(message)
messages_received += 1
yield message
except IndexError:
empty_receive = True
time.sleep(0.1)
if max and messages_received >= max:
scanning = False
elif timeout and time.time() > listen_start + timeout:
scanning = False
elif empty_receive and idle_timeout and time.time() > last_receive + idle_timeout:
scanning = False
else:
scanning = True
self._connection.unsubscribe(id=self._connection_id)
def on_message(self, frame):
destination = frame.headers.get("original-destination", frame.headers.get("destination"))
schedule_id = frame.headers.get("scheduledJobId")
message = Message(
attributes=MessageAttributes(
id=frame.headers["message-id"],
schedule_id=schedule_id,
timestamp=frame.headers["timestamp"],
queue=destination.replace("/queue/", "")
),
body=frame.body
)
self._messages.append(message)
def _ack(self, message: Message):
"""
Deletes the message from queue.
If the message has an scheduled_id, will also remove the associated scheduled job
"""
if message.attributes.schedule_id:
self._connection.send(
destination=f"ActiveMQ.Scheduler.Management",
headers={
"AMQ_SCHEDULER_ACTION": "REMOVE",
"scheduledJobId": message.attributes.schedule_id
},
id=self._connection_id,
body=""
)
self._connection.ack(message.attributes.id, subscription=self._connection_id)
In order to remove specific messages you need to know the ID which you can get via a browse of the scheduled messages. The only other option available is to use the start and stop time options in the remove operations to remove all messages inside a range.
MessageProducer producer = session.createProducer(management);
Message request = session.createMessage();
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION, ScheduledMessage.AMQ_SCHEDULER_ACTION_REMOVEALL);
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION_START_TIME, Long.toString(start));
request.setStringProperty(ScheduledMessage.AMQ_SCHEDULER_ACTION_END_TIME, Long.toString(end));
producer.send(request);
If that doesn't suit your need I'm sure the project would welcome contributions.

How can I create reliable flask-SQLAlchemy interactions with server-side-events?

I have a flask app that is functioning to expectations, and I am now trying to add a message notification section to my page. The difficulty I am having is that the database changes I am trying to rely upon do not seem to be updating in a timely fashion.
The html code is elementary:
<ul id="out" cols="85" rows="14">
</ul><br><br>
<script type="text/javascript">
var ul = document.getElementById("out");
var eventSource = new EventSource("/stream_game_channel");
eventSource.onmessage = function(e) {
ul.innerHTML += e.data + '<br>';
}
</script>
Here is the msg write code that the second user is executing. I know the code block is run because the redis trigger is properly invoked:
msg_join = Messages(game_id=game_id[0],
type="gameStart",
msg_from=current_user.username,
msg_to="Everyone",
message=f'{current_user.username} has requested to join.')
db.session.add(msg_join)
db.session.commit()
channel = str(game_id[0]).zfill(5) + 'startGame'
session['channel'] = channel
date_time = datetime.utcnow().strftime("%Y/%m/%d %H:%M:%S")
redisChannel.set(channel, date_time)
Here is the flask stream code, which is correctly triggered by a new redis time, but when I pull the list of messages, the new message the the second user has added is not yet accessible:
#games.route('/stream_game_channel')
def stream_game_channel():
#stream_with_context
def eventStream():
channel = session.get('channel')
game_id = int(left(channel, 5))
cnt = 0
while cnt < 1000:
print(f'cnt = 0 process running from: {current_user.username}')
time.sleep(1)
ntime = redisChannel.get(channel)
if cnt == 0:
msgs = db.session.query(Messages).filter(Messages.game_id == game_id)
msg_list = [i.message for i in msgs]
cnt += 1
ltime = ntime
lmsg_list = msg_list
for i in msg_list:
yield "data: {}\n\n".format(i)
elif ntime != ltime:
print(f'cnt > 0 process running from: {current_user.username}')
time.sleep(3)
msgs = db.session.query(Messages).filter(Messages.game_id == game_id)
msg_list = [i.message for i in msgs]
new_messages = # need to write this code still
ltime = ntime
cnt += 1
yield "data: {}\n\n".format(msg_list[len(msg_list)-len(lmsg_list)])
return Response(eventStream(), mimetype="text/event-stream")
The syntactic error that I am running into is that the msg_list is exactly the same length (i.e the pushed new message does not get written when i expect it to). Strangely, the second user's session appears to be accessing this information because its stream correctly reflects the addition.
I am using an Amazon RDS MySQL database.
The solution was to utilize a db.session.commit() before my db.session.query(Messages).filter(...) even where no writes were pending. This enabled an immediate read from a different user session, and my code commenced to react to the change in message list length properly.

Apache Flume with 2 different interceptors on same source

I am trying to add 2 different interceptors on the same source and send the intercepted data to 2 different channels.
But, I was not able to configure the same. Couldn't find any documentation about the same. Also, I am having some issues with the channel selectors. Not sure how to select a channel with the different interceptors.
Here is my code so far:
a1.sources = syslog_udp
a1.channels = chan1 chan2
a1.sinks = sink1 sink2 //both are different kafka sinks
a1.sources.syslog_udp.type = syslogudp
a1.sources.syslog_udp.port = 514
a1.sources.syslog_udp.host = 0.0.0.0
a1.sources.syslog_udp.keepFields = true
a1.sources.syslog_udp.interceptors = i1 i2
a1.sources.syslog_udp.interceptors.i1.type = regex_filter
a1.sources.syslog_udp.interceptors.i1.regex = '<regex_string1>'
a1.sources.syslog_udp.interceptors.i1.excludeEvents = false
a1.sources.syslog_udp.interceptors.i2.type = regex_filter
a1.sources.syslog_udp.interceptors.i2.regex = '<regex_string1>'|'<regex_string2>'
a1.sources.syslog_udp.interceptors.i2.excludeEvents = false
a1.sources.syslog_udp.selector.type = multiplexing
a1.sources.syslog_udp.channels = chan1 chan2
a1.channels.chan1.type = memory
a1.channels.chan1.capacity = 200
a1.channels.chan2.type = memory
a1.channels.chan2.capacity = 200
Seems like there is no straight-forward setup for this.
A work-around for this kind of layout is to have a single/wider channel interceptor in one agent, pipe the output to an avro-sink and setup a new agent for the avro-source and set-up the new channel interceptor on that.

How do you configure UDPInput to work with heka-flood udp test

I am trying to test sending data to heka's UDPInput with no success. I decided to try to use the heka-flood tool to mimic UPD traffic also with no success. I am using 0.10 version of heka. My heka.toml :
[UdpInput]
address = "127.0.0.1:4880"
net = "udp"
splitter = "udp_splitter"
decoder = "ProtobufDecoder"
set_hostname = true
# I have also tried not setting this as well
[udp_splitter]
type = "HekaFramingSplitter"
[ProtobufDecoder]
[LogOutput]
type = "LogOutput"
message_matcher = "Logger == 'UdpInput'"
encoder = "PayloadEncoder"
and my flood.toml:
[udp_proto]
ip_address = "127.0.0.1:4880"
sender = "udp"
pprof_file = ""
encoder = "protobuf"
num_messages = 1000
corrupt_percentage = 0.0001
signed_percentage = 0.00011
variable_size_messages = false
ascii_only = true
max_message_size = 32000
If I add another input, like say a log tailer and add it to the message matcher for the LogOutput, those messages end up being logged out. I never see anything from the UpdInput. What am I doing wrong?

How to use regex_extractor selector and multiplexing interceptor together in flume?

I am testing flume to load data into hHase and thinking about parallel data loading with using flume's selector and inteceptor, because of speed gap between source and sink.
So, what I want to do with flume are
creating Event's header with interceptors's regex_extractor type
multiplexing Event with header to more than two channels with selector's multiplexing type
in one source-channel-sink.
and tried configuration as below.
agent.sources = tailsrc
agent.channels = mem1 mem2
agent.sinks = std1 std2
agent.sources.tailsrc.type = exec
agent.sources.tailsrc.command = tail -F /home/flumeuser/test/in.txt
agent.sources.tailsrc.batchSize = 1
agent.sources.tailsrc.interceptors = i1
agent.sources.tailsrc.interceptors.i1.type = regex_extractor
agent.sources.tailsrc.interceptors.i1.regex = ^(\\d)
agent.sources.tailsrc.interceptors.i1.serializers = t1
agent.sources.tailsrc.interceptors.i1.serializers.t1.name = type
agent.sources.tailsrc.selector.type = multiplexing
agent.sources.tailsrc.selector.header = type
agent.sources.tailsrc.selector.mapping.1 = mem1
agent.sources.tailsrc.selector.mapping.2 = mem2
agent.sinks.std1.type = file_roll
agent.sinks.std1.channel = mem1
agent.sinks.std1.batchSize = 1
agent.sinks.std1.sink.directory = /var/log/flumeout/1
agent.sinks.std1.rollInterval = 0
agent.sinks.std2.type = file_roll
agent.sinks.std2.channel = mem2
agent.sinks.std2.batchSize = 1
agent.sinks.std2.sink.directory = /var/log/flumeout/2
agent.sinks.std2.rollInterval = 0
agent.channels.mem1.type = memory
agent.channels.mem1.capacity = 100
agent.channels.mem2.type = memory
agent.channels.mem2.capacity = 100
But, it doesn't work!
when selector part is removed, there are some interceptor debugging message in flume's log.
but when selector and interceptor are together, there are nothing.
Is there any wrong expression or something I missed?
Thanks for reading. :)
I found it.
In the flume log, there are warning message as below.
2013-10-10 16:34:20,514 (conf-file-poller-0) [WARN - org.apache.flume.conf.FlumeConfiguration$AgentConfiguration.validateSources(FlumeConfiguration.java:571)] Removed tailsrc due to Failed to configure component!
so I had attached below line
agent.sources.tailsrc.channels = mem1 mem2
and then It works!!!!