Redis how to match roundtrip request response - redis

I have a http server application that receives requests from HTTP clients and puts them to a Redis List for processing. Another process listening on this list picks up the requests and processes them, and finally puts the response into another Redis Queue to be consumed by the HTTP server.
The sequence is like this:
(1) Http Client ==> Web app
(2) Web App ==> Redis Request Queue (List data structure)
(3) Processor ==> consumes requests using multiple threads and processes them
(4) Processor ==> puts to a Redis Response Queue (List data structure)
(5) Web App ==> has to pick the response from the response que and deliver to HTTP clients
Given the above scenario, if multiple threads on the HTTP server are queueing the msgs to Redis, is there any established pattern to ensure responses are correctly picked up and sent back to HTTP clients using the right session?
I am planning to use Redis (or may be RabbitMQ or ZeroMQ) for the producer/consumer, because I want to scale horizontally and configure many consumers spread across several nodes.
Thanks for pointing me a reasonable approach on this.

Related

Can RabbitMQ cluster be used as a single endpoint by application?

There are three nodes in a RabbitMQ cluster as below.
Within RabbitMQ, there are two queues, q1 and q2.
The master replica of q1 and q2 are distributed on different nodes. Both queues are mirrored by other nodes.
There is a load balancer in front of three nodes.
AMQP(node port 5672) and Management HTTP API(node port 15672) are exposed by load balancer.
When application establishes a connection through load balancer, it could reach a random RabbitMQ node behind. And this is invisible to application.
Question:
Is it ok for application to consume both queues in a single AMQP channel over a single connection no matter which RabbitMQ node it reaches?
It is ok for application to call management HTTP API no matter which RabbitMQ node its request hits?
When RabbitMQ is set up as a cluster and you have your queues mirrored across them, it doesn't matter to which node you are connected. Because the AMQP connection for a queue will be automatically routed to the node containing the master queue and this handled by RabbitMQ internally. So, if a request to publish or consume on queue q1 comes, it will be routed to Node #1.
Answers to your question.
It is not advisable to consume more than one queues in a single AMQP connection. Exception from one consuming process may cause the connection to close which will interrupt the other one.
It is ok for application to call management HTTP API no matter which RabbutMQ node its request hits. Once management plugin in a RabbitMQ cluster is enabled, all the nodes will accept the Management HTTP API requests.
Reference: https://www.rabbitmq.com/clustering.html

Configuring Flume on with multiple HTTP source in a cluster

How can I configure Apache Flume to listen to multiple HTTP sources in a cluster with multiple flume agents?
My flume agent is configured as follows:
agent1.sources.httpSource_1.type = http
...
agent1.sources.httpSource_1.port = 8081
agent1.sources.httpSource_2.type = http
...
agent1.sources.httpSource_2.port = 8082
agent1.sources.httpSource_3.type = http
...
agent1.sources.httpSource_3.port = 8083
Let's assume I have 5 servers in my cluster. Which address should I send my REST or POST http message to reach all of my 5 servers?
For example, if I will send an HTTP POST message to <server_dns_1>:8081 then only agent1 will process it, if I understand it correctly.
How can I use all of my cluster servers and which address should I send my http requests to?
Cantroid, the way you have configured Flume only one agent (agent1) will be run. This agent will internally run 5 listening threads.
Being said that, there is no way a single http POST sends a message to all the 5 listening threads (or 5 agents, if you finnally split your unique agent into 5). Not unless you use some load balancing software or you use some "broadcasting" magic at network level (I'm not an expert on that).
Nevertheless, if the reason for having 5 listening ports is you want to perform 5 different data treatments, then you can create a single agent listening in a single HTTP port and then create 5 different channels where 5 different sinks will be listening. The key point with this architecture is the default channel selector is the replicating one, i.e. a copy of the same event will be put into the 5 channels by the unique listening source.

How can I Use RabbitMQ between two application while I can't change one of them?

I have an existing system consisting of two nodes, a client/server model.
I want to exchange messages between them using RabbitMQ. I.e. The client would send all its requests to RabbitMQ and the server would listen to the queue indefinitely, consume any messages that arrives and then act upon it.
I can change the server as needed, but my problem is, I cannot change the client's behavior. How can I send back the response to the client?
The client node understands HTTP request/response, what shall I do after configure the other application server to RabbitMQ instead of my app directly.
You can use RPC model or some internal convention, like storing result in database (or cache) with known id and polling your storage for that result in a cycle
You will have to use a proxy server in between that will seem to node 1 (the client you cannot change) as the actual server while it just inject requests into the queuing server. You will also have to use 2 queues.
For clarity, let's enumerate the system players:
The client
The proxy server, a server that offers the same API offered by the actual (but it doesn't do any work)
The actual server, the server that does the actual work
The input queue, the queue where clients requests go into (proxy server does that)
The output queue, the queue where server responses go into (actual server does that)
A possible working scenario:
A client sends a request to the proxy server
The proxy server puts the request in input queue
The actual server (listening to the input queue) will fetch the request
The actual server process the message
The actual server sends the response to the output queue
The proxy server (listening to the output queue) will fetch the response
The proxy server returns the response to the client
This might work, but few problems could happen, e.g. because the proxy server doesn't know when the actual server will response, and, it cannot be sure of the order of responses in the output queue, it may have to re-inject the messages it finds not relevant to the output queue until it finds the correct message.
Or, the proxy server might need to feed the response to the client later via an HTTP request to the client. That is, rather than a response to the client's request, the client will expect no response for the request it sent knowing that it will be get the answer later via a request from the proxy server.
I'm not aware of the situation at your end, but this might work!

TCP server test

i have a TCP server (listener) software written in C#. Many devices (approximately 5000) will connect to server asynchronously and send/receive messages to/from server. Now, i have 2 questions.
I have to send reply messages to every received message. Which way should i use? Asynchronous (asap when message received) or synchronous (sending replies using a reply task).
How can i strain test my server? I can communicate with 1-2 computers successfully but i don't know that my software works fine for 5000 devices.
Judging from what your saying, your server or listener is expected to be available to respond to multiple requests at any given time. The key is how has it been implemented ? Does the server support multi client response, in other words can it fulfill requests of multiple clients at the same time ? May be using multiple threads etc ! Or does it use a queue to keep track of all requests and then serve them in a orderly fashion, or does it use some other method to serve requests !

Getting result of a long running task with RabbitMQ

I have a scenario where a client sends an http request to download a file. The file needs to be dynamically generated and typically takes 5-15 seconds. Therefore I am looking into a solution that splits this operation in 3 http requests.
First request triggers the generation of the file.
The client polls the server every 5 seconds to check if file is ready to download
When the response to the poll request is positive, the client starts downloading the file
To implement this I am looking into Message Queue solutions like RabbitMQ. They seem to provide a reliable framework to run long running tasks asynchronously. However after reading the tutorials on RabbitMQ, I am not sure how will I receive the result of the operation.
Here is what I've in mind:
A front end server receives requests from clients and it posts messages to RabbitMQ as required. This front end server will have 3 endpoints
/generate
/poll
/download
When client invokes /generate with a GET parameter say request_uid=AAA, the front end server will post a message to RabbitMQ with the request_uid in the payload. Any free worker will subsequently receive this message and start generating the file corresponding to AAA.
Client will keep polling /poll with request_uid=AAA to check if task was complete.
When task is complete client will call /download with request_uid=AAA expecting to download the file.
The question is how will the /poll and /download handlers of the front end server will come to know about the status of the file generation job? How can RabbitMQ communicate the result of the task back to the producer. Or do I have to implement such mechanism outside RabbitMQ? (Consumer putting its results in a file /var/completed/AAA)
The easiest way to get started with AMQP, is to use a topic exchange, and to create queues which carry control messages. For instance you could have a file.ready queue and send messages with the file pathname when it is ready to pickup, and a file.error queue to report when you were unable to create a file for some reason. Then the client could use a file.generate queue to send the GET information to the server.
You hit the nail on the head with your last line:
(Consumer putting its results in a
file /var/completed/AAA)
Your server has to coordinate multiple jobs and the results of their work. Therefore you will need some form of "master repository" which contains an authoritative record of what has been finished already. Copying completed files into a special directory is a reasonable and simple way of doing exactly that.
It doesn't necessarily need RabbitMQ or any messaging solution, either. Your server can farm out jobs to those workers any way it wishes: by spawning processes, using a thread pool, or indeed by producing AMQP events which end up in a broker and get sucked down by "worker" queue consumers. It's up to your application and what is most appropriate for it.