Do redis streams scale when creating a new stream per client - redis

I am attempting to create a microservice where clients are connected to a certain service A over a TCP connection and a variety of actions are performed on other services within the microservice system (say B, C, D etc.) based on user interaction or other events, and I need to propagate results from these services B, C, D back to service A to be returned to the client.
Since many of these services perform long lived actions, is using redis streams as a buffer to store results from B, C, D to then be propagated to the client by A make sense? Considering a new different redis key is used for each client, will this scale well for thousands of connections? Is redis the right choice for event propagation on a 1:1 basis like this?
Kafka seems like a bad choice because all consumers are delivered every single message. Does it make sense to use something like ActiveMQ instead?

Related

Can AMQP messages be sent both to a topic and with a TTL/expiration?

I'm using RabbitMQ and the amiquip Rust crate to build out several services that will be processing some data in multiple steps. Roughly, this might look like:
Service A ingests data from external source, publishes its results to Topic A
Service B subscribes to Topic A, does some processing, publishes results to Topic B
Service C subscribes to Topic B, does some processing, publishes results to Topic C
Each step along the way, the data are further refined. I will need to be able to shut down different services for maintenance without missing messages that they're reading (eg, Service B may be taken down briefly, but the messages published by Service A to Topic A must remain in the queue until Service B comes back online). I am okay with setting some TTL/expiration (not sure what the right terminology is for AMQP); for example, if Service B doesn't come back online after 5 minutes, it's okay if messages published to the topic are lost).
Additionally, there may be another service that should also be able to subscribe to a topic without interfering with another service reading it. For example, Service C2 gets a copy of all messages in Topic B and does something with them; every message read by Service C2 is also read by Service C (no stepping on each other's feet).
I don't know the right terminology used here, so I'm at a bit of a loss for what I should be looking for. Is this possible with AMQP & RabbitMQ?

How to scale Redis Queue

We are shifting from Monolithic to Microservice Architecture for our e-commerce marketplace application. We chosen Redis pub/sub for microservice to microservice communication and also for some push notification purpose. Push notification strategy is like below:
Whenever an order is created (i,e customer creates an order), the backend publishes an event in respective channel (queue) and the specific push-notification-microservice consumes this event (json message) and sends push notification to the seller mobile.
For the time being we are using redis-server installed in our ubuntu machine without any hassle. But the headache is in future when millions of order will be generated in a point of time then how can we handle this situation ? That means, we need to scale the Redis Queue, right ?
My exact clean question (regardless the above scenario) is:
How can I horizontally scale Redis Queue instead of increasing the RAM in same machine ?
Whenever an order is created (i,e customer creates an order), the
backend publishes an event in respective channel (queue) and the
specific push-notification-microservice consumes this event (json
message) and sends push notification to the seller mobile.
IIUC you're sending a message over Redis PUB/SUB, that's not durable that means if the only producer is up and other services/consumers are down then consumers will miss messages. Any services that are down will lose all those messages that are sent when the said service was down.
Now let's assume, you're using Redis LIST and other combinations of data structures to solve the missing events issue.
Scaling Redis queue is a little bit tricky since entire data is stored in a list, that resides on a single Redis machine/host. What you can do is create your own partitioning scheme and design your Redis keys as per the partitioning scheme as Redis does internally when we add a new master in the cluster, creating consistent hashing would require some efforts.
Very simple you can distribute loads based on the userId for example if userId is between 0 and 1000 then use queue_0, 1000-2000 queue_1, and so on. This is a manual process that you can be automated using some script. Whenever a new queue is added to the set all consumers have to be notified and the publisher will be updated as well.
Dividing based on the number is a range partition scheme, you can use a hash partition scheme as well, either you use a range or hash partitioning scheme, whenever a new queue is added to the queue set the consumers must be notified for potential updates. Consumers can spawn a new worker for the new queue, removing a queue could be tricky as all consumers must have drained their respective queue.
You might consider using Rqueue

Redis Stale Data

I'm new at Redis. I'm designing a pub/sub mechanism, in which there's a specific channel for every client (business client) that has at least one user (browser) connected. Those users then receive information of the client to which they belong.
I need Redis because I have a distributed system, so there exists a backend which pushes data to the corresponding client channels and then exists a webapp which has it's own server (multiple instances) that holds the users connections (websockets).
Resuming:
The backend is the publisher and webapp server is the subscriber
A Client has multiple Users
One channel per Client with at least 1 User connected
If Client doesn't have connected Users, then no channel exists
Backend pushes data to every existing Client channel
Webapp Server consumes data only from the Client channels that correspond to the Users connected to itself.
So, in order to reduce work, from my Backend I don't want to push data to Clients that don't have Users connected. So it seems that I need way to share the list of connected Users from my Webapp to my Backend, so that the Backend can decide which Clients data push to Redis. The obvious solution to share that piece of data would be the same Redis instance.
My approach is to have a key in Redis with something like this:
[USERS: User1/ClientA/WebappServer1, User2/ClientB/WebappServer1,
User3/ClientA/WebappServer2]
So here comes my question...
How can I overcome stale data if for example one of my Webapps nodes crashes and it doesn't have the chance to remove the list of connected Users to it from Redis?
Thanks a lot!
Firstly, good luck with the overall project - sounds challenging and fun :)
I'd use a slightly different design to keep track of my users - have each Client/Webapp maintain a set (possibly sorted with login time as score) of their users. Set a TTL for the set and have the client/webapp reset it periodically, or it will expire if the owning process crashes.

REST, WCF and Queues

I created a RESTful service using WCF which calculates some value and then returns a response to the client.
I am expecting a lot of traffic so I am not sure whether I need to manually implement queues or it is not neccessary in order to process all client requests.
Actually I am receiving measurements from clients which have to be stored to the database - each client sends a measurement every 200 ms so if there are a multiple clients there could be a lot of requests.
And the other operation performed on received data. For example a client could send an instruction "give me the average of the last 200 measurements" so it could take some time to calculate this value and in the meantime the same request could come from another client.
I would be very thankful if anyone could give any advice on how to create a reliable service using WCF.
Thanks!
You could use the MsmqBinding and utilize the method implemented by eedsi9n. However, from what I'm gathering from this post is that you're looking for something along the lines of a pub/sub type of architecture.
This can be implemented with the WSDualHttpBinding which allows subscribers to subscribe to events. The publisher will then notify the user when the action is completed.
Therefore you could have Msmq running behind the scenes. The client subscribes to the certain events, then perhaps it publishes a message that needs to be processed. THe client sits there and does work (because its all async) and when the publisher is done working on th message it can publish an event (The event your client subscribed to) letting you know that its done. That way you don't have to implement a polling strategy.
There are pre-canned solutions for this as well. Such as NService Bus, Mass Transit, and Rhino Bus.
If you are using Web Service, Transmission Control Protocol (TCP/IP) will act as the queue to a certain degree.
TCP provides reliable, ordered
delivery of a stream of bytes from one
program on one computer to another
program on another computer.
This guarantees that if client sends packet A, B, then C, the server will received it in that order: A, B, then C. If you must reply back to the client in the same order as request, then you might need a queue.
By default maximum ASP.NET worker thread is set to 12 threads per CPU core. So on a dual core machine, you can run 24 connections at a time. Depending on how long the calculation takes and what you mean by "a lot of traffic" you could try different strategies.
The simplest one is to use serviceTimeouts and serviceThrottling and only handle what you can handle, and reject the ones you can't.
If that's not an option, increase hardware. That's the second option.
Finally you could make the service completely asynchronous. Implement two methods
string PostCalc(...) and double GetCalc(string id). PostCalc accepts the parameters, stuff them into a queue (or a database) and returns a GUID immediately (I like using string instead of Guid). The client can use the returned GUID as a claim ticket and call GetCalc(string id) every few seconds, if the calculation has not finished yet, you can return 404 for REST. Calculation must now be done by a separate process that monitors the queue.
The third option is the most complicated, but the outcome is similar to that of the first option of putting cap on incoming request.
It will depend on what you mean by "calculates some value" and "a lot of traffic". You could do some load testing and see how the #requests/second evolves with the traffic.
There's nothing WCF specific here if you are RESTful
the GET for an Average would give a URI where the answer would wait once the server finish calculating (if it is indeed a long operation)
Regarding getting the measurements - you didn't specify the freshness needed (i.e. when you get a request for an average - how fresh do you need the results to be) Also you did not specify the relative frequency of queries vs. new measurements
In any event you can (and IMHO should) use the queue (assuming measuring your performance proves it) behind the endpoint. If you change the WCF binding you might still be RESTful but will not benefit from the standard based approach of REST over HTTP

How does WCF Reliable Sessions affect message ordering?

One of the things that the Microsoft documentation says about enabling reliable sessions is that the service will be able to process messages in the order that they were received.
Does this mean that messages within a single session are processed in order? Or does it mean that all messages for all sessions within the service are processed in order?
I know that netTcpBinding is reliable already, without enabling reliable sessions. However, say you use something like WsDualHttpBinding without reliable sessions enabled... is it possible that if the client sends request A and then sends request B that the service might receive B before A? Or does it mean that if client A sends message A and client B sends message B, that I might process B before A?
The service might receive B before A, but reliable sessions will place the messages in a buffer and only process them in the order they were sent within the session. It will not gaurentee order between different sessions, only within the same session that is created by the client.