I have a class which listens to events coming from a socket at a very fast pace. I would like to feed these events into a coroutine Channel. The following code is used:
class MyClass(channel: Channel<String>) : ... {
...
override onMessageReceived(message: String) {
MyScope.launch {
channel.send(message)
}
}
}
This does not work since sometimes the events come in so fast that they end up getting posted out of order due to the launch spawning a new coroutine and everything happening in parallel. How can I ensure the order of the send is synchronous?
I tried newSingleThreadContext which did work however it is considered experimental and has a note saying it will be removed eventually. I am looking for a more solution that is more correct and complete.
Instead of launching the sends in parallel, you should use a Channel with a capacity of Channel.UNLIMITED, and have onMessageReceived use offer instead of send.
This is a lot cheaper than launching a new job for each send, and the channel will preserve the order
Related
I'm learning kotlin coroutines and flows and one thing is a little bit obscure to me. In case I have a long running loop for the regular coroutines I can use isActive or ensureActive to handle cancelation. However those are not defined for a flow but nevertheless the following code properly finishes the flow:
import kotlinx.coroutines.Dispatchers
import kotlinx.coroutines.flow.*
import kotlinx.coroutines.runBlocking
import org.slf4j.LoggerFactory
private val logger = LoggerFactory.getLogger("Main")
fun main() {
val producer = FlowProducer()
runBlocking {
producer
.produce()
.take(10)
.collect {
logger.info("Received $it")
}
}
logger.info("done")
}
class FlowProducer {
fun produce() = flow {
try {
var counter = 1
while (true) {
logger.info("Before emit")
emit(counter++)
logger.info("After emit")
}
}finally {
logger.info("Producer has finished")
}
}.flowOn(Dispatchers.IO)
}
Why is that a case? Is it because the emit is a suspendable function that handles cancelation for me? What to do in case the emit is called conditionally? For example that loop actually polls records from Kafka and calls emit only when the received records are not empty. Then we can have the situation that:
We want 10 messages(take 10)
Actually there are only 10 messages on the kafka topic
Since there are no more messages the emit won't be called again and therefore even though we received all messages we want, the loop will continue to waste resources on unnecessary polling.
Not sure if my understanding is correct. Should I call yield() on each loop in such case?
The important thing to remember here is that flows are "cold", at least in their simple form. What that means is that a flow isn't capable of doing any work except while you are actively consuming data from it. A cold flow doesn't have a coroutine associated with it. You can learn a little more from this blog post by Roman Elizarov.
When you call collect on a flow, control is tranferred from the collector to the flow. This is what enables the flow to do work. The collector is effectively executing the code inside the flow. When the flow calls emit, control transfers back to the collector. If you're familiar with Kotlin's sequence builder, you can think of flows very similarly.
By definition, this means that if you stop collecting the flow, the flow stops doing any work. In your case, because you used take(10), the collector will stop executing the flow once it has received ten items. Because the collector is the thing that's actually executing the loop inside the flow, the loop doesn't continue to run when the collector is no longer collecting. Once you stop using the flow, it's just like an iterator that's no longer being iterated over. It can be garbage collected like any other object.
You asked whether you should call yield() inside your flow. There are some situations where this could be useful, and you can read more about flow cancellation checks in the docs. In your case, it's not necessary, because:
The cancellation checks are only needed to detect when something has cancelled the coroutine that is executing the flow. When the flow aborts itself, such as when take(10) has emitted 10 items, it simply terminates normally, without cancelling any coroutines.
The flow is built using emit, which already checks for cancellation.
Even when cancellation checks aren't required, it's still possible to create a flow that runs forever. As mentioned above, control only transfers back to the collector each time the flow calls emit. So if your flow runs indefinitely without calling emit, it will never return control back to the collector. This is the same as writing an infinite loop in normal code, and isn't particularly special to flows.
Note that it is possible to create a hot flow that has a coroutine doing work in the background. In that case, you would need to make sure that the coroutine responds correctly to cancellation of the flow.
Yes, emit will throw CancellationException when take cancels the flow.
The Kafka example you give will actually work, because take will cancel the flow at the end of the 10th emit, not at the start of the 11th.
Im using Spring Webflux (with spring-reactor-netty) 2.1.0.RC1 and Lettuce 5.1.1.RELEASE.
When I invoke any Redis operation using the Reactive Lettuce API the execution always switches to the same individual thread (lettuce-nioEventLoop-4-1).
That is leading to poor performance since all the execution is getting bottlenecked in that single thread.
I know I could use publishOn every time I call Redis to switch to another thread, but that is error prone and still not optimal.
Is there any way to improve that? I see that Lettuce provides the ClientResources class to customize the Thread allocation but I could not find any way to integrate that with Spring webflux.
Besides, wouldn't the current behaviour be dangerous for a careless developer? Maybe the defaults should be tuned a little. I suppose the ideal scenario would be if Lettuce could just reuse the same event loop from webflux.
I'm adding this spring boot single class snippet that can be used to reproduce what I'm describing:
#SpringBootApplication
public class ReactiveApplication {
public static void main(String[] args) {
SpringApplication.run(ReactiveApplication.class, args);
}
}
#Controller
class TestController {
private final RedisReactiveCommands<String, String> redis = RedisClient.create("redis://localhost:6379").connect().reactive();
#RequestMapping("/test")
public Mono<Void> test() {
return redis.exists("key")
.doOnSubscribe(subscription -> System.out.println("\nonSubscribe called on thread " + Thread.currentThread().getName()))
.doOnNext(aLong -> System.out.println("onNext called on thread " + Thread.currentThread().getName()))
.then();
}
}
If I keep calling the /test endpoint I get the following output:
onSubscribe called on thread reactor-http-nio-2
onNext called on thread lettuce-nioEventLoop-4-1
onSubscribe called on thread reactor-http-nio-3
onNext called on thread lettuce-nioEventLoop-4-1
onSubscribe called on thread reactor-http-nio-4
onNext called on thread lettuce-nioEventLoop-4-1
That's an excellent question!
The TL;DR;
Lettuce always publishes using the I/O thread that is bound to the netty channel. This may or may not be suitable for your workload.
The Longer Read
Redis is single-threaded, so it makes sense to keep a single TCP connection. Netty's threading model is that all I/O work is handled by the EventLoop thread that is bound to the channel. Because of this constellation, you receive all reactive signals on the same thread. It makes sense to benchmark the impact using various reactive sequences with various options.
A different usage scheme (i.e. using pooled connections) is something that changes directly the observed results as pooling uses different connections and so notifications are received on different threads.
Another alternative could be to provide an ExecutorService just for response signals (data, error, completion). In some scenarios, the cost of context switching can be neglected because of the removing congestion in the I/O thread. In other scenarios, the context switching cost might be more notable.
You can already observe the same behavior with WebFlux: Every incoming connection is a new connection, and so it's handled by a different inbound EventLoop thread. Reusing the same EventLoop thread for outbound notification (that one, that was used for inbound notifications) happens quite late when writing the HTTP response to the channel.
This duality of responsibilities (completing a command, performing I/O) can experience some gravity towards a more computation-heavy workload which drags performance out of I/O.
Additional resources:
Investigate on response thread switching #905.
I am fairly new to RabbitMQ, and starting on a project that is using RabbitMQ in a fairly old-fashioned "RPC" pattern. So I'm trying something like this on the "server" side:
ConnectionFactory factory = new ConnectionFactory();
factory.setUri(uri);
Connection connection = factory.newConnection();
Channel channel = connection.createChannel();
channel.queueDeclare(queueName, false, false, false, null);
while (!shutdown) {
GetResponse gr = channel.basicGet(queueName, true);
... build reply ...
channel.basicPublish("", gr.getProps().getReplyTo(), replyProps, response);
}
My question is: can the thread waiting on basicGet() be interrupted? If so, what happens (InterruptedException is not declared). It realize this is not a great pattern, but I just want some way to cleanly shutdown a service.
UPDATE: one comment indicates that basicGet() does not block at all, and returns immediately if the queue is empty. If that is the case, let me revise my question to be more precise: How do I wait for a message on a queue and retrieve it, with a timeout?
UPDATE2: After experimenting and asking questions on the rabbitmq mailing list, I conclude that this cannot be done directly. It is simply not The Way That You Do Things in RabbitMQ. Instead, you launch a consumer thread pool using Channel.basicConsume() and wait for your handler method to be called. It can be done indirectly by having your consumer post to a SynchronizedQueue or something similar and having your foreground thread(s) wait on that, but be warned that this defeats the automatic scaling offered by basicConsume() and also makes it harder to properly ACK all requests, and also creates additional message buffering that makes it difficult to honor QOS semantics set by the basicQos() call.
It should also be noted that, once you go down the basicConsume() route, the consumer can be interrupted. This is done something like:
// This starts a background thread pool
String consumerTag = channel.basicConsume(consumer);
...
// Shutdown the consumer thread pool
channel.basicCancel(consumerTag);
UPDATE3: See last answer. RabbitMQ comes with an RpcClient class that works splendidly.
basicGet doesn't block, it returns immediately (well, just after a network roundtrip) and returns null if there's no messages on the queue. So it's not necessary to interrupt the thread.
RabbitMQ java API comes with a very nice implementation of an RPCClient called, (unsurprisingly) RPCClient. Use it! I made an example in github
So I am running into a race condition and I have a few solutions on how to fix the issue. I am new to threading so obviously, my opinion and research is limited. I have a large amount of asynchronization calls that can happen if a user receives certain messages from server. Thus, my design is poor due to the dependent nature of my objects.
Lets say I have a function called
adduser:(NSString s){
does some asynchronize activity
}
Messageuser:(NSString s)
{
Does some more asychronize activity
}
if a user were to recieve a message telling it to addUser "Ryan". he would than create a thread and proceed with looking up Ryan and storing him. However, if the user has the application in suspended mode, and in the buffered of messages waiting to be recieved there is a addUser request and a MessageUser request, a race condition occures because it takes longer to complete Adduser than it does to complete MessageUser. Thus, If messageUser is called , and (in our example) "Ryan" has not been fully added, it throws an error.
What would be a possible solution to this issue. I looked into locks and semaphores, and what I am trying to do is, when MessageUser recieves a call, check to make sure there is no thread currently proccessing addUser. If there is none, proceed. Else wait, than proceed after it has finished.
Well it depends on how the messages are being issued in the first place and what the async response events are.
If the operations have dependencies (ordering requirements) then perhaps a background serial queue would be appropriate? That is a simple way to ensure the messages are processed in order.
If the async operations take completion blocks, then you could have the completion block issue the request for the next operation to be performed, though you may not know about that ahead of time.
If you need to solve this in a more general way then you need some kind of system for tracking prerequisites so you can skip work items that don't have their prerequisites met yet. That probably means your own background thread that monitors a list of waiting tasks and receives notification of all task completions so it can scan for items waiting on that completion and issue them.
It seems really complicated though... I suspect you don't really have such strong async parallel processing requirements and a much simpler design would be just as effective. Given your situation where you are receiving messages from a server, I think a serial queue would be the best option. Then you can process messages in the order the server sent them and keep things simple.
//do this once at app startup
dispatch_queue_t queue = dispatch_queue_create("com.example.myapp", NULL);
//handle server responses
dispatch_async(queue, ^{
//handle server message here, one at a time
});
In reality, depending on how you connect to your server you might be able to just move the entire connection handling to the background queue and communicate with it via messages from the UI, and update the UI by dispatching to the dispatch_get_main_queue() which will be the UI thread.
I have a producer sending durable messages to a RabbitMQ exchange. If the RabbitMQ memory or disk exceeds the watermark threshold, RabbitMQ will block my producer. The documentation says that it stops reading from the socket, and also pauses heartbeats.
What I would like is a way to know in my producer code that I have been blocked. Currently, even with a heartbeat enabled, everything just pauses forever. I'd like to receive some sort of exception so that I know I've been blocked and I can warn the user and/or take some other action, but I can't find any way to do this. I am using both the Java and C# clients and would need this functionality in both. Any advice? Thanks.
Sorry to tell you but with RabbitMQ (at least with 2.8.6) this isn't possible :-(
had a similar problem, which centred around trying to establish a channel when the connection was blocked. The result was the same as what you're experiencing.
I did some investigation into the actual core of the RabbitMQ C# .Net Library and discovered the root cause of the problem is that it goes into an infinite blocking state.
You can see more details on the RabbitMQ mailing list here:
http://rabbitmq.1065348.n5.nabble.com/Net-Client-locks-trying-to-create-a-channel-on-a-blocked-connection-td21588.html
One suggestion (which we didn't implement) was to do the work inside of a thread and have some other component manage the timeout and kill the thread if it is exceeded. We just accepted the risk :-(
The Rabbitmq uses a blocking rpc call that listens for a reply indefinitely.
If you look the Java client api, what it does is:
AMQChannel.BlockingRpcContinuation k = new AMQChannel.SimpleBlockingRpcContinuation();
k.getReply(-1);
Now -1 passed in the argument blocks until a reply is received.
The good thing is you could pass in your timeout in order to make it return.
The bad thing is you will have to update the client jars.
If you are OK with doing that, you could pass in a timeout wherever a blocking call like above is made.
The code would look something like:
try {
return k.getReply(200);
} catch (TimeoutException e) {
throw new MyCustomRuntimeorTimeoutException("RabbitTimeout ex",e);
}
And in your code you could handle this exception and perform your logic in this event.
Some related classes that might require this fix would be:
com.rabbitmq.client.impl.AMQChannel
com.rabbitmq.client.impl.ChannelN
com.rabbitmq.client.impl.AMQConnection
FYI: I have tried this and it works.