From time to time queries gets canceled without a reason. I don't know if the query gets canceled by the coordinator or by the app who initiated the queries.
Anyone had this problem before or knows how to solve it?
coordinator.cc started execution on 9 backends for query_id=xxx
client-request-state.cc:545] xxx] Cancelled right after starting the coordinator query id=xxx
coordinator.cc:488] xxx] ExecState: query id=xxx execution cancelled
coordinator-backend-state.cc:445] xxx] Sending CancelQueryFInstances rpc for query_id=xxx
coordinator-backend-state.cc:445] xxx] Sending CancelQueryFInstances rpc for query_id=xxx
control-service.cc:169] CancelQueryFInstances(): query_id=xxx
query-exec-mgr.cc:97] QueryState: query_id=xxx refcnt=5
query-state.cc:667] Cancel: query_id=xxx
krpc-data-stream-mgr.cc:326] cancelling all streams for fragment_instance_id=xxx
krpc-data-stream-mgr.cc:326] cancelling all streams for fragment_instance_id=xxx
Related
I'm quite new into the reactive world and using Spring Webflux + reactor Kafka.
kafkaReceiver
.receive()
// .publishOn(Schedulers.boundedElastic())
.doOnNext(a -> log.info("Reading message: {}", a.value()))
.concatMap(kafkaRecord ->
//perform DB operation
//kafkaRecord.receiverOffset.ackwnowledge
)
.doOnError(e -> log.error("Error", e))
.retry()
.subscribe();
I understood that in order to parallelise message consumption, I have to instantiate one KafkaReceiver for each partition but is it possible/recommended for a partition to read messages in a synchronous manner and process them async (including the manual acknowledge)?
So that this is the desired output:
Reading message:1
Reading message:2
Reading message:3
Reading message:4
Stored message 1 in DB + ack
Reading message:5
Stored message 2 in DB + ack
Stored message 5 in DB + ack
Stored message 3 in DB + ack
Stored message 4 in DB + ack
In case of errors, I'm thinking of publishing the record to a DLT.
I've tried with flatMap too, but it seems that the entire processing happens sequentially on a single thread. Also if I'm publishing to a new scheduler, the processing happens on a new single Thread.
If what I'm asking is possible, can someone please help me with a code snippet?
What's the output of your current code log ?
When using StreamMessageListenerContainer a subscription for a consumer group can be created by calling:
receive(consumer, readOffset, streamListener)
Is there a way to configure the container/subscription so that it will always attempt to re-process any PENDING messages before moving on to polling for new messages?
The goal would be to keep retrying any message that wasn't acknowledged until it succeeds, to ensure that the stream of events is always processed in exactly the order it was produced.
My understanding is if we specify the readOffset as '>' then on every poll it will use '>' and it will never see any messages from the PENDING list.
If we provide a specific message id, then it can see messages from the PENDING list, but the way the subscription updates the lastMessageId is like this:
pollState.updateReadOffset(raw.getId().getValue());
V record = convertRecord(raw);
listener.onMessage(record);
So even if the listener throws an exception, or just doesn't acknowledge the message id, the lastMessageId in pollState is still updated to this message id and won't be seen again on the next poll.
We have a spring-webflux application running on spring-boot-starter-webflux:jar:2.1.5.RELEASE and reactor-netty:jar:0.8.8.RELEASE. When a reactive client goes away (k8s pod killed or the client subscription is disposed off) before the request is completed, we see the server simply stops processing the request and no more application logs related to that request are seen. However, reactive-netty prints a trace log indicating that the channel is inactive and will be terminated.
Is there anything we can do to handle this termination gracefully? We would ideally like to respond to a cancellation signal from reactor.
2020-03-18T16:46:40.372Z TRACE --- [reactor-http-epoll-2] r.n.c.ChannelOperations : [id: 0x4cad974b, L:/<SERVER_IP_ADDRESS>:8080 ! R:/<SOME_OTHER_IP_ADDRESS>:55436] Disposing ChannelOperation from a channel
java.lang.Exception: ChannelOperation terminal stack
at reactor.netty.channel.ChannelOperations.terminate(ChannelOperations.java:391)
at reactor.netty.channel.ChannelOperations.onInboundClose(ChannelOperations.java:360)
at reactor.netty.channel.ChannelOperationsHandler.channelInactive(ChannelOperationsHandler.java:72)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireChannelInactive(CombinedChannelDuplexHandler.java:420)
at io.netty.handler.codec.ByteToMessageDecoder.channelInputClosed(ByteToMessageDecoder.java:393)
at io.netty.handler.codec.ByteToMessageDecoder.channelInactive(ByteToMessageDecoder.java:358)
at io.netty.channel.CombinedChannelDuplexHandler.channelInactive(CombinedChannelDuplexHandler.java:223)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:236)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1416)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:257)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:243)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:912)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:816)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:416)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:331)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:918)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.lang.Thread.run(Thread.java:834)
The doOnCancel operator allows for a callback to be provided when a subscription is cancelled. This is different from the doOnEach operator which is run only when the subscription completes or errors out.
https://projectreactor.io/docs/core/release/api/reactor/core/publisher/Mono.html#doOnCancel-java.lang.Runnable-
I created a small application (Spring Boot and camunda) to process an order process. The Order-Service receives the new order via Rest and calls the Start Event of the BPMN Order workflow. The order process contains two asynchronous JMS calls (Customer check and Warehouse Stock check). If both checks return the order process should continue.
The Start event is called within a Spring Rest Controller:
ProcessInstance processInstance =
runtimeService.startProcessInstanceByKey("orderService", String.valueOf(order.getId()));
The Send Task (e.g. the customer check) sends the JMS message into a asynchronous queue.
The answer of this service is catched by a another Spring component which then trys to send an intermediate message:
runtimeService.createMessageCorrelation("msgReceiveCheckCustomerCredibility")
.processInstanceBusinessKey(response.getOrder().getBpmnBusinessKey())
.setVariable("resultOrderCheckCustomterCredibility", response)
.correlate();
I deactivated the warehouse service to see if the order process waits for the arrival of the second call, but instead I get this exception:
1115 06:33:08.564 WARN [o.c.b.e.jobexecutor] ENGINE-14006 Exception while executing job 67d2cc24-0769-11ea-933a-d89ef3425300:
org.springframework.messaging.MessageHandlingException: nested exception is org.camunda.bpm.engine.MismatchingMessageCorrelationException: ENGINE-13031 Cannot correlate a message with name 'msgReceiveCheckCustomerCredibility' to a single execution. 4 executions match the correlation keys: CorrelationSet [businessKey=1, processInstanceId=null, processDefinitionId=null, correlationKeys=null, localCorrelationKeys=null, tenantId=null, isTenantIdSet=false]
This is my process. I cannot see a way to post my bpmn file :-(
What can't it not correlate with the message name and the business key? The JMS queues are empty, there are other messages with the same businessKey waiting.
Thanks!
Just to narrow the problem: Do a runtimeService eventSubscription query before you try to correlate and check what subscriptions are actually waiting .. maybe you have a duplicate message name? Maybe you (accidentally) have another instance of the same process running? Once you identified the subscriptions, you could just notify the execution directly without using the correlation builder ...
Basically after every IN, OUT or SETUP transaction we have an ACK/NAK packet at the end of the transaction. If a handshake packet is part of every transfer as it comes after the data packet which is preceded by token packet, then why do we need a status stage? This seems to be present in the control transfer.
In the protocol endpoints are in a status: ACTIVE, HALT, STALL,...
in the status phase this status is determined (GET_STATUS(0x00) request (http://www.beyondlogic.org/usbnutshell/usb6.shtml) )
The status phase check is a bit like a CRC checksum over the entire request not over each single packet.
http://www.beyondlogic.org/usbnutshell/usb4.shtml:
"
Status Stage reports the status of the overall request and this once again varies due to direction of transfer. Status reporting is always performed by the function.
IN: If the host sent IN token(s) during the data stage to receive data, then the host must acknowledge the successful recept of this data. This is done by the host sending an OUT token followed by a zero length data packet. The function can now report its status in the handshaking stage. An ACK indicates the function has completed the command is now ready to accept another command. If an error occurred during the processing of this command, then the function will issue a STALL. However if the function is still processing, it returns a NAK indicating to the host to repeat the status stage later.
OUT: If the host sent OUT token(s) during the data stage to transmit data, the function will acknowledge the successful recept of data by sending a zero length packet in response to an IN token. However if an error occurred, it should issue a STALL or if it is still busy processing data, it should issue a NAK asking the host to retry the status phase later.
"
or see http://wiki.osdev.org/Universal_Serial_Bus
"
Finally, a STATUS transaction from the function to the host indicates whether the [control] transfer was successful.
"