Task queue using rabbitMQ - rabbitmq

We have a system that receive payment information for invoices from the bank system. It works like that:
Invoice is created in our system
Payment for the invoice is done through Bank system. Bank system request invoice details from our system and it returns the invoice details
Bank system goes through payment process, and sends payment details to our system and it will wait for confirm message for 30 seconds. If the bank system does not receive confirm message within 30 seconds, bank cancels the payment but does not inform our system about cancellation.
Our system receives the payment info, and saves the payment. Then
sends the confirm message to bank. But sometimes, because of network
or system issue, confirm message won't be delivered within 30
seconds and we became unaware of cancelled message status.
So, the problem is our system saves the payment but sometimes cannot respond on time for payment confirmation request (within 30 secs), in this case, bank cancels the payment and our system doesn't know the payment is cancelled.
I have developed a solution that checks an each payment if the payment is successful (after 30 seconds of receiving the payment) by sending a request to check payment method Bank provided. Tasks (send the payment id to check_payment method of bank system - it returns payment status) are executed in separate threads using thread pool of Spring framework. But I am afraid that it is not the best solution as there is a risk of being full of thread pool when network failure happens .
What solutions would you recommend? Can we use RabbitMQ for this issue?

You are essentially implementing stateful orchestration. I would recommend looking into Cadence Workflow that capable of supporting your use case with minimal effort.
Cadence offers a lot of other advantages over using queues for task processing.
Built it exponential retries with unlimited expiration interval
Failure handling. For example it allows to execute a task that notifies another service if both updates couldn't succeed during a configured interval.
Support for long running heartbeating operations
Ability to implement complex task dependencies. For example to implement chaining of calls or compensation logic in case of unrecoverble failures (SAGA)
Gives complete visibility into current state of the update. For example when using queues all you know if there are some messages in a queue and you need additional DB to track the overall progress. With Cadence every event is recorded.
Ability to cancel an update in flight.
See the presentation that goes over Cadence programming model.

Related

Is it possible to return qRFC (SMQ2 transaction) errors to the caller?

In my landscape, the ERP system is creating deliveries in EWM via qRFC. The setup is standard and works via a distribution model in the ERP BD64 transaction. A BAPI is called that creates a delivery replica in EWM.
Sometimes the ERP deliveries are not properly validated and don't fulfill the requirements to be distributed to EWM, but they are still sent. In that case the error stays in EWM SMQ2 queue.
I want to prevent this from happening because if the issue needs to be solved in the ERP side, the error shouldn't stay in EWM. The obvious option is to implement a BADI in the ERP before distributing the delivery that would call a EWM validation API. If the API rejects the delivery, then the distribution should be prevented.
However, if for whatever reason the validation API is not called, the ERP could still send erroneous deliveries to EWM that would get stuck in the SMQ2 queue.
Is there a way to prevent this from happening? In case of an error (when validating the ERP delivery) in EWM qRFC processing, I would like to remove the faulty record from the queue, return the error message to the ERP and mark the ERP delivery as not distributed or distribution error.
Can this be done in a more or less standard way?
This ASYNC failure pattern is known in SAP ABAP stacks.
Sometimes there is a sync call for check before post calls.
When this isnt present, you might consider a sync call instead.
Without knowing why ASYNC inbound queue is used in the first place, it is hard to advise about workarounds or process improvements.
ASYNC inbound Queues have key weaknesses. Monitoring options are very week.
No APIs to read the to be posted data. No options to "fix" the data and retry etc.
Using a wrapper function on the EWM to react to the error and send feedback is an option.

How does Two Phase Commit really work at low level?

There are tons of articles on 2 phase commit on the internet.
They are all saying the same thing and I am not getting it. I need a low-level understanding of it.
The orders and Payment service example is the most popular example on the internet.
Let's say we have Orders service and Payments service. When an order is placed, the Order service writes it to its database but the payment service must also write it to its own database in order for the transaction to be complete.
Here is my inadequate understanding:
User sends a place order request to Orchestrator
Orchestrator invokes Orders service to as well as Payment service at the same time. Now according to what I have read, Order and Payment services are supposed to respond to Orchestrator by telling it whether or not they are ready. What does that mean? What does it mean to be ready here?
Order and Payment service respond back, telling Orchestrator that they are "ready" (whatever that means).
Orchestrator sends another request to both the services (commit request).
Order writes the record to its database. The Payment service writes the record to its own database. They both respond back with status 200 to Orchestrator.
Orchestrator checks if both of the participants have returned status code 200. If yes, then it does nothing. If no then it asks them to ABORT?? How? One of the participants already wrote the transaction to its database.
Two-phase commit is all about handling failures and what they mean.
In phase 1, the orchestrator tells the order and payment services to prepare to commit. If everything goes well, they both respond with "prepared", which means:
The transaction is recorded durably, and marked prepared
There is no possibility that it will need to be rolled back due to conflicts or for any other reason, except the failure of some other service to prepare. Once the transaction is prepared, only the orchestrator can normally roll it back.
If both the order and payment processors successfully prepare, then the orchestrator will tell them finalize the commit. Finalized means:
The transaction is recorded durably and marked committed
It cannot be rolled back.
If anything goes wrong during this process, it's possible to check the durably recorded states of the transaction in both the payment and order services in order to determine whether it "really happened" and how to recover:
If not all of the services prepare successfully, then the transaction did not happen. The orchestrator will roll back the transaction in all the services that did prepare it. If things are broken, this may require manual intervention, or the orchestrator may not be able to complete this operation until things come back up.
If all of the services did prepare successfully, then the transaction did happen. The orchestrator will tell all the services that haven't finalized it to go ahead and do that. Again, this might have to wait until systems that are down come back up.
Also, sometimes it's not the orchestrator's job to recover. If the orchestrator gives up, then the individual services can check with each other to see if the transaction happened or not.
The important point is that once the two-phase commit starts, no matter what happens you can return the system to a consistent state by checking the durable transaction records.
In practice, two-phase commit is not used that often, because when transactions are in the prepared-but-not-finalized state, any other transactions that use their data cannot themselves commit, because they might need to be rolled back. This kind of contentions where transactions need to wait for each other slows the whole system down.
I will explain the steps of a successful purchase to you. The customer registers an order and then goes to the payment gateway and makes the payment there and returns to the store site and the payment record is recorded and the operator can track the steps.
Order table
Id int,
CustomerName nvarchar(150),
TotalPrice int,
SendState tinyint,
InsertDate datetime
Payment table
Id int,
OrderID int,
PayState tiniyint,
PayPrice int
The customer now registers an order in the database
Insert into Order values(1,'bill',30000,1,'5/1/2021 8:30:52 AM')
Now the customer connects to the payment gateway and makes the payment successfully and the store site returns.
Insert into Payment values(1,1,200,30000)
The results are now displayed for the operator
select o.*,p.PayState,p.PayPrice
from Order o join Payment p on o.Id = p.OrderId
If you do not pay, an error of 500 will be registered in the database and the operator, seeing the status of 500, will understand that the payment was not successful.
Insert into Payment values(1,1,500,0)

Understanding why you would want to process Message Queues at a future time

So I'm trying to understand what practical problems Queues solve. By reading all the information from Google, I get the high-level.
Push message to Queue for processing at a later time
So I'm looking at an architecture from Company A and they have different use cases for Job Queueing like for example
chat messages
file conversion
searching
Heavy sql queries
Why process it at a later time?
Here's my best guess...
Let's say I have an application that can process 10 "things" at a time.
My application then maxes out it's processing capacity.
an 11th request came in so app puts it in the Queue for later processing
Assuming this is a valid Use Case, wouldn't adding more servers to process more "things" make sense? Is it because it's more costly to add more servers than employ a Queue and sacrifice response time a little bit?
Given my Use Case examples, what other problems would Queues solve for them?
Have you ever lined up at a bank when it is busy? You would have waited in a queue.
"But," you could say, "wouldn't adding more staff to process more customers make sense? Is it because it's more costly to add more staff than employ a Queue and sacrifice response time a little bit?"
That would be correct. It can be quite costly to staff a bank based on the peak number of customers who would arrive each day. It is cheaper to staff below this level and have some customers wait in a queue.
Also, the number of customers each day are not 100% predictable. A queue allows excess demand to wait without breaking the system.
Queues enable decoupling.
For example, imagine an online store where customers purchase an item. They select the item, provide a credit card number and click 'Purchase'. If the credit card is declined, the online store can immediately prompt them to re-enter the number. This interaction has to take place immediately while the customer is still online.
However, there is no need to have the customer wait while an invoice is generated, a record is added to the accounting system and inventory is pulled off the shelf. This can be decoupled from the ordering process. A good way to do it is to push the order into a queue, which can be handled by the next system.
If that 'next system' happens to be offline at the moment, there is no reason to cancel the whole sale. The transaction can be processed when the 'next system' comes back online. This is much better than failing the whole process just because one component (which is not required immediately) has a failure.
Bottom line: Queues are excellent. They enable better handling of failures. They makes things more resilient (just wait a few minutes and try again!). They should be used at all times when the process is compatible with a queuing architecture.
Let's do scenarios
Scenario 1 without queue:
you request an endpoint /blabla/do-eveything/
this request do
download an image from very slow FTP
e.g 1.5 sec (can error, retry ? add +X sec)
attach the image to an email
send an email (3 sec)
e.g 1 sec (can error, retry ? add +X sec)
confirmation received > store confirmation to a third company tracking stuff
e.g 1.5 (can error, retry ? add +X sec)
when tracking confirm, update your data from another third company for big data purpose
e.g 2 sec (can error, retry ? add +X sec)
... you get the idead
return the response e.g 11 sec later (this is to slow) or more or timeout when everything failed
End user said internet was faster 20 years ago, maybe I need to change my internet connection or change my 16 threads
Scenario 2 queue everything you can:
you request an endpoint /blabla/do-eveything/
this request do
Queue job "DO_EVERYTHING"
e.g 0.02 sec
Return the response less then 0.250 sec
End user said that is website/app is too fast, I can keep my 56K internet connection
on queue/event system one failed job can be retry later without affeting the end user
you can pause job, add a unlimited number a task/step after the original message
better fault tolerance
Working with queue will allow you a better micro/nano service architecture, better testing because, you can test a single job, intead of a full controller that do everything...
Ye, is maybe more work, more thinking, but a the end no need to think about the work when holidays

ISO 8583 Authorization Message explanation

While learning payment technologies, I have reviewed some issuer's documentation about their implementation of ISO 8583, even though I have seen how this kind of messaging works, I have not completely understood how the Authorization Message (MTI x1xx) really works.
The general definition I have found is that this message 'determines if funds are available, get an approval but do not post to account for reconciliation', but I want to understand the general lifecycle of this message.
If the amount requested in the authorization is approved, does it mean that the funds are held until another message is sent? If the funds are not held, why reversal messages (MTI x4xx) offer the possibility of reversing the authorization? If another request is not sent, what about of 'not posting it for reconciliation'? Do issuers have to follow an expiration time as a standard to cancel the authorization request?
I know that these questions may depend on each issuer's specifications, but every time I search for the definition of the authorization message I always get the same one or two lines of description (like the one I wrote before) and no more.
I want to get a full explanation for this message and some examples. I really want to dominate this subject, because I do not want to use something that I do not understand.
Instead of using the terms issuer or acquirer, I usually prefer to use the term "payment processor" to refer to the institution or computer system that you communicate with in order to get process payments. As you know different payment processors do things differently, so I can only give you a general idea of how ISO 8583 is usually used.
When an authorization request or an authorization advice is approved, a temporary hold is usually put on the authorized funds. The authorization response message, that indicates approval, will usually contain an authorization number. I do not know how long the temporary hold on the funds lasts before it expires (or whether that time varies by payment processor).
The next step is to either:
Do nothing and let the hold expire.
Send a reversal message to reverse the authorization (and release the hold immediately).
Send a financial advice message, that contains the authorization number from the authorization response, to complete the transaction initiated by the authorization request/advice.
See the ISO 8583 Wikipedia page for background information
As far as my experience in the payment sector is concerned here is my explanation, I hope it could help you to some extent.
Most of the switches or payment systems use DMS (Dual Message System) for transactions, means in each transaction two request messages are sent from the acquirer (ATM) to the issuer (i.e. CBS). Both messages' type is x100, only some fields differ which differenciate them.
The first one is Authorization Request which is used to authorize the cardholder (i.e whether his/her PIN is correct or not, here all the basic validation and verifications are done) it is called precheck. In this case, no amount is held on the CBS and no reversal message is required in case the transaction fails.
The second one is the actual transaction request message (i.e balance inquiry, cash withdrawal and etc..). In case of cash withdrawal, the acquirer requests the issuer for cash withdrawal (the routing is done through a switch or a payment system).
As the user is already authorized, but there is no response from the issuer or it is a timeout. There can be lots possibilities why the transaction failed or no response is received.
The amount is debited from the customer in CBS but due to internet
issue the resposne is not received by the aquier (ATM).
The amoutn is debeited from the customer, but due to load of process
in CBS the acquierer received response late (there is a time limit within
the switch should receive response from CBS called timeout i.e 10 secenods or 15
seconds etc. each switch has its own rules and setting for timeout).
In above secnarios, the switch (SV, CSC, etc.) sends reversal advice (MTI x420) to the CBS or reversal advice repeat (MTI x421) after 5 seconds in case no response is received for reversal advice.
Then, the issuer (CBS) sends reversal response (MTI x430) which means the transaction is reverted (the amount is credited back to the account/ card) successfully.
This is the end. Both parties (issuer and acquirer) will be happy and there will be no money loss or fraud.
NOTE:
- x in MTI determines the ISO 8583 version.
- MTI stands for Message Type Identifier

PayPal API transaction final status

I am using PayPalAPIInterfaceClient (soap service) to get information about transaction (method GetTransactionDetails()) and need to be absolutely sure about transaction status (it means - money has been sent no matter in which direction).
When the transaction is really completed and when is still "on the road"?
For example: I assume, Processed will be followed by InProgress and finally changed to Completed or something like this. On the other hand, Denied or - I don't know - Voided will not change in future.
Can you help me please to decide, which status can be accepted as ultimate (like Completed, but may be even Completed must not mean final money transfer) and which ones are still in one of its sub-state?
I would expect simple "Money finally transferred" & "Money finally not transferred" result, but reality is different.
Shortly, to mirror transaction result into database and manage automatic transactions (from and to client) I need to know this.
I am using the PaymentStatusCodeType enumeration values and my service iterates transaction history to check if the money was transferred or not.
Completed means it's done. You may also want to look into Instant Payment Notification (IPN). It sends real-time updates when transactions hit your PayPal account so you can automate post-transaction tasks accordingly. This includes handling e-checks or other pending payments which won't complete for a few days, refunds, disputes, etc.