How to avoid deadlocks in PessimisticLockScope.EXTENDED? - sql

I’m creating a Java transfer money app that basically transfers money from one account to another.
In a nutshell I have a Transfer entity that contains 3 properties: #ManyToOne OriginAccount, #ManyToOne TargetAccount and Amount.
Account contains a balance to be adjust within the transfer.
I’m about to use LockModeType.PESSIMISTIC_WRITE on Account entities but I have to consider a deadlock.
One option is to select list of 2 accounts always in the same order (sort based on id) to always acquire locks in the same order.
I also heard about PessimisticLockScope.EXTENDED but what will be the order of locks acquiring on joined records? Is there any option to ensure the order based on a kinda comparator? How can I certainly exclude a deadlock possibility?

Related

How to express pagination in attribute based access control?

Based on my coarse reading, ABAC, i.e. attribute based access control, boils down to attach attributes to subjects, resources and other related entities (such as actions to be performed on the resources), and then evaluate a set of boolean valued functions to grant or deny the access.
To be concrete, let's consider XACML.
This is fine when the resource to be accessed is known before it hits the decision engine (PDP, in the case of XACML), e.g. view the mobile number of some account, in which case the attributes of the resource to be accessed probability can be easily retrieved with a single select SQL.
However consider the function of listing one's bank account transaction history, 10 entries per page, let's assume that only the account owner can view this history, and the transaction is stored in the database in a table transaction like:
transaction_id, from_account_id, to_account_id, amount, time_of_transaction
This function, without access control, is usually written with a SQL like this:
select to_account_id, amount, time_of_transaction
from transaction
where from_account_id = $current_user_account_id
The question: How can one express this in XACML? Obviously, the following approach is not practical (due to performance reasons):
Attach each transaction in the transaction table with the from_account_id attribute
Attach the request (of listing transaction history) with the account_id attribute
The decision rule, R, is if from_account_id == account_id then grant else deny
The decision engine fetch loops the transaction table, evaluate each row according to R, if granted, then emit the row, util 10 rows are emitted.
I assume that there will be some preprocess step to fetch the transactions first, (without consulting the decision engine), and then consult the decision engine with the fetched transaction, to see if it has access?
What you are referring to is known as 'open-ended' or data-centric authorization i.e.access control on an unknown number (or a large number) of items such as a bank account's transaction history. Typically ABAC (and XACML or alfa) have a decision model that is transactional (i.e. Can Alice view record #123?)
It's worth noting the policy in XACML/ALFA doesn't change in either scenario. You'd still write something along the lines of:
A user can view a transaction history item if the owner is XXX and the date is less than YYY...
What you need to consider is how to ask the question (that goes from the PEP to the PDP). There are 2 ways to do this:
Use the Multiple Decision Profile to bundle your request e.g. Can Alice view items #1, #2, #3...
Use an open-ended request. This is known as partial evaluation or reverse querying. Axiomatics has a product (ARQ) that addresses this use case.
I actually wrote about a similar use case in this SO post.
HTH,
David

Geode transaction to generate ID and insert object

Let's say I have 3 PARTITIONED_REDUNDANT regions:
/Orders - keys are Longs (an ID allocated from /Sequences) and values are instances of Order
/OrderLineItems - keys are Longs (an ID allocated from /Sequences) and values are instances of OrderLineItem
/Sequences - keys are Strings (name of a sequence), values are Longs
The /Sequences region will have many entries, each of which is the ID sequence for some persistent type of that is stored in another region (e.g., /Orders, /OrderLineItems, /Products, etc.)
I want to run a Geode transaction that persists one Order and a collection of OrderLineItems together.
And, I want to allocate IDs for the Order and OrderLineItems from the entries in the /Sequences region whose keys are "Orders" and "OrderLineItems", respectively. This operates like an "auto increment" column would in a relational database - the ID is allocated/assigned at insertion time as part of the transaction.
The insertion of Orders and OrderLineItems and the allocation of IDs from the /Sequences region need to be transactionally consistent - they all succeed or fail together.
I understand that Geode requires data being operated on in transaction to be co-located if the region is partitioned.
The obvious thing is to co-locate OrderLineItems with the owning Order, which can be done with a PartitionResolver that returns the Order's ID as the routing object.
However, there's still the /Sequences region that is involved in the transaction, and I'm not clear on how to co-locate that data with the Order and OrderLineItems.
The "Orders" entry of the /Sequences reqion would need to be co-located with every Order for which an ID is generated...wouldn't it? Obviously that's not possible.
Or is there another / better way to do this (e.g., change region type for /Sequences)?
Thanks for any suggestions.
Depending on how much data is in your /Sequences region - you could make that region a replicated region. A replicated region is considered co-located with all other regions because it's available on all members.
https://geode.apache.org/docs/guide/15/developing/transactions/data_location_cache_transactions.html
This pattern is potentially expensive though if you are creating a lot of entries concurrently. Every create will go through these shared global sequences. You may end up with a lot of transaction conflicts, especially if you are getting the next sequence number by incrementing the last used sequence number.
As an alternative you might want to consider UUIDs as the keys for your Orders and OrderLineItems, etc. A UUID takes twice as much space as a long, but you can allocate a random UUID without needing any coordination between concurrent creates.

POS (Shopping Cart) - Multiple items in a single Transaction No

I'm creating a POS like system and I'm not really sure how to do the Shopping Cart part wherein after the cashier enters all the customer's item (from Inventory table), the items entered will have a single Transaction #, just like what we see in the receipts.
Should I put a Trans_No column in the Cart table? If yes, how will I handle the assigning of a single Trans_No to multiple items? I'm thinking of getting the last Trans_No and increment that to 1 then assign it to all the items in the shopping cart of the casher. But there's a huge possibility that if 2 cashiers are simultaneously using the system they both retrieve the same latest transaction # and will increment it both to 1 resulting to merging of 2 customers' order in to 1 single transaction/receipt.
What's the best way to handle this?
The data object on which your transaction id goes depends on the functional requirements of your application. If whatever is in a cart should share a transaction id, then the cart table is the right place for the transaction id.
Database systems offer a variety of features to prevent the concurrent increment problem you describe. The easiest way to avoid this is to use a serial data type as offered e.g. by PostgreSQL. If you declare a column as serial, the database will care for generating a fresh value for each record you insert.
If no such data type is available, there might still be a mechanism for generating a unique primary key for a record. An example is the auto_increment directive for MySQL.
If all these are not viable for you, e.g. because you want to have some fancy logic of generating your transaction ids, putting the logic of reading, incrementing, and storing the value needs to be enclosed in a database transaction. Statements like
start transaction;
select key from current_key;
update current_key set key = :key + 1;
commit;
will prevent collisions on the key value. However, make sure that your transactions are short, in particular that you don't leave a transaction open during a wait for user input. Otherwise, other users' transactions may be blocked too long.

One entity, several models in different bounded contexts with different databases

Let's say we have an Order entity that will be modeled in 2 diff. BCs in a e-commerce application.
The first BC is Order Placement. This BC takes care of collecting all orders placed by our customers from our different websites, validates them and populates its corresponding database by Orders with state either Placed or Rejected.
The 2nd BC is Shipment. This allows the employees in the warehouses to mark an Order as Shipped in its database once it leaves the warehouse.
Now since both BCs use different databases which are empty at first, there will be a need to inform the Shipment BC of the orders that were Placed, so that when a scanner wants to Ship an Order it will be there in the Shipment BC.
My initial approach was to create a domain event once an Order is placed in the Order placement BC and have the Shipment BC subscribe to that event and create a corresponding Order entity in its database for every order placed.
However, I can't stop that feeling that I'm duplicating data across different databases.
My second approach is to ask the Order Placement each time an order is being Shipped for an Order entity, but I still need to maintain the state of the Order in case a failure of a failure in the shipment.
Is there a better approach to all this from a DDD POV?
Your first approach is perfectly fine in my opinion. You are not duplicating data, because as you already noticed, that data is from another context. Same data in different contexts means different things.
As Vernon Vaughn pointed out in his book «Implementing Domain Driven Design»: "A greater degree of autonomy can be achieved when dependent state is already in place in our local system. Some may think of this as a cache of whole dependent objects, but that’s not usually the case when using DDD. Instead we create local domain objects translated from the foreign model, maintaining only the minimal amount of state needed by the local model.”
So copying data is okay as long as it is the only data other BCs need.
But he also mentions that if you use exact copies, it might be a sign of a modeling problem.

SQL/MySQL structure (Denormalize or keep relational)

I have a question about best practices related to de-normalization or table hierarchy relationships.
For a simple example, let's say I have an app that allows a user to make a payment for an order. I save the order information in the orders table, and I have another table for the payment called payments. Payments has a foreign key to the orders table.
Let's assume that I can pay with a credit card, check, or paypal, and I want to save the information about the payment.
My question is what is the best way to handle this relationship between the different payment data and the payment table. The types of payment all have different data associated with them. So do I denormalize the payments table, putting credit card, check, and paypal information fields in there and then just use the fields as necessary. Alternately I could specify a payment type, and store the information in their own tables, but then I would have to use logic on an application level to get the data out of the correct credit card, check or paypal information tables...
I would choose to keep the database normalized.
but then I would have to use logic on an application level to get the data out of the correct credit card, check or paypal information tables...
You have to use logic (or at least mapping) in either case. Whether its what table to pull the data from or what fields in the table to access.
What about keeping it denormalized and then making a view to put the data back together again. You get the best of both worlds. IIRC, MySQL introduced views in version 5.
So do I denormalize the payments
table, putting credit card, check, and
paypal information fields in there and
then just use the fields as necessary.
yes. but this is not "denormalizing". if you stored order information in the client table, that would be denormalizing. adding nullable columns to accurately describe a payment in the payments table is not.
You can use the idea of table per subclass as the ORM tools do. This would require a join for each query against the payment table but...
Create tables for each payment type so you will have a creditcardpayment and a checkpayment table. The common fields go in the payment table, the specific fields go in the sub tables. The sub tables primary keys are foreign keys to the payment table's id.
To add a new payment you have to first insert the common fields into the payment table, get the id generated, then insert the specific fields into the specific sub table.
To query you have to join the subtables with the payment table. You could use a view to make that easier.
This way the database is still normalized and you have no null columns.
It partially depends on the framework (if any) that you are using. For instance: the Ruby on Rails way would generally be to store the type of the payment in the payments table and then have different, separate tables for each payment type (PayPal, Credit Card, etc).
Alternatively, if you notice that you are repeating the same data in many of the tables, Rails has a way to store all of the data in the same table, using only the fields you need, but still allowing you to have separate objects. For instance, you would have an AbstractPayment object with an abstract_payments table, but you would also have PayPalPayment and CreditCardPayment objects that both inherit from AbstractPayment and use the abstract_payments table. All you need to determine the payment type is a column in abstract_payments that tells you which type it is (probably a string, but could be an integer if you so choose). This is called STI.
No matter what framework/language you use, the same ideas can definitely apply and I think the right solution will depend on how many different types of payments you have, compared with how simple you want your database to be.
Keep it as normalized as possible. Only de-normalize when the performance of a fully normalized schema requires denormalization to improve response time, and do that only on a case by case basis to deal with specific performance issues associated with individual querys within your application.
These are complex problems. Database Normalization requires intimate domain knowledge, and a skilled analysis of how that domain model will be manipulated and utilized within your application. Denormalizing for performance requires that you understand your application's usage patterns well enough to predict performance issues before they occur (waiting till they actually occur in production is too late - by then making fundemental schema changes in the database is very expensive) and know what denormalization techniques to use to address them.
You need to weight the following factors:
How much space will you waste if you put all data into a single table
How complex the SQL queries will become in either case.
If you use different tables, you'll have to use joins. If you put everything into a single table, you'll need to find some magic to "ignore" the rows which don't matter (say when you want to find all credit card payments: Your query must then ignore everything that's something else).
The latter part gets more easy when you move the special data into special tables at the cost of more complex joins.