MS-SQL content-aware unique constraint for multiple columns possible? - sql

I'm building process engine and want to ensure only single process instance is pending or active at given time for the given case. However, identical process can be started once previous instance finishes. I would like to use database constraint for maximum reliability with unrealiable networks and lost responses.
I have table with following columns
process_id (PRIMARY KEY)
case_id (FOREIGN KEY)
state (PENDING | ACTIVE | COMPLETED)
Multiple rows with same case_id and state COMPLETED is allowed for restarts. However, I would like to prevent identical rows with state PENDING or ACTIVE to enforce my requirement of single process instance at time.
Here is example of allowed scenario with identical cases having state COMPLETED
process_id case_id state
------------------------------
1 1 COMPLETED
2 1 COMPLETED <-- case_id duplicate allowed for multiple COMPLETED
3 1 PENDING
Here is example of disallowed scenario with identical cases having state PENDING
process_id case_id state
------------------------------
1 1 COMPLETED
2 1 COMPLETED
3 1 PENDING
4 1 PENDING <-- case_id duplicate not allowed for multiple PENDING!
And here is example of disallowed scenario with identical cases having combination of states PENDING and ACTIVE
process_id case_id state
------------------------------
1 1 COMPLETED
2 1 COMPLETED
3 1 ACTIVE
4 1 PENDING <-- case_id duplicate not allowed for combination of ACTIVE and PENDING!
Is this kind of content-aware unique constraint possible in MS-SQL?

Based on the sample, it seems that you just need a filtered UNIQUE INDEX on case_id where state does not have a value of 'COMPLETED':
CREATE UNIQUE INDEX UQ_IncompleteCases ON dbo.YourTable (cast_id)
WHERE state != 'COMPLETED';
This will allow any number of rows, for a single case_id, where state value has 'COMPLETED' but will only allow one for a single case_id for all other values of state.
Note that if state is NULLable, you may need to add logic to include/exclude those in the above WHERE.

Related

Handling CustomField Insert - Which option is efficient and easier to maintain

We have the following Table structure:
User Table
UserId CompanyId FullName Email
1 1 Alex alex#alex.com
2 1 Sam sam#sam.com
3 2 Rohit rohit#rohit.com
CustomField Table
CustomFieldId CompanyId Name Type
1 1 DOB Datetime
2 1 CompanySize Number
3 2 LandingPage Text
CustomFieldValue Table
UserId CustomFieldId DatetimeValue NumberValue TextValue
1 2 01-01-2020
1 2 10
1 3 Home
2 1
2 2 20
2 3 Product
Please consider the following facts:
There are millions of users in a particular CompanyId
When displaying a particular user in the UI we need to show all the Custom Fields that an end customer can fill up.
How to handle CustomFieldValues table in this case? We are considering the following options
When a new CustomField row is created for a particular CompanyId have a After Insert Trigger to create all corresponding rows in CustomFieldValue table for all users.
This I think would have an initial cost of creating so many rows for each Custom Field in the CustomFieldValue Table. (This may also lock up the table and users of the application would have to wait till all the inserts are done).
Same issue for deleting all CustomFieldValue rows when a CustomField row is deleted from a Company
But easier for UI and backend developers as they don't need to worry about whether a CustomFieldValue doesn't have an entry for a Custom Field that has been created for a Company
Don't create CustomFieldValue rows when a CustomField is added to the Company. Create the CustomFieldValue whenever user fills up the relevant input field in the UI view
This would have negligible insert cost and users would not have to wait for insert or delete to complete in CustomFieldValue table for all the users in a particular company.
The downside is that developers would have to make sure that relevant CustomFields are displayed in the frontend even though no relevant records yet exist in the CustomFieldValue table.
On each Custom Field input update by the end user, the developers would have to first check if a corresponding CustomFieldValue row exits, if so - store the updated value, if not - create the CustomFieldValue row.
Kindly suggest a solution which is efficient and easier to maintain.

How save is it to delete rows with a query in the where clause

I have an object Head (table Head) which can have 0 or n positions (table Position).
HEAD table
HEAD_ID NAME
1 A
2 B
POSITION table
HEAD_ID POS_ID VALUE
1 1 X
1 2 Y
2 1 Z
3 1 DELETE ME
Unfortunately it is not possible to create foreign keys to maintain the data integrity. Therefore I want to create a delete script to delete Positions which do not have a corresponding head.
My delete script
DELETE POSITION
WHERE HEAD_ID NOT IN (SELECT HEAD_ID FROM HEAD)
Question: How does the command work if rows are inserted during the execution of the delete script into the tables? In my scenario both tables have several 10.000 of rows and the search may take some time.
If I understand it correctly, the list of HEAD_IDs from HEADS is created once at the beginning of the command. Therefore newly added rows will not be in the list and will be deleted. Is that correct?
The command would delete the position with HEAD_ID = 3 and POSITION_ID = 1 on my example, since the head is missing.
But how does it work, if after the SELECT and before the DELETE, new entries will be added to the both tables:
HEAD table
HEAD_ID NAME
1 A
2 B
4 NEW HEAD
POSITION table
HEAD_ID POS_ID VALUE
1 1 X
1 2 Y
2 1 Z
3 1 DELETE ME
4 1 WILL I BE DELETED?
Will the new position with the HEAD_ID = 4 and POSITION_ID = 1 be deleted since the head was not in the SELECT?
Any way to do perform the task in a safer way?
You can use MySQL table locking feature for you delete operation, then no new data will insert until your delete operation finish.
First Lock the tables
LOCK TABLES table_name WRITE
Then do your delete operation and after release the table lock
UNLOCK TABLES;
MySQL allows a client session to explicitly acquire a table lock for preventing other sessions from accessing the same table during a specific period.
A client session can acquire or release table locks only for itself. And a client session cannot acquire or release table locks for other client sessions.
Ref : https://www.mysqltutorial.org/mysql-table-locking/

how can i disable unique constraints when savechanges in entity framework

I have come across an interesting problem in Entity Framework 6 and SQL Server.
We have a table with a Composite Key. Here is an example;
ID Col1 Col2
-- ---- ----
1 1 1
2 1 2
3 2 1
4 2 2
5 2 3
So, Col2 is unique for each Col1. I have a requirement to swap 2 values to produce this desired result...
ID Col1 Col2
-- ---- ----
1 1 2
2 1 1
I am using Entity Framework, load the object from the database, make my change, and call SaveChanges.
I Receive the exception: "Violation of UNIQUE KEY constraint 'UQ_TableA_Constraint1'. Cannot insert duplicate key in object 'dbo.TableA'."
Supposedly, SaveChanges is called in a transaction. The EF Source seems to indicate it is, and the fact that a failed update is atomic would indicate this is working. However, it also appears that updates are completed ROW BY ROW, even inside the transaction. Thus that EF first performs an update to record 1 which temporarily produces a duplicate unique key.
Is there a manner to mitigate this? I would rather not update to a temp value, call savechanges, and then update to the correct value as this potentially could fail and leave the data in an incorrect state.
Are there any options?
I hope I understand your question.
Contraints must be satisfied also inside a transaction, this is a SQL Server (and other DBMSs) behaviour not an EF behaviour.
You can use a temporary value inside a transaction to be sure that everything went well.
If you need to run update queries (on multiple records), you can use an external library https://github.com/loresoft/EntityFramework.Extended (but if I understand your question, you won't solve contraint issues).

How do you constrain a composite key that has a large number of non-unique combinations?

So given a table structure that looks something like this:
Order_date DATE
Order_id NUMBER
State VARCHAR2(16)
...
other properties/attributes
Keep in mind that I could use a sequence of integers here and generate a PK, however that does not interest me because of how I use this table in the main application.
So the composite key is made of Order_date, Order_id and State. The problem with this combination is that it's not necessary to be unique, but it is constrained in a way.
Ex:
Order_date | Order_id | State
21-09-2014 7218821 Pending
22-09-2014 2771272 Pending
20-09-2014 3277127 Approved
13-08-2014 2218765 Done
13-08-2014 2218765 Cancelled
Constraints:
There is no way for one combination of the same order_date and
order_id and state Done to be duplicated in this
There can be any number of the same order_date and order_id with any other state than Done
You cannot add a record with state DONE or ERROR
You cannot skip from one state to another by bypassing their natural sequence (REGISTERED -> PENDING -> APPROVED -> DONE | CANCELLED | ERROR)
What whould be the best way for me to implement these constraints for a Oracle database?
The first is handled by a primary key or unique key.
The second is tricky. The second can be handled with a function-based unique key, because Oracle allows multiple values for NULL:
create unique index unq_order_date_id_done on
orders(order, order_date, order_id,
(case when state = 'DONE' then state end));
I think the third and fourth need a trigger to prevent the value from being added.
Bullet by bullet:
This is most likely true with no monitoring needed. Although you don't show it, the DATE field contains the time down to the second. In order to have a duplicate, the state for the same order will have to be changed twice within the same second.
Doubtful. Unless your processing allows for multiple state changes for the same order within a second of each other.
Your example data shows a state of DONE. How did that get there?
Your description states that after APPROVED, the only allowed states are DONE or CANCELED or ERROR. Your example data shows an order going from DONE to CANCELED. This does not seem to be allowed. Actually, your second bullet suggests that a status of ERROR is not allowed under any circumstances.
The only way you can have duplicated (order, date) values is if status changes occur very quickly -- within the same second. OR...you truncate the time values from the date fields. This doesn't seem likely as there is no reason to be discarding such valuable information as the time a status change was recorded. You get no benefit and processing becomes more difficult: lose/lose.

Database Insert Mechanism

I have a question in mind about the insert mechanism in different databases. Supposing a table with a single column primary key that is automatically generated (like identity columns), will the entire table become locked when inserting a new record? and if the insert takes too much time, will the other transactions have to wait more?
By default Oracle uses row level locks.
These locks are blocking only for writers(update, delete, insert etc). That means select will works all the time when a table is heavy updated, delete from, etc.
For example, let be tableA(col1 number, col2 number), with this data within it:
col1 | col2
1 | 10
2 | 20
3 | 30
If user John issues at time1:
update tableA set col2=11 where col1=1;
will lock row1.
At time2 user Mark issue an
update tableA set col2=22 where col1=2;
the update will work, because the row 2 is not locked.
Now the table looks in database:
col1 | col2
1 | 11 --locked by john
2 | 22 --locked by mark
3 | 30
For Mark table is(he does not see the changes uncommited)
col1 | col2
1 | 10
2 | 22
3 | 30
For John table is:(he does not see the changes uncommited)
col1 | col2
1 | 11
2 | 20
3 | 30
If mark tries at time3:
update tableA set col2=12 where col1=1;
his session will hang until time4 when John will issue an commit.(Rollback will also unlock the rows, but changes will be lost)
table is(in db, at time4):
col1 | col2
1 | 11
2 | 22 --locked by mark
3 | 30
Immediatley, after John's commit, the row1 is unlocked and marks's update will do the job:
col1 | col2
1 | 12 --locked by mark
2 | 22 --locked by mark
3 | 30
lets's mark issue a rollbak at time5:
col1 | col2
1 | 11
2 | 20
3 | 30
The insert case is simpler, because inserted rows are locked, but also are not seen by other users because they are not commited. When the user commits, he also releases the locks, so, other users can view these rows, update them, or delete them.
EDIT: As Jeffrey Kemp explained, when you have PK(it is implemented in Oracle with an unique index), if the users try to insert the same value(so, we would have a duplicate), the locking will happen in the index. The second session will be blocked until the first session ends because it try to write in the same place. If the first session commits, the second will throw Primary key violated exception and will fail to change the database. If first session does a rollback, the second will succeed(if no other problem appears).
(NB: In this explanation by user John I mean a session started by user John.)
Inserting will not lock the table. The inserted records will not be visible to other sessions until you commit.
Your question is relevant to any case where you are inserting into a table with any unique constraint. If there was no index, and you insert a row into the table, you'd expect the database would need to lock the entire table - otherwise duplicates might be inserted in a multi-user system.
However, Oracle always polices unique constraints with an index. This means that the data for the column is always sorted, and it can quickly and easily determine whether a conflicting row already exists. To protect against multiple sessions trying to insert the same value at the same time, Oracle will just lock the block in the index for that value - in this way, you won't get contention for the whole table, only for the particular value you're inserting. And since an index lookup is typically very fast, the lock will only need to be held for a very small period of time.
(But now, you might ask, what if a session inserts a value but doesn't commit straight away? What if another session tries to insert the same value? The answer is, the second session will wait. This is because it will request a lock on the same index block, but since the first session hasn't committed yet, the block will still be locked. It must wait because it cannot know if the first session will commit or rollback.)