Update WHERE (SELECT COUNT(*)) atomicity and race conditions. Suggestions? [closed]

Update WHERE (SELECT COUNT(*)) atomicity and race conditions. Suggestions? [closed] - sql

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 months ago.
Improve this question
I have a table for booking system and I want to apply a certain constraint that should be atomic. Simply, I just want to conditionally insert a row into that table. I don't want to read-prepare-write because it will cause race conditions. I decided to insert an initial row then update it with a sub-query condition and check affected rows count.
affectedRowsCount will always be 1 on concurrent requests which indicates a race condition. I know that isolation level of Serializable and lock mechanisms will help but I want to discuss other less strict ways
Pseudo Code
Start transaction
Insert single row at table Reservations (Lets call Row)
affectedRowsCount = Update Reservations where ID = "Row".id AND (SELECT COUNT(*) FROM "Reservation" WHERE ...) < some integer
if (affectedRowsCount === 0) throw Already Reserved Error
Commit transaction

There is no way to do this except
using SERIALIZABLE transaction isolation
locking everything in sight to serialize operations
It is unclear what exactly you are trying to do, but perhaps an exclusion constraint on timestamp ranges can help.

In general, the way to prevent other queries from having access to a row(s) for locking purposes is to use SELECT FOR UPDATE. I'm not sure if you're using postgresql or sqlite, but you can read about the postgresql functionality here. Sqlite does not support it.
The idea is that you can lock the row you for which are interested, and then do whatever operations you need to without worrying about other queries updating that row, and then commit your transaction.
A common scenario for this would be when you're trying to book a reservation, as it looks like your example may be doing something along those lines. We would do a SELECT FOR UPDATE on the row containing the resource we want to book, then check the available dates the user is wanting to book, and once we have ensured that the dates are available for that resource, go ahead and book it. The SELECT FOR UPDATE prevents the possibility of other people trying to book the same resource at the same time we are.

Related

EndDate on Dimension Table - Should we go with NULL or 99991231 Date Value [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
I am building a Data Warehouse on SQL Server and I was wondering what is the best approach in handling the current record in a dimension table (SCD type 2) with respect to the 'end_date' attribute.
For the current record, we have the option of using a date literal such as '12/31/9999' or specify it as NULL. The dimension tables also have an additional 'current_flag' attribute in addition to 'start_date' and 'end_date'.
It is probably a minor design decision but just wanted to see if there are any advantages of using one over the other in terms of query performance or in any other way?

I have seen systems written both ways. Personally, I go for the infinite end date (but not NULL and the reason is simple: it is easier to validate that the type-2 records are properly tiled, with no gaps or overlaps. I prefer only one validation to two -- the other being the validation of the is_current flag. There is also only one correct way of accessing the data.
That said, a system that I'm currently working on also publishes a view with only the current records. That is handy.
That system is not in SQL Server. One optimization that you can attempt is clustering so the current records are all colocated -- assuming they are much more commonly accessed. You can do this using either method. Using a clustered index like this makes updates more expensive, but they can be handy for optimizing memory.

Can converting a SQL query to PL/SQL improve performance in Oracle 12c? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have been given an 800 lines SQL Query which is taking around 20 hours to fetch around 400 million records.
There are 13 tables which are partitioned by month.
The tables have records ranging from 10k to 400 million in each partition.
The tables are indexed on primary keys.
The query uses many inline views and outer joins and a few group by functions.
DBAs say we cannot add more indexes as it would slow down the performance since it is an OLTP system.
I have been asked to convert the query logic to pl/sql and then populate a table in chunks.Then do a select * from that table.
My end result should be a query which can be fed to my application.
So even after I use pl/sql to populate a table in chunks,ultimately I need to fetch the data from that table as a query.
My question is, since pl/sql would require select and insert both, are there any chances pl/sql can be faster than sql?
Are there any cases where pl/sql is faster for any result which is achievable by sql?
I will be happy to provide more information if the given info doesn't suffice.

Implementing it as a stored procedure could be faster because the SQL will already be parsed and compiled when the procedure is created. However, given the volume of data you are describing its unclear if this will make a significant difference. All you can do is try it and see.

I think you really need to identify where the performance problem is; where the time is being spent. For example (and I have seen examples of this many times), the majority of the time might be in fetching to 400M rows to whatever the "client" is. In that case, re-writing the query or as PL/SQL will make no difference.
Anyway, once you can enumerate the problem, you have a better chance of getting sound answers, rather than guesses...

Oracle: How to identify data and schema changes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 7 years ago.
Improve this question
I have a requirement to gather what are the database data changes or schema changes occurred after executing a nightly batch. For example there is a table employee which has two records. After nightly batch suppose one record inserted and one record updated. I want to capture what record is updated and what record is inserted. I am using Oracle database. I am looking for a script to do this as we have some issues to get licenses for new tools that does this task. So anyone can advise how this can be done programatically or using Oracle 11g built in functions? Any sample code is greatly appreciated.As we have large number of tables, I am looking for a generic way to do this.
Thanks

I would suggest using triggers on the changes you want to capture and inserting that information into another table that captures those changes.
There's some info right here in stackoverflow the best way to track data changes in oracle
If triggers are not a viable option, look into INSERTing into 2 tables at once, one being your target table and one being you logging/change capture table.
Here is an example on stackoverflow
Oracle INSERT into two tables in one query
A third option would be table auditing. See the following on stackoverflow
Auditing in Oracle

In OLTP systems, you can add audit columns in the table create_date, update_date or last_modified_time, transaction_type.
With create_date, update_date - you can set default sysdate to create_date and then you need to modify application logic to update update_date. Trigger also will work, instead of changing code at the small cost of performance.
With last_modified_time, transaction_type - you need to update those 2 fields on insert or update as part of your application logic or using trigger.

Why would I update a value in a view? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Just a quick question on my MCSA SQL Server 2012 course.
As part of an exercise I've been asked to create a trigger to stop any updates on a view, I first checked that I could indeed change a value of a column on the view.
Having worked with views before I know what they are, but I dont know why a view wouldn't stop a change in the first place by default anyway.
1) Why might I want to change a value in the view table?
2) Also if I updated a value on the view, would anything change it back to reflect what is in the base tables and if so when might that happen?, ie an overnight refresh, any change to the base table?
Thanks :)

Your question seems to be more concerned with "why" rather than "how." Why would DML be executed against a view instead of directly against a table? The answers are almost too numerable to list here, but here are just a couple of the bigger ones.
For starters, when I design a database, almost every table has at least one view defined for it. If more than one, one is normally the DML view and the others are read-only (trigger that does nothing). No outward-facing and very few inward-facing apps have direct access to the tables. Everything must go through the views.
Why? Because this builds a wall of abstraction between the apps and the underlying tables. I can then constantly tune and tweak the tables and rarely have to make any changes to the apps. I can add or drop columns, change the data type of columns, merge tables together or split a table into two or three separate tables, but the views bring everything back together to how the apps expect to see it. They don't even have to know any changes were made. The apps can write data to what they see as a single "table" and the view trigger assigns the data to the correct tables. The view triggers know how the actual data is stored, the apps don't have to know.
Here's one advantage that is unbeatable. There are many, many useful functions that require the use of a before trigger. Sometimes you just really want to manipulate the data in some way before it goes to the table. Some DBMSs, even major players like SQL Server, have only after triggers. Such a pain.
But front the table with a view, write a trigger on the view, et voila, instant before trigger. This is even handy with other DBMSs like Oracle in that the trigger can update other associated rows in the same table without running afoul of a mutating table condition. I am at this very moment writing a trigger on a view that must update values in several rows every time any Insert or Delete is executed and sometimes for an Update. Almost impossible task without views -- it would have to be done with stored procedures rather than "direct" DML. Handy for me, handy for the apps. Life is good!
Also, take the loop condition caused by trying to keep two tables synched with each other. Any changes made in either table should be sent to the other table. But a trigger on both tables that just mirrors the operation to the other table will create an infinite loop as that trigger turns around and sends it right back. But a trigger on a facing view can perform the DML on its underlying table and on the mirrored table. The view on the mirrored table does the same thing. But since both views operate directly on the tables, there is no infinite loop condition to fall into.
So my question would be more like: why would we ever want users to directly access tables rather than have them go through views?

to answer the question on how to accomplish this with a trigger, you can use the following:
CREATE TRIGGER dbo.demo_trigger ON dbo.demo_view
INSTEAD OF UPDATE --The "INSTEAD OF" is probably why the question is in the course, outside of a view or procedure this is tends to be "BEFORE"
AS
BEGIN
SET NOCOUNT ON
ROLLBACK
RAISERROR (50005,10,1,N'abcde')
END
GO
I added a custom error with this:
sp_addmessage #msgnum = 50005
, #severity = 10
, #msgtext = N'Not allowed';
GO
It can be removed again with this:
sp_dropmessage #msgnum = 50005;
Trying to update the view now gives this result:
Not allowed
Msg 3609, Level 16, State 1, Line 1
The transaction ended in the trigger. The batch has been aborted.

Add surrogate id key to keyless tables in legacy database; how might it affect business applications that talk to the database? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
I am asking this because we worked with legacy database all the time and it can be a problem with you try to put an ORM on a table without primary keys. In this forum, many respondents to this problem have said (quite nonchalantly, in my opinion), just add a primary key to the table. No big deal.
Now I am asking, if you suddenly add primary keys to tables in existing database, what possible negative effects can you imagine, especially to client application? if the question is too broad, that pretty much already answer my question.

Before you start, make sure there's not some other mechanism in place to enforce uniqueness. For example, a not null unique declaration is behaviorally equivalent to a primary key declaration. There might be triggers; there might be privileges that require database updates to go through a stored procedure; there might be other mechanisms I haven't thought of this morning.
You seem to be talking about something like adding a column that's an autoincrementing integer, and declaring that to be the primary key.
Assuming that currently there really is no primary key, and assuming there's no other equivalent to a primary key, the main issues involve application code that relies implicitly on either the order of columns or on the number of columns.
For example, this is a valid SQL statement. You might find something like this embedded in application code.
select *
from your_table
order by 1 desc;
That statement will sort the result set in descending order by the values in the first column.
If you add a column to that table, and you position that column as the first column in the table definition, that SQL statement will start returning data in a different order than the other client expects.
Some platforms don't allow an alter table statement to change the order of columns; new columns go only at the end of the table definition. But even on those platforms, the DBA can dump data, rewrite the create table statement with the new column first, and reload the data.
This kind of issue--changing the number of columns in a table--can break some data imports and exports.
Insert statements written without column names, like insert into foo values (...) might fail.
Statements like these might also appear in triggers, but that's more a problem with database code than with application code.
There's some chance of performance issues. Using the new column as a clustered index--an option on some platforms--will change the physical order of rows. That will certainly change performance, but I can't predict whether that will certainly be a bad thing. The table will be wider--but not much wider--which means slightly fewer rows will fit on a page on disk.
A resilient solution
Change the name of the existing table. (This is simple, but might not be easy.)
Create an updatable view having the same structure and name as the original table.
Change the structure of the original table.
All application code that used the name of the original table will now use the name of the updatable view instead. Since the view's structure and behavior are identical to the original table, all of the application code should just continue to work. (I'd be surprised if application code needed to know whether it was dealing with a table or a view, but that's a possible problem.)
Precautions
In any case, you shouldn't make this change in production first. You should make the change locally first, then in your test environment, and finally--when you've exhausted our collective imaginations--move to production.
And move to production a little at a time. Add the column, and wait a while. Populate the column, and wait a while. Repeat until done.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas