how can I do a real LOCK of a table when inserting in redshift, I think it's like that but I'm not sure and aws documentation as always zero input
begin;lock table sku_stocks;insert into sku_stocks select facility_alias_id, facility_name, CAST( item_name AS bigint), description, CAST( prod_type AS smallint ), total_available, total_allocated from tp_sku_stocks;
LOCK has a documented behavior: it obtains an exclusive lock on the table, so that no other session or transaction can do anything to the table.
If you wish to verify this behavior, here's how you can do it:
Open a connection to the database and invoke the begin and lock commands against your test table.
Open a second connection to the database, and attempt to do a select against that table.
Wait until either the select returns or your are convinced that LOCK behaves as documented.
Execute a rollback in the first session, so that you're not permanently locking the table.
Update
Based on your comments, I think you have a misunderstanding of how transactions work:
When you start a transaction, Redshift assigns a transaction ID, and marks every row that was changed by that transaction.
SELECTs that read a table while it is being updated in a transaction will see a "snapshot of the data that has already been committed" (quote from above link), not the rows that are being updated inside the transaction.
INSERT/UPDATE/DELETE that try to update a table that is being updated by a transaction will block until the transaction completes (see doc, and note that this behavior is somewhat different than you would see from, say, MySQL).
When you commit/rollback a transaction, any new SELECTs will use the updated data. Any SELECTs that were started during the transaction will continue to use the old data.
Given these rules, there's almost no reason for an explicit LOCK statement. Your example update, without the LOCK, will put a write-lock on the table being updated (so guaranteeing that no other query can update it at the same time), and will use a snapshot of the table that it's reading.
If you do use a LOCK, you will block any query that tries to SELECT from the table during the update. You may think this is what you want, to ensure that your users only see the latest data, but consider this any SELECTs that were started prior to the update will still see the old data.
The only reason that I would use a LOCK statement is if you need to maintain consistency between a group of tables (well, also if you're getting into deadlocks, but you don't indicate that):
begin;
lock TABLE1;
lock TABLE2;
lock TABLE3;
copy TABLE1 from ...
update TABLE2 select ... from TABLE1
update TABLE3 select ... from TABLE2
commit;
In this case, you ensure that TABLE1, TABLE2, and TABLE3 will always remain consistent: queries against any of them will show the same information. Beware, however, that SELECTS that started before the lock will succeed, and show data prior to any of the updates. And SELECTs that start during the transaction won't actually execute until the transaction completes. Your users may not like that if it happens in the middle of their workday.
Related
First of all sorry if my English is not good.
I'm facing a problem with respect to transaction isolation level. My current isolation level is read committed.But it leads table to dead lock some times.
For example
create table tmp(id int,name varchar(20))
insert into tmp(id,name)
values(1,'Binesh')
,(2,'Bijesh')
,(3,'Bibesh')
begin transaction
update tmp set name ='Harish' where id=2
And I'm trying to get in another query window
select * from tmp where id=1
It is locking the table so it is not giving any records until I rollback or commits the first one
I tried
ALTER DATABASE db
SET READ_COMMITTED_SNAPSHOT On
ALTER DATABASE db
SET ALLOW_SNAPSHOT_ISOLATION on
It not locks the table but it gives old value for id =2
select * from tmp where id=2
returns me Bijesh where I'm expecting a locking
I'm expecting a way like if id=1 it will work fine, but if id =2 it will wait until the other transaction overs.
Hopes your help.....
Thanks in advance
Binesh Nambiar C
This is not a deadlock you're experiencing.
Your second query is blocked because it has to scan the entire table. During the scan, it encountered the row being updated (exclusively locked), so it waits until the lock is released (ie transaction ends).
If engine knew that the id column is unique (primary key or unique constraint), you will not get blocking here, because it wouldn't have to do scan on entire table but would rather stop on first match.
Keep ypur transactions short, provide alternative access paths to data (indexes) and try not use "select *".
Also, think carefully whether you really want to use the READ UNCOMMITTED isolation level.
I have a situation where I need to select some records from a table, store the primary keys of these records in a temporary table, and apply an exclusive lock to the records in order to ensure that no other sessions process these records. I accomplish this with locking hints:
begin tran
insert into #temp
select pk from myTable with(xlock)
inner join otherTables, etc
(Do something with records in #temp, after which they won't be candidates for selection any more)
commit
The problem is that many more records are being locked than necessary. I'd like to only lock the records that are actually inserted into the temporary table. I was initially setting a flag on the table to indicate the record was in use (as opposed to using lock hints), but this had problems because the database would be left in an invalid state if a situation prevented one or more records being processed.
Maybe setting a different isolation level and using the rowlock table hint is what you're looking for?
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
and
WITH (ROWLOCK)
Or maybe combining XLOCK and ROWLOCK might do the trick?
The documentation says:
XLOCK
Specifies that exclusive locks are to be taken and held until the transaction >completes. If specified with ROWLOCK, PAGLOCK, or TABLOCK, the exclusive locks apply to >the appropriate level of granularity.
I haven't tried this myself though.
I have a number of tables that get updated through my app which return a lot of data or are difficult to query for changes. To get around this problem, I have created a "LastUpdated" table with a single row and have a trigger on these complex tables which just sets GetDate() against the appropriate column in the LastUpdated table:
CREATE TRIGGER [dbo].[trg_ListItem_LastUpdated] ON [dbo].[tblListItem]
FOR INSERT, UPDATE, DELETE
AS
UPDATE LastUpdated SET ListItems = GetDate()
GO
This way, the clients only have to query this table for the last updated value and then can decided whether or not they need to refresh their data from the complex tables. The complex tables are using snapshot isolation to prevent dirty reads.
In busy systems, around once a day we are getting errors writing or updating data in the complex tables due to update conflicts in "LastUpdated". Because this occurs in the statement executed by the trigger, the affected complex table fails to save data. The following error is logged:
Snapshot isolation transaction aborted due to update conflict. You
cannot use snapshot isolation to access table 'dbo.tblLastUpdated'
directly or indirectly in database 'devDB' to update, delete, or
insert the row that has been modified or deleted by another
transaction. Retry the transaction or change the isolation level for
the update/delete statement.
What should I be doing here in the trigger to prevent this failure? Can I use some kind of query hints on the trigger to avoid this - or can I just ignore errors in the trigger? Updating the data in LastUpdated is not critical, but saving the data correctly into the complex tables is.
This is probably something very simple that I have overlooked or am not aware of. As always, thanks for any info.
I would say that you should look into using Change Tracking (http://msdn.microsoft.com/en-gb/library/cc280462%28v=sql.100%29.aspx), which is lightweight builtin SQL Server functionality that you can use to monitor the fact that a table has changed, as opposed to logging each individual change (which you can also do with Change Data Capture). It needs Snapshot Isolation, which you are already using.
Because your trigger is running in your parent transaction, and your snapshot has become out of date, your whole transaction would need to start again. If this is a complex workload, maintaining this last updated data in this way would be costly.
Short answer - don't do that! Making the updated transactions dependent on one single shared row makes it prone to deadlocks and and update conflicts whole gammut of nasty things.
You can either use views to determine last update, e.g.:
SELECT
t.name
,user_seeks
,user_scans
,user_lookups
,user_updates
,last_user_seek
,last_user_scan
,last_user_lookup
,last_user_update
FROM sys.dm_db_index_usage_stats i JOIN sys.tables t
ON (t.object_id = i.object_id)
WHERE database_id = db_id()
Or, if you really insist on the solution with LastUpdate, you can implement it's update from the trigger in an autonomous transactions. Even though SQL Server doesn't support autonomous transactions, it could done using liked servers: How to create an autonomous transaction in SQL Server 2008
The schema needs to change. If you have to keep your update table, make a row for every table. That would greatly reduce your locks because each table could update their very own row and not competing for the sole row in a table.
LastUpdated
table_name (varchar(whatever)) pk
modified_date (datetime)
New Trigger for tblListItem
CREATE TRIGGER [dbo].[trg_ListItem_LastUpdated] ON [dbo].[tblListItem]
FOR INSERT, UPDATE, DELETE
AS
UPDATE LastUpdated SET modified_date = GetDate() WHERE table_name = 'tblListItem'
GO
Another option that I use a lot is having a modified_date column in every table. Then people know exactly which records to update/insert to sync with your data rather than dropping and reloading everything in the table each time one record changes or is inserted.
Alternatively, you can update the log table inside the same transaction which you use to update your complex tables inside your application & avoid the trigger altogether.
Update
You can also opt for inserting a new row instead of updating the same row in LastUpdated table. You can then query max timestamp for latest update. However, with this approach your LastUpdated table would grow each day which you need to take care of if volume of transactions is high.
There is an Oracle database schema (very small in data, but still about 10-15 tables). It contains a sort of configuration (routing tables).
There is an application that have to poll this schema from time to time. Notifications are not to be used.
If no data in the schema were updated, the application should use its current in-memory version.
If any table had any update, the application should reload all the tables into memory.
What would be the most effective way to check the whole schema for update since a given key point (time or transaction id)?
I am imagined Oracle keeps an transaction id per schema. Then there should be a way to query such an ID and keep it to compare with at next poll.
I've found this question, where such an pseudo-column exists on a row level:
How to find out when an Oracle table was updated the last time
I would think something similar exists on a schema level.
Can someone please point me in the right direction?
I'm not aware of any such functionality in Oracle. See below.
The best solution I can come up with is to create a trigger on each of your tables that updates a one-row table or context with the current date/time. Such triggers could be at the table-level (as opposed to row-level), so they wouldn't carry as much overhead as most triggers.
Incidentally, Oracle can't keep a transaction ID per schema, as one transaction could affect multiple schemas. It might be possible to use V$ views to track a transaction back to the objects it affected, but it wouldn't be easy and it would almost certainly perform poorer than the trigger scheme.
It turns out, if you have 10g, you can use Oracle's flashback functionality to get this information. However, you'd need to enable flashback (which carries some overhead of it's own) and the query is ridiculously slow (presumably because it's not really intended for this use):
select max(commit_timestamp)
from FLASHBACK_TRANSACTION_QUERY
where table_owner = 'YOUR_SCHEMA'
and operation in ('INSERT','UPDATE','DELETE','MERGE')
In order to avoid locking issues in the "last updated" table, you'd probably want to put that update into a procedure that uses an autonomous transaction, such as:
create or replace procedure log_last_update as
pragma autonomous_transaction;
begin
update last_update set update_date = greatest(sysdate,update_date);
commit;
end log_last_update;
This will cause your application to serialize to some degree: each statement that needs to call this procedure will need to wait until the previous one finishes. The "last updated" table may also get out of sync, because the update on it will persist even if the update that activated the trigger is rolled back. Finally, if you have a particularly long transaction, the application could pick up the new date/time before the transaction is completed, defeating the purpose. The more I think about this, the more it seems like a bad idea.
The better solution to avoid these issues is just to insert a row from the triggers. This would not lock the table, so there wouldn't be any serialization and the inserts wouldn't need to be made asynchronously, so they could be rolled back along with the actual data (and wouldn't be visible to your application until the data is visible as well). The application would get the max, which should be very fast if the table is indexed (in fact, this table would be an ideal candidate for an index-organized table). The only downside is that you'd want a job that runs periodically to clean out old values, so it didn't grow too large.
dbms_stats.gather_table_stats might also help: http://forums.oracle.com/forums/thread.jspa?threadID=607610
4. Statistics is considered to be stale, when the change is over 10% of current rows.
(As of 11g, this value can be customized per objects. Cool feature)
.
.
.
exec dbms_stats.gather_table_stats(user, 'T_STAT');
select * from sys.dba_tab_modifications where table_name = 'T_STAT';
No row selected
select stale_stats from sys.dba_tab_statistics where table_name = 'T_STAT';
NO
insert into t_stat select rownum from all_objects where rownum <= 20;
select * from sys.dba_tab_modifications where table_name = 'T_STAT';
No rows selected <-- Oops
select stale_stats from sys.dba_tab_statistics where table_name = 'T_STAT';
NO <-- Oops
exec dbms_stats.flush_database_monitoring_info;
select * from sys.dba_tab_modifications where table_name = 'T_STAT';
TABLE_OWNER TABLE_NAME PARTITION_NAME SUBPARTITION_NAME INSERTS UPDATES DELETES TIMESTAMP TRUNCATED DROP_SEGMENTS
UKJA T_STAT 20 0 0 2008-01-18 PM 11:30:19 NO 0
select stale_stats from sys.dba_tab_statistics where table_name = 'T_STAT';
YES
I'm inserting a row using an Oracle stored procedure which is configured to use an autonomous transaction. I'd like to insert this record, commit that transaction, and then lock the newly-inserted record so that nobody else can modify it except my current session (in another transaction, obviously, since the one that inserted it is autonomous).
How can I ensure that nobody else gets a lock on this new record before I have a chance to SELECT...FOR UPDATE it?
Using Oracle 10g.
No, you can never maintain a lock between transactions. The question you should be asking yourself is why you need to issue the commit between the insert and the select ... for update. The row is locked by the insert; if you finish whatever it is that you're doing with that row before you commit, then you don't have to worry about re-locking the row.
My first preference would be to remove the COMMIT until the data is ready for use by other sessions.
Alternatively, you could design your application in such a way as to make sure the new row only gets locked by that transaction.
Add a flag to the table, e.g. VISIBLE, default 'N'.
When you insert a row, insert it with the default value.
After the commit, select it for update, and update the flag to 'Y' (so that when your code commits the second time, it will update the flag as well).
Modify your application so that other sessions will ignore rows where VISIBLE='N'.