Db2 for I :How to Select rows while deleting? - sql

I've found this article that explains how to get the deleted record with the OLD TABLE keywords.
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_selectvaluesdelete.html
However, the instruction doesn't seem to work in Db2 for I-series (version 7.2)
Do you know any alternatives to get the same result?
Thanks

As you have discovered, this syntax is not valid for DB2 for i. But, I can think of a couple ways to do what you want.
You can use two statements, one to retrieve records to be deleted into a temporary table, then one to perform the delete (just use the same where clause for both). Unfortunately, this has the opportunity, however small, that you will delete more than you read. If additional records that match your where clause are inserted between the time you select, and the time you delete, then your log will not be accurate.
You can use a delete trigger to insert records into the log as they are deleted from the table. This might be the best way, as it will always log deletes, no matter how the records are deleted. But it will always log deletes, no matter how the records are deleted, and if you only want those records logged within certain processes, then you will need to build dependencies between your trigger and those processes making both more complex.
You can use a stored procedure with a cursor and a positioned delete as mentioned by Mark Bairinstein in the comments above. This will allow you to delete records with logging, and also prevent the issue with the first option. But this leaves users the opportunity to delete records in a way that is not logged. May be good or bad depending on your requirements.

Related

Can I prevent duplicate data in bigquery?

I'm playing with BQ and I create a table and inserted some data. I reinserted it and it created duplicates. I'm sure I'm missing something, but is there something I can do to ignore it if the data exists in the table?
My use case is I get a stream of data from various clients and sometimes their data will include some data they previously already sent(I have no control on them submitting).
Is there a way to prevent duplicates when certain conditions are met? The easy one is if the entire data is the same but also if certain columns are present?
It's difficult to answer your question without a clear idea of the table structure, but it feels like you could be interested in the MERGE statement: ref here.
With this DML statement you can perform a mix of INSERT, UPDATE, and DELETE statements, hence do exactly what you are describing.

How Can I Maintain a Unique Identifier Amongst Multiple Database Tables?

I have been tasked with creating history tables for an Oracle 11g database. I have proposed something very much like the record based solution in the first answer of this post What is the best way to keep changes history to database fields?
Then my boss suggested that due to the fact that some tables are clustered i.e Some data from table 1 is related to table 2 (think of this as the format the tables were in before they were normalised), he would like there to be a version number which is maintained between all the tables at this cluster level. The suggested way to generate the version number is by using a SYS_GUID http://docs.oracle.com/cd/B12037_01/server.101/b10759/functions153.htm.
I thought about doing this with triggers so when one of this tables is updated, the other tables version numbers are subsequently updated, but I can see some issues with this such as the following:
How can I stop the trigger from one table, in turn firing the trigger for the other table?(We would end up calling triggers forever here)
How can I stop the race conditions? (i.e When table 1 and 2 are updated at the same time, how do I know which is the latest version number?)
I am pretty new to Oracle database development so some suggestions about whether or not this is a good idea/if there is a better way of doing this would be great.
I think the thing you're looking for is sequence: http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6015.htm#SQLRF01314
The tables could take the numbers from defined sequence independently, so no race conditions or triggers on your side should occur
Short answer to your first question is "No, you cannot.". The reason for this is that there's no way that users can stop a stated trigger. The only method I can imagine is some store of locking table, for example you create a intermediate table, and select the same row for update among your clustered tables. But this is really a bad way, as you've already mentioned in your second question. It will cause dreadful concurrency issue.
For your second question, you are very right. Different triggers for different original tables to update the same audit table will cause serious contention. It's wise to bear in mind the way triggers work that is they are committed when the rest of transaction commit. So if all related tables will update the same audit table, especially for the same row, simultaneously will render the rational paradigm unused. One benefit of the normalization is performance gain, as when you update different table will not content each other. But in this case if you want synchronize different table's operations in audit table. It will finally work like a flat file. So my suggestion would be trying your best to persuade your boss to use your original proposal.
But if your application always updates these clustered table in a transaction and write one audit information to audit table. You may write a stored procedure to update the entities first and write an audit at end of the transaction. Then you can use sequence to generate the id of audit table. It won't be any contention.

Logging the results of a MERGE statement

I have 2 tables: A temporary table with raw data. Rows in it may be repeating (more then 1 time). The second is the target table with actual data (every row is unique).
I'm transfering rows using a cursor. Inside the cursor I use a MERGE statement. How can I print to the console using DBMS_OUTPUT.PUT_LINE which rows are updated and which are deleted?
According to the official documentation there is no such feature for this statement.
Are there any workaround?
I don't understand why you would want to do this. The output of dbms_output requires someone to be there to look at it. Not only that it requires someone to look through all of the output otherwise it's pointless. If there are more than say 20 rows then no one will be bothered to do so. If no one looks through all the output to verify but you need to actually log it then you are actively harming yourself by doing it this way.
If you really need to log which rows are updated or deleted there are a couple of options; both involve performance hits though.
You could switch to a BULK COLLECT, which enables you to create a cursor with the ROWID of the temporary table. You BULK COLLECT a JOIN of your two tables into this. Update / delete from the target table based on rowid and according to your business logic then you update the temporary table with a flag of some kind to indicate the operation performed.
You create a trigger on your target table which logs what's happening to another table.
In reality unless it is important that the number of updates / deletes is known then you should not do anything. Create your MERGE statement in a manner that ensures that it errors if anything goes wrong and use the error logging clause to log any errors that you receive. Those are more likely to be the things you should be paying attention to.
Previous posters already said that this approach is suspicious, both because of the cursor/loop and the output log for review.
On SQL Server, there is an OUTPUT clause in the MERGE statement that allows you to insert a row in another table with the $action taken (insert,update,delete) and any columns from the inserted or deleted/overwritten data you want. This lets you summarize exactly as you asked.
The equivalent Oracle RETURNING clause may not work for MERGE but does for UPDATE and DELETE.

PLSQL: Get number of records updated vs inserted when a merge statement is used

Merge will always give you the number of records merged regardless of how my records inserted or updated using SQL%ROWCOUNT.
But how to find out number of records that were actually inserted vs number of records that were actually updated.
I tried options from this post but this doesn't seem to work -
https://asktom.oracle.com/pls/asktom/f?p=100:11:0::NO::P11_QUESTION_ID:122741200346595110
Any help?
You cannot, in general, differentiate how a row affected by a MERGE statement was affected in order to get separate counts for inserted, updated, and deleted rows.
If you really need separate figures, you could issue separate INSERT and UPDATE statements though that is likely to be less efficient. There are non-general solutions that depend on particular query plans but those are going to be rather brittle and generally wouldn't be recommended.

sql: DELETE + INSERT vs UPDATE + INSERT

A similar question has been asked, but since it always depends, I'm asking for my specific situation separately.
I have a web-site page that shows some data that comes from a database, and to generate the data from that database I have to do some fairly complex multiple joins queries.
The data is being updated once a day (nightly).
I would like to pre-generate the data for the said view to speed up the page access.
For that I am creating a table that contains exact data I need.
Question: for my situation, is it reasonable to do complete table wipe followed by insert? or should I do update,insert?
SQL wise seems like DELETE + INSERT will be easier (INSERT part is a single SQL expression).
EDIT: RDBMS: MS SQL Server 2008 Ent
TRUNCATE will be faster than delete, so if you need to empty a table do that instead
You didn't specify your RDBMS vendor but some of them also have MERGE/UPSERT commands This enables you do update the table if the data exists and insert if it doesn't
It partly depends on how the data is accessed. If you have a period of time with no (or very few) users accessing it, then there won't be much impact on the data disappearing (between the DELETE and the completion of the INSERT) for a short while.
Have you considered using a materialized view (MSSQL calls them indexed views) instead of doing it manually? This could also have other performance benefits as an indexed view gives the query optimizer more choices when its constructing execution plans for other queries that reference the table(s) in the view.
It depends on the size of the table and the recovery model on the database. If you are deleting many hundreds of thousands of records and reinstating them vs updating a small batch of a few hundred and inserting tens of rows, it will add an unnecessary size to your transaction logs. However you could use TRUNCATE to get around this as it won't affect the transaction log.
Do you have the option of a MERGE/UPSERT? If you're using MS-SQL you can use CROSS APPLY to do something similar if you don't.
One approach to handling this type of problem is to insert into new table, then do a table Rename. This will insure that all new data is present at the same time.
What if some data that was present yesterdays is not anymore? Delete may be safer or you could end up deleting some records anyway.
And in the end it doesnt really matter which way you go.
Unless on the case #kevinw mentioned
Although I fully agree with SQLMenace's answer I do would like to point out that MERGE does NOT remove unneeded records ! If you're sure that your new data will be a super-set of the existing data, then MERGE is great, otherwise you'll either need to make sure that you delete any superfluous records later on, or use the TRUNCATE + INSERT method ...
(Personally I'm still a fan of the latter as it usually is quite fast, just make sure to drop all indexes/unique constraints upfront and rebuild them one by one. This has the benefit of the INSERT transaction being smaller and the index-adding being done in (smaller) transactions again later on). (**)
(**: yes, this might be tricky on live system, but then again he already mentioned this was done during some kind of overnight anyway, I'm extrapolating there is no user-access at that time)