Logging the results of a MERGE statement

Logging the results of a MERGE statement - sql

I have 2 tables: A temporary table with raw data. Rows in it may be repeating (more then 1 time). The second is the target table with actual data (every row is unique).
I'm transfering rows using a cursor. Inside the cursor I use a MERGE statement. How can I print to the console using DBMS_OUTPUT.PUT_LINE which rows are updated and which are deleted?
According to the official documentation there is no such feature for this statement.
Are there any workaround?

I don't understand why you would want to do this. The output of dbms_output requires someone to be there to look at it. Not only that it requires someone to look through all of the output otherwise it's pointless. If there are more than say 20 rows then no one will be bothered to do so. If no one looks through all the output to verify but you need to actually log it then you are actively harming yourself by doing it this way.
If you really need to log which rows are updated or deleted there are a couple of options; both involve performance hits though.
You could switch to a BULK COLLECT, which enables you to create a cursor with the ROWID of the temporary table. You BULK COLLECT a JOIN of your two tables into this. Update / delete from the target table based on rowid and according to your business logic then you update the temporary table with a flag of some kind to indicate the operation performed.
You create a trigger on your target table which logs what's happening to another table.
In reality unless it is important that the number of updates / deletes is known then you should not do anything. Create your MERGE statement in a manner that ensures that it errors if anything goes wrong and use the error logging clause to log any errors that you receive. Those are more likely to be the things you should be paying attention to.

Previous posters already said that this approach is suspicious, both because of the cursor/loop and the output log for review.
On SQL Server, there is an OUTPUT clause in the MERGE statement that allows you to insert a row in another table with the $action taken (insert,update,delete) and any columns from the inserted or deleted/overwritten data you want. This lets you summarize exactly as you asked.
The equivalent Oracle RETURNING clause may not work for MERGE but does for UPDATE and DELETE.

Related

Can I prevent duplicate data in bigquery?

I'm playing with BQ and I create a table and inserted some data. I reinserted it and it created duplicates. I'm sure I'm missing something, but is there something I can do to ignore it if the data exists in the table?
My use case is I get a stream of data from various clients and sometimes their data will include some data they previously already sent(I have no control on them submitting).
Is there a way to prevent duplicates when certain conditions are met? The easy one is if the entire data is the same but also if certain columns are present?

It's difficult to answer your question without a clear idea of the table structure, but it feels like you could be interested in the MERGE statement: ref here.
With this DML statement you can perform a mix of INSERT, UPDATE, and DELETE statements, hence do exactly what you are describing.

Db2 for I :How to Select rows while deleting?

I've found this article that explains how to get the deleted record with the OLD TABLE keywords.
https://www.ibm.com/support/knowledgecenter/en/SSEPEK_10.0.0/apsg/src/tpc/db2z_selectvaluesdelete.html
However, the instruction doesn't seem to work in Db2 for I-series (version 7.2)
Do you know any alternatives to get the same result?
Thanks

As you have discovered, this syntax is not valid for DB2 for i. But, I can think of a couple ways to do what you want.
You can use two statements, one to retrieve records to be deleted into a temporary table, then one to perform the delete (just use the same where clause for both). Unfortunately, this has the opportunity, however small, that you will delete more than you read. If additional records that match your where clause are inserted between the time you select, and the time you delete, then your log will not be accurate.
You can use a delete trigger to insert records into the log as they are deleted from the table. This might be the best way, as it will always log deletes, no matter how the records are deleted. But it will always log deletes, no matter how the records are deleted, and if you only want those records logged within certain processes, then you will need to build dependencies between your trigger and those processes making both more complex.
You can use a stored procedure with a cursor and a positioned delete as mentioned by Mark Bairinstein in the comments above. This will allow you to delete records with logging, and also prevent the issue with the first option. But this leaves users the opportunity to delete records in a way that is not logged. May be good or bad depending on your requirements.

Fastest way to 'ignore' a row

I am writing a PL/SQL function that processes table rows individually. I pass it a key. What is the fastest way to check whether or not that row has been processed, and if so ignore it? It may sound stupid but please assume that it always tries to process all the rows in the table (mainly because it does other things too).
One solution I had was to create a flag column on that table(fastest I can think of), another was to insert a record into another table and check if the row is not in that table (probably slower).

Assuming you need to be using a PL/SQL function, you should only pass into it the rowset that it needs to handle. That means using plain SQL to select the rows from the table you need and pass that to the function. In any case though, you should look very carefully at what you're doing whenever you end up having to use a cursor in a database environment, because that's not really what databases are optimized for.

How Can I Maintain a Unique Identifier Amongst Multiple Database Tables?

I have been tasked with creating history tables for an Oracle 11g database. I have proposed something very much like the record based solution in the first answer of this post What is the best way to keep changes history to database fields?
Then my boss suggested that due to the fact that some tables are clustered i.e Some data from table 1 is related to table 2 (think of this as the format the tables were in before they were normalised), he would like there to be a version number which is maintained between all the tables at this cluster level. The suggested way to generate the version number is by using a SYS_GUID http://docs.oracle.com/cd/B12037_01/server.101/b10759/functions153.htm.
I thought about doing this with triggers so when one of this tables is updated, the other tables version numbers are subsequently updated, but I can see some issues with this such as the following:
How can I stop the trigger from one table, in turn firing the trigger for the other table?(We would end up calling triggers forever here)
How can I stop the race conditions? (i.e When table 1 and 2 are updated at the same time, how do I know which is the latest version number?)
I am pretty new to Oracle database development so some suggestions about whether or not this is a good idea/if there is a better way of doing this would be great.

I think the thing you're looking for is sequence: http://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_6015.htm#SQLRF01314
The tables could take the numbers from defined sequence independently, so no race conditions or triggers on your side should occur

Short answer to your first question is "No, you cannot.". The reason for this is that there's no way that users can stop a stated trigger. The only method I can imagine is some store of locking table, for example you create a intermediate table, and select the same row for update among your clustered tables. But this is really a bad way, as you've already mentioned in your second question. It will cause dreadful concurrency issue.
For your second question, you are very right. Different triggers for different original tables to update the same audit table will cause serious contention. It's wise to bear in mind the way triggers work that is they are committed when the rest of transaction commit. So if all related tables will update the same audit table, especially for the same row, simultaneously will render the rational paradigm unused. One benefit of the normalization is performance gain, as when you update different table will not content each other. But in this case if you want synchronize different table's operations in audit table. It will finally work like a flat file. So my suggestion would be trying your best to persuade your boss to use your original proposal.
But if your application always updates these clustered table in a transaction and write one audit information to audit table. You may write a stored procedure to update the entities first and write an audit at end of the transaction. Then you can use sequence to generate the id of audit table. It won't be any contention.

Using Trigger to get ID on Insert - SQL 2005

I have a table (table_a) that, upon insert, needs to retrieve the next available id from the available_id field in another table (table_b) to use as the primary key in table_a, and then increment the available_id field in table_b by 1. While doing this via stored procedures is easy, I need to be able to have this occur on any insert into the table.
I know I need to use triggers, but I am unsure how to code this. Any advice?
Basically this is my dilema:
I need to ensure 2 different tables have unique id's throughout. What would be the best way to do this without using GUID's? (Some of this code cannot be controlled on our end and requires ints as id's).

My advice is DON'T! Use an identity field instead.
In the first place, inserts can have multiple records and so a trigger to properly do this would have to account for that making it rather tricky to write. It would have to be an instead of trigger which is also tricky as you wouldn't have one of the required values (I assume your ID field is required) in the initial insert. In the second place two inserts going on at the same time could try to pick the same number or could lock the second connection for a good bit of time if you are doing a large import of data in one connection.

You could use an Oracle-style sequence, described here, calling it either via a trigger or from your application (providing the resulting value to your insert routine):
http://www.sqlteam.com/article/custom-auto-generated-sequences-with-sql-server
He mentions these issues to consider:
• What if two processes attempt to add
a row to the table at the exact same
time? Can you ensure that the same
value is not generated for both
processes?
• There can be overhead querying the
existing data each time you'd like to
insert new data
• Unless this is implemented as a
trigger, this means that all inserts
to your data must always go through
the same stored procedure that
calculates these sequences. This
means that bulk imports, or moving
data from production to testing and
so on, might not be possible or might
be very inefficient.
• If it is implemented as a trigger,
will it work for a set-based
multi-row INSERT statement? If so,
how efficient will it be? This
function wouldn't work if called for
each row in a single set-based INSERT
-- each NextCustomerNumber() returned would be the same value.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas