BEFORE and AFTER triggers, FOR EACH ROW and FOR EACH STATEMENT triggers - sql

I'm new to sql and in particular to postgresql, and I'm studying it for university, but I'm having trouble understanding when I should use AFTER TRIGGERS instead of BEFORE TRIGGERS and when I should make my trigger a FOR EACH ROW TRIGGER or a FOR EACH STATEMENT TRIGGER.
From what I understood, every time the constraint has a count, a sum, an average or depends on a property related to the whole table I should use an AFTER TRIGGER with FOR EACH STATEMENT but I'm not sure and honestly I'm pretty confused.
Do you have any tips for when I should use each type of trigger, or how to understand when I should choose one over the others?
Thank you!

You use a BEFORE trigger FOR EACH ROW if you want to modify the data before they get written to the database.
You use an AFTER trigger if you need the data modifications to be already done, for example if you want to insert a row that references these data via a foreign key constraint.
You use a FOR EACH ROW trigger if you need to deal with on its own, and FOR EACH STATEMENT if the actual rows processed don't concern you (e.g., you want to write an audit log entry for the statement) or you want to access the modified data as a whole (e.g., throw an error if someone tries to delete more than 10 rows with a single SQL statement).

Related

Can I prevent duplicate data in bigquery?

I'm playing with BQ and I create a table and inserted some data. I reinserted it and it created duplicates. I'm sure I'm missing something, but is there something I can do to ignore it if the data exists in the table?
My use case is I get a stream of data from various clients and sometimes their data will include some data they previously already sent(I have no control on them submitting).
Is there a way to prevent duplicates when certain conditions are met? The easy one is if the entire data is the same but also if certain columns are present?
It's difficult to answer your question without a clear idea of the table structure, but it feels like you could be interested in the MERGE statement: ref here.
With this DML statement you can perform a mix of INSERT, UPDATE, and DELETE statements, hence do exactly what you are describing.

How to manage initial data in PostgreSQL?

I have a project with a postgresql database. I'm handling migrations with Flyway. Now I have some initial data, that I want to add to the database when the application starts. It's a data that should always be there in the beginning. How could I handle this data initialization properly?
I've been thinking about using Flyways repeatable migrations. It is run always if the hash of the sql file changes. The problem is, that then I would need to construct it with sql insert statements. The problem there is, that what if the object already exists? Ideally, I would want that I could specify the data in the sql, and then the migration either inserts it to the table if it doesn't exist. But it should look for each field, not just by primary key. Because if I want to change something in one row, then I would want that to update to the database. Of course I could always drop the whole contents of the table, and then run the migration, but isn't that little cumbersome in the long run? Like always after little edit, I need to drop table and run the migration... I just wonder if there is some better way to handle the initial data?
You can specify the primary key value with INSERT or COPY by including the column like any other. With the former, you could add an ON CONFLICT DO UPDATE clause to make any possible changes. If you're using 9.4 or below, ON CONFLICT isn't available so you're stuck with DELETE and a plain INSERT or COPY, although knowing the primary keys means you don't have to delete the entire table.

Logging the results of a MERGE statement

I have 2 tables: A temporary table with raw data. Rows in it may be repeating (more then 1 time). The second is the target table with actual data (every row is unique).
I'm transfering rows using a cursor. Inside the cursor I use a MERGE statement. How can I print to the console using DBMS_OUTPUT.PUT_LINE which rows are updated and which are deleted?
According to the official documentation there is no such feature for this statement.
Are there any workaround?
I don't understand why you would want to do this. The output of dbms_output requires someone to be there to look at it. Not only that it requires someone to look through all of the output otherwise it's pointless. If there are more than say 20 rows then no one will be bothered to do so. If no one looks through all the output to verify but you need to actually log it then you are actively harming yourself by doing it this way.
If you really need to log which rows are updated or deleted there are a couple of options; both involve performance hits though.
You could switch to a BULK COLLECT, which enables you to create a cursor with the ROWID of the temporary table. You BULK COLLECT a JOIN of your two tables into this. Update / delete from the target table based on rowid and according to your business logic then you update the temporary table with a flag of some kind to indicate the operation performed.
You create a trigger on your target table which logs what's happening to another table.
In reality unless it is important that the number of updates / deletes is known then you should not do anything. Create your MERGE statement in a manner that ensures that it errors if anything goes wrong and use the error logging clause to log any errors that you receive. Those are more likely to be the things you should be paying attention to.
Previous posters already said that this approach is suspicious, both because of the cursor/loop and the output log for review.
On SQL Server, there is an OUTPUT clause in the MERGE statement that allows you to insert a row in another table with the $action taken (insert,update,delete) and any columns from the inserted or deleted/overwritten data you want. This lets you summarize exactly as you asked.
The equivalent Oracle RETURNING clause may not work for MERGE but does for UPDATE and DELETE.

How to prevent a derived value (SUM) from being manually updated to an incorrect value

I am learning SQL and DB design for a college class. One assignment that was given to us is to create a table with a derived attribute which is the SUM of some child attributes. For example:
ORDERS
orderID {PK}
/orderTotal /* derived from SUM of child itemTotals */
ITEMS
itemNo {PK}
orderID {FK}
itemTotal
Now, I am not even sure this is good practice. From some reading I've done on the web, derived values should not be stored, but rather calculated by user applications. I can understand that perspective, but in this instance my assignment is to store derived values and more importantly to maintain their integrity via triggers, which are relatively new to me so I am enjoying using them. I'd also imagine in some more complex cases that it really would be worth the saved processing time to store derived values. Here are the safeguards I've put in place which are NOT giving me problems:
A trigger which updates parent /orderTotal when new child item is inserted.
A trigger which updates parent /orderTotal when child item is deleted.
A trigger which updates parent /orderTotal when child itemTotal is modified.
However, there is another safeguard I want which I cannot figure out how to accomplish. Since the parent attribute /orderTotal is derived, it should never be manually modified. If somebody does attempt to manually modify it (to an erroneous value which is not actually the correct SUM), I want to either (a) prevent them from doing this or (b) revert it to its old value as soon as they are done.
Which is the better approach, and which is possible (and how)? I am not sure how to accomplish the former, and I tried to accomplish the latter via either a trigger or a constraint, but neither one seemed appropriate. The trigger method kept giving me ORA-04091 error for attempting to mutate the table which fired the trigger. The constraint method, I do not think is appropriate either since I'm not sure how to do such a specific thing inside a constraint check.
I am using Oracle SQL by the way, in SQL Developer.
Thanks!
"Now, I am not even sure thise is good practice."
Your intuition is right: this is bad practice. For some reason, a lot of college professors set their students the task of writing poor code; this wouldn't be so bad if they at least explained that it is bad practice and should never be used in the real world. But then I guess most professors have only a limited grasp on what matters in the real world. sigh.
Anyhoo, to answer your question. There are two approaches to this. One would be to use a trigger to "correct" i.e. swallow the change. This would be wrong because the user trying to modify the value would probably waste a lot of time trying to discover why their change wasn't sticking, without realising they were breaking a business rule. So, it's much better to hurl an exception.
This example uses Oracle syntax, because I'm guessing that's what you're using.
create or replace trigger order_header_trg
before insert or update
on order_header for each row
begin
if :new.order_total != :old.order_total
then
raise_application_error
( -20000, 'You are not allowed to modify the value of ORDER_TOTAL');
end if;
end;
The only problem with this approach is that it will prevent you inserting rows into ORDER_LINES and then deriving a new total for ORDER_HEADER.
This is one reason why denormalised totals are Bad Practice.
The error you're getting - ORA-04091 - says "mutating table". This happens when we attempt to write a trigger which selects from the table which owns the trigger. It almost always points to a poor data model, one which is insufficiently normalised. This is obviously the case here.
Given that you are stuck with the data model, the only workaround is a clunky implementation using multiple triggers and a package. The internet offers various slightly different solutions: here is one.
One solution might be to move all the logic for maintaining the orderTotal into an INSTEAD OF UPDATE trigger on the ORDERS table.
The triggers you already have in ITEMS can be simplified to update ORDERS without making a calculation - setting orderTotal to 0 or something like that. TheINSTEAD OF` trigger will run in place of the update statement.
If an attempt is made to manually update the order total, the INSTEAD OF trigger will fire and re-calculate the existing value.
PS - I don't have an Oracle DB to test on, but from what I can tell, an INSTEAD OF trigger will get around the ORA-04091 error - aplologies if this is wrong
EDIT
Whilst this solution would work in some other RDBMS systems, Oracle only supports INSTEAD OF triggers on views.
EDIT 2
If the assignment allows it, a view could be a suitable solution to this problem, since this would enable the order value column to be calculated.

Using Trigger to get ID on Insert - SQL 2005

I have a table (table_a) that, upon insert, needs to retrieve the next available id from the available_id field in another table (table_b) to use as the primary key in table_a, and then increment the available_id field in table_b by 1. While doing this via stored procedures is easy, I need to be able to have this occur on any insert into the table.
I know I need to use triggers, but I am unsure how to code this. Any advice?
Basically this is my dilema:
I need to ensure 2 different tables have unique id's throughout. What would be the best way to do this without using GUID's? (Some of this code cannot be controlled on our end and requires ints as id's).
My advice is DON'T! Use an identity field instead.
In the first place, inserts can have multiple records and so a trigger to properly do this would have to account for that making it rather tricky to write. It would have to be an instead of trigger which is also tricky as you wouldn't have one of the required values (I assume your ID field is required) in the initial insert. In the second place two inserts going on at the same time could try to pick the same number or could lock the second connection for a good bit of time if you are doing a large import of data in one connection.
You could use an Oracle-style sequence, described here, calling it either via a trigger or from your application (providing the resulting value to your insert routine):
http://www.sqlteam.com/article/custom-auto-generated-sequences-with-sql-server
He mentions these issues to consider:
• What if two processes attempt to add
a row to the table at the exact same
time? Can you ensure that the same
value is not generated for both
processes?
• There can be overhead querying the
existing data each time you'd like to
insert new data
• Unless this is implemented as a
trigger, this means that all inserts
to your data must always go through
the same stored procedure that
calculates these sequences. This
means that bulk imports, or moving
data from production to testing and
so on, might not be possible or might
be very inefficient.
• If it is implemented as a trigger,
will it work for a set-based
multi-row INSERT statement? If so,
how efficient will it be? This
function wouldn't work if called for
each row in a single set-based INSERT
-- each NextCustomerNumber() returned would be the same value.