I'm new to sql and in particular to postgresql, and I'm studying it for university, but I'm having trouble understanding when I should use AFTER TRIGGERS instead of BEFORE TRIGGERS and when I should make my trigger a FOR EACH ROW TRIGGER or a FOR EACH STATEMENT TRIGGER.
From what I understood, every time the constraint has a count, a sum, an average or depends on a property related to the whole table I should use an AFTER TRIGGER with FOR EACH STATEMENT but I'm not sure and honestly I'm pretty confused.
Do you have any tips for when I should use each type of trigger, or how to understand when I should choose one over the others?
Thank you!
You use a BEFORE trigger FOR EACH ROW if you want to modify the data before they get written to the database.
You use an AFTER trigger if you need the data modifications to be already done, for example if you want to insert a row that references these data via a foreign key constraint.
You use a FOR EACH ROW trigger if you need to deal with on its own, and FOR EACH STATEMENT if the actual rows processed don't concern you (e.g., you want to write an audit log entry for the statement) or you want to access the modified data as a whole (e.g., throw an error if someone tries to delete more than 10 rows with a single SQL statement).
I have a project with a postgresql database. I'm handling migrations with Flyway. Now I have some initial data, that I want to add to the database when the application starts. It's a data that should always be there in the beginning. How could I handle this data initialization properly?
I've been thinking about using Flyways repeatable migrations. It is run always if the hash of the sql file changes. The problem is, that then I would need to construct it with sql insert statements. The problem there is, that what if the object already exists? Ideally, I would want that I could specify the data in the sql, and then the migration either inserts it to the table if it doesn't exist. But it should look for each field, not just by primary key. Because if I want to change something in one row, then I would want that to update to the database. Of course I could always drop the whole contents of the table, and then run the migration, but isn't that little cumbersome in the long run? Like always after little edit, I need to drop table and run the migration... I just wonder if there is some better way to handle the initial data?
You can specify the primary key value with INSERT or COPY by including the column like any other. With the former, you could add an ON CONFLICT DO UPDATE clause to make any possible changes. If you're using 9.4 or below, ON CONFLICT isn't available so you're stuck with DELETE and a plain INSERT or COPY, although knowing the primary keys means you don't have to delete the entire table.
I am writing a PL/SQL function that processes table rows individually. I pass it a key. What is the fastest way to check whether or not that row has been processed, and if so ignore it? It may sound stupid but please assume that it always tries to process all the rows in the table (mainly because it does other things too).
One solution I had was to create a flag column on that table(fastest I can think of), another was to insert a record into another table and check if the row is not in that table (probably slower).
Assuming you need to be using a PL/SQL function, you should only pass into it the rowset that it needs to handle. That means using plain SQL to select the rows from the table you need and pass that to the function. In any case though, you should look very carefully at what you're doing whenever you end up having to use a cursor in a database environment, because that's not really what databases are optimized for.
In case I need to change the PK of a single row from 1 to 10, for example, is there any way to trace every proc, view and function that might reference the old value?
I mean, a simple select in a proc like: select * from table where FK = 1 would break, and I'd had to look for every reference for ones in every proc and view and change them to 10 to get the system to work.
Is there any automatic way of doing this? I use SQL SERVER.
I suspect that the only way to do this correctly involves querying the database metadata - to identify all the places that use your PK as a FK, in a proc, or in a view. This is likely to be complex; fragile; and prone to error.
This is one of the (many) reasons to avoid having the PK as anything other than a system derived, meaningless value, which is not accessible to manipulation by (even) the creator/administrator. Also, under what circumstances would you have a PK hard coded in a proc or function - again a potential source of fragility in your system.
If a PK is created that is incorrect (by whatever criteria) or which needs to be changed, create a new record and copy the existing values into it. While this does not answer your query, your routines to delete or modify values in the table need to know how and where it is used; and so a routine to copy a row should be able to access this information.
What is the best, DBMS-independent way of generating an ID number that will be used immediately in an INSERT statement, keeping the IDs roughly in sequence?
DBMS independent? That's a problem. The two most common methods are auto incrementing columns, and sequences, and most DBMSes do one or the other but not both. So the database independent way is to have another table with one column with one value that you lock, select, update, and unlock.
Usually I say "to hell with DBMS independence" and do it with sequences in PostgreSQL, or autoincrement columns in MySQL. For my purposes, supporting both is better than trying to find out one way that works everywhere.
If you can create a Globally Unique Identifier (GUID) in your chosen programming language - consider that as your id.
They are harder to work with when troubleshooting (it is much easier to type in a where condition that is an INT) but there are also some advantages. By assigning the GUID as your key locally, you can easily build parent-child record relationships without first having to save the parent to the database and retrieve the id. And since the GUID, by definition, is unique, you don't have to worry about incrementing your key on the server.
There is auto increment or sequence
What is the point of this, that is the least of your worries?
How will you handle SQL itself?
MySQL has Limit,
SQL Server has Top,
Oracle has Rank
Then there are a million other things like triggers, alter table syntax etc etc
Yep, the obvious ways in raw SQL (and in my order of preference) are a) sequences b) auto-increment fields. The better, more modern, more DBMS-independent way is to not touch SQL at all, but to use a (good) ORM.
There's no universal way to do this. If there were, everyone would use it. SQL by definition abhors the idea - it's an antipattern for set-based logic (although a useful one, in many real-world cases).
The biggest problem you'd have trying to interpose an identity value from elsewhere is when a SQL statement involves several records, and several values must be generated simultaneously.
If you need it, then make it part of your selection requirements for a database to use with your application. Any serious DBMS product will provide its own mechanism to use, and it's easy enough to code around the differences in DML. The variations are pretty much all in the DDL.
I'd always go for the DB specific solution, but if you really have to the usual way of doing this is to implement your own sequence. Your RDBMS has to support transactions.
You create a sequence table which contains an int column and seed this with the first number, your transaction logic then looks something like this
begin transaction
update tblSeq set intID = intID + 1
select #myID = intID from tblSeq
inset into tblData (intID, ...) values (#myID, ...)
end transaction
The transaction forces a write lock such that the then next queued insert cannot update the tblSeq value before the record has been inserted into tblData. So long as all inserts go though this transaction then your generated ID is in sequence.
Use an auto-incrementing id column.
Is there really a reason that they have to be in sequence? If you're just using it as an ID, then you should just be able to use part of a UUID or the first couple digits of md5(now()).
You could take the time and massage it. It'd be the equivalent of something like
DateTime.Now.Ticks
So it be something like YYYYMMDDHHMMSSSS
It may be of a bit lateral approach, but a good ORM-type library will probably be able to at least hide the differences. For example, in Ruby there is ActiveRecord (commonly used in but not exclusively tied to the Ruby the Rails web framework) which has Migrations. Within a table definition, which is declared in platform-agnostic code, implementation details such as datatypes, sequential id generation and index creation are pushed down below your vision.
I have transparently developed a schema on SQLite, then implemented it on MS SQL Server and later ported to Oracle. Without ever changing the code that generates my schema definition.
As I say, it may not be what you're looking for, but the easiest way to encapsulate what varies is to use a library that has already done the encapsulation for you.
With only SQL, following could be one to the approaches:
Create a table to contain the starting id for your needs
When the application is deployed for the first time, the application should read the value in its context.
Thereafter, increment id (in thread-safe fashion) as required
3.1 Write the id to the database (in thread-safe fashion) which always keeps updated value
3.2 Don't write it to the database, just keep incrementing in the memory (thread-safe manner)
If for any reason server is going down, write the current id value to the database
When the server is up again it will pick from where it left, the last time.