Postgresql turns update to insert and creates duplicate record - sql

I'm not quite sure how to ask this question.
The table stores its main data in a JSONB column. The other columns are an integer primary key, a unique text secondary key, an application generated integer transaction id, and the type of operation last performed (insert, update, delete).
There are 5 triggers.
On before insert and update, set the new.operation column to TG_OP (more on this later)
On before insert, generate a unique 6 digit alphameric code for use in URLs
On before insert, generate a unique, random 6 digit numeric code avoiding the German Tank Problem.
On before insert, add the numeric and alphameric codes to the JSONB object.
On after update and delete, insert the old record appended with the new tranid and operation column to an unindexed archive table.
All of the triggers seem to work and the records get created with the new ids and the ids in the JSONB column.
However, on an update the new operation gets set to update from TG_OP variable, but the record gets inserted into the table creating duplicate keys. Subsequent ops on that record fail because of the duplicate records.
I've stepped through it in the pgAdmin debugger. It seems to go through each trigger correctly. It completes with a record from the insert (e.g. tranid=254, operation=insert) and another from the update (e.g. tranid=256, operation=update). The archive table has one record added which shows the original info was 254/insert and it was replaced by 256/update.
But there are two records in the main table!!!
This is a violation of two uniqueness constraints which should have caused it to fail:
CONSTRAINT npprimarykey_id PRIMARY KEY (id),
CONSTRAINT npid_txt_unique UNIQUE (id_txt)
Beyond that, the command being executed was an UPDATE.
I'd not clear where to look or on what forum to ask the question. Which is the forum the firms building Postgresql frequent?
Thanks,
David

Related

Postgres unique index says duplicate exists on freshly deleted row

I have a postgres database in which I'm refreshing data periodically. Most of the time it works, but sometimes I have issues with a unique index.
Minimal example
create table test_table (
id int
);
create unique index test_table_unique on test_table(id);
(I know, in this case it should be a primary key, but for the sake of example, please bear with me.)
Now, every hour, I do something like this:
begin;
delete from test_table;
insert into test_table (id) values (1), (2), (3)...
commit;
As I said, most of the time it will just work fine. However, sometimes postgres complains about a duplicate entry in the unique index.
error: duplicate key value violates unique constraint test_table_unique
detail: "Key (id)=(2) already exists."
My real database
In my actual table, I'm using JSON payloads, and the unique index is made on fields of that json payload. In particular, the error details is as follows:
create table if not exists source (
id serial primary key,
payload jsonb not null
);
create unique index if not exists source_index_and_id on source ((payload->>'_index'), (payload->>'_id'));
error details: "Key ((payload ->> '_index'::text), (payload ->> '_id'::text))=(companies, AC9860) already exists."
I'm confident there is no actual duplicate data. I'm deleting everything for a particular ->>_index, and the ->>_id is unique in my source data.
My understanding is that if I delete rows from a table, the indices will be updated before the next statements are executed. But it doesn't seem to be the case. I've found that it helps (not sure if it actually solves the issue) to commit the changes after the delete, and before the inserts.
begin;
delete...
commit;
begin;
insert...
commit;
What's happening here?
The only options how this could happen are
the deleting transaction rolled back
concurrent transactions inserted new rows after you deteted the original ones
the inserting transaction inserts the same key twice
the inserting transaction is accidentally run before the deleting one
PostGreSQL is not a real relational DBMS and does not match rule 7 of Codd's Rule about functional set operations.
Contrary to other RDBMS PostGreSQL delete rows one by one and this lack of functionality conduct to have sometime fantom key violation.
In my paper that compare PostGreSQL to MS SQL Server I made a test that show this evidence (§ 7 – The hard way to udpates unique values)

What a BEFORE trigger can do that an AFTER trigger can't and conversely

I've read a lot about these type of trigger and what I understood until now is that they can do exactly the same things, the difference is more about "taste" than what they really are able to do.
So what I'm asking is: is there something that a BEFORE trigger can't do compared to an AFTER trigger and conversely? Am I missing something?
The before trigger allows you to set action before any other action is done.
Just imagine a table with a foreign key column which is not nullable. You might insert this depending row before you insert the row which needs this FK to be set.
You could prohibit any action at all (no changes allowed to key tables...)
You could check for existance and do an update instead of an insert
and many more...
To Expand a little on #Shnugo's answer, note my comments are more specific to sql-server but I believe principles hold true for other rdbms that have the triggers.
What can BEFORE (or instead of) do that AFTER cannot
Say you have a BEFORE trigger on a blank table with an identity column that does nothing/no insert. To get a similar result in an AFTER trigger you would have to delete the inserted records.
Let's walk through inserting records with the different triggers. If BEFORE trigger is enabled and you do 100 inserts, but trigger doesn't actually insert them, then you disable the trigger and do an insert you will be at identity 1.
Do the same thing for an AFTER trigger and when you insert after you will be at 101. Because the records where actually inserted, but then deleted.
So a BEFORE trigger can stop an action completely where as an AFTER has to try to undo an action to get a similar result in the data. Complex Validation? Or in Shnugo's example more common insert a parent record before inserting a child to that parent so a foreign key constraint error doesn't occur.
What can an AFTER trigger do that a BEFORE cannot.
Use the identity column on an insert statement. In sql-server the special table inserted in the same BEFORE trigger as above will return Identity = 0 instead of Identity = 1. Where an AFTER trigger will have Identity = 1. So in the BEFORE you could avoid a foreign key constraint by inserting a parent and in the AFTER you can do the opposite you can insert a child record with the proper foreign key.

Creating a Trigger so that a member is not allowed to rent more than 5 movies at a given time [duplicate]

How do I start a trigger so that this allows nobody to be able to rent a movie if their unpaid balance exceeds 50 dollars?
What you have here is a cross-row table constraint - i.e. you can't just put a single Oracle CONSTRAINT on a column, as these can only look at data within a single row at a time.
Oracle has support for only two cross-row constraint types - uniqueness (e.g. primary keys and unique constraints) and referential integrity (foreign keys).
In your case, you'll have to hand-code the constraint yourself - and with that comes the responsibility to ensure that the constraint is not violated in the presence of multiple sessions, each of which cannot see data inserted/updated by other concurrent sessions (at least, until they commit).
A simplistic approach is to add a trigger that issues a query to count how many records conflict with the new record; but this won't work because the trigger cannot see rows that have been inserted/updated by other sessions but not committed yet; so the trigger will sometimes allow members to rent 6 videos, as long as (for example) they get two cashiers to enter the data in separate terminals.
One way to get around this problem is to put some element of serialization in - e.g. the trigger would first request a lock on the member record (e.g. with a SELECT FOR UPDATE) before it's allowed to check the rentals; that way, if a 2nd session tries to insert rentals, it will wait until the first session does a commit or rollback.
Another way around this problem is to use an aggregating Materialized View, which would be based on a query that is designed to find any rows that fail the test; the expectation is that the MV will be empty, and you put a table constraint on the MV such that if a row was ever to appear in the MV, the constraint would be violated. The effect of this is that any statement that tries to insert rows that violate the constraint will cause a constraint violation when the MV is refreshed.
Writing the query for this based on your design is left as an exercise for the reader :)
If you want to restrict something about your table data then you should have a look at Constraints and not Triggers.
Constraints are ensuring some conditions about your table data. Like your example.
Triggers are fired when some action (i.e. INSERT, UPDATE, DELETE) took place and you can do some work then as a reaction to this action.

Guarantee primary key is present in update trigger

I am writing an update trigger and accessing the 'inserted' table to see which rows have been modified.
I have two related questions :
Does the inserted table always contain all the columns of the real table?
If the inserted table contains only the columns that have changed, will there always at least be the primary key columns in the inserted table?
Yes, it includes all columns from the original table, except:
SQL Server 2012 does not allow for text, ntext, or image column references in the inserted and deleted tables for AFTER triggers.
(Similar language, with different version numbers, exists for older versions of SQL Server)
Ask yourself how useful they would be if only a single (non-key) column was updated. You could tell that an update had occurred but you'd be unable to do any further useful processing.

Define One to Many Relationships with SQL

I'm looking for a way to set up a one to many relationship between 2 tables. The table structures is explained below but I've tried to leave everything off that has nothing to do with the problem.
Table objects has 1 column called uuid.
Table contents has 3 columns called content, object_uuid and timestamp.
The basic idea is to insert a row into objects and get a new uuid from the database. This uuid is then used stored for every row in contents to associate contents with objects.
Now I'm trying to use the database to enforce that:
Each row in contents references a row in objects (a foreign key should do)
No row in objects exists without at least a row in contents
These constraints should be enforced on commit of transactions.
Ordinary triggers can't help probably because when a row in the objects table is written, there can't be a row in contents yet. Postgres does have so called constraint triggers that can be deferred until the end of the transaction. It would be possible to use those but they seem to be some sort of internal construct not intended for everyday use.
Ideas or solutions should be standard SQL (preferred) or work with Postgres (version does not matter). Thanks for any input.
Your main problem is that other than foreign key constraints; no constraint can reference another table.
Your best bet is to denormalize this a little and have a column on object containing the count of contents that reference it. You can create a trigger to keep this up to date.
contents_count INTEGER NOT NULL DEFAULT 0
This won't be as unbreakable unless you put some user security over who can update this column. But if you keep it up to date with a trigger and all you're looking to avoid is accidental corruption, this should be sufficient.
EDIT: As per the comment, CHECK constraints are not deferrable. This solution would raise an error if all the contents are removed even if the intention is to add more in the same transaction.
Maybe what you want to do is normalize a little bit more. You need a third table, that references elements of the other tables. Table objects should have its own uuid and table contents sholud have also its own uuid and no reference to the table objects. The third table should have only the references to the other two tables, but the primary key is the combination of both references.
so for example you have an uuid of the table objects and you want all the contents of that uuid, assuming that the third table has as columns object_uuid and content_uuid, and the table contents has its own serial column named uuid, your query should be like this:
SELECT * FROM thirdtable,contents
WHERE thirdtable.content_uuid = contents.uuid AND thirdtable.object_uuid=34;
Then you can use an on insert trigger on every table
CREATE TRIGGER my_insert_trigger AFTER INSERT OR UPDATE ON contents
FOR EACH ROW EXECUTE PROCEDURE my_check_function();
and then in function my_check_function() delete every row in objects that is not present in the third table. Somebody else answered first while I was answering, if you guys like my solution I could help you to make the my_check_function() function.