Doing UPSERT when row is referenced by a FK - sql

Let's say that I have a table of items, and for each item, there can be additional information stored for it, which goes into a second table. The additional information is referenced by a FK in the first table, which can be NULL (if the item doesn't have additional info).
TABLE item (
...
item_addtl_info_id INTEGER
)
CONSTRAINT fk_item_addtl_info FOREIGN KEY (item_addtl_info)
REFERENCES addtl_info (addtl_info_id)
TABLE addtl_info (
addtl_info_id INTEGER NOT NULL
GENERATED BY DEFAULT
AS IDENTITY (
INCREMENT BY 1
NO CACHE
),
addtl_info_text VARCHAR(100)
...
CONSTRAINT pk_addtl_info PRIMARY KEY (addtl_info_id)
)
What is the "best practice" to update an item's additional info (in IBM DB2 SQL, preferably)?
It should be an UPSERT operation, meaning that if additional info does not yet exist then a new record is created in the second table, but if it does, then it is only updated, and the FK in the first table does not change.
So imperatively, this is the logic:
UPSERT(item, item_info):
CASE WHEN item.item_addtl_info_id IS NULL THEN
INSERT INTO addtl_info (item_info)
UPDATE item.item_addtl_info_id (addtl_info.addtl_info_id)
^^^^^^^^^^^^^
ELSE
UPDATE addtl_info (item_info)
END
My main problem is how to get the newly inserted addtl_info row's id (underlined above). In a stored proc I can request the id from a sequence and store it in a variable, but maybe there is a more straightforward way. Isn't it something that comes up all the time when programming databases?
I mean, I'm really not interested in what the id of the addtl_info record is as long as it remains unique and is referenced properly. So using sequences seems a bit of an overkill to me in this case.
As a matter of fact, this UPSERT operation should be part of the SQL language as a standard operation (maybe it is, and I just don't know about it?)...

The syntax I was looking for is:
SELECT * FROM NEW TABLE ( INSERT INTO phone_book VALUES ( 'Peter Doe','555-2323' ) )
from Wikipedia (http://en.wikipedia.org/wiki/Insert_%28SQL%29)
This is how to refer to the record that was just inserted in the table.
My colleague called this construct an "in-place trigger", which what it really is...
Here is the first version that I put together as a compound SQL statement:
begin atomic
declare addtl_id integer;
set addtl_id = (select item_addtl_info_id from item where item.item_id = XXX);
if addtl_id is null
then
set addtl_id = (select addtl_info_id from new table
(insert into addtl_info
(addtl_info_text)
values ('My brand new additional info')
)
);
update item set item.item_addtl_info_id = addtl_id
where item.item_id = XXX;
else
update addtl_info set addtl_info_text = 'My updated additional info'
where addtl_info.addtl_info_id = addtl_id;
end if;
end
XXX being equal to the item id to be updated - this code can now be easily inserted into a sproc, and XXX can be converted to an input parameter.
I also tried using MERGE INTO, but I couldn't figure out a syntax for updating a table different from what was specified as the target.

Related

Insert strategy for tables with one-to-one relationships in Teradata

In our data model, which is derived from the Teradata industry models, we observe a common pattern, where the superclass and subclass relationships in the logical data model are transformed into one-to-one relationships between the parent and the child table.
I know you can roll-up or roll-down the attributes to end up with a single table but we are not using this option overall. At the end what we have is a model like this:
Where City Id references a Geographical Area Id.
I am struggling with a good strategy to load the records in these tables.
Option 1: I could select the max(Geographical Area Id) and calculate the next Ids for a batch insert and reuse them for the City Table.
Option 2: I could use an Identity column in the Geographical Area Table and retrieve it after I insert every record in order to use it for the City table.
Any other options?
I need to assess the solution in terms of performance, reliability and maintenance.
Any comment will be appreciated.
Kind regards,
Paul
When you say "load the records into these tables", are you talking about a one-time data migration or a function that creates records for new Geographical Area/City?
If you are looking for a surrogate key and are OK with gaps in your ID values, then use an IDENTITY column and specify the NO CYCLE clause, so it doesn't repeat any numbers. Then just pass NULL for the value and let TD handle it.
If you do need sequential IDs, then you can just maintain a separate "NextId" table and use that to generate ID values. This is the most flexible way and would make it easier for you to manage your BATCH operations. It requires more code/maintenance on your part, but is more efficient than doing a MAX() + 1 on your data table to get your next ID value. Here's the basic idea:
BEGIN TRANSACTION
Get the "next" ID from a lookup table
Use that value to generate new ID values for your next record(s)
Create your new records
Update the "next" ID value in the lookup table and increment it by the # rows newly inserted (you can capture this by storing the value in the ACTIVITY_COUNT value variable directly after executing your INSERT/MERGE statement)
Make sure to LOCK the lookup table at the beginning of your transaction so it can't be modified until your transaction completes
END TRANSACTION
Here is an example from Postgres, that you can adapt to TD:
CREATE TABLE NextId (
IDType VARCHAR(50) NOT NULL,
NextValue INTEGER NOT NULL,
PRIMARY KEY (IDType)
);
INSERT INTO Users(UserId, UserType)
SELECT
COALESCE(
src.UserId, -- Use UserId if provided (i.e. update existing user)
ROW_NUMBER() OVER(ORDER BY CASE WHEN src.UserId IS NULL THEN 0 ELSE 1 END ASC) +
(id.NextValue - 1) -- Use newly generated UserId (i.e. create new user)
)
AS UserIdFinal,
src.UserType
FROM (
-- Bulk Upsert (get source rows from JSON parameter)
SELECT src.FirstName, src.UserId, src.UserType
FROM JSONB_TO_RECORDSET(pUserDataJSON->'users') AS src(FirstName VARCHAR(100), UserId INTEGER, UserType CHAR(1))
) src
CROSS JOIN (
-- Get next ID value to use
SELECT NextValue
FROM NextId
WHERE IdType = 'User'
FOR UPDATE -- Use "Update" row-lock so it is not read by any other queries also using "Update" row-lock
) id
ON CONFLICT(UserId) DO UPDATE SET
UserType = EXCLUDED.UserType;
-- Increment UserId value
UPDATE NextId
SET NextValue = NextValue + COALESCE(NewUserCount,0)
WHERE IdType = 'User'
;
Just change the locking statement to Teradata syntax (LOCK TABLE NextId FOR WRITE) and add an ACTIVITY_COUNT variable after your INSERT/MERGE to capture the # rows affected. This assumes you're doing all this inside a stored procedure.
Let me know how it goes...

Relating two tables

I have created tables T1 with columns( id as Primary key and name) and T2 with columns( id as primary key, name, t_id as foreign key references T1(id)) . I Inserted some values from inputs from a Windows form. After querying SELECT * FROM T2; using isql, all the values in the foreign key column are null instead of duplicating values in T1(id) because of the relationship created. Is they anything I have left out or need to add? The primary key of both tables are autoincremented.
You are confusing auto-incremented keys and relationship uses.
Auto-incremented keys (or generally talking, fields) just help you when you are inserting a new record on the table of the key. But when you are inserting a new record that makes a reference to a record in another table, then you must specify that record, using the foreign key field. Or in your case, the user that is inserting the "name" in T2 must say which one record on T1 that "name" in T2 is making a reference.
Your confusion on the relationship is that you are thinking that an established relationship will enforce the use of that values automatically. But the relationship just enforce the validation of the values. So, the field t_id in T2 will not use the value of the last record of T1 automatically. But if you try to insert a value that do not exist in T1 in the field t_id, the relationship will not let you do.
So, answering your question, what you left out and need to add?
You left out the part of the code that insert the value on the t_id field of T2 table.
Let me try to explain using an example that is more common.
The most common case of this is that the application insert first the T1 record and then when the user is inserting T2, the application provide a way to the user to choose which one T1 record his T2 record is referencing.
Suppose T1 is a publishers table and T2 is a book table. User insert a publisher, and when it is inserting a book it can choose which one publisher publish that book.
Field "ID" of Customers will be AUTOINCREMENT by default in table create using Event BeforeInsert on table CUSTOMERS. LOOK AT
CREATE TRIGGER nametrigger FOR nametable
ACTIVE BEFORE INSERT POSITION 0
AS
BEGIN
IF (NEW.ID IS NULL) THEN BEGIN
NEW.ID = GEN_ID(GEN_PK_ID, 1);
END
END
Now one new record in Customers
INSERT INTO Customers (CustomerName, ContactName, Address, City, PostalCode, Country)
VALUES ('Cardinal','Tom B. Erichsen','Skagen 21','Stavanger','4006','Norway');
Then ID will be automaticaly one sequencial number from 1 up to last integer or smallint or bigint as you defined in your create table (pay attencion that ID field is not include in FIELDS and VALUES) because TRIGGER
now you can use the dataset (obj) options to link the table MATER and DETAIL see in help delphi
or in SQL you can to use PARAMS FIELDS
later insert one new record in table MASTER try...
INSERT INTO xTable2 (IDcustomersField, ..., ..., ...., ....)
VALUES ( :IDcustomersField, ..., ..., ...., ....);
xTable2 may using one field ID (Primary Key) autoincrement too. this help when DELETING or UPDATING fileds in this table
Then you can say the value to :IDcustomersField in table detail using
xQuery.PARAM( 0 ).value or xQuery.PARAMBYNAME( IDcustomersField).value (here im using Query obj as example )
you can to use example with DATASOURCE in code to say the value for IDcustomersField
can to use
Events in SQL
can to use
PROCEDURE IN SQL
DONT FORGOT
you have to create Relationship between two table ( REFERENCIAL INTEGRITY and PRIMARY KEY in mater table ) NOT NULL FOR TWO FIELDS ON TABLES
I believe that understand me about my poor explanation (i dont speak english
You need to insert the values for t_id manually, after you get the ID's value from the main table T1.
Depending on your logic in the database you also can use a trigger or a stored procedure. Give us more information about what values you expect to have in NAME field in T2 after the insert? Are they duplicates from T1 or independent from T1?
If T1.NAME=T2.NAME, you can automate the process with a trigger
CREATE OR ALTER TRIGGER TR_T1_AI0 FOR T1
ACTIVE AFTER INSERT POSITION 0
AS
BEGIN
INSERT INTO T2(NAME, T_ID)
VALUES (NEW.NAME, NEW.ID);
END
If T2.NAME's value is different from T1.NAME you can use a stored procedure with parameters both names:
CREATE ORA ALTER PROCEDURE XXXX(
P_NAME_T1 TYPE OF T1.NAME,
P_NAME_T2 TYPE OF T2.NAME)
AS
DECLARE VARIABLE L_ID TYPE OF T1.ID;
BEGIN
INSERT INTO T1(NAME)
VALUES (:p_NAME_T1)
RETURNING ID INTO:L_ID;
INSERT INTO T2(NAME, T_ID)
VALUES (:P_NAME_T2, :l_ID);
END
You can use both statements from the stored procedure directly in your program if it supports the returning syntax. If not, you need an additional query with SELECT NEXT VALUE FOR GENERATOR_FOR_T1 FROM RDB$DATABASE; and use the value returned from it in both INSERT statements.

SQL Server Unique Composite Key of Two Field With Second Field Auto-Increment

I have the following problem, I want to have Composite Primary Key like:
PRIMARY KEY (`base`, `id`);
for which when I insert a base the id to be auto-incremented based on the previous id for the same base
Example:
base id
A 1
A 2
B 1
C 1
Is there a way when I say:
INSERT INTO table(base) VALUES ('A')
to insert a new record with id 3 because that is the next id for base 'A'?
The resulting table should be:
base id
A 1
A 2
B 1
C 1
A 3
Is it possible to do it on the DB exactly since if done programmatically it could cause racing conditions.
EDIT
The base currently represents a company, the id represents invoice number. There should be auto-incrementing invoice numbers for each company but there could be cases where two companies have invoices with the same number. Users logged with a company should be able to sort, filter and search by those invoice numbers.
Ever since someone posted a similar question, I've been pondering this. The first problem is that DBs don't provide "partitionable" sequences (that would restart/remember based on different keys). The second is that the SEQUENCE objects that are provided are geared around fast access, and can't be rolled back (ie, you will get gaps). This essentially this rules out using a built-in utility... meaning we have to roll our own.
The first thing we're going to need is a table to store our sequence numbers. This can be fairly simple:
CREATE TABLE Invoice_Sequence (base CHAR(1) PRIMARY KEY CLUSTERED,
invoiceNumber INTEGER);
In reality the base column should be a foreign-key reference to whatever table/id defines the business(es)/entities you're issuing invoices for. In this table, you want entries to be unique per issued-entity.
Next, you want a stored proc that will take a key (base) and spit out the next number in the sequence (invoiceNumber). The set of keys necessary will vary (ie, some invoice numbers must contain the year or full date of issue), but the base form for this situation is as follows:
CREATE PROCEDURE Next_Invoice_Number #baseKey CHAR(1),
#invoiceNumber INTEGER OUTPUT
AS MERGE INTO Invoice_Sequence Stored
USING (VALUES (#baseKey)) Incoming(base)
ON Incoming.base = Stored.base
WHEN MATCHED THEN UPDATE SET Stored.invoiceNumber = Stored.invoiceNumber + 1
WHEN NOT MATCHED BY TARGET THEN INSERT (base) VALUES(#baseKey)
OUTPUT INSERTED.invoiceNumber ;;
Note that:
You must run this in a serialized transaction
The transaction must be the same one that's inserting into the destination (invoice) table.
That's right, you'll still get blocking per-business when issuing invoice numbers. You can't avoid this if invoice numbers must be sequential, with no gaps - until the row is actually committed, it might be rolled back, meaning that the invoice number wouldn't have been issued.
Now, since you don't want to have to remember to call the procedure for the entry, wrap it up in a trigger:
CREATE TRIGGER Populate_Invoice_Number ON Invoice INSTEAD OF INSERT
AS
DECLARE #invoiceNumber INTEGER
BEGIN
EXEC Next_Invoice_Number Inserted.base, #invoiceNumber OUTPUT
INSERT INTO Invoice (base, invoiceNumber)
VALUES (Inserted.base, #invoiceNumber)
END
(obviously, you have more columns, including others that should be auto-populated - you'll need to fill them in)
...which you can then use by simply saying:
INSERT INTO Invoice (base) VALUES('A');
So what have we done? Mostly, all this work was about shrinking the number of rows locked by a transaction. Until this INSERT is committed, there are only two rows locked:
The row in Invoice_Sequence maintaining the sequence number
The row in Invoice for the new invoice.
All other rows for a particular base are free - they can be updated or queried at will (deleting information out of this kind of system tends to make accountants nervous). You probably need to decide what should happen when queries would normally include the pending invoice...
you can use the trigger for before insert and assign the next value by taking the max(id) with "base" filter which is "A" in this case.
That will give you the max(id) value as 2 and than increment it by max(id)+1. now push the new value to the "id" field. before insert.
I think this may help you
MSSQL Triggers: http://msdn.microsoft.com/en-in/library/ms189799.aspx
Test Table
CREATE TABLE MyTable
( base CHAR(1),
id INT
)
GO
Trigger Definition
CREATE TRIGGER dbo.tr_Populate_ID
ON dbo.MyTable
INSTEAD OF INSERT
AS
BEGIN
SET NOCOUNT ON;
INSERT INTO MyTable (base,id)
SELECT i.base, ISNULL(MAX(mt.id),0) +1 AS NextValue
FROM inserted i left join MyTable mt
on i.base = mt.base
GROUP BY i.base
END
Test
Execute the following statement multiple times and you will see the next values available in that group will be assigned to ID.
INSERT INTO MyTable VALUES
('A'),
('B'),
('C')
GO
SELECT * FROM MyTable
GO

Primay Key conflicts on insertion of new records

In a database application, I want to insert, update and delete records in a table of database.
Table is as below:
In this table, Ga1_ID is Primary Key.
Suppose, I insert 5 records as show currently.
In second attempt, if I want to insert 5 other records and if any of these new records contains a primary key attribute which is already present in table it show error. Its fine.
But, when I insert new 5 records... how I can verify these new records's primary key value is not present. I mean, how to match or calculate the already present primary key attributes and then insert new records.
What is the best approach to manage this sort of situation ?
use following query in dataadapter:
da=new SqlDataAdapter("select Ga1_ID from table where Ga1_ID=#pkVal",conn);
DataSet=new DataSet();
da.fill(ds);
//pass parameter for #pkVal
da.SelectCommand.Parameters(1).Value = pkValue;
if(ds.Tables[0].Rows.Count>0) //If number of rows >0 then record exists
BEGIN
messagebox.show("Primary key present");
END
Hope its helpful.
Do not check existing records in advance, i.e. do not SELECT and then INSERT. A better (and pretty common) approach is to try to INSERT and handle exceptions, in particular, catch a primary key violation if any and handle it.
Do the insert in a try/catch block, with different handling in case of a primary key violation exception and other sql exception types.
If there was no exception, then job's done, record was inserted.
If you caught a primary key violation exception, then handle it appropriately (your post does not specify what you want to do in this case, and it's completely up to you)
If you want to perform 5 inserts at once and want to make sure they all succeed or else roll back if any of them failed, then do the inserts within a transaction.
you can do a lookup first before inserting.
IF EXISTS (SELECT * FROM tableName WHERE GA1_id=#newId)
BEGIN
UPDATE tableName SET Ga1_docid = #newdocID, GA1_fieldNAme = #newName, Ga1_fieldValue = #newVal where GA1_id=#newId
END
ELSE
BEGIN
INSERT INTO tableName(GA1_ID, Ga1_docid, GA1_fieldNAme Ga1_fieldValue) VALUES (value1,val2,value3,value4)
END
If you're using SQL Server 2012, use a sequence object - CREATE SEQUENCE.
This way you can get the next value using NEXT VALUE FOR.
With an older SQL Server version, you need to create the primary key field as an IDENTITY field and use the SCOPE_IDENTITY function to get the last identity value and then increment it manually.
Normally, you would like to have a surrogate key wich is generally an identity column that will automatically increment when you are inserting rows so that you don't have to care about knowing which id already exists.
However, if you have to manually insert the id there's a few alternatives for that and knowing wich SQL database you are using would help, but in most SQL implementations, you should be able to do something like:
IF NOT EXISTS
IF NOT EXISTS(
SELECT 1
FROM your_table
WHERE Ga1_ID = 1
)
INSERT INTO ...
SELECT WHERE NOT EXISTS
INSERT INTO your_table (col_1, col_2)
SELECT col_1, col_2
FROM (
SELECT 1 AS col_1, 2 AS col_2
UNION ALL
SELECT 3, 4
) q
WHERE NOT EXISTS (
SELECT 1
FROM your_table
WHERE col_1 = q.col_1
)
For MS SQL Server, you can also look at the MERGE statement and for MySQL, you can use the INSERT IGNORE statement.

Get all the rows referencing (via foreign keys) a particular row in a table

This seems so simple, but I haven't been able to find an answer to this question.
What do I want? A master table with rows that delete themselves whenever they are not referenced (via foreign keys) anymore. The solution may or may not be specific to PostgreSql.
How? One of my approaches to solving this problem (actually, the only approach so far) involves the following: For every table that references this master table, on UPDATE or DELETE of a row, to check for the referenced row in master, how many other other rows still refer to the referenced row. If it drops down to zero, then I delete that row in master as well.
(If you have a better idea, I'd like to know!)
In detail:
I have one master table referenced by many others
CREATE TABLE master (
id serial primary key,
name text unique not null
);
All the other tables have the same format generally:
CREATE TABLE other (
...
master_id integer references master (id)
...
);
If one of these are not NULL, they refer to a row in master. If I go to this and try to delete it, I will get an error message, because it is already referred to:
ERROR: update or delete on table "master" violates foreign key constraint "other_master_id_fkey" on table "other"
DETAIL: Key (id)=(1) is still referenced from table "other".
Time: 42.972 ms
Note that it doesn't take too long to figure this out even if I have many tables referencing master. How do I find this information out without having to raise an error?
You can do one of the following:
1) Add reference_count field to master table. Using triggers on detail tables increase the reference count whenever a row with this master_id is added. Decrease the count, when row gets deleted. When reference_count reaches 0 - delete the record.
2) Use pg_constraint table (details here) to get the list of referencing tables and create a dynamic SQL query.
3) Create triggers on every detail table, that deletes master_id in main table. Silence error messages with BEGIN ... EXCEPTION ... END.
In case someone wants a real count of rows in all other tables that reference a given master row, here is some PL/pgSQL. Note that this works in plain case with single column constraints. It gets more involved for multi-column constraints.
CREATE OR REPLACE FUNCTION count_references(master regclass, pkey_value integer,
OUT "table" regclass, OUT count integer)
RETURNS SETOF record
LANGUAGE 'plpgsql'
VOLATILE
AS $BODY$
declare
x record; -- constraint info for each table in question that references master
sql text; -- temporary buffer
begin
for x in
select conrelid, attname
from pg_constraint
join pg_attribute on conrelid=attrelid and attnum=conkey[1]
where contype='f' and confrelid=master
and confkey=( -- here we assume that FK references master's PK
select conkey
from pg_constraint
where conrelid=master and contype='p'
)
loop
"table" = x.conrelid;
sql = format('select count(*) from only %s where %I=$1', "table", x.attname);
execute sql into "count" using pkey_value;
return next;
end loop;
end
$BODY$;
Then use it like
select * from count_references('master', 1) where count>0
This will return a list of tables that have references to master table with id=1.
SELECT *
FROM master ma
WHERE EXISTS (
SELECT *
FROM other ot
WHERE ot.master_id = ma.id
);
Or, the other way round:
SELECT *
FROM other ot
WHERE EXISTS (
SELECT *
FROM master ma
WHERE ot.master_id = ma.id
);
SO if you want to update (or delete) only the rows in master that are not referenced by other, you could:
UPDATE master ma
SET id = 1000+id
, name = 'blue'
WHERE name = 'green'
AND NOT EXISTS (
SELECT *
FROM other ot
WHERE ot.master_id = ma.id
);