SQL Merge output both matched and unmatched results - sql

I have two keys for data, pk is supposed to be generated from database when a row is inserted, while fk is a complex key which is given by another system. I would like to produce a pk key for each fk key.
CREATE TABLE test_target (
[pk] [INT] IDENTITY(1,1),
[fk] [varchar](20) NOT NULL)
And I can use merge to ensure that a new pk is produced whenever there is no corresponding fk exists in the table and I know I can output the newly created ids.
CREATE TABLE test_source (
[fk] [varchar](20) NOT NULL)
INSERT INTO test_source VALUES('abc123'),('def456'),('ghi789')
MERGE test_target WITH (SERIALIZABLE) AS T
USING test_source AS U
ON U.fk = T.fk
WHEN NOT MATCHED THEN
INSERT (fk) VALUES(U.fk)
OUTPUT inserted.pk, inserted.fk;
However, what I really want is all the pk associated with the fk in the test_source table. So I can get all by joining two tables.
SELECT test_target.* FROM test_target
INNER JOIN test_source ON test_target.fk = test_source.fk
But I feel like the associate pk is already found when in the case of MATCHED in the merge statement, so it is duplicated effort to do another search on the target table. My question is that is there a way to output the MATCHED pk in the same merge statement?

Yes there is - at first I thought I had to touch the row and update it in some form but I've realized we can just trick it. The output clause will output any row the statement touches, not just the rows you did not match on, so you can include a when matched clause - the problem is to make it a null op.
create table foo
(
id int
,bar varchar(30)
)
insert into foo (id, bar) values (1,'test1');
insert into foo (id, bar) values (2,'test2');
insert into foo (id, bar) values (3,'test3');
declare #temp int;
merge foo as dest
using
(
values (2, 'an updated value')
, (4, 'a new value')
) as src(id, bar) on (dest.id = src.id)
when matched then
update set #temp=1
when not matched then
insert (id,bar)
values (src.id, src.bar)
output $action, src.id;
You can see in the when matched clause, I set a declared variable to 1. This is oddly enough to be considered for the output clause to pick it up for output. You can distinguish which operation (insert vs update) has occurred if you need with the $action in the output.
This gives the following results:
$action id
UPDATE 2
INSERT 4
Performance wise I'd want to test how to operated at scale, or whether the variable assignment would cause a throttling effect

Related

SQL Server trigger can't insert

I beginning to learn how to write trigger with this basic database.
I'm also making my very 1st database.
Schema
Team:
TeamID int PK (TeamID int IDENTITY(0,1) CONSTRAINT TeamID_PK PRIMARY KEY)
TeamName nvarchar(100)
History:
HistoryID int PK (HistoryID int IDENTITY(0,1) CONSTRAINT HistoryID_PK PRIMARY KEY)
TeamID int FK REF Team(TeamID)
WinCount int
LoseCount int
My trigger: when a new team is inserted, it should insert a new history row with that team id
CREATE TRIGGER after_insert_Player
ON Team
FOR INSERT
AS
BEGIN
INSERT INTO History (TeamID, WinCount, LoseCount)
SELECT DISTINCT i.TeamID
FROM Inserted i
LEFT JOIN History h ON h.TeamID = i.TeamID
AND h.WinCount = 0 AND h.LoseCount = 0
END
Executed it returns
The select list for the INSERT statement contains fewer items than the insert list. The number of SELECT values must match the number of INSERT columns.
Please help thank. I'm using SQL Server
The error text is the best guide, it is so clear ..
You try inserting one value from i.TeamID into three columns (TeamID,WinCount,LoseCount)
consider these WinCount and LoseCount while inserting.
Note: I Think the structure of History table need to revisit, you should select WinCount and LoseCount as Expressions not as actual columns.
When you specify insert columns, you say which columns you will be filling. But in your case, right after insert you select only one column (team id).
You either have to modify the insert to contain only one column, or select, to retrieve 3 fields as in insert.
If you mention the columns where values have to be inserted(Using INSERT-SELECT).
The SELECT Statement has to contain the same number of columns that have been specified to be inserted. Also, ensure they are of the same data type.(You might face some issues otherwise)

Duplicating parent, child and grandchild records

I have a parent table that represents a document of-sorts, with each record in the table having n children records in a child table. Each child record can have n grandchild records. These records are in a published state. When the user wants to modify a published document, we need to clone the parent and all of its children and grandchildren.
The table structure looks like this:
Parent
CREATE TABLE [ql].[Quantlist] (
[QuantlistId] INT IDENTITY (1, 1) NOT NULL,
[StateId] INT NOT NULL,
[Title] VARCHAR (500) NOT NULL,
CONSTRAINT [PK_Quantlist] PRIMARY KEY CLUSTERED ([QuantlistId] ASC),
CONSTRAINT [FK_Quantlist_State] FOREIGN KEY ([StateId]) REFERENCES [ql].[State] ([StateId])
);
Child
CREATE TABLE [ql].[QuantlistAttribute]
(
[QuantlistAttributeId] INT IDENTITY (1, 1),
[QuantlistId] INT NOT NULL,
[Narrative] VARCHAR (500) NOT NULL,
CONSTRAINT [PK_QuantlistAttribute] PRIMARY KEY ([QuantlistAttributeId]),
CONSTRAINT [FK_QuantlistAttribute_QuantlistId] FOREIGN KEY ([QuantlistId]) REFERENCES [ql].[Quantlist]([QuantlistId]),
)
Grandchild
CREATE TABLE [ql].[AttributeReference]
(
[AttributeReferenceId] INT IDENTITY (1, 1),
[QuantlistAttributeId] INT NOT NULL,
[Reference] VARCHAR (250) NOT NULL,
CONSTRAINT [PK_QuantlistReference] PRIMARY KEY ([AttributeReferenceId]),
CONSTRAINT [FK_QuantlistReference_QuantlistAttribute] FOREIGN KEY ([QuantlistAttributeId]) REFERENCES [ql].[QuantlistAttribute]([QuantlistAttributeId]),
)
In my stored procedure, i pass in the QuantlistId I want to clone as #QuantlistId. Since the QuantlistAttribute table has a ForeignKey I can easily clone that as well.
INSERT INTO [ql].[Quantlist] (
[StateId],
[Title],
) SELECT
1,
Title,
FROM [ql].[Quantlist]
WHERE QuantlistId = #QuantlistId
SET #ClonedId = SCOPE_IDENTITY()
INSERT INTO ql.QuantlistAttribute(
QuantlistId
,Narrative)
SELECT
#ClonedId,
Narrative,
FROM ql.QuantlistAttribute
WHERE QuantlistId = #QuantlistId
The trouble comes down to the AttributeReference. If I cloned 30 QuantlistAttribute records, how do I clone the records in the reference table and match them up with the new records I just inserted in to the QuantlistAttribute table?
INSERT INTO ql.AttributeReference(
QuantlistAttributeId,
Reference,)
SELECT
QuantlistAttributeId,
Reference,
FROM ql.QuantlistReference
WHERE ??? I don't have a key to go off of for this.
I thought I could do this with some temporary linking tables that holds the old attribute id's along with the new attribute id's. I don't know how to go about inserting the old Attribute Id's in to a temp table along with their new ones. Inserting the existing Attributes, by QuantlistId, is easy enough, but I can't figure out how to make sure I link the correct new and old Id's together in some way, so that the AttributeReference table can be cloned right. If I could get the QuantlistAttribute new and old Id's linked, I could join on that temp table and figure out how to restore the relationship of the newly cloned references, to the newly cloned attributes.
Any help on this would be awesome. I've spent the last day and a half trying to figure this out with no luck :/
Please excuse some of the SQL inconsistencies. I re-wrote up the sql real quick, trimming out a lot of additional columns, related-tables and constraints that weren't needed for this question.
Edit
After doing a little digging around, I found that OUTPUT might be useful for this. Is there a way to use OUTPUT to map the QuantlistAttributeId records I just inserted, to the QuantlistAttributeId they originated from?
You can use OUTPUT to get the inserted rows.
You can insert the data into QuantlistAttribute based on the order of ORDER BY c.QuantlistAttributeId ASC
Have a temp table/table variable which 3 columns
an id identity column
new QuantlistAttributeId
old QuantlistAttributeId.
Use OUTPUT to insert new identity values of QuantlistAttribute into a temp table/table variable.
The new IDs are generated in the same order as c.QuantlistAttributeId
Use a row_number() ordered by QuantlistAttributeId to match the old QuantlistAttributeId and new QuantlistAttributeIds based on row_number() and id of the table variable and update the values or old QuantlistAttributeId in the table variable
Use the temp table and join with AttributeReference and insert records in one go.
Note:
ORDER BY during INSERT INTO SELECT and ROW_NUMBER() to get matching old QuantlistAttributeId is required because looking at your question, there seems to be no other logical key to map old and new records together.
Query for above Steps
DECLARE #ClonedId INT,#QuantlistId INT = 0
INSERT INTO [ql].[Quantlist] (
[StateId],
[Title]
) SELECT
1,
Title
FROM [ql].[Quantlist]
WHERE QuantlistId = #QuantlistId
SET #ClonedId = SCOPE_IDENTITY()
--Define a table variable to store the new QuantlistAttributeID and use it to map with the Old QuantlistAttributeID
DECLARE #temp TABLE(id int identity(1,1), newAttrID INT,oldAttrID INT)
INSERT INTO ql.QuantlistAttribute(
QuantlistId
,Narrative)
--New QuantlistAttributeId are created in the same order as old QuantlistAttributeId because of ORDER BY
OUTPUT inserted.QuantlistAttributeId,NULL INTO #temp
SELECT
#ClonedId,
Narrative
FROM ql.QuantlistAttribute c
WHERE QuantlistId = #QuantlistId
--This is required to keep new ids generated in the same order as old
ORDER BY c.QuantlistAttributeId ASC
;WITH CTE AS
(
SELECT c.QuantlistAttributeId,
--Use ROW_NUMBER to get matching id which is same as the one generated in #temp
ROW_NUMBER()OVER(ORDER BY c.QuantlistAttributeId ASC) id
FROM ql.QuantlistAttribute c
WHERE QuantlistId = #QuantlistId
)
--Update the old value in #temp
UPDATE T
SET oldAttrID = CTE.QuantlistAttributeId
FROM #temp T
INNER JOIN CTE ON T.id = CTE.id
INSERT INTO ql.AttributeReference(
QuantlistAttributeId,
Reference)
SELECT
T.NewAttrID,
Reference
FROM ql.AttributeReference R
--Use OldAttrID to join with ql.AttributeReference and insert NewAttrID
INNER JOIN #temp T
ON T.oldAttrID = R.QuantlistAttributeId
Hope this helps.

How to efficiently update postgres using a tuple of the PK and a value?

My SCHEMA is the following and I have ~ 4m existing posts in the DB that I need to update. I am adding an integer which points to a text location.
CREATE TABLE app_post (
id integer NOT NULL,
text_location integer,
title character varying(140)
);
I want to update existing records with a long (1000-5000) list of tuples that represent (id, text_location):
[(1, 123), (2,3), (9, 10)....]
What is the most efficient way to do this?
If you are generating the values on the fly using phyton, you could:
Create a buffer containing a single INSERT statement
Start a transaction
Create a temporary table and perform the INSERT statement in your buffer
Perform an UPDATE ... FROM
Commit the transaction, discarding the temporary table.
The UPDATE statement will look like this (assuming there is a table new_values containing those new values you need to update):
UPDATE app_post AS a SET text_location = n.text_location
FROM new_values AS n WHERE a.id = n.id
Don't forget to define the columns id as PRIMARY KEY or create an index on them.
EDIT : Since you are experiencing very slow performance, another workaround could be to recreate the whole table. The following idea assumes you don't have any FOREIGN KEY constraint applied to app_post table, as you have shown in your initial post.
-- Begin the Transaction
BEGIN;
-- Create a temporary table to hold the new values
CREATE TEMPORARY TABLE temp_update_values (
id integer PRIMARY KEY,
text_location integer
) ON COMMIT DROP;
-- Populate it
INSERT INTO temp_update_values (id, text_location) VALUES (1, 123), (2, 456) /* ... #5000 total */ ;
-- Create a temporary table merging the existing "app_post" and "temp_update_values"
CREATE TEMPORARY TABLE temp_new_app_post ON COMMIT DROP AS
SELECT a.id, COALESCE(n.text_location, a.text_location) AS text_location, a.title
FROM app_post AS a LEFT JOIN temp_update_values AS n ON a.id = n.id;
-- Empty the existing "app_post"
TRUNCATE TABLE app_post;
-- Repopulate "app_post" table
INSERT INTO app_post (id, text_location, title)
SELECT id, text_location, title FROM temp_new_app_post;
-- Commit the Transaction
COMMIT;
If there are any FOREIGN KEY constraint, you should take care of them, dropping them before TRUNCATING the app_post table, and re-creating them after it's been repopulated.

SQL can I have a "conditionally unique" constraint on a table?

I've had this come up a couple times in my career, and none of my local peers seems to be able to answer it. Say I have a table that has a "Description" field which is a candidate key, except that sometimes a user will stop halfway through the process. So for maybe 25% of the records this value is null, but for all that are not NULL, it must be unique.
Another example might be a table which must maintain multiple "versions" of a record, and a bit value indicates which one is the "active" one. So the "candidate key" is always populated, but there may be three versions that are identical (with 0 in the active bit) and only one that is active (1 in the active bit).
I have alternate methods to solve these problems (in the first case, enforce the rule code, either in the stored procedure or business layer, and in the second, populate an archive table with a trigger and UNION the tables when I need a history). I don't want alternatives (unless there are demonstrably better solutions), I'm just wondering if any flavor of SQL can express "conditional uniqueness" in this way. I'm using MS SQL, so if there's a way to do it in that, great. I'm mostly just academically interested in the problem.
If you are using SQL Server 2008 a Index filter would maybe your solution:
http://msdn.microsoft.com/en-us/library/ms188783.aspx
This is how I enforce a Unique Index with multiple NULL values
CREATE UNIQUE INDEX [IDX_Blah] ON [tblBlah] ([MyCol]) WHERE [MyCol] IS NOT NULL
In the case of descriptions which are not yet completed, I wouldn't have those in the same table as the finalized descriptions. The final table would then have a unique index or primary key on the description.
In the case of the active/inactive, again I might have separate tables as you did with an "archive" or "history" table, but another possible way to do it in MS SQL Server at least is through the use of an indexed view:
CREATE TABLE Test_Conditionally_Unique
(
my_id INT NOT NULL,
active BIT NOT NULL DEFAULT 0
)
GO
CREATE VIEW dbo.Test_Conditionally_Unique_View
WITH SCHEMABINDING
AS
SELECT
my_id
FROM
dbo.Test_Conditionally_Unique
WHERE
active = 1
GO
CREATE UNIQUE CLUSTERED INDEX IDX1 ON Test_Conditionally_Unique_View (my_id)
GO
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (1, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 0)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1)
INSERT INTO dbo.Test_Conditionally_Unique (my_id, active)
VALUES (2, 1) -- This insert will fail
You could use this same method for the NULL/Valued descriptions as well.
Thanks for the comments, the initial version of this answer was wrong.
Here's a trick using a computed column that effectively allows a nullable unique constraint in SQL Server:
create table NullAndUnique
(
id int identity,
name varchar(50),
uniqueName as case
when name is null then cast(id as varchar(51))
else name + '_' end,
unique(uniqueName)
)
insert into NullAndUnique default values
insert into NullAndUnique default values -- Works
insert into NullAndUnique default values -- not accidentally :)
insert into NullAndUnique (name) values ('Joel')
insert into NullAndUnique (name) values ('Joel') -- Boom!
It basically uses the id when the name is null. The + '_' is to avoid cases where name might be numeric, like 1, which could collide with the id.
I'm not entirely aware of your intended use or your tables, but you could try using a one to one relationship. Split out this "sometimes" unique column into a new table, create the UNIQUE index on that column in the new table and FK back to the original table using the original tables PK. Only have a row in this new table when the "unique" data is supposed to exist.
OLD tables:
TableA
ID pk
Col1 sometimes unique
Col...
NEW tables:
TableA
ID
Col...
TableB
ID PK, FK to TableA.ID
Col1 unique index
Oracle does. A fully null key is not indexed by a Btree in index in Oracle, and Oracle uses Btree indexes to enforce unique constraints.
Assuming one wished to version ID_COLUMN based on the ACTIVE_FLAG being set to 1:
CREATE UNIQUE INDEX idx_versioning_id ON mytable
(CASE active_flag WHEN 0 THEN NULL ELSE active_flag END,
CASE active_flag WHEN 0 THEN NULL ELSE id_column END);

How to add data to two tables linked via a foreign key?

If I were to have 2 tables, call them TableA and TableB. TableB contains a foreign key which refers to TableA. I now need to add data to both TableA and TableB for a given scenario. To do this I first have to insert data in TableA then find and retrieve TableA's last inserted primary key and use it as the foreign key value in TableB. I then insert values in TableB. This seems lika a bit to much of work just to insert 1 set of data. How else can I achieve this? If possible please provide me with SQL statements for SQL Server 2005.
That sounds about right. Note that you can use SCOPE_IDENTITY() on a per-row basis, or you can do set-based operations if you use the INSERT/OUTPUT syntax, and then join the the set of output from the first insert - for example, here we only have 1 INSERT (each) into the "real" tables:
/*DROP TABLE STAGE_A
DROP TABLE STAGE_B
DROP TABLE B
DROP TABLE A*/
SET NOCOUNT ON
CREATE TABLE STAGE_A (
CustomerKey varchar(10),
Name varchar(100))
CREATE TABLE STAGE_B (
CustomerKey varchar(10),
OrderNumber varchar(100))
CREATE TABLE A (
Id int NOT NULL IDENTITY(51,1) PRIMARY KEY,
CustomerKey varchar(10),
Name varchar(100))
CREATE TABLE B (
Id int NOT NULL IDENTITY(1123,1) PRIMARY KEY,
CustomerId int,
OrderNumber varchar(100))
ALTER TABLE B ADD FOREIGN KEY (CustomerId) REFERENCES A(Id);
INSERT STAGE_A VALUES ('foo', 'Foo Corp')
INSERT STAGE_A VALUES ('bar', 'Bar Industries')
INSERT STAGE_B VALUES ('foo', '12345')
INSERT STAGE_B VALUES ('foo', '23456')
INSERT STAGE_B VALUES ('bar', '34567')
DECLARE #CustMap TABLE (CustomerKey varchar(10), Id int NOT NULL)
INSERT A (CustomerKey, Name)
OUTPUT INSERTED.CustomerKey,INSERTED.Id INTO #CustMap
SELECT CustomerKey, Name
FROM STAGE_A
INSERT B (CustomerId, OrderNumber)
SELECT map.Id, b.OrderNumber
FROM STAGE_B b
INNER JOIN #CustMap map ON map.CustomerKey = b.CustomerKey
SELECT * FROM A
SELECT * FROM B
If you work directly with SQL you have the right solution.
In case you're performing the insert from code, you may have higher level structures that help you achieve this (LINQ, Django Models, etc).
If you are going to do this in direct SQL, I suggest creating a stored procedure that takes all of the data as parameters, then performs the insert/select identity/insert steps inside a transaction. Even though the process is still the same as your manual inserts, using the stored procedure will allow you to more easily use it from your code. As #Rax mentions, you may also be able to use an ORM to get similar functionality.