SQL Server MERGE, access to source table after OUTPUT - sql

I am working on a copy functionality in my application. I have campaigns, and campaign packages, stored in different tables.
I have a table for outputs, and several variables:
DECLARE #NewIds TABLE (NewCampaignId int, NewCampaignPackageId int, OriginalCampaignId int, OriginalCampaignPackageId int)
DECLARE #NewCampaignId [int]
DECLARE #NewCampaignPackageId [int]
To copy the campaign, I do a MERGE:
MERGE INTO Campaign AS CNew USING
(SELECT * FROM Campaign WHERE Id = #CampaignId) AS COld ON 0 = 1 WHEN NOT MATCHED BY TARGET THEN
INSERT (Status, Country, Culture, Title, DateFrom, DateTo, SalesRepEmail, SalesAgronomistEmail, TermsAndConditions, SalesForceCampaignId, CampaignType, Description, CampaignExpiredTitle, CampaignExpiredMessage, CampaignExpiredCTAText, CampaignExpiredCTALink)
VALUES (Status, Country, Culture, 'Copy of ' + Title, DateFrom, DateTo, SalesRepEmail, SalesAgronomistEmail, TermsAndConditions, null, CampaignType, Description, CampaignExpiredTitle, CampaignExpiredMessage, CampaignExpiredCTAText, CampaignExpiredCTALink)
OUTPUT inserted.Id, 0, #CampaignId, 0 INTO #NewIds;
SELECT #NewCampaignId = NewCampaignId FROM #NewIds WHERE OriginalCampaignId = #CampaignId
This works. However, I need to update the campaign package table. For it to work, I need to preserve the original CampaignPackageId. I use merge like so
MERGE INTO CampaignPackage AS CPackageNew USING
(SELECT * FROM CampaignPackage WHERE Id = #CampaignId) AS CPackageOld ON 0 = 1
WHEN NOT MATCHED BY TARGET THEN
INSERT(CampaignId, CanEditQuantity, Benefits, Description, Name)
VALUES(#NewCampaignId, CanEditQuantity, Benefits, Description, Name)
OUTPUT 0, inserted.Id, #CampaignId, CPackageOld.Id INTO #NewIds;
SELECT #NewCampaignPackageId = NewCampaignPackageId FROM #NewIds WHERE OriginalCampaignPackageId = CPackageOld.Id AND OriginalCampaignId = #CampaignId
This is where the problem is - in the last line I'm trying to assign the NewCampaignPackageId based on my outputs table, but here I have no access to CPackageOld.Id. I need this old id, to make sure I'm hitting the correct record in the table.
Is there any way to still have access to this variable, after OUTPUT?

Related

When Merge null columns are not merged just inserted

My table and it's data are below
CREATE TABLE Sales (Id int ITDENTITY(1,1) NOT NULL,stateid int, district int Sitecount int)
CREATE TABLE Sales1(stateid int, district int,
Sitecount Int)
insert into Sales values (1,2,12)
insert into Sales values (1,3,20)
insert into Sales values (1, NULL, 10)
insert into Salesi values (1,2,10)
insert into Salesi values (1,2. 100)
insert into Select values (1,ULL, 18)
I have used the below query to merge
MERGE Sales AS T
USING (Select stateid, district, Sitecount from Sales1 group by stateid,district) as S
ON(S.stateid =T.stateld and S.district=T.district)
WHEN MATCHED
Then UPDATE SET
T.Sitecount=T.Sitecount+S.Sitecount
WHEN NOT MATCHED BY TARGET THEN INSERT (stateid,district,Sitecount) VALUES(S stateid, S.district, S.5itecount);
Whenever I run the query, if the matched data all columns are not null then only data merged,
Otherwise it is inserted as a new row.
If district data is null, need to add the sitecount based on the stateid.How to achieve it. Suggest me..
You can match on stateid using ISNULL in join like below:
MERGE Sales AS T
USING (Select stateid, district, max(Sitecount) Sitecount from Sales1 group by stateid,district) as S
ON(S.stateid =T.stateid and isnull(S.district,'')=isnull(T.district, ''))
WHEN MATCHED
Then UPDATE SET
T.Sitecount=T.Sitecount+S.Sitecount
WHEN NOT MATCHED BY TARGET THEN INSERT (stateid,district,Sitecount) VALUES(S.stateid, S.district, S.Sitecount);
select * from Sales
Also, please note that I have used max(Sitecount) to avoid the duplicates in join. Please change as per your requirement.
Please find the db<>fiddle here.
Try this:
MERGE Sales AS T
USING (Select stateid, district, Sitecount from Sales1 group by stateid,district) as S
ON(ISNULL(S.stateid, -1) = ISNULL(T.stateld, -1) and ISNULL(S.district, '')= ISNULL(T.district, ''))
WHEN MATCHED
Then UPDATE SET
T.Sitecount=T.Sitecount+S.Sitecount
WHEN NOT MATCHED BY TARGET THEN INSERT (stateid,district,Sitecount) VALUES(S stateid, S.district, S.5itecount);
The following element was adjusted to isnull the comparison fields as a null value as you have observed are treated differently and incorrectly (for what you desire) execute the insert element:
ON(ISNULL(S.stateid, -1) = ISNULL(T.stateld, -1) and ISNULL(S.district, '')= ISNULL(T.district, ''))

Insert into a Informix table or update if exists

I want to add a row to an Informix database table, but when a row exists with the same unique key I want to update the row.
I have found a solution for MySQL here which is as follows but I need it for Informix:
INSERT INTO table (id, name, age) VALUES(1, "A", 19) ON DUPLICATE KEY UPDATE name="A", age=19
You probably should use the MERGE statement.
Given a suitable table:
create table table (id serial not null primary key, name varchar(20) not null, age integer not null);
this SQL works:
MERGE INTO table AS dst
USING (SELECT 1 AS id, 'A' AS name, 19 AS age
FROM sysmaster:'informix'.sysdual
) AS src
ON dst.id = src.id
WHEN NOT MATCHED THEN INSERT (dst.id, dst.name, dst.age)
VALUES (src.id, src.name, src.age)
WHEN MATCHED THEN UPDATE SET dst.name = src.name, dst.age = src.age
Informix has interesting rules allowing the use of keywords as identifiers without needing double quotes (indeed, unless you have DELIMIDENT set in the environment, double quotes are simply an alternative to single quotes around strings).
You can try the same behavior using the MERGE statement:
Example, creation of the target table:
CREATE TABLE target
(
id SERIAL PRIMARY KEY CONSTRAINT pk_tst,
name CHAR(1),
age SMALLINT
);
Create a temporary source table and insert the record you want:
CREATE TEMP TABLE source
(
id INT,
name CHAR(1),
age SMALLINT
) WITH NO LOG;
INSERT INTO source (id, name, age) VALUES (1, 'A', 19);
The MERGE would be:
MERGE INTO target AS t
USING source AS s ON t.id = s.id
WHEN MATCHED THEN
UPDATE
SET t.name = s.name, t.age = s.age
WHEN NOT MATCHED THEN
INSERT (id, name, age)
VALUES (s.id, s.name, s.age);
You'll see that the record was inserted then you can:
UPDATE source
SET age = 20
WHERE id = 1;
And test the MERGE again.
Another way to do it is create a stored procedure, basically you will do the INSERT statement and check the SQL error code, if it's -100 you go for the UPDATE.
Something like:
CREATE PROCEDURE sp_insrt_target(v_id INT, v_name CHAR(1), v_age SMALLINT)
ON EXCEPTION IN (-100)
UPDATE target
SET name = v_name, age = v_age
WHERE id = v_id;
END EXCEPTION
INSERT INTO target VALUES (v_id, v_name, v_age);
END PROCEDURE;

Speed up performance on UPDATE of temp table

I have a SQL Server 2012 stored procedure. I'm filling a temp table below, and that's fairly straightforward. However, after that I'm doing some UPDATE on it.
Here's my T-SQL for declaring the temp table, #SourceTable, filling it, then doing some updates on it. After all of this, I simply take this temp table and insert it into a new table we are filling with a MERGE statement which joins on DOI. DOI is a main column here, and you'll see below that my UPDATE statements get MAX/MIN on several columns based on this column as the table can have multiple rows with the same DOI.
My question is...how can I speed up filling #SourceTable or doing my updates on it? Are there any indexes I can create? I'm decent at SQL, but not the best at performance issues. I'm dealing with maybe 60,000,000 records here in the temp table. It's been running for almost 4 hours now. This is a one-time deal here for a script I'm running once.
CREATE TABLE #SourceTable
(
DOI VARCHAR(72),
FullName NVARCHAR(128), LastName NVARCHAR(64),
FirstName NVARCHAR(64), FirstInitial NVARCHAR(10),
JournalId INT, JournalVolume VARCHAR(16),
JournalIssue VARCHAR(16), JournalFirstPage VARCHAR(16),
JournalLastPage VARCHAR(16), ArticleTitle NVARCHAR(1024),
PubYear SMALLINT, CreatedDate SMALLDATETIME,
UpdatedDate SMALLDATETIME,
ISSN_e VARCHAR(16), ISSN_p VARCHAR(16),
Citations INT, LastCitationRefresh SMALLDATETIME,
LastCitationRefreshValue SMALLINT, IsInSearch BIT,
BatchUpdatedDate SMALLDATETIME, LastIndexUpdate SMALLDATETIME,
ArticleClassificationId INT, ArticleClassificationUpdatedBy INT,
ArticleClassificationUpdatedDate SMALLDATETIME,
Affiliations VARCHAR(8000),
--Calculated columns for use in importing...
RowNum SMALLINT, MinCreatedDatePerDOI SMALLDATETIME,
MaxUpdatedDatePerDOI SMALLDATETIME,
MaxBatchUpdatedDatePerDOI SMALLDATETIME,
MaxArticleClassificationUpdatedByPerDOI INT,
MaxArticleClassificationUpdatedDatePerDOI SMALLDATETIME,
AffiliationsSameForAllDOI BIT, NewArticleId INT
)
--***************************************
--CROSSREF_ARTICLES
--***************************************
--GET RAW DATA INTO SOURCE TABLE TEMP TABLE..
INSERT INTO #SourceTable
SELECT
DOI, FullName, LastName, FirstName, FirstInitial,
JournalId, LEFT(JournalVolume,16) AS JournalVolume,
LEFT(JournalIssue,16) AS JournalIssue,
LEFT(JournalFirstPage,16) AS JournalFirstPage,
LEFT(JournalLastPage,16) AS JournalLastPage,
ArticleTitle, PubYear, CreatedDate, UpdatedDate,
ISSN_e, ISSN_p,
ISNULL(Citations,0) AS Citations, LastCitationRefresh,
LastCitationRefreshValue, IsInSearch, BatchUpdatedDate,
LastIndexUpdate, ArticleClassificationId,
ArticleClassificationUpdatedBy,
ArticleClassificationUpdatedDate, Affiliations,
ROW_NUMBER() OVER(PARTITION BY DOI ORDER BY UpdatedDate DESC, CreatedDate ASC) AS RowNum,
NULL AS MinCreatedDatePerDOI, NULL AS MaxUpdatedDatePerDOI,
NULL AS MaxBatchUpdatedDatePerDOI,
NULL AS MaxArticleClassificationUpdatedByPerDOI,
NULL AS ArticleClassificationUpdatedDatePerDOI,
0 AS AffiliationsSameForAllDOI, NULL AS NewArticleId
FROM
CrossRef_Articles WITH (NOLOCK)
--UPDATE SOURCETABLE WITH MAX/MIN/CALCULATED VALUES PER DOI...
UPDATE S
SET MaxUpdatedDatePerDOI = T.MaxUpdatedDatePerDOI, MaxBatchUpdatedDatePerDOI = T.MaxBatchUpdatedDatePerDOI, MinCreatedDatePerDOI = T.MinCreatedDatePerDOI, MaxArticleClassificationUpdatedByPerDOI = T.MaxArticleClassificationUpdatedByPerDOI, MaxArticleClassificationUpdatedDatePerDOI = T.MaxArticleClassificationUpdatedDatePerDOI
FROM #SourceTable S
INNER JOIN (SELECT MAX(UpdatedDate) AS MaxUpdatedDatePerDOI, MIN(CreatedDate) AS MinCreatedDatePerDOI, MAX(BatchUpdatedDate) AS MaxBatchUpdatedDatePerDOI, MAX(ArticleClassificationUpdatedBy) AS MaxArticleClassificationUpdatedByPerDOI, MAX(ArticleClassificationUpdatedDate) AS MaxArticleClassificationUpdatedDatePerDOI, DOI from #SourceTable GROUP BY DOI) AS T ON S.DOI = T.DOI
UPDATE S
SET AffiliationsSameForAllDOI = 1
FROM #SourceTable S
WHERE NOT EXISTS (SELECT 1 FROM #SourceTable S2 WHERE S2.DOI = S.DOI AND S2.Affiliations <> S.Affiliations)
After
This will probably be a faster way to do the update-- hard to say without seeing the execution plan, but it might be running the GROUP BY for every row.
with doigrouped AS
(
SELECT
MAX(UpdatedDate) AS MaxUpdatedDatePerDOI,
MIN(CreatedDate) AS MinCreatedDatePerDOI,
MAX(BatchUpdatedDate) AS MaxBatchUpdatedDatePerDOI,
MAX(ArticleClassificationUpdatedBy) AS MaxArticleClassificationUpdatedByPerDOI,
MAX(ArticleClassificationUpdatedDate) AS MaxArticleClassificationUpdatedDatePerDOI,
DOI
FROM #SourceTable
GROUP BY DOI
)
UPDATE S
SET MaxUpdatedDatePerDOI = T.MaxUpdatedDatePerDOI,
MaxBatchUpdatedDatePerDOI = T.MaxBatchUpdatedDatePerDOI,
MinCreatedDatePerDOI = T.MinCreatedDatePerDOI,
MaxArticleClassificationUpdatedByPerDOI = T.MaxArticleClassificationUpdatedByPerDOI,
MaxArticleClassificationUpdatedDatePerDOI = T.MaxArticleClassificationUpdatedDatePerDOI
FROM #SourceTable S
INNER JOIN doigrouped T ON S.DOI = T.DOI
If it is faster it will be a couple of orders of magnitude faster -- but that does not mean your machine will be able to process 60 million records in any period of time... if you didn't test on 100k first there is no way to know how long it will take to finish.
I suppose you can try:
Replace INSERT with SELECT INTO
Anyway you don't have indexes on your #SourceTable.
SELECT INTO is minimally logged, so you must have some speedup here
Replace UPDATE with SELECT INTO another table
Instead of updating #SourceTable you can create #SourceTable_Updates with SELECT INTO (modified Hogan query):
with doigrouped AS
(
SELECT
MAX(UpdatedDate) AS MaxUpdatedDatePerDOI,
MIN(CreatedDate) AS MinCreatedDatePerDOI,
MAX(BatchUpdatedDate) AS MaxBatchUpdatedDatePerDOI,
MAX(ArticleClassificationUpdatedBy) AS MaxArticleClassificationUpdatedByPerDOI,
MAX(ArticleClassificationUpdatedDate) AS MaxArticleClassificationUpdatedDatePerDOI,
DOI
FROM #SourceTable
GROUP BY DOI
)
SELECT
S.DOI,
MaxUpdatedDatePerDOI = T.MaxUpdatedDatePerDOI,
MaxBatchUpdatedDatePerDOI = T.MaxBatchUpdatedDatePerDOI,
MinCreatedDatePerDOI = T.MinCreatedDatePerDOI,
MaxArticleClassificationUpdatedByPerDOI = T.MaxArticleClassificationUpdatedByPerDOI,
MaxArticleClassificationUpdatedDatePerDOI = T.MaxArticleClassificationUpdatedDatePerDOI
INTO #SourceTable_Updates
FROM #SourceTable S
INNER JOIN doigrouped T ON S.DOI = T.DOI
Use JOIN-ed #SourceTable and #SourceTable_Updates
Hope this helps
Here are a couple of things that may help the performance of you insert statement
Does the CrossRef_Articles table have a primary key? If it does insert the primary key (be sure it is indexed) into your temp table and only include the fields you need to do your calculations. Once the calculations are done then do a select and join your temp table to the original table on the Id field. It takes time to write all that data to disk.
Look at your tempdb. If you have run this query multiple times then the database or log file size may be out of control.
Check the fields between the 2 original tables joined to see if the fields are indexed?

Can I use the output clause in the Update command to generate 2 rows

I can use OUTPUT to generate the before and after row values when doing an update to a table. And I can put these values into a table.
But they go into the table in 1 single row.
I need the before values as a row, and the after values as a new row.
So this snippet isn't working for me at the moment, but it shows the thing I am trying to do. Is there any way to do this using OUTPUT.
update dbo.table1
set employee = employee + '_updated'
OUTPUT 'd',DELETED.* INTO dbo.table2,
OUTPUT 'i',INSERTED.* INTO dbo.table2,
WHERE id = 4 OR id = 2;
This snippet below works, but only creates a single row:
update dbo.table1
set employee = employee + '_updated'
OUTPUT 'd', DELETED.*, 'i', INSERTED.* INTO dbo.table2,
WHERE id = 4 OR id = 2;
I can do it using triggers, but that's not allowed in this case.
And I can do it manually (selecting what I'm going to update into table2, then doing the update...)
Any tips or hints appreciated on how to do it just using just OUTPUT in the update?
Rgds, Dave
To elaborate on an answer given in the comments:
declare #TempTable table (
d_id int,
d_employee varchar(50),
d_other varchar(50),
u_id int,
u_employee varchar(50),
u_other varchar(50)
)
update Table1
set employee = employee + '_updated'
output deleted.id d_id, deleted.employee d_employee, deleted.other d_other,
inserted.id u_id, inserted.employee u_employee, inserted.other u_other
into #TempTable
where id = 4 or id = 2;
insert Table2 (change_type, employee_id, employee, other)
select
'd',
d_id,
d_employee,
d_other
from #TempTable
union all
select
'i',
u_id,
u_employee,
u_other
from #TempTable
I made some assumptions about your schema as it wasn't given, but this should get you started.

SQL Insert multiple rows to 1:n table association

I have 3 tables (simplified for the question) like the following:
Candidate
- ID (primary key auto-numbered)
- ProcessedOn
- CandidateState
CandidateData
- CandidateID
- FieldKey
- FieldValue
CandidateTransform
- ID (primary key auto-numbered)
- FieldKey
- FieldValue
CandidateTransform contains data that is used to populate CandidateData (i.e., FieldKey -> FieldKey and FieldValue -> FieldValue). FieldValue in CandeidateTransform contains a string of SQL that selects the FieldValue from another database/table (personnel information) like so:
(SELECT firstName FROM Personnel WHERE id = #CandidateID)
I was able to do this previously as we would only process one candidate at a time and we would create a stored procedure with the INSERT statement like this:
DECLARE #NewCandidate TABLE (id bigint);
INSERT CloudRecruiting.Candidate (CandidateState, ActHdrKey, ActionProcessedOn, JobPostingID, Deleted)
OUTPUT INSERTED.CandidateKey INTO #NewCandidate
VALUES (0, 0, GETDATE(), #PlacementID, 0);
DECLARE #ID bigint;
SET #ID = (SELECT id FROM #NewCandidate);
DECLARE #CandidateID bigint;
SET #CandidateID = (SELECT candidate FROM Personnel WHERE id = #PlacementID);
INSERT INTO .CandidateData (CandidateKey, FieldKey, FieldValue)
VALUES (#ID, 1128, (SELECT ISNULL(LEFT(CONVERT(VARCHAR, dateOfBirth, 120), 10), '') FROM Personnel WHERE id = #CandidateID))
, (#ID, 1159, (SELECT email FROM Personnel WHERE id = #CandidateID))
, (#ID, 1284, (SELECT ssn FROM Personnel WHERE id = #CandidateID))
, (#ID, 1303, CAST((SELECT payRate FROM Personnel WHERE id = #PlacementID) as nvarchar(MAX)))
, (#ID, 1169, (SELECT firstName FROM Personnel WHERE id = #CandidateID))
, (#ID, 1229, (SELECT middleName FROM Personnel WHERE id = #CandidateID))
, (#ID, 1219, (SELECT lastName FROM Personnel WHERE id = #CandidateID))
(etc.)
Now we want to make the process more robust by storing the SELECT statements in a table (CandidateTransform) that the customer can modify and be able to process multiple candidates in one batch via a stored procedure like this (we currently have SQL that will capture the CandidateIDs that we need to process and stores them in a tempTable) :
for each (candidateID in tempTable)
create entry in Candidate
capture new id number from Candidate
for each (sql in CandidateTransform)
execute sql script
store value in CandidateData with the new id, FieldKey and FieldValue
end
end
I have not used the while feature of SQL and not sure if this is the best approach or if there is some other SQL that I can use to do what I need. Any solution needs to support SQL 2008 or later.