SQL Merge n rows from source table to two targets - sql

We have an old database that we maintain, and a new one that we have started using. We need to periodically transfer data from the old db to the new one. At the moment, we need to transfer, or merge as it might also be called, data from one table - Student, in the old database to two tables (ie two targets) in the new one - Person and Student. Now the catch is that the data from the old, source, database should be divided among the two tables in the new one. For example (just for the sake of this post),
Old table 'Student'
------------------------------
IdNo | FirstName | LastName |
578 | John | Doe |
645 | Sara | Doe |
New table 'Person'
-----------
Id | IdNo |
11 | 578 |
23 | 645 |
New table 'Student'
--------------------------------------
Id | PersonId | FirstName | LastName |
101| 11 | John | Doe |
102| 23 | Sara | Doe |
And the procedure should take a parameter of the number of rows to merge.
How can this be accomplished?
Update
Perhaps it would be easier for you guys to know what I mean by pseudo code:
MERGE [NewDB].[dbo].[Person] p, [NewDB].[dbo].[Student] ns -- 2 targets, this does not work
USING [OldDB].[dbo].[student] os -- source table, old student
ON p.IdNo = s.IdNo
WHEN MATCHED THEN -- Update existing rows
UPDATE p
SET p.SomeCoumn1 = os.SomeColumn1 -- works. os (old student) is know here
UPDATE ns
SET ns.SomeColumn2 = os.SomeColumn2 -- Does not work. os is not known here
WHEN NOT MATCHED BY TARGET THEN -- Add new rows
INSERT INTO p (IdNo, SomeOlumn1)
VALUES (os.Idno, os.SomeColumn1); -- os (old Studnet) is known here
INSERT INTO ns (SomeColumn2)
VALUES (os.SomeColumn2); -- Does not work. os is not knwon here
I hope that makes it somewhat clearer.

May we assume the reason you want to do this in one statement instead of two is that one if the fields in the first table you are inserting in to is an identity field (Id in the Person table in your example) that needs to be inserted into the second table?
If so, add an OUTPUT clause in the first merge statement so that you have the relationship and fields you require for the second merge statement.
declare #OldStudent table (IdNo int, FirstName varchar(30), LastName varchar(30))
declare #Person table (Id int identity, IdNo int)
declare #NewStudent table (Id int identity, PersonId int, FirstName varchar(30), LastName varchar(30))
insert #OldStudent (IdNo, FirstName, LastName)
select 578, 'John', 'Doe'
union all select 645, 'Sara', 'Doe'
declare #output table ([Action] varchar(20), PersonId int, IdNo int)
MERGE #Person p
USING #OldStudent os
ON p.IdNo = os.IdNo
WHEN MATCHED THEN -- Update existing rows
UPDATE SET IdNo = os.IdNo
WHEN NOT MATCHED BY TARGET THEN -- Add new rows
INSERT (IdNo) VALUES (os.Idno)
OUTPUT $action, inserted.Id, inserted.IdNo into #output;
WITH src AS
(
select
o.IdNo, o.PersonId, os.FirstName, os.LastName
from
#output o
inner join #OldStudent os on os.IdNo = o.IdNo
)
MERGE INTO #NewStudent as ns
USING src
ON src.PersonID = ns.PersonID
WHEN MATCHED THEN
UPDATE SET FirstName = src.FirstName, LastName = src.LastName
WHEN NOT MATCHED BY TARGET THEN -- Add new rows
INSERT (PersonID, FirstName, LastName) VALUES (src.PersonID, src.FirstName, src.LastName);
select * from #Person
select * from #NewStudent

Related

How to limit rows in the table ? How to insert row on the top? SQL Server

Firstly - I want to limit my inserted rows into the database. I.e. I want to have max of 100 rows in my table (i don't want to have version count(*), if >100, delete...)
Secondary - I would like something like pushing (on insert) into the table. I.e:
ID | Name | City
0 | Mike | New York
ID | Name | City
1 | David | Pekin -> as insert on the top!
0 | Mike | New York
ID | Name | City
2 | Marcus | Warsaw -> as next
1 | David | Pekin
0 | Mike | New York
Best regards
David
Both of your conditions can be done for a 100 row table. Or, even a 1,000 row table. But, the example below is not recommended actions for a production table.
RE: "have max of 100 rows"
Create table, then add an INSTEAD OF INSERT trigger that checks size of table before allowing an insert.
RE: "something like pushing (on insert) into the table."
Create table with a clustered primary key (recommended) but with a DESC key (not recommended for anything but small, seldom used tables). CLUSTERED index is physically ordered. Therefore, adding a next larger number (as in an IDENTITY sequence) will push the new entry on to the top of the index. SELECT will return rows in disk order. For the purposes of a 100 row table with a clustered index, disk order and index order are identical.
Code:
NOTE: code below limits max_size to 3 for demonstration. Set #max_size = 100 for your example case
-->-- create table with descending order for primary key. will act like a push down stack
IF OBJECT_ID('dbo.so_limit_size') IS NOT NULL DROP TABLE dbo.so_limit_size
CREATE TABLE dbo.so_limit_size (
id TINYINT IDENTITY(1,1)
, name VARCHAR(100)
, city VARCHAR(100)
, CONSTRAINT pk_so_limit_size PRIMARY KEY CLUSTERED (id DESC) -- DESC makes it work like a push down stack
)
INSERT INTO dbo.so_limit_size (name, city)
VALUES
( 'Mike', 'New York City' )
, ( 'David', 'Pekin' )
, ( 'Marcus', 'Warsaw' )
SELECT * FROM dbo.so_limit_size -- 3 rows
GO
CREATE TRIGGER dbo.limit_size ON dbo.so_limit_size
INSTEAD OF INSERT AS
SET NOCOUNT ON
-- purpose: limit size of table to #max_size. insert batch of 1 or more that exceeds #max_size will not be allowed
DECLARE #max_size TINYINT = 3 -- size limit of table
DECLARE #existing_count TINYINT = (SELECT COUNT(*) FROM dbo.so_limit_size)
, #insert_count TINYINT = (SELECT COUNT(*) FROM Inserted )
PRINT 'existing_count = ' + LOWER(#existing_count) + ' new insert count = ' + LOWER(#insert_count)
IF #existing_count + #insert_count >= 3
BEGIN
PRINT 'insert will cause table count to exceed max_size. insert aborted. max_size = ' + LOWER(#max_size)
END
ELSE
BEGIN
PRINT 'table count less than max_size. insert allowed. max_size = ' + LOWER(#max_size) --<<-- demonstration, print is not a recommended practice for a trigger
INSERT INTO dbo.so_limit_size (name, city)
SELECT name, city FROM inserted
END
GO
INSERT INTO dbo.so_limit_size (name, city)
VALUES
( 'Zorba', 'Athens' ) -- will not be allowed if #max_size = 3
SELECT * FROM dbo.so_limit_size
#Hans Kesting noted that his interpretation of the 100 row limit "should push out the oldest one(s)". That is a valid interpretation. So, I created an answer that does that. The following is identical to my previous answer except that the trigger now deletes the oldest rows before the insert. Table max size is maintained. Please see my other answer for explanations of clustered index with descending key and instead of trigger.
CODE:
-->-- create table with descending order for primary key. will act like a push down stack
IF OBJECT_ID('dbo.so_limit_size') IS NOT NULL DROP TABLE dbo.so_limit_size
CREATE TABLE dbo.so_limit_size (
id TINYINT IDENTITY(1,1)
, name VARCHAR(100)
, city VARCHAR(100)
, CONSTRAINT pk_so_limit_size PRIMARY KEY CLUSTERED (id DESC) -- DESC makes it work like a push down stack
)
INSERT INTO dbo.so_limit_size (name, city)
VALUES
( 'Mike', 'New York City' )
, ( 'David', 'Pekin' )
, ( 'Marcus', 'Warsaw' )
SELECT * FROM dbo.so_limit_size -- 3 rows
GO
CREATE TRIGGER dbo.limit_size ON dbo.so_limit_size
INSTEAD OF INSERT AS
SET NOCOUNT ON
-- purpose: limit size of table to #max_size. insert batch of 1 or more that exceeds #max_size will not be allowed
DECLARE #max_size TINYINT = 3 -- size limit of table
DECLARE #existing_count TINYINT = (SELECT COUNT(*) FROM dbo.so_limit_size)
, #insert_count TINYINT = (SELECT COUNT(*) FROM Inserted )
PRINT 'existing_count = ' + LOWER(#existing_count) + ' new insert count = ' + LOWER(#insert_count)
IF #existing_count + #insert_count >= 3
BEGIN
-- for FIFO stack, delete oldest rows before insert if #max_size exceeded
PRINT 'insert will cause table count to exceed max_size. oldest row(s) deleted before insert. max_size = ' + LOWER(#max_size)
DELETE FROM dbo.so_limit_size
WHERE Id IN ( SELECT TOP (#insert_count) Id FROM dbo.so_limit_size ORDER BY Id ASC )
INSERT INTO dbo.so_limit_size (name, city)
SELECT name, city FROM inserted
END
ELSE
BEGIN
PRINT 'table count less than max_size. insert allowed. max_size = ' + LOWER(#max_size) --<<-- demonstration, print is not a recommended practice for a trigger
INSERT INTO dbo.so_limit_size (name, city)
SELECT name, city FROM inserted
END
GO
INSERT INTO dbo.so_limit_size (name, city)
VALUES
( 'Zorba', 'Athens' ) -- will not be allowed if #max_size = 3
, ('Sriram', 'Chennai')
SELECT * FROM dbo.so_limit_size

How do flag the unselected records in a top selection

I have a requirement I need to choose only a selected number of records from a group. However, I also need to flag the not choosen records in the even that they will need to be referred to at a later date.
I have over 80K records in Segment 1. The requirement is to select 50000 records
I've tried this:
UPDATE mytable
SET [SuppressionReason] = 'REC LIMIT REACHED - S1'
WHERE
[ID] NOT IN
(
SELECT TOP 50000 [ID] FROM mytable
WHERE segment = '1'
);
However, this results in 0 records getting labeled in the SuppressionReason field as 'REC LIMIT REACHED - S1'. What am I missing or doing wrong?
Based on testing with the following code, are you absolutely certain that you have more than 50,000 records?
DROP TABLE IF EXISTS #TEMP
CREATE TABLE #TEMP
(
ID INT IDENTITY(1,1),
FIRSTNAME VARCHAR(10),
LASTNAME VARCHAR(10),
SEGMENT INT,
SUPPRESSION VARCHAR(10)
)
INSERT INTO #TEMP
(FIRSTNAME, LASTNAME, SEGMENT)
VALUES
('JOHN', 'KRAMER',1),
('MATT','GEORGE',1),
('PHILIP','MCCAIN',1),
('ANDREW','THOMAS',1)
UPDATE #TEMP
SET SUPPRESSION = 'YEP'
WHERE ID NOT IN
(SELECT TOP(2) ID FROM #TEMP WHERE SEGMENT = 1)
SELECT * FROM #TEMP
This produces the following output, which I suspect is exactly what you are expecting to get.
1 JOHN KRAMER 1 NULL
2 MATT GEORGE 1 NULL
3 PHILIP MCCAIN 1 YEP
4 ANDREW THOMAS 1 YEP

Can I use OUTPUT INTO to add data to a relational table with additional values?

I have two tables. One holds common data for articles, and the other holds translations for text. Something like this:
Articles Table
id | key | date
Translations Table
id | article_key | lang | title | content
key is a string and is the primary key.
article_key is a foreign key relating it to articles on the key column.
When I add a new row to the Articles, I'd like to be able to use the key that was just inserted and add a new row to the Translations Table.
I've read about OUTPUT INTO but it doesn't seem like I can add other values to the Translations table. Also I get an error about it being on either side of a relationship.
Is my only course of action to INSERT into Articles followed by an INSERT with a SELECT subquery to get the key?
Edit: Expected output would be something like:
Articles
id | key | date
---------------
1 | somekey | 2018-05-31
Article Translations
id | article_key | lang | title | content
-----------------------------------------
1 | somekey | en | lorem | ipsum
Well this could work based on your description:
SET NOCOUNT ON;
DECLARE #Articles TABLE (id INT NOT NULL
, [key] VARCHAR(50) NOT NULL
, [date] DATE NOT NULL);
DECLARE #ArticleTranslations TABLE (id INT NOT NULL
, article_key VARCHAR(50) NOT NULL
, lang VARCHAR(50) NOT NULL
, title VARCHAR(50) NOT NULL
, content VARCHAR(50) NOT NULL);
INSERT #Articles (id, [key], [date]) -- This is insert into #Articles
OUTPUT INSERTED.id, INSERTED.[key], 'en', 'lorem', 'ipsum' -- This is insert into #ArticleTranslations
INTO #ArticleTranslations (id, article_key, lang, title, content) -- This is insert into #ArticleTranslations
VALUES (1, 'somekey', GETDATE()); -- This is insert into #Articles
SELECT *
FROM #Articles;
SELECT *
FROM #ArticleTranslations;
Try this out Stack Exchange: https://data.stackexchange.com/stackoverflow/query/857925
Maybe it's not that simple as it is. So let me know whether this works or not.

Inserting an auto generated value into a column with specific pattern

I have a table named tblSample which has columns ID, PID etc. I want to auto generate those two columns with a specific pattern.
For example:
ID PID
------ ------
ABC001 PAB001
ABC002 PAB002
ABC003 PAB003
ABC004 PAB004
| |
| |
ABC999 PAB999
As you can see, the pattern 'ABC' in ID and 'PAB' in PID is the same. How can I insert those records into a table automatically and the range between those three digits after 'ABC' or 'PAB' is 001-999?
My suggestion is to create table structure as below with one identity column as testID and other computed by using that column ID and PID:
CREATE TABLE #tmpOne(testID INT IDENTITY (1,1),
ID AS ('ABC'+ (CASE WHEN len(testID) <=3 THEN CAST(RIGHT(0.001*testID, 3) AS VARCHAR) ELSE CAST(testID AS VARCHAR) END)),
Ename VARCHAR(20))
INSERT INTO #tmpOne(Ename)
SELECT 'Test'
SELECT * FROM #tmpOne
CREATE TABLE #tt(ID VARCHAR(100),PID VARCHAR(100))
GO
INSERT INTO #tt(ID,PID)
SELECT 'ABC'+RIGHT('000'+LTRIM(a.ID),3),'PAB'+RIGHT('000'+LTRIM(a.ID),3) FROM (
SELECT ISNULL(MAX(CASE WHEN SUBSTRING(t.id,4,LEN(ID))> SUBSTRING(t.id,4,LEN(PID)) THEN SUBSTRING(t.id,4,LEN(ID)) ELSE SUBSTRING(t.id,4,LEN(PID)) END )+1,1) AS id
FROM #tt AS t
) AS a
GO 999

Splitting sql pipe separated

I have table as follows
Discese
ID | DisceseNAme
1 | Heart
2 | Lungs
3 | ENT
Registration
PatienID | NAME | Discease
1 | abc | 1
2 | asa | 2|3
3 | asd | 1|2|3
I have a function to split |-separated data. Now I want result as:
PatientID | Name | DisceseNAme
1 | abc | heart
2 |asa | Lungs,ENT
3 |asd | heart,Lungs,ENT
My split function is
ALTER FUNCTION [dbo].[fnSplit](
#sInputList VARCHAR(8000) -- List of delimited items
, #sDelimiter VARCHAR(8000) = '|' -- delimiter that separates items
) RETURNS #List TABLE (item VARCHAR(8000))
BEGIN
DECLARE #sItem VARCHAR(8000)
WHILE CHARINDEX(#sDelimiter,#sInputList,0) <> 0
BEGIN
SELECT
#sItem=RTRIM(LTRIM(SUBSTRING(#sInputList,1,CHARINDEX(#sDelimiter,#sInputList,0)-1))),
#sInputList=RTRIM(LTRIM(SUBSTRING(#sInputList,CHARINDEX(#sDelimiter,#sInputList,0)+LEN(#sDelimiter),LEN(#sInputList))))
IF LEN(#sItem) > 0
INSERT INTO #List SELECT #sItem
END
IF LEN(#sInputList) > 0
INSERT INTO #List SELECT #sInputList -- Put the last item in
RETURN
END
I am not sure how I can get that result, though.
As already mentioned in the comments, it is better to normalize your table structure. What this means is that you should not store patient's diseases in one VARCHAR column with disease ID's separated with some character. Instead you should store all diseases for a patient in separate rows.
If you keep using the setup you have now, your queries will become real cumbersome and performance will be really bad. Also, you will not be able to enjoy database consistency by using foreign keys.
I've written this example script which finally selects for the output you require. The example uses temporary tables. If you choose to use this way of working (and you should), just use this setup with regular tables (ie not starting with #).
The tables:
#disease: Defines diseases
#patients: Defines patients
#registration: Defines patients' diseases; foreign keys to #disease and #patients for data consistency (make sure the patients and diseases actually exist in the database)
If you're wondering how the FOR XML PATH('') construct in the final query results in a |-separated VARCHAR, read this answer I gave a while ago on this subject.
-- Diseases
CREATE TABLE #disease(
ID INT,
DiseaseName VARCHAR(256),
CONSTRAINT PK_disease PRIMARY KEY(ID)
);
INSERT INTO #disease(ID,DiseaseName)VALUES
(1,'Heart'),(2,'Lungs'),(3,'ENT');
-- Patients
CREATE TABLE #patients(
PatientID INT,
Name VARCHAR(256),
CONSTRAINT PK_patients PRIMARY KEY(PatientID)
);
INSERT INTO #patients(PatientID,Name)VALUES
(1,'abc'),(2,'asa'),(3,'asd'),(4,'zldkzld');
-- Registration for patient's diseases
CREATE TABLE #registration(
PatientID INT,
Disease INT,
CONSTRAINT FK_registration_to_patient FOREIGN KEY(PatientID) REFERENCES #patients(PatientID),
CONSTRAINT FK_registration_to_disease FOREIGN KEY(Disease) REFERENCES #disease(ID),
);
INSERT INTO #registration(PatientID,Disease)VALUES
(1,1), -- patient with ID 1 has one disease: Heart
(2,2),(2,3), -- patient with ID 2 has two diseases: Lungs and ENT
(3,1),(3,2),(3,3); -- patient with ID 3 has three diseases: Heart, Lungs and ENT
-- Select diseases for partients in one |-separated column
SELECT
p.PatientID,p.Name,Diseases=STUFF(dn.diseases,1,1,'')
FROM
#patients AS p
CROSS APPLY ( -- construct a |-separated column with all diseases for the client
SELECT
'|'+d.DiseaseName
FROM
#registration AS r
INNER JOIN #disease AS d ON
d.ID=r.Disease
WHERE
r.PatientID=p.PatientID
FOR
XML PATH('')
) AS dn(diseases)
WHERE
EXISTS(SELECT 1 FROM #registration AS r WHERE r.PatientID=p.PatientID)
ORDER BY
p.PatientID;
DROP TABLE #disease;DROP TABLE #registration;DROP TABLE #patients;
Results:
+-----------+------+-----------------+
| PatientID | Name | Diseases |
+-----------+------+-----------------+
| 1 | abc | Heart |
| 2 | asa | Lungs|ENT |
| 3 | asd | Heart|Lungs|ENT |
+-----------+------+-----------------+