Each Column in Separate Row - sql

How can I display each column in separate row and at the end add additional field.
For example I have this result:
ID ArticleName Brend1 Brend2 Brend3
== =========== ======== ======== ========
1 TestArticle 10001 20002 30003
I want to achieve this:
ID ArticleName BrandNo BrandName
== =========== ======= =========
1 TestArticle 10001 if column name = Brand1 Then Nike
1 TestArticle 20002 if column name = Brand2 Then Adidas
1 TestArticle 30003 if column name = Brand3 Then Mercedes
I can show each column in separate row, but how can I add additional column to the end of the result BrandName
Here is what I've done:
DECLARE #temTable TABLE
(
Id INT,
ArticleName VARCHAR(20),
Brand1 VARCHAR(20),
Brand2 VARCHAR(20),
Brand3 VARCHAR(20)
);
INSERT INTO #temTable
(
Id,
ArticleName,
Brand1,
Brand2,
Brand3
)
VALUES
(1, 'TestArticle', '10001', '20002', '30003');
SELECT Id,
ArticleName,
b.*
FROM #temTable a
CROSS APPLY
(
VALUES
(Brand1),
(Brand2),
(Brand3)
) b (Brand)
WHERE b.Brand IS NOT NULL;

You could use CROSS APPLY as
SELECT Id, ArticleName, Br BrandNo, Val BrandName
FROM #TemTable TT
CROSS APPLY(
VALUES
(Brand1, 'Nike'),
(Brand2, 'Adidas'),
(Brand3, 'Mercedes')
) T(Br, Val)
db-fiddle

I assume the brand is stored in another table, so you just need to add another column in your VALUES operator, and then join to the Brand Table:
SELECT Id,
ArticleName,
V.Brand
FROM #temTable a
CROSS APPLY (VALUES (1,Brand1),
(2,Brand2),
(3,Brand3)) V (BrandID,Brand)
JOIN dbo.Brand B ON V.BrandID = B.BrandID
WHERE V.Brand IS NOT NULL;

You can use UNPIVOT to achieve this. You can use either a case statement or another table variable to switch column names with brand names, I would prefer a table variable with a join it would make adding new column a bit easier.
DECLARE #d TABLE (ColNames VARCHAR(128) , BrandName VARCHAR(100))
INSERT INTO #d VALUES ('Brand1', 'Nike'),('Brand2', 'Adidas'),('Brand3', 'Mercedes')
SELECT up.Id
, up.ArticleName
, up.BrandNo
, d.BrandName
FROM #temTable
UNPIVOT (BrandNo FOR ColNames IN (Brand1,Brand2,Brand3)) up
INNER JOIN #d d ON d.ColNames = up.ColNames

Related

Compare two rows (both with different ID) & check if their column values are exactly the same. All rows & columns are in the same table

I have a table named "ROSTER" and in this table I have 22 columns.
I want to query and compare any 2 rows of that particular table with the purpose to check if each column's values of that 2 rows are exactly the same. ID column always has different values in each row so I will not include ID column for the comparing. I will just use it to refer to what rows will be used for the comparison.
If all column values are the same: Either just display nothing (I prefer this one) or just return the 2 rows as it is.
If there are some column values not the same: Either display those column names only or display both the column name and its value (I prefer this one).
Example:
ROSTER Table:
ID
NAME
TIME
1
N1
0900
2
N1
0801
Output:
ID
TIME
1
0900
2
0801
OR
Display "TIME"
Note: Actually I'm okay with whatever result or way of output as long as I can know in any way that the 2 rows are not the same.
What are the possible ways to do this in SQL Server?
I am using Microsoft SQL Server Management Studio 18, Microsoft SQL Server 2019-15.0.2080.9
Please try the following solution based on the ideas of John Cappelletti. All credit goes to him.
SQL
-- DDL and sample data population, start
DECLARE #roster TABLE (ID INT PRIMARY KEY, NAME VARCHAR(10), TIME CHAR(4));
INSERT INTO #roster (ID, NAME, TIME) VALUES
(1,'N1','0900'),
(2,'N1','0801')
-- DDL and sample data population, end
DECLARE #source INT = 1
, #target INT = 2;
SELECT id AS source_id, #target AS target_id
,[key] AS [column]
,source_Value = MAX( CASE WHEN Src=1 THEN Value END)
,target_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,id
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#source
UNION ALL
SELECT Src=2
,id = #source
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#target
) AS A
GROUP BY id, [key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
AND [key] <> 'ID' -- exclude this PK column
ORDER BY id, [key];
Output
+-----------+-----------+--------+--------------+--------------+
| source_id | target_id | column | source_Value | target_Value |
+-----------+-----------+--------+--------------+--------------+
| 1 | 2 | TIME | 0900 | 0801 |
+-----------+-----------+--------+--------------+--------------+
A general approach here might be to just aggregate over the entire table and report the state of the counts:
SELECT
CASE WHEN COUNT(DISTINCT ID) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [ID same],
CASE WHEN COUNT(DISTINCT NAME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [NAME same],
CASE WHEN COUNT(DISTINCT TIME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [TIME same]
FROM yourTable;

Use SSRS to move child rows to repeating columns for exporting?

I'm trying to take an ugly SQL output and use SSRS to make it suitable for export to a mail house.
What would be the right approach to group data from this:
order_no
type
name
item
price
1
header
sally
NULL
NULL
1
data
NULL
book
12.50
1
data
NULL
dvd
39.00
2
header
bob
NULL
NULL
2
data
NULL
shirt
50.00
2
data
NULL
shorts
65.00
Into this?
order_no
type
name
item_1
price_1
item_2
price_2
1
header
sally
book
12.50
dvd
39.00
2
header
bob
shirt
50.00
shorts
65.00
Should this be a Matrix? I'm having trouble getting making progress.
There may be a much cleaner way of doing this but this is the approach I took...
First I replicated your sample data
DECLARE #t TABLE(order_no int, [type] varchar(20), [name] varchar(20), [item] varchar(20), price decimal (10,2))
INSERT INTO #t VALUES
(1,'header', 'sally' , NULL, NULL ),
(1,'data', NULL , 'book', 12.50),
(1,'data', NULL , 'dvd', 39.00),
(2,'header', 'bob' , NULL, NULL ),
(2,'data', NULL , 'shirt', 50.00),
(2,'data', NULL , 'shorts', 65.00)
;
WITH o (order_no, [type], [name], [item], [price], [ItemNumber]) AS
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY order_no ORDER BY item) AS ItemNumber FROM #t WHERE [type] != 'header'
)
SELECT
h.order_no, h.[type], h.name
, d.ItemNumber, d.ItemCaption, d.ItemValue
FROM (SELECT DISTINCT order_no, [Type], [name] FROM #t WHERE [type] = 'header') h
JOIN
(
SELECT order_no, ItemNumber, 'Item_' + CAST(ItemNumber as varchar(10)) as ItemCaption, Item as ItemValue from o
UNION
SELECT order_no, ItemNumber, 'Price_' + CAST(ItemNumber as varchar(10)) as ItemCaption, CAST(Price as varchar(20)) as ItemValue from o
) d ON h.order_no = d.order_no
I created a CTE just to clean up the query a little and included an row_number for each item, we'll use this to created column captions which we can use in the matrix.
This gives us the following output
We now have everything in place for a simple matrix.
Note: As we had to convert everything to strings, the prices are no longer numbers so bear this in mind if you plan on doing anything else with the data later - they would have to be converted back
So, create a new report, add a new dataset and use the above query as the dataset query.
Add a matrix control, drag order_no to the row placeholder, ItemCaption to the column placeholder and ItemValue to the data placeholder.
Next, right-click the order_no column and choose "insert column - inside group right", the set new column value to your type field. Repeat for the header field.
Your design will look like this.
Finally In the column group sort properties, sort by ItemNumber then ItemCaption
The final report looks like this...

SQL Auto-populate ID column based on another column

I have a workflow where source table is used to populate the destination table.
I have tried to simulate this workflow in the code below.
-- creating/populating source table
CREATE TABLE #SourceTable
(
CampaignName VARCHAR(50),
CustomerNumber INT
)
INSERT INTO #SourceTable
VALUES ('Campaign1', 1111), ('Campaign1', 2222), ('Campaign1', 3333),
('Campaign2', 4444), ('Campaign2', 2222), ('Campaign2', 1111)
-- create/populate destination table
CREATE TABLE #DestinationTable
(
CampaignID INT,
CampaignName VARCHAR(50),
CustomerNumber INT
)
-- Simulating populating the #DestinationTable
INSERT INTO #DestinationTable (CampaignName, CustomerNumber)
SELECT CampaignName, CustomerNumber
FROM #SourceTable
The source table will get created in some way, but then it is used to populate the destination table in the same way as my sample code.
The destination table is at CustomerNumber level. I want to autopopulate an ID field (without the user having to code it in) that will give a new number at CampaignName level.
So for example, I want the output of the #DestinationTable to be:
CampaignID CampaignName CustomerNumber
------------------------------------------
1 Campaign1 1111
1 Campaign1 2222
1 Campaign1 3333
2 Campaign2 4444
2 Campaign2 2222
2 Campaign2 1111
But I need the CampaignID column to be auto-populated whenever new rows are being inserted, like an IDENTITY column, but instead of giving each row a number, I need it to give each CampaignName a new number.
Is that possible?
Thanks
This one can be achieved using dense_rank().
SELECT dense_rank() over (order by CampaignName) as rn, CampaignName, CustomerNumber
FROM #SourceTable
To validate if your customer number and campaign name already existed on your destination use not exists keyword.
SELECT dense_rank() over (order by t1.CampaignName) as rn, t1.CampaignName, t1.CustomerNumber
FROM #SourceTable t1
WHERE not exists (select 1 from #DestinationTable t2
where t2.CustomerNumber = t1.CustomerNumber and t2.CampaignName = t1.CampaignName)

How to synthesize attribute for joined tables

I have a view defined like this:
CREATE VIEW [dbo].[PossiblyMatchingContracts] AS
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts
FROM [dbo].AllContracts AS C
INNER JOIN [dbo].AllContracts AS CC
ON C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND C.AssociatedUser IS NULL
AND C.UniqueID <> CC.UniqueID
Which is basically finding contracts where f.e. the first name and the birthday are matching. This works great. Now I want to add a synthetic attribute to each row with the value from only one source row.
Let me give you an example to make it clearer. Suppose I have the following table:
UniqueID | FirstName | LastName | Birthday
1 | Peter | Smith | 1980-11-04
2 | Peter | Gray | 1980-11-04
3 | Peter | Gray-Smith| 1980-11-04
4 | Frank | May | 1985-06-09
5 | Frank-Paul| May | 1985-06-09
6 | Gina | Ericson | 1950-11-04
The resulting view should look like this:
UniqueID | PossiblyMatchingContracts | SyntheticID
1 | 2 | PeterSmith1980-11-04
1 | 3 | PeterSmith1980-11-04
2 | 1 | PeterSmith1980-11-04
2 | 3 | PeterSmith1980-11-04
3 | 1 | PeterSmith1980-11-04
3 | 2 | PeterSmith1980-11-04
4 | 5 | FrankMay1985-06-09
5 | 4 | FrankMay1985-06-09
6 | NULL | NULL [or] GinaEricson1950-11-04
Notice that the SyntheticID column uses ONLY values from one of the matching source rows. It doesn't matter which one. I am exporting this view to another application and need to be able to identify each "match group" afterwards.
Is it clear what I mean? Any ideas how this could be done in sql?
Maybe it helps to elaborate a bit on the actual use case:
I am importing contracts from different systems. To account for the possibility of typos or people that have married but the last name was only updated in one system, I need to find so called 'possible matches'. Two or more contracts are considered a possible match if they contain the same birthday plus the same first, last or birth name. That implies, that if contract A matches contract B, contract B also matches contract A.
The target system uses multivalue reference attributes to store these relationships. The ultimate goal is to create user objects for these contracts. The catch first is, that the shall only be one user object for multiple matching contracts. Thus I'm creating these matches in the view. The second catch is, that the creation of user objects happens by workflows, which run parallel for each contract. To avoid creating multiple user objects for matching contracts, each workflow needs to check, if there is already a matching user object or another workflow, which is about to create said user object. Because the workflow engine is extremely slow compared to sql, the workflows should not repeat the whole matching test. So the idea is, to let the workflow check only for the 'syntheticID'.
I have solved it with a multi step approach:
Create the list of possible 1st level matches for each contract
Create the base groups list, assigning a different group for for
each contract (as if they were not related to anybody)
Iterate the matches list updating the group list when more contracts need to
be added to a group
Recursively build up the SyntheticID from final group list
Output results
First of all, let me explain what I have understood, so you can tell if my approach is correct or not.
1) matching propagates in "cascade"
I mean, if "Peter Smith" is grouped up with "Peter Gray", it means that all Smith and all Gray are related (if they have the same birth date) so Luke Smith can be in the same group of John Gray
2) I have not understood what you mean with "Birth Name"
You say contracts matches on "first, last or birth name", sorry, I'm italian, I thought birth name and first were the same, also in your data there is not such column. Maybe it is related to that dash symbol between names?
When FirstName is Frank-Paul it means it should match both Frank and Paul?
When LastName is Gray-Smith it means it should match both Gray and Smith?
In following code I have simply ignored this problem, but it could be handled if needed (I already did a try, breaking names, unpivoting them and treating as double match).
Step Zero: some declaration and prepare base data
declare #cli as table (UniqueID int primary key, FirstName varchar(20), LastName varchar(20), Birthday varchar(20))
declare #comb as table (id1 int, id2 int, done bit)
declare #grp as table (ix int identity primary key, grp int, id int, unique (grp,ix))
declare #str_id as table (grp int primary key, SyntheticID varchar(1000))
declare #id1 as int, #g int
;with
t as (
select *
from (values
(1 , 'Peter' , 'Smith' , '1980-11-04'),
(2 , 'Peter' , 'Gray' , '1980-11-04'),
(3 , 'Peter' , 'Gray-Smith', '1980-11-04'),
(4 , 'Frank' , 'May' , '1985-06-09'),
(5 , 'Frank-Paul', 'May' , '1985-06-09'),
(6 , 'Gina' , 'Ericson' , '1950-11-04')
) x (UniqueID , FirstName , LastName , Birthday)
)
insert into #cli
select * from t
Step One: Create the list of possible 1st level matches for each contract
;with
p as(select UniqueID, Birthday, FirstName, LastName from #cli),
m as (
select p.UniqueID UniqueID1, p.FirstName FirstName1, p.LastName LastName1, p.Birthday Birthday1, pp.UniqueID UniqueID2, pp.FirstName FirstName2, pp.LastName LastName2, pp.Birthday Birthday2
from p
join p pp on (pp.Birthday=p.Birthday) and (pp.FirstName = p.FirstName or pp.LastName = p.LastName)
where p.UniqueID<=pp.UniqueID
)
insert into #comb
select UniqueID1,UniqueID2,0
from m
Step Two: Create the base groups list
insert into #grp
select ROW_NUMBER() over(order by id1), id1 from #comb where id1=id2
Step Three: Iterate the matches list updating the group list
Only loop on contracts that have possible matches and updates only if needed
set #id1 = 0
while not(#id1 is null) begin
set #id1 = (select top 1 id1 from #comb where id1<>id2 and done=0)
if not(#id1 is null) begin
set #g = (select grp from #grp where id=#id1)
update g set grp= #g
from #grp g
inner join #comb c on g.id = c.id2
where c.id2<>#id1 and c.id1=#id1
and grp<>#g
update #comb set done=1 where id1=#id1
end
end
Step Four: Build up the SyntheticID
Recursively add ALL (distinct) first and last names of group to SyntheticID.
I used '_' as separator for birth date, first names and last names, and ',' as separator for the list of names to avoid conflicts.
;with
c as(
select c.*, g.grp
from #cli c
join #grp g on g.id = c.UniqueID
),
d as (
select *, row_number() over (partition by g order by t,s) n1, row_number() over (partition by g order by t desc,s desc) n2
from (
select distinct c.grp g, 1 t, FirstName s from c
union
select distinct c.grp, 2, LastName from c
) l
),
r as (
select d.*, cast(CONVERT(VARCHAR(10), t.Birthday, 112) + '_' + s as varchar(1000)) Names, cast(0 as bigint) i1, cast(0 as bigint) i2
from d
join #cli t on t.UniqueID=d.g
where n1=1
union all
select d.*, cast(r.names + IIF(r.t<>d.t,'_',',') + d.s as varchar(1000)), r.n1, r.n2
from d
join r on r.g = d.g and r.n1=d.n1-1
)
insert into #str_id
select g, Names
from r
where n2=1
Step Five: Output results
select c.UniqueID, case when id2=UniqueID then id1 else id2 end PossibleMatchingContract, s.SyntheticID
from #cli c
left join #comb cb on c.UniqueID in(id1,id2) and id1<>id2
left join #grp g on c.UniqueID = g.id
left join #str_id s on s.grp = g.grp
Here is the results
UniqueID PossibleMatchingContract SyntheticID
1 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
1 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
4 5 1985-06-09_Frank,Frank-Paul_May
5 4 1985-06-09_Frank,Frank-Paul_May
6 NULL 1950-11-04_Gina_Ericson
I think that in this way the resulting SyntheticID should also be "unique" for each group
This creates a synthetic value and is easy to change to suit your needs.
DECLARE #T TABLE (
UniqueID INT
,FirstName VARCHAR(200)
,LastName VARCHAR(200)
,Birthday DATE
)
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 1,'Peter','Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 2,'Peter','Gray','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 3,'Peter','Gray-Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 4,'Frank','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 5,'Frank-Paul','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 6,'Gina','Ericson','1950-11-04'
DECLARE #PossibleMatches TABLE (UniqueID INT,[PossibleMatch] INT,SynKey VARCHAR(2000)
)
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Fn=' + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,t2.UniqueID,'Ln=' + t1.LastName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,pm.UniqueID,'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
LEFT JOIN #PossibleMatches pm on pm.UniqueID=t1.UniqueID
WHERE pm.UniqueID IS NULL
SELECT *
FROM #PossibleMatches
ORDER BY UniqueID,[PossibleMatch]
I think this will work for you
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C INNER JOIN
[dbo].AllContracts AS CC ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN(
SELECT UniqueID FROM [dbo].DefinitiveMatches)
AND C.AssociatedUser IS NULL
You can try this:
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C
INNER JOIN
[dbo].AllContracts AS CC
ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND
C.AssociatedUser IS NULL
This will generate one extra row (because we left out C.UniqueID <> CC.UniqueID) but will give you the good souluton.
Following an example with some example data extracted from your original post. The idea: Generate all SyntheticID in a CTE, query all records with a "PossibleMatch" and Union it with all records which are not yet included:
DECLARE #t TABLE(
UniqueID int
,FirstName nvarchar(20)
,LastName nvarchar(20)
,Birthday datetime
)
INSERT INTO #t VALUES (1, 'Peter', 'Smith', '1980-11-04');
INSERT INTO #t VALUES (2, 'Peter', 'Gray', '1980-11-04');
INSERT INTO #t VALUES (3, 'Peter', 'Gray-Smith', '1980-11-04');
INSERT INTO #t VALUES (4, 'Frank', 'May', '1985-06-09');
INSERT INTO #t VALUES (5, 'Frank-Paul', 'May', '1985-06-09');
INSERT INTO #t VALUES (6, 'Gina', 'Ericson', '1950-11-04');
WITH ctePrep AS(
SELECT UniqueID, FirstName, LastName, BirthDay,
ROW_NUMBER() OVER (PARTITION BY FirstName, BirthDay ORDER BY FirstName, BirthDay) AS k,
FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
),
cteKeys AS(
SELECT FirstName, BirthDay, SyntheticID
FROM ctePrep
WHERE k = 1
),
cteFiltered AS(
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
keys.SyntheticID
FROM #t AS C
JOIN #t AS CC ON C.FirstName = CC.FirstName
AND C.Birthday = CC.Birthday
JOIN cteKeys AS keys ON keys.FirstName = c.FirstName
AND keys.Birthday = c.Birthday
WHERE C.UniqueID <> CC.UniqueID
)
SELECT UniqueID, PossiblyMatchingContracts, SyntheticID
FROM cteFiltered
UNION ALL
SELECT UniqueID, NULL, FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
WHERE UniqueID NOT IN (SELECT UniqueID FROM cteFiltered)
Hope this helps. The result looked OK to me:
UniqueID PossiblyMatchingContracts SyntheticID
---------------------------------------------------------------
2 1 PeterSmith1980-11-04
3 1 PeterSmith1980-11-04
1 2 PeterSmith1980-11-04
3 2 PeterSmith1980-11-04
1 3 PeterSmith1980-11-04
2 3 PeterSmith1980-11-04
4 NULL FrankMay1985-06-09
5 NULL Frank-PaulMay1985-06-09
6 NULL GinaEricson1950-11-04
Tested in SSMS, it works perfect. :)
--create table structure
create table #temp
(
uniqueID int,
firstname varchar(15),
lastname varchar(15),
birthday date
)
--insert data into the table
insert #temp
select 1, 'peter','smith','1980-11-04'
union all
select 2, 'peter','gray','1980-11-04'
union all
select 3, 'peter','gray-smith','1980-11-04'
union all
select 4, 'frank','may','1985-06-09'
union all
select 5, 'frank-paul','may','1985-06-09'
union all
select 6, 'gina','ericson','1950-11-04'
select * from #temp
--solution is as below
select ab.uniqueID
, PossiblyMatchingContracts
, c.firstname+c.lastname+cast(c.birthday as varchar) as synID
from
(
select a.uniqueID
, case
when a.uniqueID < min(b.uniqueID)over(partition by a.uniqueid)
then a.uniqueID
else min(b.uniqueID)over(partition by a.uniqueid)
end as SmallestID
, b.uniqueID as PossiblyMatchingContracts
from #temp a
left join #temp b
on (a.firstname = b.firstname OR a.lastname = b.lastname) AND a.birthday = b.birthday AND a.uniqueid <> b.uniqueID
) as ab
left join #temp c
on ab.SmallestID = c.uniqueID
Result capture is attached below:
Say we have following table (a VIEW in your case):
UniqueID PossiblyMatchingContracts SyntheticID
1 2 G1
1 3 G2
2 1 G3
2 3 G4
3 1 G4
3 4 G6
4 5 G7
5 4 G8
6 NULL G9
In your case you can set initial SyntheticID as a string like PeterSmith1980-11-04 using UniqueID for each line. Here is a recursive CTE query it divides all lines to unconnected groups and select MAX(SyntheticId) in the current group as a new SyntheticID for all lines in this group.
WITH CTE AS
(
SELECT CAST(','+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' as Varchar(MAX)) as GroupCont,
SyntheticID
FROM PossiblyMatchingContracts
UNION ALL
SELECT CAST(GroupCont+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' AS Varchar(MAX)) as GroupCont,
pm.SyntheticID
FROM CTE
JOIN PossiblyMatchingContracts as pm
ON
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
AND NOT
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
AND
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
)
SELECT pm.UniqueID,
pm.PossiblyMatchingContracts,
ISNULL(
(SELECT MAX(SyntheticID) FROM CTE WHERE
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
))
,pm.SyntheticID) as SyntheticID
FROM PossiblyMatchingContracts pm

Select first Row if configuration exists, otherwise select NULL Row

Given an instance of SQL Server 2008, imagine there's a table named #Configuration, which has three columns: ID, Code, and SubCode. There should be no duplicate rows for Code and SubCode.
Now imagine another detail level table #ConfigurationDetails which have duplicate rows of Code and Sub code may be available of SubCode as Null.
If SubCode is available then pick Amt and Data direct from the detail table and if SubCode is not available in the details table then pick Amt and Data in NULL record
(NOTE: SubCode=NULL entry is always available for every Configuration row )
Any ideas on where to start?
e.g.
a simple example...
table
declare #Configuration TABLE (
ID INTEGER IDENTITY PRIMARY KEY,
Code VARCHAR(50),
SubCode VARCHAR(50)
);
declare #ConfigurationDetails TABLE
(
Code VARCHAR(50),
SubCode VARCHAR(50),
Amt MONEY,
Data VARCHAR(123)
);
INSERT INTO #Configuration VALUES
('BR1','Sub1'),
('BR1','Sub2'),
('BR1','Sub3'),
('BR1','Sub4'),
('BR2','Sub1'),
('BR2','Sub2')
INSERT INTO #ConfigurationDetails VALUES
('BR1','Sub1',500,'BR1 Sub1 Data'),
('BR1','Sub2',600,'BR1 Sub2 Data'),
('BR1',NULL,700,'BR1 Data'),
('BR2','Sub1',500,'BR2 Sub1 Data'),
('BR2',NULL,700,'BR2 Data')
INPUT:
#SubCode = 'Sub1', #Code = 'BR1'
OUTPUT:
Code SubCode Amt Data
==== ======= === ====
BR1 Sub1 500 BR1 Sub1 Data
INPUT:
#SubCode = 'Sub4', #Code = 'BR1'
OUTPUT:
Code SubCode Amt Data
==== ======= === ====
BR1 NULL 700 BR1 Data
You should be able to use something like
SELECT *
FROM #Configuration c
CROSS APPLY (SELECT TOP 1 *
FROM #ConfigurationDetails cd
WHERE c.Code = cd.Code
AND ( c.SubCode = cd.SubCode
OR c.SubCode IS NULL )
ORDER BY cd.SubCode DESC --Order the not null match first if it exists
) CA
WITH cte
AS (
SELECT a.ID,
a.Code,
a.SubCode,
b.SubCode as bSubCode,
b.Amt,
b.Data,
ROW_NUMBER() OVER(PARTITION BY a.CODE, a.SubCode ORDER BY B.SUBCODE desc) as RN
FROM #Configuration as a
LEFT JOIN #ConfigurationDetails as b ON b.Code = a.Code
AND (a.SubCode = b.SubCode OR b.SubCode IS NULL)
)
SELECT * FROM cte
where rn=1