How to split a row in multiple rows SQL Server? - sql

I want to convert a row in sql table to multiple rows in other table.
example: say if i'm having a table 'UserDetail' and he has 2 address's (home, office...) then the table looks like...
I wand the result to be in the second table as shown in the image

We can use Cross Apply
;WITH CTE(UseriD,Address1Line,Address1City,Address1State,Address2Line,Address2City,Address2State )
AS
(
SELECT 1,'Line1','City1','State1','Line2','City2','State2'
)
SELECT UseriD,[Address],City,[State]
FROM CTE
CROSS APPLY ( VALUES (Address1Line,Address1City,Address1State ),
(Address2Line,Address2City,Address2State )
)AS Dt([Address],City,[State])
Result
UseriD Address City State
-----------------------------
1 Line1 City1 State1
1 Line2 City2 State2
Demo:http://rextester.com/KHFUM28227

You could use "union all" to do that like:
select * into newTable
from
(
select UserId, Address1Line as Address, Address1City as City, Address1State as State
from myTable
union all
select UserId, Address2Line as Address, Address2City as City, Address2State as State
from myTable
) tmp
If you use just UNION instead of UNION ALL you would also be removing the duplicates where Address1 and Address2 is same.

You can CROSS JOIN or CROSS APPLY the table to a list of numbers.
Then use IIF or CASE to get the correspondent numbered fields.
Or CROSS APPLY on the address values directly on the table.
Example snippet:
declare #UserTable table (UserId int, Address1Line varchar(30), Address1City varchar(30), Address1State varchar(30), Address2Line varchar(30), Address2City varchar(30), Address2State varchar(30));
insert into #UserTable (UserId, Address1Line, Address1City, Address1State, Address2Line, Address2City, Address2State) values
(1,'Wonder Lane 42','WonderTown', 'WonderState', 'Somewhere 1 B', 'Nowhere', 'Anywhere'),
(2,'Backstreet 69','Los Libros', 'Everland', 'Immortal Cave 777', 'Ghost City', 'The Wild Lands');
-- Cross Join on numbers
select UserId,
case n when 1 then Address1Line when 2 then Address2Line end as [Address],
case n when 1 then Address1City when 2 then Address2City end as [City],
case n when 1 then Address1State when 2 then Address2State end as [State]
from #UserTable u
cross join (values (1),(2)) as nums(n);
-- Cross Apply on Adress values
select UserId, [Address], [City], [State]
from #UserTable Usr
cross apply (values
(1, Address1Line, Address1City, Address1State),
(2, Address2Line, Address2City, Address2State)
) AS Addr(n, [Address], [City], [State]);
Both return:
UserId Address City State
------ ----------------- ---------- --------------
1 Wonder Lane 42 WonderTown WonderState
1 Somewhere 1 B Nowhere Anywhere
2 Backstreet 69 Los Libros Everland
2 Immortal Cave 777 Ghost City The Wild Lands

Related

Each Column in Separate Row

How can I display each column in separate row and at the end add additional field.
For example I have this result:
ID ArticleName Brend1 Brend2 Brend3
== =========== ======== ======== ========
1 TestArticle 10001 20002 30003
I want to achieve this:
ID ArticleName BrandNo BrandName
== =========== ======= =========
1 TestArticle 10001 if column name = Brand1 Then Nike
1 TestArticle 20002 if column name = Brand2 Then Adidas
1 TestArticle 30003 if column name = Brand3 Then Mercedes
I can show each column in separate row, but how can I add additional column to the end of the result BrandName
Here is what I've done:
DECLARE #temTable TABLE
(
Id INT,
ArticleName VARCHAR(20),
Brand1 VARCHAR(20),
Brand2 VARCHAR(20),
Brand3 VARCHAR(20)
);
INSERT INTO #temTable
(
Id,
ArticleName,
Brand1,
Brand2,
Brand3
)
VALUES
(1, 'TestArticle', '10001', '20002', '30003');
SELECT Id,
ArticleName,
b.*
FROM #temTable a
CROSS APPLY
(
VALUES
(Brand1),
(Brand2),
(Brand3)
) b (Brand)
WHERE b.Brand IS NOT NULL;
You could use CROSS APPLY as
SELECT Id, ArticleName, Br BrandNo, Val BrandName
FROM #TemTable TT
CROSS APPLY(
VALUES
(Brand1, 'Nike'),
(Brand2, 'Adidas'),
(Brand3, 'Mercedes')
) T(Br, Val)
db-fiddle
I assume the brand is stored in another table, so you just need to add another column in your VALUES operator, and then join to the Brand Table:
SELECT Id,
ArticleName,
V.Brand
FROM #temTable a
CROSS APPLY (VALUES (1,Brand1),
(2,Brand2),
(3,Brand3)) V (BrandID,Brand)
JOIN dbo.Brand B ON V.BrandID = B.BrandID
WHERE V.Brand IS NOT NULL;
You can use UNPIVOT to achieve this. You can use either a case statement or another table variable to switch column names with brand names, I would prefer a table variable with a join it would make adding new column a bit easier.
DECLARE #d TABLE (ColNames VARCHAR(128) , BrandName VARCHAR(100))
INSERT INTO #d VALUES ('Brand1', 'Nike'),('Brand2', 'Adidas'),('Brand3', 'Mercedes')
SELECT up.Id
, up.ArticleName
, up.BrandNo
, d.BrandName
FROM #temTable
UNPIVOT (BrandNo FOR ColNames IN (Brand1,Brand2,Brand3)) up
INNER JOIN #d d ON d.ColNames = up.ColNames

SQL group three columns into one

I have a table with three columns:
[ID] [name] [link]
1 sample_name_1 sample_link_1
2 sample_name_2 sample_link_2
3 sample_name_3 sample_link_3
I need to somehow group them into one column, so the ideal result is this:
[one_column]
1
sample_name_1
sample_name_1
2
sample_name_2
sample_link_2
3
sample_name_3
sample_link_3
Does anyone have any suggestions on where to look and how to get it done in SQL Server?
You may try to use VALUES table value constructor with CROSS APPLY:
Table:
CREATE TABLE MyTable (
ID int,
name varchar(50),
link varchar(50)
)
INSERT INTO MyTable (ID, name, link)
VALUES
(1, 'sample_name_1', 'sample_link_1'),
(2, 'sample_name_2', 'sample_link_2'),
(3, 'sample_name_3', 'sample_link_3')
Statement:
SELECT v.one_column
FROM MyTable t
CROSS APPLY (VALUES
(1, CONVERT(varchar(50), ID)),
(2, CONVERT(varchar(50), name)),
(3, CONVERT(varchar(50), link))
) v (rn, one_column)
ORDER BY t.ID, v.rn
Result:
one_column
1
sample_name_1
sample_link_1
2
sample_name_2
sample_link_2
3
sample_name_3
sample_link_3
While this is something you should do in your presentation layer (i.e. your app or Website) you can do this in SQL:
select one column
from
(
select cast(id as varchar(10)) as one column, id as sortkey1, 1 as sortkey2 from mytable
union all
select name as one column, id as sortkey1, 2 as sortkey2 from mytable
union all
select link as one column, id as sortkey1, 3 as sortkey2 from mytable
) unioned
order by sortkey1, sortkey2;

How can I use SQL Pivot Table to turn my rows of data into columns

In SQL 2008, I need to flatten a table and show extra rows as columns. All I can find are queries with calculations. I just want to show the raw data. The data is like as below (simplified):
ID# Name Name_Type
1 Mary Jane Legal
1 MJ Nickname
1 Smith Maiden
2 John Legal
3 Suzanne Legal
3 Susie Nickname
I want the data to show as:
ID# Legal Nickname Maiden
1 Mary Jane MJ Smith
2 John
3 Suzanne Susie
where nothing shows in the column if there is not a row existing for that column. I'm thinking the Pivot Table method should work.
PIVOT requires you to use an aggregate. See this post for a better explanation of how it works.
CREATE TABLE #MyTable
(
ID# INT
, Name VARCHAR(50)
, Name_Type VARCHAR(50)
);
INSERT INTO #MyTable VALUES
(1, 'Mary Jane', 'Legal')
, (1, 'MJ', 'Nickname')
, (1, 'Smith', 'Maiden')
, (2, 'John', 'Legal')
, (3, 'Suzanne', 'Legal')
, (3, 'Susie', 'Nickname');
SELECT *
FROM
(
SELECT * FROM #MyTable
) AS Names
PIVOT (MAX(NAME)
FOR Name_Type IN ([Legal], [Nickname], [Maiden]))
AS PVT;
DROP TABLE #MyTable;
Try this (replace "new_Table" with your table - name and "ID_" with your id - column):
SELECT ID_ AS rootID, (
SELECT Name
FROM new_table
WHERE Name_type = 'legal'
AND new_table.ID_ = rootID
) AS legal,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Nickname'
AND ID_ = rootID
) AS Nickname,
(
SELECT Name
FROM new_table
WHERE Name_type = 'Maiden'
AND ID_ = rootID
) AS Maiden
FROM new_table
GROUP BY rootID;

How to synthesize attribute for joined tables

I have a view defined like this:
CREATE VIEW [dbo].[PossiblyMatchingContracts] AS
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts
FROM [dbo].AllContracts AS C
INNER JOIN [dbo].AllContracts AS CC
ON C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND C.AssociatedUser IS NULL
AND C.UniqueID <> CC.UniqueID
Which is basically finding contracts where f.e. the first name and the birthday are matching. This works great. Now I want to add a synthetic attribute to each row with the value from only one source row.
Let me give you an example to make it clearer. Suppose I have the following table:
UniqueID | FirstName | LastName | Birthday
1 | Peter | Smith | 1980-11-04
2 | Peter | Gray | 1980-11-04
3 | Peter | Gray-Smith| 1980-11-04
4 | Frank | May | 1985-06-09
5 | Frank-Paul| May | 1985-06-09
6 | Gina | Ericson | 1950-11-04
The resulting view should look like this:
UniqueID | PossiblyMatchingContracts | SyntheticID
1 | 2 | PeterSmith1980-11-04
1 | 3 | PeterSmith1980-11-04
2 | 1 | PeterSmith1980-11-04
2 | 3 | PeterSmith1980-11-04
3 | 1 | PeterSmith1980-11-04
3 | 2 | PeterSmith1980-11-04
4 | 5 | FrankMay1985-06-09
5 | 4 | FrankMay1985-06-09
6 | NULL | NULL [or] GinaEricson1950-11-04
Notice that the SyntheticID column uses ONLY values from one of the matching source rows. It doesn't matter which one. I am exporting this view to another application and need to be able to identify each "match group" afterwards.
Is it clear what I mean? Any ideas how this could be done in sql?
Maybe it helps to elaborate a bit on the actual use case:
I am importing contracts from different systems. To account for the possibility of typos or people that have married but the last name was only updated in one system, I need to find so called 'possible matches'. Two or more contracts are considered a possible match if they contain the same birthday plus the same first, last or birth name. That implies, that if contract A matches contract B, contract B also matches contract A.
The target system uses multivalue reference attributes to store these relationships. The ultimate goal is to create user objects for these contracts. The catch first is, that the shall only be one user object for multiple matching contracts. Thus I'm creating these matches in the view. The second catch is, that the creation of user objects happens by workflows, which run parallel for each contract. To avoid creating multiple user objects for matching contracts, each workflow needs to check, if there is already a matching user object or another workflow, which is about to create said user object. Because the workflow engine is extremely slow compared to sql, the workflows should not repeat the whole matching test. So the idea is, to let the workflow check only for the 'syntheticID'.
I have solved it with a multi step approach:
Create the list of possible 1st level matches for each contract
Create the base groups list, assigning a different group for for
each contract (as if they were not related to anybody)
Iterate the matches list updating the group list when more contracts need to
be added to a group
Recursively build up the SyntheticID from final group list
Output results
First of all, let me explain what I have understood, so you can tell if my approach is correct or not.
1) matching propagates in "cascade"
I mean, if "Peter Smith" is grouped up with "Peter Gray", it means that all Smith and all Gray are related (if they have the same birth date) so Luke Smith can be in the same group of John Gray
2) I have not understood what you mean with "Birth Name"
You say contracts matches on "first, last or birth name", sorry, I'm italian, I thought birth name and first were the same, also in your data there is not such column. Maybe it is related to that dash symbol between names?
When FirstName is Frank-Paul it means it should match both Frank and Paul?
When LastName is Gray-Smith it means it should match both Gray and Smith?
In following code I have simply ignored this problem, but it could be handled if needed (I already did a try, breaking names, unpivoting them and treating as double match).
Step Zero: some declaration and prepare base data
declare #cli as table (UniqueID int primary key, FirstName varchar(20), LastName varchar(20), Birthday varchar(20))
declare #comb as table (id1 int, id2 int, done bit)
declare #grp as table (ix int identity primary key, grp int, id int, unique (grp,ix))
declare #str_id as table (grp int primary key, SyntheticID varchar(1000))
declare #id1 as int, #g int
;with
t as (
select *
from (values
(1 , 'Peter' , 'Smith' , '1980-11-04'),
(2 , 'Peter' , 'Gray' , '1980-11-04'),
(3 , 'Peter' , 'Gray-Smith', '1980-11-04'),
(4 , 'Frank' , 'May' , '1985-06-09'),
(5 , 'Frank-Paul', 'May' , '1985-06-09'),
(6 , 'Gina' , 'Ericson' , '1950-11-04')
) x (UniqueID , FirstName , LastName , Birthday)
)
insert into #cli
select * from t
Step One: Create the list of possible 1st level matches for each contract
;with
p as(select UniqueID, Birthday, FirstName, LastName from #cli),
m as (
select p.UniqueID UniqueID1, p.FirstName FirstName1, p.LastName LastName1, p.Birthday Birthday1, pp.UniqueID UniqueID2, pp.FirstName FirstName2, pp.LastName LastName2, pp.Birthday Birthday2
from p
join p pp on (pp.Birthday=p.Birthday) and (pp.FirstName = p.FirstName or pp.LastName = p.LastName)
where p.UniqueID<=pp.UniqueID
)
insert into #comb
select UniqueID1,UniqueID2,0
from m
Step Two: Create the base groups list
insert into #grp
select ROW_NUMBER() over(order by id1), id1 from #comb where id1=id2
Step Three: Iterate the matches list updating the group list
Only loop on contracts that have possible matches and updates only if needed
set #id1 = 0
while not(#id1 is null) begin
set #id1 = (select top 1 id1 from #comb where id1<>id2 and done=0)
if not(#id1 is null) begin
set #g = (select grp from #grp where id=#id1)
update g set grp= #g
from #grp g
inner join #comb c on g.id = c.id2
where c.id2<>#id1 and c.id1=#id1
and grp<>#g
update #comb set done=1 where id1=#id1
end
end
Step Four: Build up the SyntheticID
Recursively add ALL (distinct) first and last names of group to SyntheticID.
I used '_' as separator for birth date, first names and last names, and ',' as separator for the list of names to avoid conflicts.
;with
c as(
select c.*, g.grp
from #cli c
join #grp g on g.id = c.UniqueID
),
d as (
select *, row_number() over (partition by g order by t,s) n1, row_number() over (partition by g order by t desc,s desc) n2
from (
select distinct c.grp g, 1 t, FirstName s from c
union
select distinct c.grp, 2, LastName from c
) l
),
r as (
select d.*, cast(CONVERT(VARCHAR(10), t.Birthday, 112) + '_' + s as varchar(1000)) Names, cast(0 as bigint) i1, cast(0 as bigint) i2
from d
join #cli t on t.UniqueID=d.g
where n1=1
union all
select d.*, cast(r.names + IIF(r.t<>d.t,'_',',') + d.s as varchar(1000)), r.n1, r.n2
from d
join r on r.g = d.g and r.n1=d.n1-1
)
insert into #str_id
select g, Names
from r
where n2=1
Step Five: Output results
select c.UniqueID, case when id2=UniqueID then id1 else id2 end PossibleMatchingContract, s.SyntheticID
from #cli c
left join #comb cb on c.UniqueID in(id1,id2) and id1<>id2
left join #grp g on c.UniqueID = g.id
left join #str_id s on s.grp = g.grp
Here is the results
UniqueID PossibleMatchingContract SyntheticID
1 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
1 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
2 3 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 1 1980-11-04_Peter_Gray,Gray-Smith,Smith
3 2 1980-11-04_Peter_Gray,Gray-Smith,Smith
4 5 1985-06-09_Frank,Frank-Paul_May
5 4 1985-06-09_Frank,Frank-Paul_May
6 NULL 1950-11-04_Gina_Ericson
I think that in this way the resulting SyntheticID should also be "unique" for each group
This creates a synthetic value and is easy to change to suit your needs.
DECLARE #T TABLE (
UniqueID INT
,FirstName VARCHAR(200)
,LastName VARCHAR(200)
,Birthday DATE
)
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 1,'Peter','Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 2,'Peter','Gray','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 3,'Peter','Gray-Smith','1980-11-04'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 4,'Frank','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 5,'Frank-Paul','May','1985-06-09'
INSERT INTO #T(UniqueID,FirstName,LastName,Birthday) SELECT 6,'Gina','Ericson','1950-11-04'
DECLARE #PossibleMatches TABLE (UniqueID INT,[PossibleMatch] INT,SynKey VARCHAR(2000)
)
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID [UniqueID],t2.UniqueID [Possible Matches],'Fn=' + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.FirstName=t2.FirstName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,t2.UniqueID,'Ln=' + t1.LastName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
INNER JOIN #T t2 ON t1.Birthday=t2.Birthday
AND t1.LastName=t2.LastName
AND t1.UniqueID<>t2.UniqueID
INSERT INTO #PossibleMatches
SELECT t1.UniqueID,pm.UniqueID,'Ln=' + t1.LastName + ' Fn=' + + t1.FirstName + ' DoB=' + CONVERT(VARCHAR,t1.Birthday,102) [SynKey]
FROM #T t1
LEFT JOIN #PossibleMatches pm on pm.UniqueID=t1.UniqueID
WHERE pm.UniqueID IS NULL
SELECT *
FROM #PossibleMatches
ORDER BY UniqueID,[PossibleMatch]
I think this will work for you
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C INNER JOIN
[dbo].AllContracts AS CC ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN(
SELECT UniqueID FROM [dbo].DefinitiveMatches)
AND C.AssociatedUser IS NULL
You can try this:
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
FIRST_VALUE(CC.FirstName+CC.LastName+CC.Birthday)
OVER (PARTITION BY C.UniqueID ORDER BY CC.UniqueID) as SyntheticID
FROM
[dbo].AllContracts AS C
INNER JOIN
[dbo].AllContracts AS CC
ON
C.SecondaryMatchCodeFB = CC.SecondaryMatchCodeFB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeLB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeLB = CC.SecondaryMatchCodeBB
OR
C.SecondaryMatchCodeBB = CC.SecondaryMatchCodeLB
WHERE
C.UniqueID NOT IN
(
SELECT UniqueID FROM [dbo].DefinitiveMatches
)
AND
C.AssociatedUser IS NULL
This will generate one extra row (because we left out C.UniqueID <> CC.UniqueID) but will give you the good souluton.
Following an example with some example data extracted from your original post. The idea: Generate all SyntheticID in a CTE, query all records with a "PossibleMatch" and Union it with all records which are not yet included:
DECLARE #t TABLE(
UniqueID int
,FirstName nvarchar(20)
,LastName nvarchar(20)
,Birthday datetime
)
INSERT INTO #t VALUES (1, 'Peter', 'Smith', '1980-11-04');
INSERT INTO #t VALUES (2, 'Peter', 'Gray', '1980-11-04');
INSERT INTO #t VALUES (3, 'Peter', 'Gray-Smith', '1980-11-04');
INSERT INTO #t VALUES (4, 'Frank', 'May', '1985-06-09');
INSERT INTO #t VALUES (5, 'Frank-Paul', 'May', '1985-06-09');
INSERT INTO #t VALUES (6, 'Gina', 'Ericson', '1950-11-04');
WITH ctePrep AS(
SELECT UniqueID, FirstName, LastName, BirthDay,
ROW_NUMBER() OVER (PARTITION BY FirstName, BirthDay ORDER BY FirstName, BirthDay) AS k,
FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
),
cteKeys AS(
SELECT FirstName, BirthDay, SyntheticID
FROM ctePrep
WHERE k = 1
),
cteFiltered AS(
SELECT
C.UniqueID,
CC.UniqueID AS PossiblyMatchingContracts,
keys.SyntheticID
FROM #t AS C
JOIN #t AS CC ON C.FirstName = CC.FirstName
AND C.Birthday = CC.Birthday
JOIN cteKeys AS keys ON keys.FirstName = c.FirstName
AND keys.Birthday = c.Birthday
WHERE C.UniqueID <> CC.UniqueID
)
SELECT UniqueID, PossiblyMatchingContracts, SyntheticID
FROM cteFiltered
UNION ALL
SELECT UniqueID, NULL, FirstName+LastName+CONVERT(nvarchar(10), Birthday, 126) AS SyntheticID
FROM #t
WHERE UniqueID NOT IN (SELECT UniqueID FROM cteFiltered)
Hope this helps. The result looked OK to me:
UniqueID PossiblyMatchingContracts SyntheticID
---------------------------------------------------------------
2 1 PeterSmith1980-11-04
3 1 PeterSmith1980-11-04
1 2 PeterSmith1980-11-04
3 2 PeterSmith1980-11-04
1 3 PeterSmith1980-11-04
2 3 PeterSmith1980-11-04
4 NULL FrankMay1985-06-09
5 NULL Frank-PaulMay1985-06-09
6 NULL GinaEricson1950-11-04
Tested in SSMS, it works perfect. :)
--create table structure
create table #temp
(
uniqueID int,
firstname varchar(15),
lastname varchar(15),
birthday date
)
--insert data into the table
insert #temp
select 1, 'peter','smith','1980-11-04'
union all
select 2, 'peter','gray','1980-11-04'
union all
select 3, 'peter','gray-smith','1980-11-04'
union all
select 4, 'frank','may','1985-06-09'
union all
select 5, 'frank-paul','may','1985-06-09'
union all
select 6, 'gina','ericson','1950-11-04'
select * from #temp
--solution is as below
select ab.uniqueID
, PossiblyMatchingContracts
, c.firstname+c.lastname+cast(c.birthday as varchar) as synID
from
(
select a.uniqueID
, case
when a.uniqueID < min(b.uniqueID)over(partition by a.uniqueid)
then a.uniqueID
else min(b.uniqueID)over(partition by a.uniqueid)
end as SmallestID
, b.uniqueID as PossiblyMatchingContracts
from #temp a
left join #temp b
on (a.firstname = b.firstname OR a.lastname = b.lastname) AND a.birthday = b.birthday AND a.uniqueid <> b.uniqueID
) as ab
left join #temp c
on ab.SmallestID = c.uniqueID
Result capture is attached below:
Say we have following table (a VIEW in your case):
UniqueID PossiblyMatchingContracts SyntheticID
1 2 G1
1 3 G2
2 1 G3
2 3 G4
3 1 G4
3 4 G6
4 5 G7
5 4 G8
6 NULL G9
In your case you can set initial SyntheticID as a string like PeterSmith1980-11-04 using UniqueID for each line. Here is a recursive CTE query it divides all lines to unconnected groups and select MAX(SyntheticId) in the current group as a new SyntheticID for all lines in this group.
WITH CTE AS
(
SELECT CAST(','+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' as Varchar(MAX)) as GroupCont,
SyntheticID
FROM PossiblyMatchingContracts
UNION ALL
SELECT CAST(GroupCont+CAST(UniqueID AS Varchar(100)) +','+ CAST(PossiblyMatchingContracts as Varchar(100))+',' AS Varchar(MAX)) as GroupCont,
pm.SyntheticID
FROM CTE
JOIN PossiblyMatchingContracts as pm
ON
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
AND NOT
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
AND
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
)
)
SELECT pm.UniqueID,
pm.PossiblyMatchingContracts,
ISNULL(
(SELECT MAX(SyntheticID) FROM CTE WHERE
(
CTE.GroupCont LIKE '%,'+CAST(pm.UniqueID AS Varchar(100))+',%'
OR
CTE.GroupCont LIKE '%,'+CAST(pm.PossiblyMatchingContracts AS Varchar(100))+',%'
))
,pm.SyntheticID) as SyntheticID
FROM PossiblyMatchingContracts pm

How to Select data from SQL Server 2000 using distinct clause on ONLY 1 column

I have a query where I'm trying to select some customer information: name, address, city, state, and zip. I'd like to pull all information and only pull of the records if there is a dupe.
Example of data:
Invoice_Date First Last Addr City State Zip
11/11/14 Jim Jones 12 Cedar alkdjf TN 29430
11/11/15 Ralph Jones 12 Cedar alkdjf TN 29430
11/11/14 Robert Smith 15 block slkjdd TX 10932
What I want to return:
Invoice_Date First Last Addr City State Zip
11/11/15 Ralph Jones 12 Cedar alkdjf TN 29430 (newest Record)
11/11/14 Robert Smith 15 block slkjdd TX 10932
This is my query that is able to pull ALL customers for the specified dates:
SELECT
Invoice_Tb.Invoice_Date, Invoice_Tb.Customer_First_Name,
Invoice_Tb.Customer_Last_Name,
Invoice_Tb.Customer_Address, Invoice_Tb.City,
Invoice_Tb.Customer_State, Invoice_Tb.ZIP_Code
FROM
Invoice_Tb
LEFT OUTER JOIN
Invoice_Detail_Tb ON Invoice_Tb.Store_Number = Invoice_Detail_Tb.Store_Number
AND Invoice_Tb.Invoice_Number = Invoice_Detail_Tb.Invoice_Number
AND Invoice_Tb.Invoice_Date = Invoice_Detail_Tb.Invoice_Date
WHERE
(Invoice_Tb.Invoice_Date IN ('11/11/14', '11/11/15'))
AND (Invoice_Detail_Tb.Invoice_Detail_Code = 'FSV')
AND (LEN(Invoice_Tb.Customer_Address) > 4)
ORDER BY
Invoice_Tb.Customer_Address
Now, obviously I can't use Row_Number here, because it's not an option in SQL Server 2000, so that's out.
I've tried Select Distinct - but I'm in need of the other information (first name, last name, etc), and when using select Distinct, it also returns distinct records for First and Last name. I only want 1 record, per address.
How can I return 1 row, for each distinct Address while including first name last name of the MOST recent visit, in this case - 11/11/15.
create table #SomeTest (Invoice_Number Int, InvoiceDt DateTime, FName Varchar(24), LName Varchar(24), Addr Varchar(24), City Varchar(24), St Varchar(2), Zip Varchar(12) )
insert into #SomeTest (Invoice_Number, InvoiceDt, FName, LName, Addr, City, St, Zip) values (1, '11/11/14', 'Jim','Jones', '12 Cedar', 'alkdjf', 'TN', '29430')
insert into #SomeTest (Invoice_Number, InvoiceDt, FName, LName, Addr, City, St, Zip) values (2, '11/11/15', 'Ralph','Jones', '12 Cedar', 'alkdjf', 'TN', '29430')
insert into #SomeTest (Invoice_Number, InvoiceDt, FName, LName, Addr, City, St, Zip) values (3, '11/11/14', 'Robert','Smith', '15 block', 'slkjdd', 'TX', '10932')
select * from #SomeTest
where Invoice_Number in
(
select Invoice_Number from
(select Invoice_Number = max(Invoice_Number), SupperAddy = Addr + '#' + City + '#' + St + '#' + Zip from #SomeTest
group by Addr + '#' + City + '#' + St + '#' + Zip) X
)
If you don't want to use any analytic function (like row_number) then you could do something like this:
--This gives you the most recent date for an address
Select max(invoice_Date) as Invoice_date
Invoice_Tb.Customer_Address
, Invoice_Tb.City
, Invoice_Tb.Customer_State
, Invoice_Tb.ZIP_Code
into #tmp1
from Invoice_Tb
group by Invoice_Tb.Customer_Address
, Invoice_Tb.City
, Invoice_Tb.Customer_State
, Invoice_Tb.ZIP_Code
GO
--link back to the name for the most recent address
Select a.Invoice_date
,b.Customer_First_Name as [First]
,b.Customer_Last_Name as [Last]
, a.Invoice_Tb.Customer_Address as [Addr]
, a.Invoice_Tb.City
, a.Invoice_Tb.Customer_State as [State]
, a.Invoice_Tb.ZIP_Code as [Zip]
from #tmp1 a
left join Invoice_T b on
a.Invoicedate = b.Invoicedate
a.Customer_Address = b.Customer_Address
a.City = b.City
a.Customer_State = b.Customer_State
a.ZIP_Code = b.ZIP_Code
GO
Here's the "standard" way I always did this in my MsSql 2000 days. In your case, a subquery in the WHERE clause would also work, but I will use the INNER JOIN version of this solution. I employed a little bit of pseudo-code to cut down on my typing. You should be able to figure it out:
SELECT
t1.Invoice_Date, t1.Customer_First_Name,
t1.Customer_Last_Name,
t1.Customer_Address, t1.City,
t1.Customer_State, t1.ZIP_Code
FROM
Invoice_Tb t1
INNER JOIN Invoice_Tb t2
ON t1.Invoice_Number = (
SELECT TOP 1 t2.Invoice_Number
WHERE t2.Customer_Address=t1.Customer_Address
AND {t2.City,State,Zip=t1.City,State,Zip} --psuedocode
ORDER BY t2.Invoice_Date DESC
)
LEFT OUTER JOIN
Invoice_Detail_Tb ON ...
WHERE
...
ORDER BY
Invoice_Tb.Customer_Address
Also note that if any of the "address" fields can be NULL, you will have to handle that possibility when comparing them in the subquery.
you can do something like this query:
SELECT
i.last,
MAX(invoice_date) AS InvoiceDAte,
MAX(first) AS First,
MAX(addr) AS addr,
MAX(zip) AS zip
FROM invoice i
INNER JOIN (SELECT
last,
MAX(invoice_date) AS MaxInvoiceDate
FROM invoice
GROUP BY last) a
ON a.last = i.last
AND a.MaxInvoiceDate = i.Invoice_date
GROUP BY i.last
Here max(invoice_date) can be multiple in that case we require to take top 1 on that