Merge a two way relation in the same table in SQL Server - sql

Current Data
ID | Name1 | Name2
<guid1> | XMind | MindNode
<guid2> | MindNode | XMind
<guid3> | avast | Hitman Pro
<guid4> | Hitman Pro | avast
<guid5> | PPLive | Hola!
<guid6> | ZenMate | Hola!
<guid7> | Hola! | PPLive
<guid8> | Hola! | ZenMate
Required Output
ID1 | ID2 | Name1 | Name2
<guid1> | <guid2> | XMind | MindNode
<guid3> | <guid4> | avast | Hitman Pro
<guid5> | <guid7> | PPLive | Hola!
<guid6> | <guid8> | Hola! | ZenMate
These are relations between apps. I want to show that Avast and Hitman has a relation but in this view i do not need to show in what "direction" they have an relation. It's a given in this view that the relation goes both ways.
EDIT: Seems like my example was to simple. The solution doesn't work with more data.
DECLARE #a TABLE (ID INT, Name1 VARCHAR(50), Name2 VARCHAR(50))
INSERT INTO #a VALUES ( 1, 'XMind', 'MindNode' )
INSERT INTO #a VALUES ( 2, 'MindNode', 'XMind' )
INSERT INTO #a VALUES ( 3, 'avast', 'Hitman Pro' )
INSERT INTO #a VALUES ( 4, 'Hitman Pro', 'avast' )
INSERT INTO #a VALUES ( 5, 'PPLive Video Accelerator', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( 6, 'ZenMate', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( 7, 'Hola! Better Internet', 'PPLive Video Accelerator' )
INSERT INTO #a VALUES ( 8, 'Hola! Better Internet', 'ZenMate' )
SELECT a1.ID AS ID1 ,
a2.ID AS ID2 ,
a1.Name1 ,
a2.Name1 AS Name2
FROM #a a1
JOIN #a a2 ON a1.Name1 = a2.Name2
AND a1.ID < a2.ID -- avoid duplicates
This works however so i guess it's the Guid that is messing with me.
EDIT AGAIN:
I haven't looked at this for a while and i thought it worked but i just realized it does not. I've struggled all morning with this but i must admit that SQL is not really my strong suite. The thing is this.
DECLARE #a TABLE (ID int, Name1 VARCHAR(50), Name2 VARCHAR(50))
INSERT INTO #a VALUES ( 1, 'XMind', 'MindNode' )
INSERT INTO #a VALUES ( 2, 'MindNode', 'XMind' )
INSERT INTO #a VALUES ( 3, 'avast', 'Hitman Pro' )
INSERT INTO #a VALUES ( 4, 'PPLive Video Accelerator', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( 5, 'ZenMate', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( 6, 'Hitman Pro', 'avast' )
INSERT INTO #a VALUES ( 7, 'Hola! Better Internet', 'PPLive Video Accelerator' )
INSERT INTO #a VALUES ( 8, 'Hola! Better Internet', 'ZenMate' )
INSERT INTO #a VALUES ( 9, 'XX', 'A' )
INSERT INTO #a VALUES ( 10, 'XX', 'BB' )
INSERT INTO #a VALUES ( 11, 'BB', 'XX' )
INSERT INTO #a VALUES ( 12, 'A', 'XX' )
INSERT INTO #a VALUES ( 13, 'XX', 'CC' )
INSERT INTO #a VALUES ( 14, 'CC', 'XX' )
;With CTE as
(
SELECT a1.ID AS ID1 ,
a2.ID AS ID2 ,
a1.Name1 ,
a2.Name1 AS Name2,
CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end) ck, -- just for display
Row_Number() over (Partition by CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end)
order by CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end)) as rn
FROM #a a1
JOIN #a a2 ON a1.Name1 = a2.Name2
)
Select ID1, ID2,Name1, Name2
from CTE C1
where rn=1
When i use this code it sure works fine with the names but it doesn't match the ID's correctly.
The result is
ID1 | ID2 | Name1 | Name2
12 | 9 | A | X (Correct)
7 | 5 | Hola! | ZenMate (Not Correct)
[..]
I've pulled my hair all morning but i can't figure this out. I still use Guid's as ID's and just use Int's here to make it a bit more readable.

DECLARE #a TABLE (ID INT, Name1 VARCHAR(50), Name2 VARCHAR(50))
INSERT INTO #a VALUES ( 1, 'XMind', 'MindNode' )
INSERT INTO #a VALUES ( 2, 'MindNode', 'XMind' )
INSERT INTO #a VALUES ( 3, 'avast', 'Hitman Pro' )
INSERT INTO #a VALUES ( 4, 'Hitman Pro', 'avast' )
SELECT a1.ID AS ID1 ,
a2.ID AS ID2 ,
a1.Name1 ,
a2.Name1 AS Name2
FROM #a a1
JOIN #a a2 ON a1.Name1 = a2.Name2
AND a1.ID < a2.ID -- avoid duplicates
Referring to the amendment and extension of your question, a more complicated solution is required.
We form a CHECKSUM on a1.Name1,a2.Name (to get an identical we exchanged on size).
Using this we generate with ROW_NUMBER (Transact-SQL) a number and use only rows from the result with number 1.
DECLARE #a TABLE (ID uniqueIdentifier, Name1 VARCHAR(50), Name2 VARCHAR(50))
INSERT INTO #a VALUES ( NewID(), 'XMind', 'MindNode' )
INSERT INTO #a VALUES ( NewID(), 'MindNode', 'XMind' )
INSERT INTO #a VALUES ( NewID(), 'avast', 'Hitman Pro' )
INSERT INTO #a VALUES ( NewID(), 'Hitman Pro', 'avast' )
INSERT INTO #a VALUES ( NewID(), 'PPLive Video Accelerator', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( NewID(), 'ZenMate', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( NewID(), 'Hola! Better Internet', 'PPLive Video Accelerator' )
INSERT INTO #a VALUES ( NewID(), 'Hola! Better Internet', 'ZenMate' )
INSERT INTO #a VALUES ( NewID(), 'XX', 'A' )
INSERT INTO #a VALUES ( NewID(), 'A', 'XX' )
INSERT INTO #a VALUES ( NewID(), 'XX', 'BB' )
INSERT INTO #a VALUES ( NewID(), 'BB', 'XX' )
INSERT INTO #a VALUES ( NewID(), 'XX', 'CC' )
INSERT INTO #a VALUES ( NewID(), 'CC', 'XX' )
;With CTE as
(
SELECT a1.ID AS ID1 ,
a2.ID AS ID2 ,
a1.Name1 ,
a2.Name1 AS Name2,
CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end) ck, -- just for display
Row_Number() over (Partition by CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end)
order by CheckSum(Case when a1.Name1>a2.Name1 then a2.Name1+a1.Name1 else a1.Name1+a2.Name1 end)) as rn
FROM #a a1
JOIN #a a2 ON a1.Name1 = a2.Name2
)
Select *
from CTE C1
where rn=1
Edit:
If you only want to get those where both fields are fitting the needed query would simply be:
SELECT a1.ID AS ID1 , a2.ID AS ID2 , a1.Name1 , a2.Name1 AS Name2
FROM #a a1
JOIN #a a2 ON a1.Name1 = a2.Name2 and a1.Name2 = a2.Name1 AND a1.ID < a2.ID

If the output should contain only two-way relations ('XX' + 'A') AND ('A' + 'XX'), try this:
;
WITH m (ID1, ID2, Name1, Name2) AS (
SELECT ID1, ID2, Name1, Name2
FROM (
SELECT a1.ID AS ID1
,a2.ID AS ID2
,a1.Name1 AS Name1
,a2.Name1 AS Name2
,ROW_NUMBER() OVER (PARTITION BY a1.Name1, a2.Name1 ORDER BY (SELECT 1)) AS n
FROM #a AS a1
JOIN #a AS a2
ON a1.Name1 = a2.Name2
AND a1.Name2 = a2.Name1
) AS T
WHERE n = 1
)
SELECT DISTINCT *
FROM (
SELECT ID1, ID2, Name1, Name2
FROM m
WHERE ID1 <= ID2
UNION ALL
SELECT ID2, ID1, Name2, Name1
FROM m
WHERE ID1 > ID2
) AS dm
It produces the output as follows:
+------+-----+--------------------------+-----------------------+
| ID1 | ID2 | Name1 | Name2 |
+------+-----+--------------------------+-----------------------+
| 1 | 2 | XMind | MindNode |
| 3 | 6 | avast | Hitman Pro |
| 4 | 7 | PPLive Video Accelerator | Hola! Better Internet |
| 5 | 8 | ZenMate | Hola! Better Internet |
| 9 | 12 | XX | A |
| 10 | 11 | XX | BB |
| 13 | 14 | XX | CC |
+------+-----+--------------------------+-----------------------+

Just rank your rows with ROW_NUMBER function and use this rank in join instead of original ID column:
DECLARE #a TABLE (ID UNIQUEIDENTIFIER, Name1 VARCHAR(50), Name2 VARCHAR(50))
INSERT INTO #a VALUES ( NEWID(), 'XMind', 'MindNode' )
INSERT INTO #a VALUES ( NEWID(), 'MindNode', 'XMind' )
INSERT INTO #a VALUES ( NEWID(), 'avast', 'Hitman Pro' )
INSERT INTO #a VALUES ( NEWID(), 'Hitman Pro', 'avast' )
INSERT INTO #a VALUES ( NEWID(), 'PPLive Video Accelerator', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( NEWID(), 'ZenMate', 'Hola! Better Internet' )
INSERT INTO #a VALUES ( NEWID(), 'Hola! Better Internet', 'PPLive Video Accelerator' )
INSERT INTO #a VALUES ( NEWID(), 'Hola! Better Internet', 'ZenMate' )
;WITH cte AS(SELECT *, ROW_NUMBER() OVER (ORDER BY (SELECT 1)) rn FROM #a)
SELECT a1.ID AS ID1 ,
a2.ID AS ID2 ,
a1.Name1 ,
a2.Name1 AS Name2
FROM cte a1
JOIN cte a2 ON a1.Name1 = a2.Name2 AND
a2.Name1 = a1.Name2 AND
a1.rn < a2.rn
Output:
ID1 ID2 Name1 Name2
Guid Guid XMind MindNode
Guid Guid avast Hitman Pro
Guid Guid PPLive Video Accelerator Hola! Better Internet
Guid Guid ZenMate Hola! Better Internet

I suggest you to use this simple way:
SELECT
t2.ID, t3.ID ID2,
t1.Name1,t1.Name2
FROM (
SELECT DISTINCT
CASE WHEN Name1 <= Name2 THEN Name1 ELSE Name2 END AS Name1,
CASE WHEN Name1 <= Name2 THEN Name2 ELSE Name1 END AS Name2
FROM
#a) t1
JOIN
#a t2 ON t1.Name1+t1.Name2 = t2.Name1+t2.Name2
JOIN
#a t3 ON t1.Name1+t1.Name2 = t3.Name2+t3.Name1
For this:
ID | ID2 | Name1 | Name2
----+-----+-----------------------+---------------------------
12 | 9 | A | XX
3 | 4 | avast | Hitman Pro
11 | 10 | BB | XX
14 | 13 | CC | XX
7 | 5 | Hola! Better Internet | PPLive Video Accelerator
8 | 6 | Hola! Better Internet | ZenMate
2 | 1 | MindNode | XMind

You can solve this using a CROSS APPLY
SELECT a2.ID ID_1,a1.ID ID_2, a2.Name1 , a2.Name2
FROM #a a1
CROSS APPLY
(
SELECT ID, Name2, Name1
FROM #a aa
WHERE aa.Name1 = a1.Name2 AND a1.Name1 = aa.Name2 AND a1.ID > aa.ID
) a2

You can try also:
select min(ID) ID1,
max(ID) ID2,
Name1,
Name2
from ( -- Here I get all the IDs and each couple sorted
-- Change > to < if you don't like the order
select ID,
case
when Name1 > Name2 then Name1
else Name2
end Name1,
case
when Name1 > Name2 then Name2
else Name1
end Name2
from table1
) as t
group by Name1,
Name2
You can even tansform this in a simgle query, without the inner one, but I think in this way it's more readable and you can understand better my approach.

Related

Get exact match between two table

I have a issue that is hard to explain. I have two tables
table1: (this is something like shipping table)
ID ShippingId ProductId1 ProductId2
1 100 A A1
2 100 A A2
3 100 A A3
4 100 A A4
5 100 A A5
6 200 B B1
7 200 B B2
8 300 B A1
9 300 B A2
table2: (and this is about relation between ProductId1 and ProductId2)
ID ProductId1 ProductId2
1 A A1
2 A A2
3 A A3
4 A A4
5 A A5
6 B B1
7 B B2
In the case the shipment "100" includes all items of "A" so this should "true"
and the shipments "200" and "300" does not include all parts of their main products. So expected output should be like
ShippingId ProductId1 IsIncludeAll
100 A true
200 B false
300 A true
can you guys help me?
DECLARE #table1 AS TABLE
(
ShippingID INT,
ProductId1 INT,
ProductId2 INT
)
DECLARE #table2 AS TABLE
(
ProductId1 INT,
ProductId2 INT
)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (100,111, 1119)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (100,111, 1118)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (100,111, 1117)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (100,111, 1116)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (100,111, 1115)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (200,222, 2229)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (200,222, 2228)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (300,111, 1117)
INSERT INTO #table1 (ShippingId,ProductId1,ProductId2) VALUES (300,111, 1116)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 111, 1119)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 111, 1118)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 111, 1117)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 111, 1116)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 111, 1115)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 222, 2229)
INSERT INTO #table2 (ProductId1,ProductId2) VALUES ( 222, 2228)
ShippingId ProductId1 IsIncludeAll
100 A true
200 B false
300 A false
A total guess, as the sample DDL and DML don't match the sample data, but perhaps this?
SELECT S.ShippingID,
T2.ProductId1,
CASE COUNT(CASE WHEN T1.ProductId2 IS NULL THEN 1 END) WHEN 0 THEN 'true' ELSE 'false' END AS IsIncludeAll
FROM #table2 T2
CROSS APPLY (SELECT DISTINCT
sq.ShippingID,
sq.ProductId1
FROM #table1 sq
WHERE sq.ProductId1 = T2.ProductId1) S
LEFT JOIN #table1 T1 ON T2.ProductId1 = T2.ProductId1
AND T1.ProductId2 = T2.ProductId2
AND S.ShippingID = T1.ShippingID
GROUP BY S.ShippingID,
T2.ProductId1;
Little confused on your sample data and output. From my thought Check this query and output.
------Step 1. concatenate product1 and product2 with ShippingID wise,product1 wise order by ShippingID,ProductId1,ProductId2-------------------
declare #Shipping as table
(
ShippingID INT,
ProductId1 INT,
ProductDesc varchar(max)
)
insert #Shipping
SELECT Shipping.ShippingID,Shipping.ProductId1,
LEFT(Shipping.prod_desc,Len(Shipping.prod_desc)-1) As prod_desc
FROM
(
SELECT DISTINCT T2.ShippingID, T2.ProductId1,
(
SELECT cast(T1.ProductId1 as varchar(10))+'-' +cast(T1.ProductId2 as varchar(10))+ '|' AS [text()]
FROM dbo.table1 T1
WHERE T1.ShippingID = T2.ShippingID and T1.ProductId1=T2.ProductId1
ORDER BY T1.ShippingID,ProductId1,ProductId2
FOR XML PATH ('')
) prod_desc
FROM dbo.table1 T2
) Shipping
------Step 2. concatenate product1 and product2 with product1 wise order by ProductId1,ProductId2-------------------
declare #relation as table
(
ProductId1 INT,
ProductDesc varchar(max)
)
insert #relation
SELECT relation.ProductId1,LEFT(relation.prod_desc,Len(relation.prod_desc)-1) As prod_desc
FROM
(
SELECT DISTINCT T2.ProductId1,
(
SELECT cast(T1.ProductId1 as varchar(10))+'-' +cast(T1.ProductId2 as varchar(10))+ '|' AS [text()]
FROM dbo.table2 T1
WHERE T1.ProductId1=T2.ProductId1
ORDER BY ProductId1,ProductId2
FOR XML PATH ('')
) prod_desc
FROM dbo.table2 T2
) relation
------Step 1. use left join to match with concatinated string of every product1. if matches return True otherwise False-------------------
select a.ShippingID,a.ProductId1,case when b.ProductDesc is null then 'False' else 'True' end as IsIncludeAll
from #Shipping a
left join #relation b
on a.ProductId1=b.ProductId1 and a.ProductDesc=b.ProductDesc
-----Result-----------
------------+---------------+--------------
ShippingID | ProductId1 | IsIncludeAll
------------+---------------+--------------
100 | 111 | True
200 | 222 | True
300 | 111 | False

Join with dynamic Table and Column names

I'm trying to join from a table where the tables and fields are defined within the data instead of keys. So here is what I have
Table Root:
ID | Table | Field
---+---------+-----------
1 | Tab1 | Field1
2 | Tab2 | Field2
3 | Tab1 | Field2
4 | Tab3 | Field4
5 | Tab1 | Field1
Tab1
ID | Field1
---+---------
1 | A
2 | B
3 | C
4 | D
Tab2
ID | Field1 |Field2
---+--------+-----------
1 | X | Bla
2 | Y | 123
3 | Z | 456
Tab3 does not exist
I'd like to have a result like that one:
ID | Value
---+---------
1 | A
2 | 123
3 | NULL -- Field does not match
4 | NULL -- Tables does not exist
5 | NULL -- ID does not exist
Basicly trying to join using the the ID trageting a dynamic table and field.
My Starting Point is somehwere around Here, but this is just for a single specific table. I can't figure out how to join dynamicly or if it even possible without dynamic sql like exec.
you could solve this with a case expression and subqueries, like this example
declare #root table (id int, [table] varchar(10), Field varchar(10))
declare #tab1 table (id int, Field1 varchar(10))
declare #tab2 table (id int, Field1 varchar(10), Field2 varchar(10))
insert into #root (id, [table], Field)
values (1, 'Tab1', 'Field1'), (2, 'Tab2', 'Field2'), (3, 'Tab1', 'Field2'), (4, 'Tab3', 'Field4'), (5, 'Tab1', 'Field1')
insert into #tab1 (id, Field1)
values (1, 'A'), (2, 'B'), (3, 'C'), (4, 'D')
insert into #tab2 (id, Field1, Field2)
values (1, 'X', 'Bla'), (2, 'Y', '123'), (3, 'Z', '456')
select r.id,
case when r.[Table] = 'Tab1' and r.Field = 'Field1' then (select t1.Field1 from #tab1 t1 where t1.ID = r.ID)
when r.[Table] = 'Tab2' and r.Field = 'Field1' then (select t2.Field1 from #tab2 t2 where t2.id = r.id)
when r.[Table] = 'Tab2' and r.Field = 'Field2' then (select t2.Field2 from #tab2 t2 where t2.id = r.id)
end as Value
from #root r
the result is
id Value
-- -------
1 A
2 123
3 null
4 null
5 null

Combining select and select union in SQL Server

I have 2 tables
Table A with 3 columns : ID, NAME, VALUE 1
Table B with 3 columns : ID, NAME, VALUE 2
Table A :
ID : A1, A2
NAME: AAA, BBB
VALUE 1 : 1000,2000
Table B :
ID : B1, B2
NAME: CCC,DDD
VALUE 1 : 3000,4000
And I want to show the result like this :
ID, NAME, VALUE 1, VALUE 2
A1 AAA 1000
A2 BBB 2000
A3 CCC 3000
A4 DDD 4000
I have tried union and it works for id, name column. What about regular select + union select, is it possible ?
You can use union all:
select id, name, value1, null as value2
from a
union all
select id, name, null as value1, value2
from b;
online demonstration
MS SQL Server 2017 Schema Setup:
CREATE TABLE Table1
([ID] varchar(2), [NAME] varchar(3), [VALUE1] int)
;
INSERT INTO Table1
([ID], [NAME], [VALUE1])
VALUES
('A1', 'AAA', 1000),
('A2', 'BBB', 2000)
;
CREATE TABLE Table2
([ID] varchar(2), [NAME] varchar(3), [VALUE2] int)
;
INSERT INTO Table2
([ID], [NAME], [VALUE2])
VALUES
('B1', 'CCC', 3000),
('B2', 'DDD', 4000)
;
Query 1:
select id, name, value1, null as value2
from table1
union all
select id, name, null as value1, value2
from table2
Results:
| id | name | value1 | value2 |
|----|------|--------|--------|
| A1 | AAA | 1000 | (null) |
| A2 | BBB | 2000 | (null) |
| B1 | CCC | (null) | 3000 |
| B2 | DDD | (null) | 4000 |
You can also use Full Outer Join
SELECT a.id, a.name, a.value1, b.value2
FROM tableA as a
FULL OUTER JOIN tableB AS b ON
a.id = b.id

SQL: Deleting row which values already exist

I have a table that look like this:
ID | DATE | NAME | VALUE_1 | VALUE_2
1 | 27.11.2015 | Homer | A | B
2 | 27.11.2015 | Bart | C | B
3 | 28.11.2015 | Homer | A | C
4 | 28.11.2015 | Maggie | C | B
5 | 28.11.2015 | Bart | C | B
I currently delete duplicate rows (thank to this thread) using this code :
WITH cte AS
(SELECT ROW_NUMBER() OVER (PARTITION BY [VALUE_1], [VALUE_2]
ORDER BY [DATE] DESC) RN
FROM [MY_TABLE])
DELETE FROM cte
WHERE RN > 1
But this code don't delete exactly the lines I want. I would like to delete only rows which values already exist so in my example I would like to delete only line 5 because line 2 have the same values and is older.
Code to create my table and insert values:
CREATE TABLE [t_diff_values]
([id] INT IDENTITY NOT NULL PRIMARY KEY,
[date] DATETIME NOT NULL,
[name] VARCHAR(255) NOT NULL DEFAULT '',
[val1] CHAR(1) NOT NULL DEFAULT '',
[val2] CHAR(1) NOT NULL DEFAULT '');
INSERT INTO [t_diff_values] ([date], [name], [val1], [val2]) VALUES
('2015-11-27','Homer', 'A','B'),
('2015-11-27','Bart', 'C','B'),
('2015-11-28','Homer', 'A','C'),
('2015-11-28','Maggie', 'C','B'),
('2015-11-28','Bart', 'C','B');
You need to add one more CTE where you will index all islands and then apply your duplicate logic in second CTE:
DECLARE #t TABLE
(
ID INT ,
DATE DATE ,
VALUE_1 CHAR(1) ,
VALUE_2 CHAR(1)
)
INSERT INTO #t
VALUES ( 1, '20151127', 'A', 'B' ),
( 2, '20151128', 'C', 'B' ),
( 3, '20151129', 'A', 'B' ),
( 4, '20151130', 'A', 'B' );
WITH cte1
AS ( SELECT * ,
ROW_NUMBER() OVER ( ORDER BY date)
- ROW_NUMBER() OVER ( PARTITION BY VALUE_1, VALUE_2 ORDER BY DATE) AS gr
FROM #t
),
cte2
AS ( SELECT * ,
ROW_NUMBER() OVER ( PARTITION BY VALUE_1, VALUE_2, gr ORDER BY date) AS rn
FROM cte1
)
DELETE FROM cte2
WHERE rn > 1
SELECT *
FROM #t
Try this
CREATE TABLE [dbo].[Employee](
[ID] INT NOT NULL,
[Date] DateTime NOT NULL,
[VAL1] varchar(20) NOT NULL,
[VAL2] varchar(20) NOT NULL
)
INSERT INTO [dbo].[Employee] VALUES
(1,'2015-11-27 10:44:33.087','A','B')
INSERT INTO [dbo].[Employee] VALUES
(2,'2015-11-28 10:44:33.087','C','B')
INSERT INTO [dbo].[Employee] VALUES
(3,'2015-11-29 10:44:33.087','A','B')
INSERT INTO [dbo].[Employee] VALUES
(4,'2015-11-30 10:44:33.087','A','B')
with cte as(
select
*,
rn = row_number() over(partition by [VAL1], [VAL2]
ORDER BY [DATE] DESC),
cc = count(*) over(partition by [VAL1], [VAL2])
from [Employee]
)
delete
from cte
where
rn > 1 and rn < cc
select * from [Employee]
You could use this query:
WITH cte AS
(
SELECT RN = ROW_NUMBER() OVER (ORDER BY ID)
, *
FROM #data
)
DELETE FROM c1
--SELECT *
FROM CTE c1
INNER JOIN CTE c2 ON c1.RN +1 = c2.RN AND c1.VALUE_1 = c2.VALUE_1 AND c1.VALUE_2 = c2.VALUE_2
Here I order them by ID. If the next one (RN+1) has similar V1 and V2, it is deleted.
Output:
ID DATE VALUE_1 VALUE_2
1 2015-11-27 A B
2 2015-11-28 C B
4 2015-11-30 A B
Data:
declare #data table(ID int, [DATE] date, VALUE_1 char(1), VALUE_2 char(1));
insert into #data(ID, [DATE], VALUE_1, VALUE_2) values
(1, '20151127', 'A', 'B'),
(2, '20151128', 'C', 'B'),
(3, '20151129', 'A', 'B'),
(4, '20151130', 'A', 'B');

How to merge ranges from different tables

Giving the following 2 tables:
T1
------------------
From | To | Value
------------------
10 | 20 | XXX
20 | 30 | YYY
30 | 40 | ZZZ
T2
------------------
From | To | Value
------------------
10 | 15 | AAA
15 | 19 | BBB
19 | 39 | CCC
39 | 40 | DDD
What is the best way to get the result below, using T-SQL on SQL Server 2008?
The From/To ranges are sequential (there are no gaps) and the next From always has the same value as the previous To
Desired result
-------------------------------
From | To | Value1 | Value2
-------------------------------
10 | 15 | XXX | AAA
15 | 19 | XXX | BBB
19 | 20 | XXX | CCC
20 | 30 | YYY | CCC
30 | 39 | ZZZ | CCC
39 | 40 | ZZZ | DDD
First I declare data that looks like the data you posted. Please correct me if any assumptions I have made are wrong. Better would be to post your own declaration in the question so we are all working with the same data.
DECLARE #T1 TABLE (
[From] INT,
[To] INT,
[Value] CHAR(3)
);
INSERT INTO #T1 (
[From],
[To],
[Value]
)
VALUES
(10, 20, 'XXX'),
(20, 30, 'YYY'),
(30, 40, 'ZZZ');
DECLARE #T2 TABLE (
[From] INT,
[To] INT,
[Value] CHAR(3)
);
INSERT INTO #T2 (
[From],
[To],
[Value]
)
VALUES
(10, 15, 'AAA'),
(15, 19, 'BBB'),
(19, 39, 'CCC'),
(39, 40, 'DDD');
Here is my select query to generate your expected result:
SELECT
CASE
WHEN [#T1].[From] > [#T2].[From]
THEN [#T1].[From]
ELSE [#T2].[From]
END AS [From],
CASE
WHEN [#T1].[To] < [#T2].[To]
THEN [#T1].[To]
ELSE [#T2].[To]
END AS [To],
[#T1].[Value],
[#T2].[Value]
FROM #T1
INNER JOIN #T2 ON
(
[#T1].[From] <= [#T2].[From] AND
[#T1].[To] > [#T2].[From]
) OR
(
[#T2].[From] <= [#T1].[From] AND
[#T2].[To] > [#T1].[From]
);
Stealing #isme's data setup, I wrote the following:
;With EPs as (
select [From] as EP from #T1
union
select [To] from #T1
union
select [From] from #T2
union
select [To] from #T2
), OrderedEndpoints as (
select EP,ROW_NUMBER() OVER (ORDER BY EP) as rn from EPs
)
select
oe1.EP,
oe2.EP,
t1.Value,
t2.Value
from
OrderedEndpoints oe1
inner join
OrderedEndpoints oe2
on
oe1.rn = oe2.rn - 1
inner join
#T1 t1
on
oe1.EP < t1.[To] and
oe2.EP > t1.[From]
inner join
#T2 t2
on
oe1.EP < t2.[To] and
oe2.EP > t2.[From]
That is, you create a set containing all of the possible end points of periods (EPs), then you "sort" those and assign each one a row number (OrderedEPs).
Then the final query assembles each "adjacent" pair of rows together, and joins back to the original tables to find which rows from each one overlap the selected range.
The below query finds the smallest ranges, then picks the values back out the tables again:
SELECT ranges.from, ranges.to, T1.Value, T2.Value
FROM (SELECT all_from.from, min(all_to.to) as to
FROM (SELECT T1.FROM
FROM T1
UNION
SELECT T2.FROM
FROM T2) all_from
JOIN (SELECT T1.TO
FROM T1
UNION
SELECT T2.FROM
FROM T2) all_to ON all_from.from < all_to.to
GROUP BY all_from.from) ranges
JOIN T1 ON ranges.from >= T1.from AND ranges.to <= T1.to
JOIN T2 ON ranges.from >= T2.from AND ranges.to <= T2.to
ORDER BY ranges.from
Thanks for the answers, but I ended using a CTE, wgich I think is cleaner.
DECLARE #T1 TABLE ([From] INT, [To] INT, [Value] CHAR(3));
DECLARE #T2 TABLE ([From] INT, [To] INT, [Value] CHAR(3));
INSERT INTO #T1 ( [From], [To], [Value]) VALUES (10, 20, 'XXX'), (20, 30, 'YYY'), (30, 40, 'ZZZ');
INSERT INTO #T2 ( [From], [To], [Value]) VALUES (10, 15, 'AAA'), (15, 19, 'BBB'), (19, 39, 'CCC'), (39, 40, 'DDD');
;with merged1 as
(
select
t1.[From] as from1,
t1.[to] as to1,
t1.Value as Value1,
t2.[From] as from2,
t2.[to] as to2,
t2.Value as Value2
from #t1 t1
inner join #T2 t2
on t1.[From] < t2.[To]
and t1.[To] >= t2.[From]
)
,merged2 as
(
select
case when from2>=from1 then from2 else from1 end as [From]
,case when to2<=to1 then to2 else to1 end as [To]
,value1
,value2
from merged1
)
select * from merged2