concatenate recursive cross join - sql

I need to concatenate the name in a recursive cross join way. I don't know how to do this, I have tried a CTE using WITH RECURSIVE but no success.
I have a table like this:
group_id | name
---------------
13 | A
13 | B
19 | C
19 | D
31 | E
31 | F
31 | G
Desired output:
combinations
------------
ACE
ACF
ACG
ADE
ADF
ADG
BCE
BCF
BCG
BDE
BDF
BDG
Of course, the results should multiply if I add a 4th (or more) group.

Native Postgresql Syntax:
SqlFiddleDemo
WITH RECURSIVE cte1 AS
(
SELECT *, DENSE_RANK() OVER (ORDER BY group_id) AS rn
FROM mytable
),cte2 AS
(
SELECT
CAST(name AS VARCHAR(4000)) AS name,
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
CAST(CONCAT(c2.name,c1.name) AS VARCHAR(4000)) AS name
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT name as combinations
FROM cte2
WHERE LENGTH(name) = (SELECT MAX(rn) FROM cte1)
ORDER BY name;
Before:
I hope if you don't mind that I use SQL Server Syntax:
Sample:
CREATE TABLE #mytable(
ID INTEGER NOT NULL
,TYPE VARCHAR(MAX) NOT NULL
);
INSERT INTO #mytable(ID,TYPE) VALUES (13,'A');
INSERT INTO #mytable(ID,TYPE) VALUES (13,'B');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'C');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'D');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'E');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'F');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'G');
Main query:
WITH cte1 AS
(
SELECT *, rn = DENSE_RANK() OVER (ORDER BY ID)
FROM #mytable
),cte2 AS
(
SELECT
TYPE = CAST(TYPE AS VARCHAR(MAX)),
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
[Type] = CAST(CONCAT(c2.TYPE,c1.TYPE) AS VARCHAR(MAX))
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT *
FROM cte2
WHERE LEN(Type) = (SELECT MAX(rn) FROM cte1)
ORDER BY Type;
LiveDemo
I've assumed that the order of "cross join" is dependent on ascending ID.
cte1 generate DENSE_RANK() because your IDs contain gaps
cte2 recursive part with CONCAT
main query just filter out required length and sort string

The recursive query is a bit simpler in Postgres:
WITH RECURSIVE t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name
FROM tbl
)
, cte AS (
SELECT grp, name
FROM t
WHERE grp = 1
UNION ALL
SELECT t.grp, c.name || t.name
FROM cte c
JOIN t ON t.grp = c.grp + 1
)
SELECT name AS combi
FROM cte
WHERE grp = (SELECT max(grp) FROM t)
ORDER BY 1;
The basic logic is the same as in the SQL Server version provided by #lad2025, I added a couple of minor improvements.
Or you can use a simple version if your maximum number of groups is not too big (can't be very big, really, since the result set grows exponentially). For a maximum of 5 groups:
WITH t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name AS n
FROM tbl
)
SELECT concat(t1.n, t2.n, t3.n, t4.n, t5.n) AS combi
FROM (SELECT n FROM t WHERE grp = 1) t1
LEFT JOIN (SELECT n FROM t WHERE grp = 2) t2 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 3) t3 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 4) t4 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 5) t5 ON true
ORDER BY 1;
Probably faster for few groups. LEFT JOIN .. ON true makes this work even if higher levels are missing. concat() ignores NULL values. Test with EXPLAIN ANALYZE to be sure.
SQL Fiddle showing both.

Related

Frequency of all combinations of values for certain column

I have a dataset in SQL Server 2012 with a column for id and value, like this:
[id] [value]
--------------
A 15
A 11
A 11
B 13
B 15
B 12
C 12
C 13
D 13
D 12
My goal is to get a frequency count of all combinations of [value], with two caveats:
Order doesn't matter, so [11,12,15] is not counted separately from [12,11,15]
Repeated values are counted separately, so [11,11,12,15] is counted separately from [11,12,15]
I'm interested in all combinations, of any length (not just pairs)
So the outcome would look like:
[combo] [frequency]
---------------------
11,11,15 1
12,13,15 1
12,13 2
I've seen answers here involving recursion that answer similar questions but where order counts, and answers here involving self-joins that yield pair-wise combinations. These come close but I'm not quite sure how to adapt for my specific needs.
You can use string_agg():
select vals, count(*) as frequency
from (select string_agg(value, ',') within group (order by value) as vals, id
from t
group by id
) i
group by vals;
SQL Server 2012 doesn't support string_agg() but you can use the XML hack:
select vals, count(*) as frequency
from (select id,
stuff( (select concat(',', value)
from t t2
where t2.id = i.id
for xml path ('')
), 1, 1, ''
) as vals
from (select distinct id from t) i
) i
group by vals;
Your number string is just all the values with the same id in increasing order. So I'm treating the lowest id as a canonical name for the full sequence and all its matches. This spares all the string manipulations though you can expand as necessary.
Just tag each duplicate value with a counter and then look for groups that pair up completely.
with data as (
select id, value,
row_number() over (partition by id, value) as rn
), matches as (
select l.id, r.id as match
from data l full outer join data r on
l.value = r.value and l.rn = r.rn and l.id <= r.id
group by l.id
having count(l.id) = count(*) and count(r.id) = count(*)
)
select id, count(match) as frequency
from matches
group by id;
The logic in the middle query is also easily adaptable for finding subset of values in common.
You can achieve this using CTEs and row_number functions.
DECLARE #table table(id CHAR(1), val int)
insert into #table VALUES
('A',15),
('A',11),
('A',11),
('B',13),
('B',15),
('B',12),
('C',12),
('C',13),
('D',13),
('D',12);
;WITH CTE_rnk as
(
SELECT id,val, row_number() over (partition by id order by val) as rnk
from #table
),
CTE_concat as
(
SELECT id, cast(val as varchar(100)) as val, rnk
from CTE_rnk
where rnk =1
union all
SELECT r.id, cast(concat(c.val,',',r.val) as varchar(100)) as val,r.rnk
from CTE_rnk as r
inner join CTE_concat as c
on r.rnk = c.rnk+1
and r.id = c.id
),
CTE_maxcombo as
(
SELECT id,val, row_number() over(partition by id order by rnk desc) as rnk
from CTE_concat
)
select val as combo, count(*) as frequency
from CTE_maxcombo where rnk = 1
group by val
+----------+-----------+
| combo | frequency |
+----------+-----------+
| 11,11,15 | 1 |
| 12,13 | 2 |
| 12,13,15 | 1 |
+----------+-----------+

unable to swap rows in sql

I have the below table where names will be inserted first. Then I will get IDs where I need to map with names
ID NAME
null Test1
null Test2
1 null
2 null
I need the result like
ID NAME
1 Test1
2 Test2
I tried below query but it doesn't work for me
select t1.ID , t2.Name from table1 T1 join table1 t2 on T1.id = t2.id
According to screen short you are working with SQL Server , you could try cte expression which may help you
;with cte as
(
select max(id) id, row_number() over (order by (select 1)) rn from
(
select *, rank() over(order by id) rnk from table
) a
group by a.rnk
having max(id) is not null
), cte1 as
(
select max(name) name, row_number() over (order by (select 1)) rn from
(
select *, rank() over(order by name) rnk from table
) a
group by a.rnk
having max(name) is not null
)
select c.id, c1.name from cte c
join cte1 c1 on c1.rn = c.rn
Result :
id name
1 test1
2 test2

SQL Self Recursive Join with Grouping

In SQL Server, I have the following table:
Name New_Name
---------------
A B
B C
C D
G H
H I
Z B
I want to create a new table that links all the names that are related to a single new groupID
GroupID Name
-------------
1 A
1 B
1 C
1 D
1 Z
2 G
2 H
2 I
I'm a bit stuck on this can can't figure out how to do it apart from a bunch of joins. But I would like to do it properly.
Edited the question to allow grouping from two different start points, A and Z into one group.
Since you've changed the question, I'm updating the answer. Please note that answer is same in sense of logical structure. All it does differently is that instead of going from G to I in calculating levels, answer now calculates from I to G.
Working demo link
;with cte as
(
select
t1.Name as Name, row_number() over (order by t1.Name) r,
t1.New_Name as New_Name,
1 as level
from yt t1 left join yt t2
on t1.New_Name=t2.name
where t2.name is null
union all
select
yt.Name as Name, r,
yt.New_Name as New_Name,
c.level+1 as level
from cte c join yt
on yt.New_Name=c.Name
),
cte2 as
(
select r as group_id, Name from cte
union
select c1.r as group_id, c1.New_name as Name from cte c1
where level = (select min(level) from cte c2 where c2.r=c1.r)
)
select * from cte2;
Below is old answer.
You can try below CTE based query:
create table yt (Name varchar(10), New_Name varchar(10));
insert into yt values
('A','B'),
('B','C'),
('C','D'),
('G','H'),
('H','I');
;with cte as
(
select
t1.Name as Name, row_number() over (order by t1.Name) r,
t1.New_Name as New_Name,
1 as level
from yt t1 left join yt t2
on t1.Name=t2.New_name
where t2.new_name is null
union all
select
yt.Name as Name, r,
yt.New_Name as New_Name,
c.level+1 as level
from cte c join yt
on yt.Name=c.New_Name
),
cte2 as
(
select r as group_id, Name from cte
union
select c1.r as group_id, c1.New_name as Name from cte c1
where level = (select max(level) from cte c2 where c2.r=c1.r)
)
select * from cte2;
see working demo
Little bit complex but working.
DECLARE #T TABLE (Name VARCHAR(2), New_Name VARCHAR(2))
INSERT INTO #T
VALUES
('A','B'),
('B','C'),
('C','D'),
('G','H'),
('H','I'),
('Z','B')
;WITH CTE AS
(
SELECT * , RN = ROW_NUMBER() OVER(ORDER BY Name) FROM #T
)
,CTE2 AS (SELECT T1.RN, T1.Name Name1, T1.New_Name New_Name1,
X.Name Name2, X.New_Name New_Name2,
FLAG = CASE WHEN X.Name IS NULL THEN 1 ELSE 0 END
FROM CTE T1
OUTER APPLY (SELECT * FROM CTE T2 WHERE T2.RN > T1.RN
AND (T2.Name IN (T1.Name , T1.New_Name)
OR T2.New_Name IN (T1.Name , T1.New_Name)
)) AS X
)
,CTE3 AS (SELECT *,
GroupID = ROW_NUMBER() OVER (ORDER BY RN) -
ROW_NUMBER() OVER (PARTITION BY Flag ORDER BY RN) +1
FROM CTE2
)
SELECT
DISTINCT GroupID, Name
FROM
(SELECT * FROM CTE3 WHERE Name2 IS NOT NULL) SRC
UNPIVOT ( Name FOR COL IN ([Name1], [New_Name1], [Name2], [New_Name2])) UNPVT
Result
GroupID Name
-------------------- ----
1 A
1 B
1 C
1 D
1 Z
2 G
2 H
2 I

SQL Joining table with Min and Sec Min row

I want to join table 1 with table2 twice becuase I need to get the first minimum record and the second minimum. However, I can only think of using a cte to get the second minimum record. Is there a better way to do it?
Here is the table table:
I want to join Member with output table FirstRunID whose Output value is 1 and second RunID whose Output value is 0
current code I am using:
select memid, a.runid as aRunid,b.runid as bRunid
into #temp
from FirstTable m inner join
(select min(RunID), MemID [SecondTable] where ouput=1 group by memid)a on m.memid=a.memid
inner join (select RunID, MemID [SecondTable] where ouput=0 )b on m.memid=a.memid and b.runid>a.runid
with cte as
(
select row_number() over(partition by memid, arunid order by brunid ),* from #temp
)
select * from cte where n=1
You can use outer apply operator for this:
select * from t1
outer apply(select top 1 t2.runid from t2
where t1.memid = t2.memid and t2.output = 1 order by t2.runid) as oa1
outer apply(select top 1 t2.runid from t2
where t1.memid = t2.memid and t2.output = 0 order by t2.runid) as oa2
You can do this with conditional aggregation. Based on your results, you don't need the first table:
select t2.memid,
max(case when output = 1 and seqnum = 1 then runid end) as OutputValue1,
max(case when output = 0 and seqnum = 2 then runid end) as OutputValue2
from (select t2.*,
row_number() over (partition by memid, output order by runid) a seqnum
from t2
) t2
group by t2.memid;
declare #FirstTable table
(memid int, name varchar(20))
insert into #firsttable
values
(1,'John'),
(2,'Victor')
declare #secondtable table
(runid int,memid int,output int)
insert into #secondtable
values
(1,1,0),(1,2,1),(2,1,1),(2,2,1),(3,1,1),(3,2,0),(4,1,0),(4,2,0)
;with cte as
(
SELECT *, row_number() over (partition by memid order by runid) seq --sequence
FROM #SECONDTABLE T
where t.output = 1
union all
SELECT *, row_number() over (partition by memid order by runid) seq --sequence
FROM #SECONDTABLE T
where t.output = 0 and
t.runid > (select min(x.runid) from #secondtable x where x.memid = t.memid and x.output = 1 group by x.memid) --lose any O output record where there is no prior 1 output record
)
select cte1.memid,cte1.runid,cte2.runid from cte cte1
join cte cte2 on cte2.memid = cte1.memid and cte2.seq = cte1.seq
where cte1.seq = 1 --remove this test if you want matched pairs
and cte1.output = 1 and cte2.output = 0

SQL group by if values are close

Class| Value
-------------
A | 1
A | 2
A | 3
A | 10
B | 1
I am not sure whether it is practical to achieve this using SQL.
If the difference of values are less than 5 (or x), then group the rows (of course with the same Class)
Expected result
Class| ValueMin | ValueMax
---------------------------
A | 1 | 3
A | 10 | 10
B | 1 | 1
For fixed intervals, we can easily use "GROUP BY". But now the grouping is based on nearby row's value. So if the values are consecutive or very close, they will be "chained together".
Thank you very much
Assuming MSSQL
You are trying to group things by gaps between values. The easiest way to do this is to use the lag() function to find the gaps:
select class, min(value) as minvalue, max(value) as maxvalue
from (select class, value,
sum(IsNewGroup) over (partition by class order by value) as GroupId
from (select class, value,
(case when lag(value) over (partition by class order by value) > value - 5
then 0 else 1
end) as IsNewGroup
from t
) t
) t
group by class, groupid;
Note that this assumes SQL Server 2012 for the use of lag() and cumulative sum.
Update:
*This answer is incorrect*
Assuming the table you gave is called sd_test, the following query will give you the output you are expecting
In short, we need a way to find what was the value on the previous row. This is determined using a join on row ids. Then create a group to see if the difference is less than 5. and then it is just regular 'Group By'.
If your version of SQL Server supports windowing functions with partitioning the code would be much more readable.
SELECT
A.CLASS
,MIN(A.VALUE) AS MIN_VALUE
,MAX(A.VALUE) AS MAX_VALUE
FROM
(SELECT
ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
,CLASS
,VALUE
FROM SD_TEST) AS A
LEFT JOIN
(SELECT
ROW_NUMBER()OVER(PARTITION BY CLASS ORDER BY VALUE) AS ROW_ID
,CLASS
,VALUE
FROM SD_TEST) AS B
ON A.CLASS = B.CLASS AND A.ROW_ID=B.ROW_ID+1
GROUP BY A.CLASS,CASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END
ORDER BY A.CLASS,cASE WHEN ABS(COALESCE(B.VALUE,0)-A.VALUE)<5 THEN 1 ELSE 0 END DESC
ps: I think the above is ANSI compliant. So should run in most SQL variants. Someone can correct me if it is not.
These give the correct result, using the fact that you must have the same number of group starts as ends and that they will both be in ascending order.
if object_id('tempdb..#temp') is not null drop table #temp
create table #temp (class char(1),Value int);
insert into #temp values ('A',1);
insert into #temp values ('A',2);
insert into #temp values ('A',3);
insert into #temp values ('A',10);
insert into #temp values ('A',13);
insert into #temp values ('A',14);
insert into #temp values ('b',7);
insert into #temp values ('b',8);
insert into #temp values ('b',9);
insert into #temp values ('b',12);
insert into #temp values ('b',22);
insert into #temp values ('b',26);
insert into #temp values ('b',67);
Method 1 Using CTE and row offsets
with cte as
(select distinct class,value,ROW_NUMBER() over ( partition by class order by value ) as R from #temp),
cte2 as
(
select
c1.class
,c1.value
,c2.R as PreviousRec
,c3.r as NextRec
from
cte c1
left join cte c2 on (c1.class = c2.class and c1.R= c2.R+1 and c1.Value < c2.value + 5)
left join cte c3 on (c1.class = c3.class and c1.R= c3.R-1 and c1.Value > c3.value - 5)
)
select
Starts.Class
,Starts.Value as StartValue
,Ends.Value as EndValue
from
(
select
class
,value
,row_number() over ( partition by class order by value ) as GroupNumber
from cte2
where PreviousRec is null) as Starts join
(
select
class
,value
,row_number() over ( partition by class order by value ) as GroupNumber
from cte2
where NextRec is null) as Ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber
** Method 2 Inline views using not exists **
select
Starts.Class
,Starts.Value as StartValue
,Ends.Value as EndValue
from
(
select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
from
(select distinct class,value from #temp) as T
where not exists (select 1 from #temp where class=t.class and Value < t.Value and Value > t.Value -5 )
) Starts join
(
select class,Value ,row_number() over ( partition by class order by value ) as GroupNumber
from
(select distinct class,value from #temp) as T
where not exists (select 1 from #temp where class=t.class and Value > t.Value and Value < t.Value +5 )
) ends on starts.class=ends.class and starts.GroupNumber = ends.GroupNumber
In both methods I use a select distinct to begin because if you have a dulpicate entry at a group start or end things go awry without it.
Here is one way of getting the information you are after:
SELECT Under5.Class,
(
SELECT MIN(m2.Value)
FROM MyTable AS m2
WHERE m2.Value < 5
AND m2.Class = Under5.Class
) AS ValueMin,
(
SELECT MAX(m3.Value)
FROM MyTable AS m3
WHERE m3.Value < 5
AND m3.Class = Under5.Class
) AS ValueMax
FROM
(
SELECT DISTINCT m1.Class
FROM MyTable AS m1
WHERE m1.Value < 5
) AS Under5
UNION
SELECT Over4.Class,
(
SELECT MIN(m4.Value)
FROM MyTable AS m4
WHERE m4.Value >= 5
AND m4.Class = Over4.Class
) AS ValueMin,
(
SELECT Max(m5.Value)
FROM MyTable AS m5
WHERE m5.Value >= 5
AND m5.Class = Over4.Class
) AS ValueMax
FROM
(
SELECT DISTINCT m6.Class
FROM MyTable AS m6
WHERE m6.Value >= 5
) AS Over4