SQL LOGIC using 3conditions - sql

course_completions CC
id coursemodid userid state timemodified
370 23 2 1 1433582890
329 24 89 1 1427771915
333 30 39 1 1428309816
332 32 39 1 1428303307
327 33 40 1 1427689703
328 34 89 1 1427710711
303 35 41 1 1410258482
358 36 99 1 1432020067
365 25 2 1 1433142455
304 26 69 1 1410717866
353 37 95 1 1430387005
416 38 2 1 1438972465
300 27 70 1 1409824001
302 29 74 1 1412055704
297 30 2 1 1409582123
301 133 41 1 1410255923
336 133 91 1 1428398435
364 133 40 1 1433142348
312 133 85 1 1425863621
course_modules CM
id course
23 6
24 6
25 6
26 6
27 6
28 6
29 8
30 8
31 8
32 8
33 8
34 5
35 5
36 5
37 5
38 5
39 9
40 9
41 9
course_mod_settings CMS
id course modinstance
27 8 30
28 8 31
29 8 32
30 8 33
31 6 23
32 6 24
33 6 25
34 6 26
35 6 27
36 6 28
37 9 39
38 9 40
39 9 41
I need the count of each user has Completed modules, Inprocess modules and Notstarted modules for each course, where getting the count of userids from table CC by taking courseia from table CM, get number of modules that an user has completed from each course.
(A course can have morethan one module and a course can have number of users attempted all modules, few modules or not attempted at all).
So, I need number of users - has done number of modules - in a course. (3 logics)
Completed.Users means : If number of modules attempted is equal to number of modinstance from table CMS (ex: no. of modules attempted by a user per course= 9, no.modinstance = 9. Because 7 is not equal to 9, They are completed.)
Inprocess.Users means : Number of modules attempted should be >0, but not equal to [count(modinstance) per course] (ex: no. of modules attempted by a user per course= 7 , no.modinstance = 9. Because 7 is not equal to 9, They are Inprocess.)
Notstarted.Users means : Number of modules attempted should be equal to 0, (ex: no. of modules attempted by a user per course= 0. They are Notstarted).
OUTPUT :
Course No.Completed.Users No.Inprocess.Users No.Notstarted.Users
5 65 32 6
6 40 12 15
8 43 56 0
9 0 7 9
Sir, this is a very critical logic that I was trying, I couldn't get a solution. I hope stackoverflow developers could help me out. I tried with my query :
SELECT cm.course AS "Course",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) =
(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 )
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Completed.Users",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) > 0 AND
(SELECT count(cms.id) FROM course_mod_settings cms) !=
(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 )
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Inprocess.Users",
(CASE WHEN
(SELECT count(cms.id) FROM course_mod_settings cms) = 0
THEN COUNT(SELECT count(cmc.coursemodid) FROM course_completions cc JOIN course_modules cm ON cmc.coursemodid = cm.id WHERE cmc.state=1 ) END) AS "No.Notstarted.Users"
FROM
mdl_course c
GROUP BY c.id

SQL Fiddle
SELECT course AS "Course",
SUM(CASE WHEN completion_count = module_count THEN 1 ELSE 0 END) AS "No.Completed.Users",
SUM(CASE WHEN completion_count > 0 AND completion_count < module_count THEN 1 ELSE 0 END) AS "No.Inprocess.Users",
SUM(CASE WHEN completion_count = 0 THEN 1 ELSE 0 END) AS "No.Notstarted.Users"
FROM (SELECT course, COUNT(*) AS module_count
FROM course_modules cm
GROUP BY course) course_module_counts JOIN
(SELECT cm.course AS courseid, users.id AS userid, SUM(CASE WHEN cc.state = 1 THEN 1 ELSE 0 END) completion_count
FROM ((SELECT DISTINCT userid AS id FROM course_completions) users CROSS JOIN course_modules cm) LEFT JOIN course_completions cc ON users.id = cc.userid AND cc.coursemodid = cm.id
GROUP BY cm.course, users.id) course_completion_counts
ON course_module_counts.course = course_completion_counts.courseid
GROUP BY course
gives this output, which matches the limited dataset that you provided in your question.
| course | No.Completed.Users | No.Inprocess.Users | No.Notstarted.Users |
|--------|--------------------|--------------------|---------------------|
| 5 | 0 | 5 | 7 |
| 6 | 0 | 4 | 8 |
| 8 | 0 | 4 | 8 |
| 9 | 0 | 0 | 12 |

Related

Find duplicates on multiple columns in a SQL

I want to find the batchID who are having subjectID(0,1,2) only, pls help me, thanks in advance
I need answer BatchID=12 and BatchID=51, what can I do in sql
MyTable
uid BatchID SubjectID
6 2 0
7 2 1
8 2 2
9 2 4
10 3 0
11 3 1
12 4 5
13 4 0
14 5 5
15 6 5
17 7 0
18 7 1
19 7 2
26 12 0
27 12 1
28 12 2
29 1 0
30 1 1
31 1 4
62 45 5
63 46 0
64 46 1
65 46 4
107 49 6
108 49 2
109 49 4
110 50 1
111 50 3
116 0 1
117 0 4
118 51 0
119 51 1
120 51 2
You can use conditional aggregation for this:
select batchId
from your_table
group by batchId
having count(distinct case when subjectID in (0,1,2) then subjectID end) = 3
and count(case when subjectID not in (0,1,2) then 1 end) = 0
Explanation:
Group by batchId - Aggregate on batchId
count(distinct case when subjectID in (0,1,2) then subjectID end) - Produces 3 only if all three of them are present for this batchId
count(case when subjectID not in (0,1,2) then 1 end) - Produces 0 if there is no other subjectID except 0,1,2 assuming nulls are not allowed in the subjectId column.
You can use Row_Number()
;with cte as (
select *, RowN = row_number() over (partition by batchid order by uid) from #yourbatch where subjectid in (0,1,2)
) select distinct BatchId from cte where RowN > 1

Oracle - Group By Creating Duplicate Rows

I have a query that looks like this:
select nvl(trim(a.code), 'Blanks') as Ward, count(b.apcasekey) as UNSP, count(c.apcasekey) as GRAPH,
count(d.apcasekey) as "ANI/PIG",
(count(b.apcasekey) + count(c.apcasekey) + count(d.apcasekey)) as "TOTAL ACTIVE",
count(a.apcasekey) as "TOTAL OPEN" from (etc...)
group by a.code
order by Ward
The reason I have nvl(trim(a.code), 'Blanks') as Ward is that sometimes a.code is a blank string, sometimes it's a null.
The problem is that when I use the Group By statement, I can't use Ward or I get the error
Ward: Invalid Identifier
I can only use a.code so I get 2 rows for 'Blanks', as per below
1 Blanks 7 0 0 7 7
2 Blanks 23 1 1 25 30
3 W01 75 4 0 79 91
4 W02 62 1 0 63 72
5 W03 140 2 0 142 162
6 W04 6 1 0 7 7
7 W05 46 0 1 47 48
8 W06 322 46 1 369 425
9 W07 91 0 1 92 108
10 W08 93 2 0 95 104
11 W09 28 1 0 29 30
12 W10 25 0 0 25 28
What I need, is for the row with 'Blanks' to combined into 1 row. Little help?
Thanks.
You can not use the alias in the GROUP BY, but you can use the expression that builds the value:
GROUP BY nvl(trim(a.code), 'Blanks')

How to get sum of the same Unit's

I have a table which contains a college's departments and their units and sub-units.
OrganizationID ParentUnit Unit ChildUnit UnitName
10 1 0 0 Education
12 1 1 0 Sports
24 1 2 0 Mathmatics
28 1 3 0 Science
35 1 3 1 Physics
51 1 4 0 Arts
66 1 4 1 Music
69 1 4 2 Painting
84 8 0 0 Business & Administration
88 8 1 0 Administration
96 8 1 1 Public Administration
107 8 1 2 Local Managements
110 8 2 0 Finance
119 8 2 1 Accounting
124 8 2 2 Marketing
I have another table which contains the student information of that college.
StudentID OrganizationID
1 12
2 12
3 24
5 28
6 35
8 51
9 66
31 69
34 96
45 88
57 96
66 107
69 110
72 69
74 124
I want to get student counts for each unit. If a studutent's Organization is a ChildUnit it should be added to current Unit. If ChildUnit is greater than0 corresponding student count should be added to same Unit For example Physics is a child of Science. Then Science student count should return 2.
My target data table should look like as the following
ParentUnit UnitName StudentCount
------------------------------------------------------
Education Sports 2
Education Mathmatics 1
Education Science 2
Education Arts 4
Business & Administration Administration 4
Business & Administration Finance 2
I have done it in programmatic way. There are many for and if loops. Then I started to think whether it could be done with a smarter sql query.
That doesn't look so difficult. You are looking for the student count per ParentUnit + Unit. Then the name for such a group is the record where the level (the ChildUnit) is zero. You get that record with a CASE construct, then use MIN or MAX, because you need an aggregate function here (there should be exactly one record per group anyhow, so MIN = MAX).
select
min(case when o.childunit = 0 then o.unitname end) as unitname,
count(*) as studentcount
from organization o
inner join student s on s.organizationid = o.organizationid
group by o.parentunit, o.unit;
To include the parent unit name:
select
(
select unitname
from organization po
where po.parentunit = o.parentunit
and po.unit =0
and po.childunit = 0
) as parentunitname,
min(case when o.childunit = 0 then o.unitname end) as unitname,
count(*) as studentcount
from organization o
inner join student s on s.organizationid = o.organizationid
group by o.parentunit, o.unit;
Or:
select
min(po.unitname) as parentunitname,
min(case when o.childunit = 0 then o.unitname end) as unitname,
count(*) as studentcount
from organization o
inner join student s on s.organizationid = o.organizationid
inner join
(
select parentunit, unitname
from organization
where unit = 0 and childunit = 0
) po on po.parentunit = o.parentunit
group by o.parentunit, o.unit;

SQL How to group a table with special condition and join to another table

i have Table PRICE
CodePrice CodeClient TheZone Tempo W_Begin W_End Price
A_0_49 88989 1 1 0 49 20
A_50_99 60000 1 1 50 99 10
B_0_49 88989 2 1 0 49 30
C_0_49 50000 3 4 0 49 40
Table Ordre
NoID CodeClient Cp Agence TheZone Tempo Weight
01 88989 44 bidon 1 1 12
02 60000 49 toto 2 2 10
03 60000 49 bigoi 1 1 56
04 88989 49 titi 3 3 8
05 50000 44 bidon 1 1 5
How can i show the result like this:
CodePrice TheZone Tempo W_Begin W_End SUm_Weight SUM_Price CountIDOrdre CP Agence
A_0_49 1 1 0 49 17 40 2 44 bidon
A_50_99 1 1 50 99 56 10 1 49 bigoi
B_0_49 2 1 0 49 0 0 0 null null
C_0_49 3 4 0 49 0 0 0 null null
in 1 SQL Syntax (1 Step).
Nowdays i make it in 2 step
SELECT CodePrice
,CodeClient
,TheZone
,Tempo
,W_Begin
,W_End
,Price
FROM PRICE
LEFT JOIN T_TARIF_ZONE ON PRICE.ZONE = T_TARIF_ZONE.NO_ID
LEFT JOIN CLIENT ON PRICE.CodeClient= CLIENT.CODE_CL
WHERE CodeClient= #NO_CLIENT AND TheZone = #ZONE AND UNITE = #UNITE
AND (Tempo IN (SELECT ParsedString From dbo.ParseStringList(#ModLiv)))
ORDER BY MONTANT
after i select the price, now i can group the ordre on each price depends on Weight, zone, and another variable
SELECT count(NoID) as LVId
,sum(Weight)as Tot_Poids
,sum(Price) as Tot_Prix
FROM [Ordre]
WHERE Weight>= #TR_DEB AND Weight<= #TR_FIN AND LE_ZONE = #LE_ZONE AND ENLEV_UNITE = #ENLEV_UNITE
AND DATE_CLOTURE >= #Date_Deb AND DATE_CLOTURE <= #Date_Fin AND STATUT_LV = 2 AND FACTURATION = #FACTURATION
HAVING count([NOID]) > 0

Generate all combinations in SQL

I need to generate all combinations of size #k in a given set of size #n. Can someone please review the following SQL and determine first if the following logic is returning the expected results, and second if is there a better way?
/*CREATE FUNCTION dbo.Factorial ( #x int )
RETURNS int
AS
BEGIN
DECLARE #value int
IF #x <= 1
SET #value = 1
ELSE
SET #value = #x * dbo.Factorial( #x - 1 )
RETURN #value
END
GO*/
SET NOCOUNT ON;
DECLARE #k int = 5, #n int;
DECLARE #set table ( [value] varchar(24) );
DECLARE #com table ( [index] int );
INSERT #set VALUES ('1'),('2'),('3'),('4'),('5'),('6');
SELECT #n = COUNT(*) FROM #set;
DECLARE #combinations int = dbo.Factorial(#n) / (dbo.Factorial(#k) * dbo.Factorial(#n - #k));
PRINT CAST(#combinations as varchar(max)) + ' combinations';
DECLARE #index int = 1;
WHILE #index <= #combinations
BEGIN
INSERT #com VALUES (#index)
SET #index = #index + 1
END;
WITH [set] as (
SELECT
[value],
ROW_NUMBER() OVER ( ORDER BY [value] ) as [index]
FROM #set
)
SELECT
[values].[value],
[index].[index] as [combination]
FROM [set] [values]
CROSS JOIN #com [index]
WHERE ([index].[index] + [values].[index] - 1) % (#n) BETWEEN 1 AND #k
ORDER BY
[index].[index];
Returning Combinations
Using a numbers table or number-generating CTE, select 0 through 2^n - 1. Using the bit positions containing 1s in these numbers to indicate the presence or absence of the relative members in the combination, and eliminating those that don't have the correct number of values, you should be able to return a result set with all the combinations you desire.
WITH Nums (Num) AS (
SELECT Num
FROM Numbers
WHERE Num BETWEEN 0 AND POWER(2, #n) - 1
), BaseSet AS (
SELECT ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), *
FROM #set
), Combos AS (
SELECT
ComboID = N.Num,
S.Value,
Cnt = Count(*) OVER (PARTITION BY N.Num)
FROM
Nums N
INNER JOIN BaseSet S ON N.Num & S.ind <> 0
)
SELECT
ComboID,
Value
FROM Combos
WHERE Cnt = #k
ORDER BY ComboID, Value;
This query performs pretty well, but I thought of a way to optimize it, cribbing from the Nifty Parallel Bit Count to first get the right number of items taken at a time. This performs 3 to 3.5 times faster (both CPU and time):
WITH Nums AS (
SELECT Num, P1 = (Num & 0x55555555) + ((Num / 2) & 0x55555555)
FROM dbo.Numbers
WHERE Num BETWEEN 0 AND POWER(2, #n) - 1
), Nums2 AS (
SELECT Num, P2 = (P1 & 0x33333333) + ((P1 / 4) & 0x33333333)
FROM Nums
), Nums3 AS (
SELECT Num, P3 = (P2 & 0x0f0f0f0f) + ((P2 / 16) & 0x0f0f0f0f)
FROM Nums2
), BaseSet AS (
SELECT ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), *
FROM #set
)
SELECT
ComboID = N.Num,
S.Value
FROM
Nums3 N
INNER JOIN BaseSet S ON N.Num & S.ind <> 0
WHERE P3 % 255 = #k
ORDER BY ComboID, Value;
I went and read the bit-counting page and think that this could perform better if I don't do the % 255 but go all the way with bit arithmetic. When I get a chance I'll try that and see how it stacks up.
My performance claims are based on the queries run without the ORDER BY clause. For clarity, what this code is doing is counting the number of set 1-bits in Num from the Numbers table. That's because the number is being used as a sort of indexer to choose which elements of the set are in the current combination, so the number of 1-bits will be the same.
I hope you like it!
For the record, this technique of using the bit pattern of integers to select members of a set is what I've coined the "Vertical Cross Join." It effectively results in the cross join of multiple sets of data, where the number of sets & cross joins is arbitrary. Here, the number of sets is the number of items taken at a time.
Actually cross joining in the usual horizontal sense (of adding more columns to the existing list of columns with each join) would look something like this:
SELECT
A.Value,
B.Value,
C.Value
FROM
#Set A
CROSS JOIN #Set B
CROSS JOIN #Set C
WHERE
A.Value = 'A'
AND B.Value = 'B'
AND C.Value = 'C'
My queries above effectively "cross join" as many times as necessary with only one join. The results are unpivoted compared to actual cross joins, sure, but that's a minor matter.
Critique of Your Code
First, may I suggest this change to your Factorial UDF:
ALTER FUNCTION dbo.Factorial (
#x bigint
)
RETURNS bigint
AS
BEGIN
IF #x <= 1 RETURN 1
RETURN #x * dbo.Factorial(#x - 1)
END
Now you can calculate much larger sets of combinations, plus it's more efficient. You might even consider using decimal(38, 0) to allow larger intermediate calculations in your combination calculations.
Second, your given query does not return the correct results. For example, using my test data from the performance testing below, set 1 is the same as set 18. It looks like your query takes a sliding stripe that wraps around: each set is always 5 adjacent members, looking something like this (I pivoted to make it easier to see):
1 ABCDE
2 ABCD Q
3 ABC PQ
4 AB OPQ
5 A NOPQ
6 MNOPQ
7 LMNOP
8 KLMNO
9 JKLMN
10 IJKLM
11 HIJKL
12 GHIJK
13 FGHIJ
14 EFGHI
15 DEFGH
16 CDEFG
17 BCDEF
18 ABCDE
19 ABCD Q
Compare the pattern from my queries:
31 ABCDE
47 ABCD F
55 ABC EF
59 AB DEF
61 A CDEF
62 BCDEF
79 ABCD G
87 ABC E G
91 AB DE G
93 A CDE G
94 BCDE G
103 ABC FG
107 AB D FG
109 A CD FG
110 BCD FG
115 AB EFG
117 A C EFG
118 BC EFG
121 A DEFG
...
Just to drive the bit-pattern -> index of combination thing home for anyone interested, notice that 31 in binary = 11111 and the pattern is ABCDE. 121 in binary is 1111001 and the pattern is A__DEFG (backwards mapped).
Performance Results With A Real Numbers Table
I ran some performance testing with big sets on my second query above. I do not have a record at this time of the server version used. Here's my test data:
DECLARE
#k int,
#n int;
DECLARE #set TABLE (value varchar(24));
INSERT #set VALUES ('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I'),('J'),('K'),('L'),('M'),('N'),('O'),('P'),('Q');
SET #n = ##RowCount;
SET #k = 5;
DECLARE #combinations bigint = dbo.Factorial(#n) / (dbo.Factorial(#k) * dbo.Factorial(#n - #k));
SELECT CAST(#combinations as varchar(max)) + ' combinations', MaxNumUsedFromNumbersTable = POWER(2, #n);
Peter showed that this "vertical cross join" doesn't perform as well as simply writing dynamic SQL to actually do the CROSS JOINs it avoids. At the trivial cost of a few more reads, his solution has metrics between 10 and 17 times better. The performance of his query decreases faster than mine as the amount of work increases, but not fast enough to stop anyone from using it.
The second set of numbers below is the factor as divided by the first row in the table, just to show how it scales.
Erik
Items CPU Writes Reads Duration | CPU Writes Reads Duration
----- ------ ------ ------- -------- | ----- ------ ------ --------
17•5 7344 0 3861 8531 |
18•9 17141 0 7748 18536 | 2.3 2.0 2.2
20•10 76657 0 34078 84614 | 10.4 8.8 9.9
21•11 163859 0 73426 176969 | 22.3 19.0 20.7
21•20 142172 0 71198 154441 | 19.4 18.4 18.1
Peter
Items CPU Writes Reads Duration | CPU Writes Reads Duration
----- ------ ------ ------- -------- | ----- ------ ------ --------
17•5 422 70 10263 794 |
18•9 6046 980 219180 11148 | 14.3 14.0 21.4 14.0
20•10 24422 4126 901172 46106 | 57.9 58.9 87.8 58.1
21•11 58266 8560 2295116 104210 | 138.1 122.3 223.6 131.3
21•20 51391 5 6291273 55169 | 121.8 0.1 613.0 69.5
Extrapolating, eventually my query will be cheaper (though it is from the start in reads), but not for a long time. To use 21 items in the set already requires a numbers table going up to 2097152...
Here is a comment I originally made before realizing that my solution would perform drastically better with an on-the-fly numbers table:
I love single-query solutions to problems like this, but if you're looking for the best performance, an actual cross-join is best, unless you start dealing with seriously huge numbers of combination. But what does anyone want with hundreds of thousands or even millions of rows? Even the growing number of reads don't seem too much of a problem, though 6 million is a lot and it's getting bigger fast...
Anyway. Dynamic SQL wins. I still had a beautiful query. :)
Performance Results with an On-The-Fly Numbers Table
When I originally wrote this answer, I said:
Note that you could use an on-the-fly numbers table, but I haven't tried it.
Well, I tried it, and the results were that it performed much better! Here is the query I used:
DECLARE #N int = 16, #K int = 12;
CREATE TABLE #Set (Value char(1) PRIMARY KEY CLUSTERED);
CREATE TABLE #Items (Num int);
INSERT #Items VALUES (#K);
INSERT #Set
SELECT TOP (#N) V
FROM
(VALUES ('A'),('B'),('C'),('D'),('E'),('F'),('G'),('H'),('I'),('J'),('K'),('L'),('M'),('N'),('O'),('P'),('Q'),('R'),('S'),('T'),('U'),('V'),('W'),('X'),('Y'),('Z')) X (V);
GO
DECLARE
#N int = (SELECT Count(*) FROM #Set),
#K int = (SELECT TOP 1 Num FROM #Items);
DECLARE #combination int, #value char(1);
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L3 B),
L5 AS (SELECT 1 N FROM L4, L4 B),
Nums AS (SELECT Row_Number() OVER(ORDER BY (SELECT 1)) Num FROM L5),
Nums1 AS (
SELECT Num, P1 = (Num & 0x55555555) + ((Num / 2) & 0x55555555)
FROM Nums
WHERE Num BETWEEN 0 AND Power(2, #N) - 1
), Nums2 AS (
SELECT Num, P2 = (P1 & 0x33333333) + ((P1 / 4) & 0x33333333)
FROM Nums1
), Nums3 AS (
SELECT Num, P3 = (P2 & 0x0F0F0F0F) + ((P2 / 16) & 0x0F0F0F0F)
FROM Nums2
), BaseSet AS (
SELECT Ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), *
FROM #Set
)
SELECT
#Combination = N.Num,
#Value = S.Value
FROM
Nums3 N
INNER JOIN BaseSet S
ON N.Num & S.Ind <> 0
WHERE P3 % 255 = #K;
Note that I selected the values into variables to reduce the time and memory needed to test everything. The server still does all the same work. I modified Peter's version to be similar, and removed unnecessary extras so they were both as lean as possible. The server version used for these tests is Microsoft SQL Server 2008 (RTM) - 10.0.1600.22 (Intel X86) Standard Edition on Windows NT 5.2 <X86> (Build 3790: Service Pack 2) (VM) running on a VM.
Below are charts showing the performance curves for values of N and K up to 21. The base data for them is in another answer on this page. The values are the result of 5 runs of each query at each K and N value, followed by throwing out the best and worst values for each metric and averaging the remaining 3.
Basically, my version has a "shoulder" (in the leftmost corner of the chart) at high values of N and low values of K that make it perform worse there than the dynamic SQL version. However, this stays fairly low and constant, and the central peak around N = 21 and K = 11 is much lower for Duration, CPU, and Reads than the dynamic SQL version.
I included a chart of the number of rows each item is expected to return so you can see how the query performs stacked up against how big a job it has to do.
Please see my additional answer on this page for the complete performance results. I hit the post character limit and could not include it here. (Any ideas where else to put it?) To put things in perspective against my first version's performance results, here's the same format as before:
Erik
Items CPU Duration Reads Writes | CPU Duration Reads
----- ----- -------- ------- ------ | ----- -------- -------
17•5 354 378 12382 0 |
18•9 1849 1893 97246 0 | 5.2 5.0 7.9
20•10 7119 7357 369518 0 | 20.1 19.5 29.8
21•11 13531 13807 705438 0 | 38.2 36.5 57.0
21•20 3234 3295 48 0 | 9.1 8.7 0.0
Peter
Items CPU Duration Reads Writes | CPU Duration Reads
----- ----- -------- ------- ------ | ----- -------- -------
17•5 41 45 6433 0 |
18•9 2051 1522 214021 0 | 50.0 33.8 33.3
20•10 8271 6685 864455 0 | 201.7 148.6 134.4
21•11 18823 15502 2097909 0 | 459.1 344.5 326.1
21•20 25688 17653 4195863 0 | 626.5 392.3 652.2
Conclusions
On-the-fly numbers tables are better than a real table containing rows, since reading one at huge rowcounts requires a lot of I/O. It is better to use a little CPU.
My initial tests weren't broad enough to really show the performance characteristics of the two versions.
Peter's version could be improved by making each JOIN not only be greater than the prior item, but also restrict the maximum value based on how many more items have to be fit into the set. For example, at 21 items taken 21 at a time, there is only one answer of 21 rows (all 21 items, one time), but the intermediate rowsets in the dynamic SQL version, early in the execution plan, contain combinations such as "AU" at step 2 even though this will be discarded at the next join since there is no value higher than "U" available. Similarly, an intermediate rowset at step 5 will contain "ARSTU" but the only valid combo at this point is "ABCDE". This improved version would not have a lower peak at the center, so possibly not improving it enough to become the clear winner, but it would at least become symmetrical so that the charts would not stay maxed past the middle of the region but would fall back to near 0 as my version does (see the top corner of the peaks for each query).
Duration Analysis
There is no really significant difference between the versions in duration (>100ms) until 14 items taken 12 at a time. Up to this point, my version wins 30 times and the dynamic SQL version wins 43 times.
Starting at 14•12, my version was faster 65 times (59 >100ms), the dynamic SQL version 64 times (60 >100ms). However, all the times my version was faster, it saved a total averaged duration of 256.5 seconds, and when the dynamic SQL version was faster, it saved 80.2 seconds.
The total averaged duration for all trials was Erik 270.3 seconds, Peter 446.2 seconds.
If a lookup table were created to determine which version to use (picking the faster one for the inputs), all the results could be performed in 188.7 seconds. Using the slowest one each time would take 527.7 seconds.
Reads Analysis
The duration analysis showed my query winning by significant but not overly large amount. When the metric is switched to reads, a very different picture emerges--my query uses on average 1/10th the reads.
There is no really significant difference between the versions in reads (>1000) until 9 items taken 9 at a time. Up to this point, my version wins 30 times and the dynamic SQL version wins 17 times.
Starting at 9•9, my version used fewer reads 118 times (113 >1000), the dynamic SQL version 69 times (31 >1000). However, all the times my version used fewer reads, it saved a total averaged 75.9M reads, and when the dynamic SQL version was faster, it saved 380K reads.
The total averaged reads for all trials was Erik 8.4M, Peter 84M.
If a lookup table were created to determine which version to use (picking the best one for the inputs), all the results could be performed in 8M reads. Using the worst one each time would take 84.3M reads.
I would be very interested to see the results of an updated dynamic SQL version that puts the extra upper limit on the items chosen at each step as I described above.
Addendum
The following version of my query achieves an improvement of about 2.25% over the performance results listed above. I used MIT's HAKMEM bit-counting method, and added a Convert(int) on the result of row_number() since it returns a bigint. Of course I wish this is the version I had used with for all the performance testing and charts and data above, but it is unlikely I will ever redo it as it was labor-intensive.
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L3 B),
L5 AS (SELECT 1 N FROM L4, L4 B),
Nums AS (SELECT Row_Number() OVER(ORDER BY (SELECT 1)) Num FROM L5),
Nums1 AS (
SELECT Convert(int, Num) Num
FROM Nums
WHERE Num BETWEEN 1 AND Power(2, #N) - 1
), Nums2 AS (
SELECT
Num,
P1 = Num - ((Num / 2) & 0xDB6DB6DB) - ((Num / 4) & 0x49249249)
FROM Nums1
),
Nums3 AS (SELECT Num, Bits = ((P1 + P1 / 8) & 0xC71C71C7) % 63 FROM Nums2),
BaseSet AS (SELECT Ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), * FROM #Set)
SELECT
N.Num,
S.Value
FROM
Nums3 N
INNER JOIN BaseSet S
ON N.Num & S.Ind <> 0
WHERE
Bits = #K
And I could not resist showing one more version that does a lookup to get the count of bits. It may even be faster than other versions:
DECLARE #BitCounts binary(255) =
0x01010201020203010202030203030401020203020303040203030403040405
+ 0x0102020302030304020303040304040502030304030404050304040504050506
+ 0x0102020302030304020303040304040502030304030404050304040504050506
+ 0x0203030403040405030404050405050603040405040505060405050605060607
+ 0x0102020302030304020303040304040502030304030404050304040504050506
+ 0x0203030403040405030404050405050603040405040505060405050605060607
+ 0x0203030403040405030404050405050603040405040505060405050605060607
+ 0x0304040504050506040505060506060704050506050606070506060706070708;
WITH L0 AS (SELECT 1 N UNION ALL SELECT 1),
L1 AS (SELECT 1 N FROM L0, L0 B),
L2 AS (SELECT 1 N FROM L1, L1 B),
L3 AS (SELECT 1 N FROM L2, L2 B),
L4 AS (SELECT 1 N FROM L3, L3 B),
L5 AS (SELECT 1 N FROM L4, L4 B),
Nums AS (SELECT Row_Number() OVER(ORDER BY (SELECT 1)) Num FROM L5),
Nums1 AS (SELECT Convert(int, Num) Num FROM Nums WHERE Num BETWEEN 1 AND Power(2, #N) - 1),
BaseSet AS (SELECT Ind = Power(2, Row_Number() OVER (ORDER BY Value) - 1), * FROM ComboSet)
SELECT
#Combination = N.Num,
#Value = S.Value
FROM
Nums1 N
INNER JOIN BaseSet S
ON N.Num & S.Ind <> 0
WHERE
#K =
Convert(int, Substring(#BitCounts, N.Num & 0xFF, 1))
+ Convert(int, Substring(#BitCounts, N.Num / 256 & 0xFF, 1))
+ Convert(int, Substring(#BitCounts, N.Num / 65536 & 0xFF, 1))
+ Convert(int, Substring(#BitCounts, N.Num / 16777216, 1))
Please forgive this extra answer. I ran into the post character limit in my original answer.
Here are the complete averaged numeric performance results for the charts in my answer.
| Erik | Peter
N K | CPU Duration Reads Writes | CPU Duration Reads Writes
-- -- - ----- -------- ------ ------ - ----- -------- ------- ------
1 1 | 0 0 7 0 | 0 0 7 0
2 1 | 0 0 10 0 | 0 0 7 0
2 2 | 0 0 7 0 | 0 0 11 0
3 1 | 0 0 12 0 | 0 0 7 0
3 2 | 0 0 12 0 | 0 0 13 0
3 3 | 5 0 7 0 | 0 0 19 0
4 1 | 0 0 14 0 | 0 0 7 0
4 2 | 0 0 18 0 | 0 0 15 0
4 3 | 0 0 14 0 | 5 0 27 0
4 4 | 0 0 7 0 | 0 0 35 0
5 1 | 5 0 16 0 | 5 0 7 0
5 2 | 0 0 26 0 | 0 0 17 0
5 3 | 0 0 26 0 | 0 0 37 0
5 4 | 0 0 16 0 | 0 0 57 0
5 5 | 0 0 7 0 | 0 0 67 0
6 1 | 0 0 18 0 | 0 0 7 0
6 2 | 5 0 36 0 | 0 0 19 0
6 3 | 0 0 46 0 | 0 0 49 0
6 4 | 0 0 36 0 | 0 0 89 0
6 5 | 5 0 18 0 | 5 0 119 0
6 6 | 0 0 7 0 | 0 0 131 0
7 1 | 5 0 20 0 | 0 0 7 0
7 2 | 0 0 48 0 | 0 0 21 0
7 3 | 0 0 76 0 | 0 0 63 0
7 4 | 0 0 76 0 | 0 0 133 0
7 5 | 0 1 48 0 | 0 1 203 0
7 6 | 5 0 20 0 | 0 1 245 0
7 7 | 5 0 7 0 | 0 3 259 0
8 1 | 5 2 22 0 | 0 4 7 0
8 2 | 0 1 62 0 | 0 0 23 0
8 3 | 0 1 118 0 | 0 0 79 0
8 4 | 0 1 146 0 | 0 1 191 0
8 5 | 5 3 118 0 | 0 1 331 0
8 6 | 5 1 62 0 | 5 2 443 0
8 7 | 0 0 22 0 | 5 3 499 0
8 8 | 0 0 7 0 | 5 3 515 0
9 1 | 0 2 24 0 | 0 0 7 0
9 2 | 5 3 78 0 | 0 0 25 0
9 3 | 5 3 174 0 | 0 1 97 0
9 4 | 5 5 258 0 | 0 2 265 0
9 5 | 5 7 258 0 | 10 4 517 0
9 6 | 5 5 174 0 | 5 5 769 0
9 7 | 0 3 78 0 | 10 4 937 0
9 8 | 0 0 24 0 | 0 3 1009 0
9 9 | 0 1 7 0 | 0 4 1027 0
10 1 | 10 4 26 0 | 0 0 7 0
10 2 | 5 5 96 0 | 0 0 27 0
10 3 | 5 2 246 0 | 0 0 117 0
10 4 | 10 10 426 0 | 10 4 357 0
10 5 | 15 12 510 0 | 5 8 777 0
10 6 | 15 16 426 0 | 10 9 1281 0
10 7 | 10 4 246 0 | 10 9 1701 0
10 8 | 10 5 96 0 | 10 5 1941 0
10 9 | 5 4 26 0 | 10 7 2031 0
10 10 | 5 0 7 0 | 10 7 2051 0
11 1 | 10 8 28 0 | 0 0 7 0
11 2 | 15 11 116 0 | 0 0 29 0
11 3 | 21 24 336 0 | 10 3 139 0
11 4 | 21 18 666 0 | 5 2 469 0
11 5 | 21 20 930 0 | 5 3 1129 0
11 6 | 26 35 930 0 | 15 12 2053 0
11 7 | 20 14 666 0 | 5 25 2977 0
11 8 | 15 9 336 0 | 20 14 3637 0
11 9 | 10 7 116 0 | 21 27 3967 0
11 10 | 10 8 28 0 | 36 34 4086 0
11 11 | 5 8 7 0 | 15 15 4109 0
12 1 | 16 18 30 0 | 5 0 7 0
12 2 | 31 32 138 0 | 0 0 31 0
12 3 | 31 26 446 0 | 10 2 163 0
12 4 | 47 40 996 0 | 10 7 603 0
12 5 | 47 46 1590 0 | 21 17 1593 0
12 6 | 57 53 1854 0 | 31 30 3177 0
12 7 | 41 39 1590 0 | 31 30 5025 0
12 8 | 41 42 996 0 | 42 43 6609 0
12 9 | 31 26 446 0 | 52 52 7607 0
12 10 | 20 19 138 0 | 57 62 8048 0
12 11 | 15 17 30 0 | 72 64 8181 0
12 12 | 15 10 7 0 | 67 38 8217 0
13 1 | 31 32 32 0 | 0 0 7 0
13 2 | 21 25 162 0 | 0 0 33 0
13 3 | 36 34 578 0 | 5 2 189 0
13 4 | 57 65 1436 0 | 10 5 761 0
13 5 | 41 40 2580 0 | 10 10 2191 0
13 6 | 62 56 3438 0 | 31 32 4765 0
13 7 | 62 62 3438 0 | 57 53 8251 0
13 8 | 52 64 2580 0 | 52 47 11710 0
13 9 | 26 28 1436 0 | 93 96 14311 0
13 10 | 31 29 578 0 | 161 104 15891 0
13 11 | 36 35 162 0 | 129 99 16525 0
13 12 | 21 22 32 0 | 156 96 16383 0
13 13 | 26 30 7 0 | 166 98 16411 0
14 1 | 57 53 34 0 | 0 0 7 0
14 2 | 52 50 188 0 | 0 0 35 0
14 3 | 57 60 734 0 | 10 4 217 0
14 4 | 78 76 2008 0 | 15 8 945 0
14 5 | 99 97 4010 0 | 36 34 2947 0
14 6 | 120 125 6012 0 | 41 47 6951 0
14 7 | 125 119 6870 0 | 93 94 12957 0
14 8 | 135 138 6012 0 | 88 98 19821 0
14 9 | 78 153 4010 0 | 234 156 26099 0
14 10 | 94 92 2008 0 | 229 133 30169 0
14 11 | 83 90 734 0 | 239 136 32237 0
14 12 | 47 46 188 0 | 281 176 33031 0
14 13 | 52 53 34 0 | 260 167 32767 0
14 14 | 46 47 7 0 | 203 149 32797 0
15 1 | 83 83 36 0 | 0 0 7 0
15 2 | 145 139 216 0 | 0 2 37 0
15 3 | 104 98 916 0 | 0 2 247 0
15 4 | 135 135 2736 0 | 15 17 1157 0
15 5 | 94 97 6012 0 | 26 27 3887 0
15 6 | 192 188 10016 0 | 57 53 9893 0
15 7 | 187 192 12876 0 | 73 73 19903 0
15 8 | 286 296 12876 0 | 338 230 33123 0
15 9 | 208 207 10016 0 | 354 223 46063 0
15 10 | 140 143 6012 0 | 443 334 56143 0
15 11 | 88 86 2736 0 | 391 273 62219 0
15 12 | 73 72 916 0 | 432 269 65019 0
15 13 | 109 117 216 0 | 317 210 65999 0
15 14 | 156 187 36 0 | 411 277 66279 0
15 15 | 140 142 7 0 | 354 209 65567 0
16 1 | 281 281 38 0 | 0 0 7 0
16 2 | 141 146 246 0 | 0 0 39 0
16 3 | 208 206 1126 0 | 10 4 279 0
16 4 | 187 189 3646 0 | 15 13 1399 0
16 5 | 234 234 8742 0 | 42 42 5039 0
16 6 | 333 337 16022 0 | 83 85 13775 0
16 7 | 672 742 22886 0 | 395 235 30087 0
16 8 | 510 510 25746 0 | 479 305 53041 0
16 9 | 672 675 22886 0 | 671 489 78855 0
16 10 | 489 492 16022 0 | 859 578 101809 0
16 11 | 250 258 8742 0 | 719 487 117899 0
16 12 | 198 202 3646 0 | 745 483 126709 0
16 13 | 119 119 1126 0 | 770 506 130423 0
16 14 | 291 327 246 0 | 770 531 131617 0
16 15 | 156 156 38 0 | 713 451 131931 0
16 16 | 125 139 7 0 | 895 631 132037 0
17 1 | 406 437 40 0 | 0 0 7 0
17 2 | 307 320 278 0 | 0 0 41 0
17 3 | 281 290 1366 0 | 0 3 313 0
17 4 | 307 317 4766 0 | 31 28 1673 0
17 5 | 354 378 12382 0 | 41 45 6433 0
17 6 | 583 582 24758 0 | 130 127 18809 0
17 7 | 839 859 38902 0 | 693 449 43873 0
17 8 | 1177 1183 48626 0 | 916 679 82847 0
17 9 | 1031 1054 48626 0 | 1270 944 131545 0
17 10 | 828 832 38902 0 | 1469 1105 180243 0
17 11 | 672 668 24758 0 | 1535 1114 219217 0
17 12 | 422 422 12382 0 | 1494 991 244047 0
17 13 | 474 482 4766 0 | 1615 1165 256501 0
17 14 | 599 607 1366 0 | 1500 1042 261339 0
17 15 | 223 218 278 0 | 1401 1065 262777 0
17 16 | 229 228 40 0 | 1390 918 263127 0
17 17 | 541 554 7 0 | 1562 1045 263239 0
18 1 | 401 405 42 0 | 0 0 7 0
18 2 | 401 397 312 0 | 0 0 43 0
18 3 | 458 493 1638 0 | 5 6 349 0
18 4 | 583 581 6126 0 | 16 13 1981 0
18 5 | 697 700 17142 0 | 83 130 8101 0
18 6 | 792 799 37134 0 | 156 162 25237 0
18 7 | 1672 1727 63654 0 | 1098 751 62693 0
18 8 | 1598 1601 87522 0 | 1416 1007 126423 0
18 9 | 1849 1893 97246 0 | 2051 1522 214021 0
18 10 | 1963 2083 87522 0 | 2734 2103 311343 0
18 11 | 1411 1428 63654 0 | 2849 2352 398941 0
18 12 | 1042 1048 37134 0 | 3021 2332 462671 0
18 13 | 942 985 17142 0 | 3036 2314 499881 0
18 14 | 656 666 6126 0 | 3052 2177 517099 0
18 15 | 526 532 1638 0 | 2910 2021 523301 0
18 16 | 614 621 312 0 | 3083 2108 525015 0
18 17 | 536 551 42 0 | 2921 2031 525403 0
18 18 | 682 680 7 0 | 3141 2098 525521 0
19 1 | 885 909 44 0 | 0 0 7 0
19 2 | 1411 1498 348 0 | 0 0 45 0
19 3 | 880 887 1944 0 | 5 4 387 0
19 4 | 1119 1139 7758 0 | 26 25 2325 0
19 5 | 1120 1127 23262 0 | 73 72 10077 0
19 6 | 1395 1462 54270 0 | 453 387 33591 0
19 7 | 1875 1929 100782 0 | 1197 838 87941 0
19 8 | 2656 2723 151170 0 | 2255 1616 188803 0
19 9 | 3046 3092 184762 0 | 3317 2568 340053 0
19 10 | 3635 3803 184762 0 | 5171 4041 524895 0
19 11 | 2739 2774 151170 0 | 5577 4574 709737 0
19 12 | 3203 3348 100782 0 | 6182 5194 860987 0
19 13 | 1672 1750 54270 0 | 6458 5561 961849 0
19 14 | 1760 1835 23262 0 | 6177 4964 1016199 0
19 15 | 968 1006 7758 0 | 6266 4331 1039541 0
19 16 | 1099 1134 1944 0 | 6208 4254 1047379 0
19 17 | 995 1037 348 0 | 6385 4366 1049403 0
19 18 | 916 964 44 0 | 6036 4268 1049831 0
19 19 | 1135 1138 7 0 | 6234 4320 1049955 0
20 1 | 1797 1821 46 0 | 0 0 7 0
20 2 | 2000 2029 386 0 | 0 0 47 0
20 3 | 2031 2071 2286 0 | 10 6 427 0
20 4 | 1942 2036 9696 0 | 31 34 2707 0
20 5 | 2104 2161 31014 0 | 88 85 12397 0
20 6 | 2880 2958 77526 0 | 860 554 43675 0
20 7 | 3791 3940 155046 0 | 2026 1405 121285 0
20 8 | 5130 5307 251946 0 | 3823 2731 276415 0
20 9 | 6547 6845 335926 0 | 5380 4148 528445 0
20 10 | 7119 7357 369518 0 | 8271 6685 864455 0
20 11 | 5692 5803 335926 0 | 9557 8029 1234057 0
20 12 | 4734 4850 251946 0 | 11114 9504 1570067 0
20 13 | 3604 3641 155046 0 | 11551 10434 1822097 0
20 14 | 2911 2999 77526 0 | 12317 10822 1977227 0
20 15 | 2115 2134 31014 0 | 12806 10679 2054837 0
20 16 | 2041 2095 9696 0 | 13062 9115 2085935 0
20 17 | 2390 2465 2286 0 | 12807 9002 2095715 0
20 18 | 1765 1788 386 0 | 12598 8601 2098085 0
20 19 | 2067 2143 46 0 | 12578 8626 2098555 0
20 20 | 1640 1663 7 0 | 12932 9064 2098685 0
21 1 | 3374 3425 48 0 | 0 0 7 0
21 2 | 4031 4157 426 0 | 0 1 49 0
21 3 | 3218 3250 2666 0 | 10 5 469 0
21 4 | 3687 3734 11976 0 | 21 25 3129 0
21 5 | 3692 3735 40704 0 | 115 114 15099 0
21 6 | 4859 4943 108534 0 | 963 661 56079 0
21 7 | 6114 6218 232566 0 | 2620 1880 164701 0
21 8 | 8573 8745 406986 0 | 4999 3693 397355 0
21 9 | 11880 12186 587866 0 | 9047 6863 804429 0
21 10 | 13255 13582 705438 0 | 14358 11436 1392383 0
21 11 | 13531 13807 705438 0 | 18823 15502 2097909 0
21 12 | 12244 12400 587866 0 | 21834 18760 2803435 0
21 13 | 9406 9528 406986 0 | 23771 21274 3391389 0
21 14 | 7114 7180 232566 0 | 26677 24296 3798463 0
21 15 | 4869 4961 108534 0 | 26479 23998 4031117 0
21 16 | 4416 4521 40704 0 | 26536 22976 4139739 0
21 17 | 4380 4443 11976 0 | 26490 19107 4180531 0
21 18 | 3265 3334 2666 0 | 25979 17995 4192595 0
21 19 | 3640 3768 426 0 | 26186 17891 4195349 0
21 20 | 3234 3295 48 0 | 25688 17653 4195863 0
21 21 | 3156 3219 7 0 | 26140 17838 4195999 0
The CUBE extension to a group by clause represents all combinations of the given list. E.g, the following will give all 3-combinations of a 4-element set.
select concat(a,b,c,d)
from (select 'a','b','c','d') as t(a,b,c,d)
group by cube(a,b,c,d)
having len(concat(a,b,c,d)) = 3
How about some dynamic SQL?
DECLARE #k int = 5, #n INT
IF OBJECT_ID('tempdb..#set') IS NOT NULL DROP TABLE #set
CREATE TABLE #set ( [value] varchar(24) )
INSERT #set VALUES ('1'),('2'),('3'),('4'),('5'),('6')
SET #n = ##ROWCOUNT
SELECT dbo.Factorial(#n) / (dbo.Factorial(#k) * dbo.Factorial(#n - #k)) AS [expected combinations]
-- let's generate some sql.
DECLARE
#crlf NCHAR(2) = NCHAR(13)+NCHAR(10)
, #sql NVARCHAR(MAX)
, #select NVARCHAR(MAX)
, #from NVARCHAR(MAX)
, #order NVARCHAR(MAX)
, #in NVARCHAR(MAX)
DECLARE #j INT = 0
WHILE #j < #k BEGIN
SET #j += 1
IF #j = 1 BEGIN
SET #select = 'SELECT'+#crlf+' _1.value AS [1]'
SET #from = #crlf+'FROM #set AS _1'
SET #order = 'ORDER BY _1.value'
SET #in = '[1]'
END
ELSE BEGIN
SET #select += #crlf+', _'+CONVERT(VARCHAR,#j)+'.value AS ['+CONVERT(VARCHAR,#j)+']'
SET #from += #crlf+'INNER JOIN #set AS _'+CONVERT(VARCHAR,#j)+' ON _'+CONVERT(VARCHAR,#j)+'.value > _'+CONVERT(VARCHAR,#j-1)+'.value'
SET #order += ', _'+CONVERT(VARCHAR,#j)+'.value'
SET #in += ', ['+CONVERT(VARCHAR,#j)+']'
END
END
SET #select += #crlf+', ROW_NUMBER() OVER ('+#order+') AS combination'
SET #sql = #select + #from
-- let's see how it looks
PRINT #sql
EXEC (#sql)
-- ok, now dump pivot and dump into a table for later use
IF OBJECT_ID('tempdb..#combinations') IS NOT NULL DROP TABLE #combinations
CREATE TABLE #combinations (
combination INT
, value VARCHAR(24)
, PRIMARY KEY (combination, value)
)
SET #sql
= 'WITH CTE AS ('+#crlf+#sql+#crlf+')'+#crlf
+ 'INSERT #combinations (combination, value)'+#crlf
+ 'SELECT combination, value FROM CTE a'+#crlf
+ 'UNPIVOT (value FOR position IN ('+#in+')) AS b'
PRINT #sql
EXEC (#sql)
SELECT COUNT(DISTINCT combination) AS [returned combinations] FROM #combinations
SELECT * FROM #combinations
Generates the following query for #k = 5:
SELECT
_1.value AS [1]
, _2.value AS [2]
, _3.value AS [3]
, _4.value AS [4]
, _5.value AS [5]
, ROW_NUMBER() OVER (ORDER BY _1.value, _2.value, _3.value, _4.value, _5.value) AS combination
FROM #set AS _1
INNER JOIN #set AS _2 ON _2.value > _1.value
INNER JOIN #set AS _3 ON _3.value > _2.value
INNER JOIN #set AS _4 ON _4.value > _3.value
INNER JOIN #set AS _5 ON _5.value > _4.value
Which it then unpivots and dumps into a table.
The dynamic SQL is ugly, and you can't wrap it in a UDF, but the query produced is very efficient.
First create this UDF...
CREATE FUNCTION [dbo].[_ex_fn_SplitToTable] (#str varchar(5000), #sep char(1) = null)
RETURNS #ReturnVal table (n int, s varchar(5000))
AS
/*
Alpha Test
-----------
select * from [dbo].[_ex_fn_SplitToTable_test01]('abcde','')
*/
BEGIN
declare #str2 varchar(5000)
declare #sep2 char(1)
if LEN(ISNULL(#sep,'')) = 0
begin
declare #i int
set #i = 0
set #str2 = ''
declare #char varchar(1)
startloop:
set #i += 1
--print #i
set #char = substring(#str,#i,1)
set #str2 = #str2 + #char + ','
if LEN(#str) <= #i
goto exitloop
goto startloop
exitloop:
set #str2 = left(#str2,LEN(#str2) - 1)
set #sep2 = ','
--print #str2
end
else
begin
set #str2 = #str
set #sep2 = #sep
end
;WITH Pieces(n, start, stop) AS (
SELECT 1, 1, CHARINDEX(#sep2, #str2)
UNION ALL
SELECT n + 1, stop + 1, CHARINDEX(#sep2, #str2, stop + 1)
FROM Pieces
WHERE stop > 0
)
insert into #ReturnVal(n,s)
SELECT n,
SUBSTRING(#str2, start, CASE WHEN stop > 0 THEN stop-start ELSE 5000 END) AS s
FROM Pieces option (maxrecursion 32767)
RETURN
END
GO
Then create this stored proc...
CREATE proc [CombinationsOfString]
(
#mystring varchar(max) = '0,5,10,15,20,25'
)
as
/*
ALPHA TEST
---------
exec CombinationsOfString '-20,-10,0,10,20'
*/
if object_id('tempdb..#_201606070947_myorig') is not null drop table #_201606070947_myorig
CREATE TABLE #_201606070947_myorig
(
SourceId int not null identity(1,1)
,Element varchar(100) not null
)
insert into #_201606070947_myorig
select s from dbo._ex_fn_SplitToTable(#mystring,',')
--select SourceId, Element from #_201606070947_myorig
declare #mynumerics varchar(max)
set #mynumerics = (
select STUFF(REPLACE((SELECT '#!' + LTRIM(RTRIM(SourceId)) AS 'data()'
FROM #_201606070947_myorig
FOR XML PATH('')),' #!',', '), 1, 2, '') as Brands
)
set #mynumerics = REPLACE(#mynumerics,' ','')
print #mynumerics
if object_id('tempdb..#_201606070947_source') is not null drop table #_201606070947_source
if object_id('tempdb..#_201606070947_numbers') is not null drop table #_201606070947_numbers
if object_id('tempdb..#_201606070947_results') is not null drop table #_201606070947_results
if object_id('tempdb..#_201606070947_processed') is not null drop table #_201606070947_processed
CREATE TABLE #_201606070947_source
(
SourceId int not null identity(1,1)
,Element char(1) not null
)
--declare #mynumerics varchar(max)
--set #mynumerics = '1,2,3,4,5'
insert into #_201606070947_source
select s from dbo._ex_fn_SplitToTable(#mynumerics,',')
-- select * from #_201606070947_source
declare #Length int
set #Length = (select max(SourceId) from #_201606070947_source)
declare #columnstring varchar(max) = (SELECT REPLICATE('c.',#Length))
print #columnstring
declare #subs varchar(max) = (SELECT REPLICATE('substring.',#Length))
print #subs
if object_id('tempdb..#_201606070947_columns') is not null drop table #_201606070947_columns
select s+CONVERT(varchar,dbo.PadLeft(convert(varchar,n),'0',3)) cols
into #_201606070947_columns
from [dbo].[_ex_fn_SplitToTable](#columnstring,'.') where LEN(s) > 0
if object_id('tempdb..#_201606070947_subcolumns') is not null drop table #_201606070947_subcolumns
select s+'(Combo,'+CONVERT(varchar,n)+',1) ' + 'c'+CONVERT(varchar,dbo.PadLeft(convert(varchar,n),'0',3)) cols
into #_201606070947_subcolumns
from [dbo].[_ex_fn_SplitToTable](#subs,'.') where LEN(s) > 0
-- select * from #_201606070947_subcolumns
-- select * from #_201606070947_columns
declare #columns_sql varchar(max)
set #columns_sql =
(
select distinct
stuff((SELECT distinct + cast(cols as varchar(50)) + ' VARCHAR(1), '
FROM (
select cols
from #_201606070947_columns
) t2
--where t2.n = t1.n
FOR XML PATH('')),3,0,'')
from (
select cols
from #_201606070947_columns
) t1
)
declare #substring_sql varchar(max)
set #substring_sql =
(
select distinct
stuff((SELECT distinct + cast(cols as varchar(100)) + ', '
FROM (
select cols
from #_201606070947_subcolumns
) t2
--where t2.n = t1.n
FOR XML PATH('')),3,0,'')
from (
select cols
from #_201606070947_subcolumns
) t1
)
set #substring_sql = left(#substring_sql,LEN(#substring_sql) - 1)
print #substring_sql
set #columns_sql = LEFT(#columns_sql,LEN(#columns_sql) - 1)
--SELECT #columns_sql
declare #sql varchar(max)
set #sql = 'if object_id(''tempdb..##_201606070947_01'') is not null drop table ##_201606070947_01 create table ##_201606070947_01 (rowid int,' + #columns_sql + ')'
print #sql
execute(#sql)
CREATE TABLE #_201606070947_numbers (Number int not null)
insert into #_201606070947_numbers
select SourceId from #_201606070947_source
CREATE TABLE #_201606070947_results
(
Combo varchar(10) not null
,Length int not null
)
SET NOCOUNT on
DECLARE
#Loop int
,#MaxLoop int
-- How many elements there are to process
SELECT #MaxLoop = max(SourceId)
from #_201606070947_source
-- Initialize first value
TRUNCATE TABLE #_201606070947_results
INSERT #_201606070947_results (Combo, Length)
select Element, 1
from #_201606070947_source
where SourceId = 1
SET #Loop = 2
-- Iterate to add each Element after the first
WHILE #Loop <= #MaxLoop
BEGIN
INSERT #_201606070947_results (Combo, Length)
select distinct
left(re.Combo, #Loop - nm.Number)
+ so.Element
+ RIGHT(re.Combo, nm.Number - 1)
,#Loop
from #_201606070947_results re
inner join #_201606070947_numbers nm
on nm.Number <= #Loop
inner join #_201606070947_source so
on so.SourceId = #Loop
where re.Length = #Loop - 1
SET #Loop = #Loop + 1
END
-- select * from #_201606070947_results
-- Show #_201606070947_results
SELECT *
into #_201606070947_processed
from #_201606070947_results
where Length = #MaxLoop
order by Combo
-- select * from #_201606070947_processed
set #sql = 'if object_id(''tempdb..##_201606070947_02'') is not null drop table ##_201606070947_02 '
print #sql
execute(#sql)
set #sql = ' ' +
' SELECT ROW_NUMBER() OVER(ORDER BY Combo Asc) AS RowID,' + #substring_sql +
' into ##_201606070947_02 ' +
' FROM #_201606070947_processed ' +
' '
PRINT #sql
execute(#sql)
declare #columns_sql_new varchar(max)
set #columns_sql_new = REPLACE(#columns_sql,'(1)','(100)')
set #sql = 'if object_id(''tempdb..##_201606070947_03'') is not null drop table ##_201606070947_03 create table ##_201606070947_03 (RowId int,' + #columns_sql_new + ')'
PRINT #sql
execute(#sql)
insert into ##_201606070947_03 (RowId)
select RowId from ##_201606070947_02
--select * from ##_201606070947_03
DECLARE #ColumnId varchar(10)
DECLARE #getColumnId CURSOR
SET #getColumnId = CURSOR FOR
select cols ColumnId from #_201606070947_columns
OPEN #getColumnId
FETCH NEXT
FROM #getColumnId INTO #ColumnId
WHILE ##FETCH_STATUS = 0
BEGIN
PRINT #ColumnId
set #sql = ' ' +
' update ##_201606070947_03
set ' + #ColumnId + ' = B.Element
from ##_201606070947_03 A
, (
select A.RowID, B.*
from
(
select * from ##_201606070947_02
) A
,
(
select * from #_201606070947_myorig
) B
where A.' + #ColumnId + ' = B.SourceId
) B
where A.RowId = B.RowId
'
execute(#sql)
print #sql
FETCH NEXT
FROM #getColumnId INTO #ColumnId
END
CLOSE #getColumnId
DEALLOCATE #getColumnId
select * from ##_201606070947_03
I ran across another technique for calculating combinations, and it is so simple. In a public SQL challenge, it also soundly beat my own attempt based on the same bit-pattern technique as in my (currently) accepted answer. Though I admit I didn't play with it super long, or grab my best solution from here and adapt, but I wrote it over again.
Anyway, I was sure I had taken myself down a peg--here I was thinking I'd hit on something very cool but later found out my solution was easily surpassed. But to my surprise, when I tried the technique out, it was worse than the best method above. It is here for reference due to its easiness to implement and its reasonable performance for small jobs (thus making it superior except when the job may demand the added complexity of something like the bit-pattern technique).
Given a table #Set with the same number of rows as the items being selected from, and variable #K for the number of items taken at a time, here is that method:
WITH Chains AS (
SELECT
Generation = 1,
Chain = Convert(varchar(1000),'|' + Value + '|'),
Value
FROM
#Set
WHERE
#K > 0
AND value <= ALL (
SELECT TOP (#K) Value
FROM #Set
ORDER BY Value DESC
)
UNION ALL
SELECT
C.Generation + 1,
Convert(varchar(1000), C.Chain + S.Value + '|'),
S.Value
FROM
Chains C
INNER JOIN #Set S
ON C.Value < S.Value
WHERE
C.Generation <= #K
)
SELECT
C.Chain,
S.Value
FROM
Chains C
INNER JOIN #Set S
ON C.Chain LIKE '%|' + S.Value + '|%'
WHERE
Generation = #K;