SQL Server CTE doesn't Join Properly - sql

The data for this project contains two columns with semicolon-delimited strings. These are actually ordered pairs. So, for example, in: "a;b;c", "x;y;z", 'a' is paired with 'x'. The goal for our query is to create a table where this relationship is clearly represented one row at a time.
Here is a script to re-create the sample data:
DROP TABLE IF EXISTS dbo.sampleData;
DROP TABLE IF EXISTS dbo.lookupCPT;
GO
CREATE TABLE sampleData
(
numRow bigint IDENTITY(1,1) NOT NULL CONSTRAINT PK_numRow PRIMARY KEY,
sDelimQty varchar(MAX) NULL,
sDelimCPT varchar(MAX) NULL
)
CREATE TABLE lookupCPT
(
numRow bigint IDENTITY(1,1) NOT NULL CONSTRAINT PK_numRowCPT PRIMARY KEY,
sCPTCode varchar(10) NULL,
decCPTRate decimal(16,2) NULL
)
INSERT [dbo].[lookupCPT] ([numRow], [sCPTCode], [decCPTRate])
VALUES (1, N'123', CAST(4.00 AS Decimal(16, 2)))
INSERT [dbo].[lookupCPT] ([numRow], [sCPTCode], [decCPTRate])
VALUES (2, N'456', CAST(5.00 AS Decimal(16, 2)))
INSERT [dbo].[lookupCPT] ([numRow], [sCPTCode], [decCPTRate])
VALUES (3, N'789', CAST(7.00 AS Decimal(16, 2)))
INSERT [dbo].[sampleData] ([numRow], [sDelimQty], [sDelimCPT])
VALUES (1, N'1;2', N'123;789')
INSERT [dbo].[sampleData] ([numRow], [sDelimQty], [sDelimCPT])
VALUES (2, N'3', N'456')
We attempted to accomplish this using common table expressions:
WITH Qty_CTE (numRowQ, Qty) AS
(
SELECT numRow, value
FROM sampleData
CROSS APPLY STRING_SPLIT(sDelimQty, ';')
),
CPT_CTE (numRowC, CPT) AS
(
SELECT numRow, value
FROM sampleData
CROSS APPLY STRING_SPLIT(sDelimCPT, ';')
)
SELECT *
FROM sampleData
JOIN CPT_CTE c on c.numRowC = sampleData.numRow
JOIN Qty_CTE q on q.numRowQ = sampleData.numRow
However, doing this doubles the amount of rows in our output:
q1
But, if we remove either one of the two joins, it returns correctly:
q2
Any ideas? Thanks very much
After all the helpful answers, below is the final solution. Cheers!
WITH Qty_CTE (numRowQ, Qty, RN) AS
(
SELECT
numRow, value,
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM
sampleData
CROSS APPLY
STRING_SPLIT(sDelimQty, ';')
),
CPT_CTE (numRowC, CPT, CPTRate, RN) AS
(
SELECT
s.numRow, value as CPT, l.decCPTRate as CPTRate,
ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM
sampleData s
CROSS APPLY
STRING_SPLIT(sDelimCPT, ';')
JOIN
lookupCPT l ON value = l.sCPTCode
)
SELECT
numRow, sDelimCPT, sDelimQty, CPT, CPTRate, Qty, CPTRate * Qty as Total
FROM
sampleData
JOIN
CPT_CTE c on c.numRowC = sampleData.numRow
JOIN
Qty_CTE q on q.numRowQ = sampleData.numRow AND c.RN = q.RN

If your STRING_SPLIT function preserve order, then this will work.
WITH Qty_CTE (numRowQ, Qty, RN) AS
(
SELECT numRow, value, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM sampleData
CROSS APPLY STRING_SPLIT(sDelimQty, ';')
),
CPT_CTE (numRowC, CPT, RN) AS
(
SELECT numRow, value, ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS RN
FROM sampleData
CROSS APPLY STRING_SPLIT(sDelimCPT, ';')
)
SELECT * FROM sampleData
JOIN CPT_CTE c on c.numRowC = sampleData.numRow
JOIN Qty_CTE q on q.numRowQ = sampleData.numRow AND c.RN = q.RN

Both your examples q1 and q2 work as expected. Qty_CTE has a cardinality of 3 (Record 1 appears twice, record 2 appears once), so does CPT_CTE.
sampleData has a cardinality of 2 (each row appears once only)
Since you're joining on the PK, sampleData x Qty_CTE x CPT_CTE should return 5 records, which it does (1x2x2 records for numrow 1 and 1x1x1 record for numrow 2). If you remove either Qty_CTE or CPT_CTE it should return 3 records, which it does (1x1x1 record for numrow 2 and 1x2 records for numrow 1).
We could propose a solution based on an expected result, if you had one.

Related

Query to Find maximum possible combinations between two columns

The target is to create all possible combinations of joining the two columns. every article of the first column ('100','101','102','103') must be in the combination result.
Sample Code
create table basis
(article Integer,
supplier VarChar(10) );
Insert into basis Values (100, 'A');
Insert into basis Values (101, 'A');
Insert into basis Values (101, 'B');
Insert into basis Values (101, 'C');
Insert into basis Values (102, 'D');
Insert into basis Values (103, 'B');
Result set
combination_nr;article;supplier
1;100;'A'
1;101;'A'
1;102;'D'
1;103;'B'
2;100;'A'
2;101;'B'
2;102;'D'
2;103;'B'
3;100;'A'
3;101;'C'
3;102;'D'
3;103;'B'
Let suppose if we add one more row against 102 as 'A' then our result set will be like this
Also according to the below-given calculations now we have 24 result sets
1;100;'A'
1;101;'A'
1;102;'A'
1;103;'B'
2;100;'A'
2;101;'A'
2;102;'D'
2;103;'B'
3;100;'A'
3;101;'B'
3;102;'A'
3;103;'B'
4;100;'A'
4;101;'B'
4;102;'D'
4;103;'B'
5;100;'A'
5;101;'C'
5;102;'A'
5;103;'B'
6;100;'A'
6;101;'C'
6;102;'D'
6;103;'B'
Already tried code
I have tried different Cross Joins but they always give exceeded rows according to my result sets.
SELECT article, supplier
FROM (SELECT DISTINCT supplier FROM basis2) AS t1
CROSS JOIN (SELECT DISTINCT article FROM basis2) AS t2;
Calculations:
article 100: 1 supplier ('A')
article 101: 3 suppliers ('A','B','C')
article 102: 1 supplier ('D')
article 103: 1 supplier ('B')
unique articles: 4 (100,101,102,103)
1x3x1x1 x 4 = 12 (combination rows)
You can do what you want using a recursive CTE. It is easier to put the combinations in single rows rather than across multiple rows:
with b as (
select b.*, dense_rank() over (order by article) as seqnum
from basis b
),
cte as (
select convert(varchar(max), concat(article, ':', supplier)) as suppliers, seqnum
from b
where seqnum = 1
union all
select concat(cte.suppliers, ',', concat(article, ':', supplier)), b.seqnum
from cte join
b
on b.seqnum = cte.seqnum + 1
)
select row_number() over (order by suppliers), suppliers
from (select cte.*, max(seqnum) over () as max_seqnum
from cte
) cte
where seqnum = max_seqnum;
For your particular result set, you can unroll the string:
with b as (
select b.*, dense_rank() over (order by article) as seqnum
from basis b
),
cte as (
select convert(varchar(max), concat(article, ':', supplier)) as suppliers, seqnum
from b
where seqnum = 1
union all
select concat(cte.suppliers, ',', concat(article, ':', supplier)), b.seqnum
from cte join
b
on b.seqnum = cte.seqnum + 1
)
select seqnum,
left(s.value, charindex(':', s.value) - 1) as article,
stuff(s.value, 1, charindex(s.value, ':'), '') as supplier
from (select row_number() over (order by suppliers) as seqnum, suppliers
from (select cte.*, max(seqnum) over () as max_seqnum
from cte
) cte
where seqnum = max_seqnum
) cte cross apply
string_split(suppliers, ',') s;
Here is a db<>fiddle.

Find all records within x units of each other

I have a table like this:
CREATE TABLE t(idx integer primary key, value integer);
INSERT INTO t(idx, value)
VALUES
(1, 1),
(2, 2),
(3, 3),
(4, 6),
(5, 7),
(6, 12)
I would like to return all the groups of records where the values are within 2 of each other, with an associated group label as a new column by which to identify them.
I thought perhaps a recursive query might be suitable...but my sql-fu is lacking.
You can use a recursive CTE:
with recursive tt as (
select t.*, row_number() over (order by idx) as seqnum
from t
),
cte as (
select idx, value, value as grp,
seqnum, 1 as lev
from tt
where seqnum = 1
union all
select tt.idx, tt.value,
(case when tt.value > grp + 2 then tt.value else cte.grp end),
tt.seqnum, 1 + lev
from cte join
tt
on tt.seqnum = cte.seqnum + 1
)
select *
from cte;
Here is a db<>fiddle. Note that this added a row with the value of "4" to show that the first four rows are split into two groups.
I assume you want to group rows so that any two values in each group may differ only by at most 2. Then you are right, recursive query is the solution. In each level of recursion the bounds of new group are precomputed. Groups are disjoint so finally join original table with computed group number and group by this number. Db fiddle here.
with recursive r (minv,maxv,level) as (
select min(t.value), min(t.value) + 2, 1
from t
union all
select minv, maxv, level from (
select t.value as minv, t.value + 2 as maxv, r.level + 1 as level, row_number() over (order by minv) rn
from r
join t on t.value > r.maxv
) x where x.rn = 1
)
select r.level
, format('ids from %s to %s', min(t.idx), max(t.idx)) as id_label
, format('values from %s to %s', min(t.value), max(t.value)) as value_label
from t join r on t.value between r.minv and r.maxv
group by r.level
order by r.level
(The inner query in the recursive part is just to limit number of newly added rows only to one. Simpler clause select min(t.value), min(t.value) + 2 is not possible because aggregation functions are not allowed in recursive part, analytic function is workaround.)

How can I transform my N little queries into one query?

I have a query that gives me the first available value for a given date and pair.
SELECT
TOP 1 value
FROM
my_table
WHERE
date >= 'myinputdate'
AND key = 'myinpukey'
ORDER BY date
I have N pairs of key and dates, and I try to find out how not to query each pair one by one. The table is rather big, and N as well, so it's currently heavy and slow.
How can I query all the pairs in one query ?
A solution is to use APPLY like a "function" created on the fly with one or many columns from another set:
DECLARE #inputs TABLE (
myinputdate DATE,
myinputkey INT)
INSERT INTO #inputs(
myinputdate,
myinputkey)
VALUES
('2019-06-05', 1),
('2019-06-01', 2)
SELECT
I.myinputdate,
I.myinputkey,
R.value
FROM
#inputs AS I
CROSS APPLY (
SELECT TOP 1
T.value
FROM
my_table AS T
WHERE
T.date >= I.myinputdate AND
T.key = I.myinputkey
ORDER BY
T.date ) AS R
You can use OUTER APPLY if you want NULL result values to be shown also. This supports fetching multiple columns and using ORDER BY with TOP to control amount of rows.
This solution is without variables. You control your N by setting the right value to the row_num predicate.
There are plenty of ways how to do you what you want and it all depends on your specific needs. As it answered already, that you can use temp/variable table to store these conditions and then join it on the same conditions you use predicates. You can also create user defined data type and use it as param to the function/procedure. You might use CROSS APPLY + VALUES clause to get that list and then join it.
DROP TABLE IF EXISTS #temp;
CREATE TABLE #temp ( d DATE, k VARCHAR(100) );
GO
INSERT INTO #temp
VALUES ( '20180101', 'a' ),
( '20180102', 'b' ),
( '20180103', 'c' ),
( '20180104', 'd' ),
( '20190101', 'a' ),
( '20190102', 'b' ),
( '20180402', 'c' ),
( '20190103', 'c' ),
( '20190104', 'd' );
SELECT a.d ,
a.k
FROM ( SELECT d ,
k ,
ROW_NUMBER() OVER ( PARTITION BY k ORDER BY d DESC ) row_num
FROM #temp
WHERE (d >= '20180401'
AND k = 'a')
OR (d > '20180401'
AND k = 'b')
OR (d > '20180401'
AND k = 'c')
) a
WHERE a.row_num <= 1;
-- VALUES way
SELECT a.d ,
a.k
FROM ( SELECT t.d ,
t.k ,
ROW_NUMBER() OVER ( PARTITION BY t.k ORDER BY t.d DESC ) row_num
FROM #temp t
CROSS APPLY (VALUES('20180401','a'), ('20180401', 'b'), ('20180401', 'c')) f(d,k)
WHERE t.d >= f.d AND f.k = t.k
) a
WHERE a.row_num <= 1;
If all the keys are using the same date, then use window functions:
SELECT key, value
FROM (SELECT t.*, ROW_NUMBER() OVER (PARTITION BY key ORDER BY date) as seqnum
FROM my_table t
WHERE date >= #input_date AND
key IN ( . . . )
) t
WHERE seqnum = 1;
SELECT key, date,value
FROM (SELECT ROW_NUMBER() OVER (PARTITION BY key,date ORDER BY date) as rownum,key,date,value
FROM my_table
WHERE
date >= 'myinputdate'
) as d
WHERE d.rownum = 1;

How To Create Duplicate Records depending on Column which indicates on Repetition

I've got a table which consisting aggregated records, and i need to Split them according to specific column ('Shares Bought' like in the example below), as Follow:
Original Table:
Requested Table:
Needless to say, that there are more records like that in the table and i need an automated query (not manual insertions),
and also there are some more attributes which i will need to duplicate (like the field 'Date').
You would need to first generate_rows with increasing row_number and then perform a cross join with your table.
Eg:
create table t(rowid int, name varchar(100),shares_bought int, date_val date)
insert into t
select *
from (values (1,'Dan',2,'2018-08-23')
,(2,'Mirko',1,'2018-08-25')
,(3,'Shuli',3,'2018-05-14')
,(4,'Regina',1,'2018-01-19')
)t(x,y,z,a)
with generate_data
as (select top (select max(shares_bought) from t)
row_number() over(order by (select null)) as rnk /* This would generate rows starting from 1,2,3 etc*/
from sys.objects a
cross join sys.objects b
)
select row_number() over(order by t.rowid) as rowid,t.name,1 as shares_bought,t.date_val
from t
join generate_data gd
on gd.rnk <=t.shares_bought /* generate rows up and until the number of shares bought*/
order by 1
Here is a db fiddle link
https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=5736255585c3ab2c2964c655bec9e08b
declare #t table (rowid int, name varchar(100), sb int, dt date);
insert into #t values
(1, 'Dan', 2, '20180823'),
(2, 'Mirco', 1, '20180825'),
(3, 'Shuli', 3, '20180514'),
(4, 'Regina', 1, '20180119');
with nums as
(
select n
from (values(1), (2), (3), (4)) v(n)
)
select t.*
from #t t
cross apply (select top (t.sb) *
from nums) a;
Use a table of numbers instead of CTE nums or add there as many values as you can find in Shares Bought column.
Other option is to use recursive cte :
with t as (
select 1 as RowId, Name, ShareBought, Date
from table
union all
select RowId+1, Name, ShareBought, Date
from t
where RowId <= ShareBought
)
select row_number() over (order by name) as RowId,
Name, 1 as ShareBought, Date
from t;
If the sharebought not limited to only 2 or 3 then you would have to use option (maxrecursion 0) query hint as because by default it is limited to only 100 sharebought.

concatenate recursive cross join

I need to concatenate the name in a recursive cross join way. I don't know how to do this, I have tried a CTE using WITH RECURSIVE but no success.
I have a table like this:
group_id | name
---------------
13 | A
13 | B
19 | C
19 | D
31 | E
31 | F
31 | G
Desired output:
combinations
------------
ACE
ACF
ACG
ADE
ADF
ADG
BCE
BCF
BCG
BDE
BDF
BDG
Of course, the results should multiply if I add a 4th (or more) group.
Native Postgresql Syntax:
SqlFiddleDemo
WITH RECURSIVE cte1 AS
(
SELECT *, DENSE_RANK() OVER (ORDER BY group_id) AS rn
FROM mytable
),cte2 AS
(
SELECT
CAST(name AS VARCHAR(4000)) AS name,
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
CAST(CONCAT(c2.name,c1.name) AS VARCHAR(4000)) AS name
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT name as combinations
FROM cte2
WHERE LENGTH(name) = (SELECT MAX(rn) FROM cte1)
ORDER BY name;
Before:
I hope if you don't mind that I use SQL Server Syntax:
Sample:
CREATE TABLE #mytable(
ID INTEGER NOT NULL
,TYPE VARCHAR(MAX) NOT NULL
);
INSERT INTO #mytable(ID,TYPE) VALUES (13,'A');
INSERT INTO #mytable(ID,TYPE) VALUES (13,'B');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'C');
INSERT INTO #mytable(ID,TYPE) VALUES (19,'D');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'E');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'F');
INSERT INTO #mytable(ID,TYPE) VALUES (31,'G');
Main query:
WITH cte1 AS
(
SELECT *, rn = DENSE_RANK() OVER (ORDER BY ID)
FROM #mytable
),cte2 AS
(
SELECT
TYPE = CAST(TYPE AS VARCHAR(MAX)),
rn
FROM cte1
WHERE rn = 1
UNION ALL
SELECT
[Type] = CAST(CONCAT(c2.TYPE,c1.TYPE) AS VARCHAR(MAX))
,c1.rn
FROM cte1 c1
JOIN cte2 c2
ON c1.rn = c2.rn + 1
)
SELECT *
FROM cte2
WHERE LEN(Type) = (SELECT MAX(rn) FROM cte1)
ORDER BY Type;
LiveDemo
I've assumed that the order of "cross join" is dependent on ascending ID.
cte1 generate DENSE_RANK() because your IDs contain gaps
cte2 recursive part with CONCAT
main query just filter out required length and sort string
The recursive query is a bit simpler in Postgres:
WITH RECURSIVE t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name
FROM tbl
)
, cte AS (
SELECT grp, name
FROM t
WHERE grp = 1
UNION ALL
SELECT t.grp, c.name || t.name
FROM cte c
JOIN t ON t.grp = c.grp + 1
)
SELECT name AS combi
FROM cte
WHERE grp = (SELECT max(grp) FROM t)
ORDER BY 1;
The basic logic is the same as in the SQL Server version provided by #lad2025, I added a couple of minor improvements.
Or you can use a simple version if your maximum number of groups is not too big (can't be very big, really, since the result set grows exponentially). For a maximum of 5 groups:
WITH t AS ( -- to produce gapless group numbers
SELECT dense_rank() OVER (ORDER BY group_id) AS grp, name AS n
FROM tbl
)
SELECT concat(t1.n, t2.n, t3.n, t4.n, t5.n) AS combi
FROM (SELECT n FROM t WHERE grp = 1) t1
LEFT JOIN (SELECT n FROM t WHERE grp = 2) t2 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 3) t3 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 4) t4 ON true
LEFT JOIN (SELECT n FROM t WHERE grp = 5) t5 ON true
ORDER BY 1;
Probably faster for few groups. LEFT JOIN .. ON true makes this work even if higher levels are missing. concat() ignores NULL values. Test with EXPLAIN ANALYZE to be sure.
SQL Fiddle showing both.