TSQL Cascade / Waterfall value from current row into the next [closed] - sql

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
My earlier edits were a little muddled. Hopefully this clears it up ...
TL/DR -- just copy and execute the two script blocks and it will become apparent.
I have a question on cascading data. Essentially I am trying to move data down in a waterfall effect according to some predefined conditions (below). I've solved 15 of the 18 scenarios and I help with the remaining 3, scenario's with GID's 9, 10 and 18.
For a bit of perspective, in the system I'm working on data is continually imported into the system. The data is sparse, and I'm working to reconstitute a full set of data to complete the import process. I have little control over the shape of the data in the system, or that is provided to me:-/
Ultimately the question is: how do I satisfy the 5 cascading rules below, or alternately, how do I solve for test case #18 I've provided in the script below?
The Cascade Rules
In this simplified scenario the 'rules' for cascading are as follows:
Data will be cascaded only within the same group (GID)
A group of data will be ordered starting at 1 (Seq)
IsLive column will be either 1 or 0
If IsLive = 1 then move data down the rows until you encounter another IsLive = 1 or IsLive = 0 which has a non-null value
If IsLive = 0 then move data down the rows until you hit another IsLive = 0 with a value.
Note: My script is a simplified example, but in the full scenario there are N columns on which I need to cascade.
Solution Notes
If you run the SQL below you will see 3 columns, Input, Output - result of the CTE, Expected - the expected result and Result - Pass/Fail. I have included a script that both creates sample tables and illustrates test cases simply by executing.
The test cases script below has sample data
The test case script has a column I appended for the correct expected value. (Look for GID=18 in the INSERT script.)
I hope someone can help, if not I might have to resort to a SQL CLR SP solution. Also, I'm not tied to this solution, you may also completely discard my solution and come up with something new.
Test Case
DECLARE #Test TABLE (GID int, Seq int, IsLive bit,
Eff date,
Name varchar(50),
Expected varchar(50)) -- expected val should help debug!
INSERT INTO #Test VALUES (1, 1, 1, '01-08-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (1, 2, 0, '01-09-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (1, 3, 1, '01-10-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (1, 4, 0, '01-11-2012', NULL, 'RTA')
INSERT INTO #Test VALUES (1, 5, 1, '01-12-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (2, 1, 1, '01-08-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (2, 2, 0, '01-09-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (2, 3, 1, '01-10-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (2, 4, 0, '01-11-2012', 'GSM', 'GSM')
INSERT INTO #Test VALUES (2, 5, 1, '01-12-2012', 'FSA', 'GSM')
INSERT INTO #Test VALUES (3, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (3, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (4, 1, 1, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (4, 2, 0, '01-02-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (4, 3, 0, '01-03-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (5, 1, 0, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (5, 2, 1, '01-02-2012', 'LSI', 'LSI')
INSERT INTO #Test VALUES (5, 3, 0, '01-03-2012', NULL, 'LSI')
INSERT INTO #Test VALUES (6, 1, 1, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (6, 2, 0, '01-02-2012', 'LSI', 'LSI')
INSERT INTO #Test VALUES (6, 3, 1, '01-03-2012', NULL, 'LSI')
INSERT INTO #Test VALUES (7, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (7, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (7, 3, 1, '01-03-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (8, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (8, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (8, 3, 1, '01-03-2012', NULL, NULL)
INSERT INTO #Test VALUES (9, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (9, 2, 1, '01-02-2012', NULL, NULL)
INSERT INTO #Test VALUES (9, 3, 1, '01-03-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (10, 1, 1, '01-01-2012', 'FSA','FSA')
INSERT INTO #Test VALUES (10, 2, 1, '01-02-2012', 'GSM','GSM')
INSERT INTO #Test VALUES (10, 3, 1, '01-03-2012', 'RTS','RTS')
INSERT INTO #Test VALUES (11, 1, 0, '01-01-2012', 'NOP','NOP')
INSERT INTO #Test VALUES (11, 2, 1, '01-02-2012', 'TAP','NOP')
INSERT INTO #Test VALUES (11, 3, 1, '01-03-2012', 'STG','NOP')
INSERT INTO #Test VALUES (12, 1, 1, '01-01-2012', 'RTS','RTS')
INSERT INTO #Test VALUES (12, 2, 0, '01-02-2012', 'RTM','RTM')
INSERT INTO #Test VALUES (12, 3, 1, '01-03-2012', 'LSA','RTM')
INSERT INTO #Test VALUES (12, 4, 1, '01-03-2012', 'LSA','RTM')
INSERT INTO #Test VALUES (12, 5, 1, '01-03-2012', 'GSM','RTM')
INSERT INTO #Test VALUES (13, 1, 1, '01-08-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (13, 2, 0, '01-09-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (13, 3, 1, '01-10-2012', 'TST','TST')
INSERT INTO #Test VALUES (14, 1, 1, '01-08-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (14, 2, 0, '01-09-2012', 'GIP','GIP')
INSERT INTO #Test VALUES (14, 3, 1, '01-10-2012', 'TST','GIP')
INSERT INTO #Test VALUES (15, 1, 1, '01-01-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 2, 0, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 3, 1, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 4, 1, '01-02-2012', 'GYM','BAR')
INSERT INTO #Test VALUES (16, 1, 1, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (16, 2, 0, '01-03-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (16, 3, 1, '01-03-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (16, 4, 1, '01-03-2012', 'GYM','GYM')
INSERT INTO #Test VALUES (17, 1, 1, '01-02-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (17, 2, 0, '01-03-2012', 'GIP', 'GIP')
INSERT INTO #Test VALUES (17, 3, 0, '01-03-2012', NULL, 'GIP')
INSERT INTO #Test VALUES (17, 4, 1, '01-03-2012', 'TST', 'GIP')
-- -------------------------------------------
-- Following is the GID=18 test case that fails
-- -------------------------------------------
INSERT INTO #Test VALUES (18, 1, 1, '01-02-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (18, 2, 0, '01-03-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (18, 3, 0, '01-03-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (18, 4, 1, '01-03-2012', 'TST', 'BAR')
Solution
DECLARE #PrevNonLiveSeq int = NULL
;WITH CTE AS (
SELECT T.GID, T.SEQ, T.IsLive, Expected
, Name AS Name
, CASE WHEN T.IsLive = 0 THEN T.SEQ ELSE NULL END As PrevNonLiveSeq
, CASE WHEN T.IsLive = 1 THEN T.SEQ ELSE NULL END As PrevLiveSeq
, NULL AS PerNonLiveSeqCalc
, NULL AS PerLiveSeqCalc
, 0 PrevSeq
, CAST(NULL AS varchar(50)) PrevName
FROM #Test T
WHERE T.Seq = 1
UNION ALL
SELECT Curr.GID, Curr.SEQ, Curr.IsLive, Curr.Expected
,CASE WHEN Curr.IsLive = 0 THEN ISNULL(Curr.Name, Prev.Name)
ELSE CASE WHEN PrevNonLive.Name IS NULL THEN
CASE WHEN Prev.Name <> PrevLive.Name THEN Prev.Name ELSE Curr.Name END
ELSE Prev.Name END
END
,CASE WHEN Curr.IsLive = 0 THEN Curr.SEQ ELSE Prev.PrevNonLiveSeq END As PrevNonLiveSeq
,CASE WHEN Curr.IsLive = 1 THEN Curr.SEQ ELSE Prev.PrevLiveSeq END As PrevLiveSeq
, ISNULL(Prev.PrevNonLiveSeq, Curr.SEQ) AS PerNonLiveSeqCalc
, ISNULL(Prev.PrevLiveSeq, Curr.SEQ) AS PerLiveSeqCalc
, Prev.Seq PrevSeq, Prev.Name PrevName
FROM CTE Prev
JOIN #Test Curr ON Curr.GID = Prev.GID AND Curr.SEQ = Prev.SEQ+1
JOIN #Test PrevNonLive ON Prev.GID = PrevNonLive.GID AND PrevNonLive.SEQ = ISNULL(Prev.PrevNonLiveSeq, Curr.SEQ)
JOIN #Test PrevLive ON Prev.GID = PrevLive.GID AND PrevLive.SEQ = ISNULL(Prev.PrevLiveSeq, Curr.SEQ)
)
SELECT CTE.GID, CTE.Seq, T.IsLive
, T.Name Input, CTE.Name [Output]
, CASE WHEN CTE.Name = CTE.Expected OR (CTE.Name IS NULL AND CTE.Expected IS NULL) THEN 'Pass' ELSE 'FAIL' END AS Result
, CTE.Expected
FROM CTE
INNER JOIN #Test T on CTE.GID = T.GID AND CTE.Seq = T.Seq
ORDER BY CTE.GID, CTE.Seq
Results
For results please copy and run in SSMS
Thanks!

This should work and does not require the recursive CTE. You would just need to do the COALESCE for each of the actual fields you wanted to "cascade".
SELECT crrnt.*, COALESCE(cscd.Name, crrnt.Name) AS [Output]
FROM #Test crrnt
OUTER APPLY (
SELECT TOP 1 *
FROM #Test prir
WHERE prir.GID = crrnt.GID
AND prir.Seq < crrnt.Seq
AND (
(
crrnt.IsLive = 1
AND prir.IsLive = 0
AND prir.Name IS NOT NULL
)
OR (
crrnt.IsLive = 0
AND crrnt.Name IS NULL
AND (
(
prir.IsLive = 0
AND prir.Name IS NOT NULL
)
OR (
prir.IsLive = 1
AND NOT EXISTS(
SELECT *
FROM #Test confirm
WHERE confirm.GID = prir.GID
AND confirm.Seq < prir.Seq
AND confirm.IsLive = 0
AND confirm.Name IS NOT NULL
)
)
)
)
)
ORDER BY prir.Seq DESC
) cscd
Edit:
It is generally a good idea to test the performance of your queries so the following is just that. The test consists of:
1. Start with originally posted query and sample data
2. Change Temp Variable to Temp Table (query will end up hitting real User Table)
3. Create Clustered Index on Temp Table, being: GID, Seq.
4. Duplicate the data, but with higher GID values (turn 18 rows into 6,300,063 rows)
5. Ensure equal environment with DBCC FREEPROCCACAHE and DBCC DROPCLEANBUFFERS
6. Use STATISTICS IO and STATISTICS TIME
SET NOCOUNT ON
-- DROP TABLE #Test
IF (OBJECT_ID('tempdb.dbo.#Test') IS NULL)
BEGIN
CREATE TABLE #Test (GID INT NOT NULL, Seq INT NOT NULL, IsLive BIT NOT NULL,
Eff date,
Name varchar(50),
Expected varchar(50), -- expected val should help debug!
PRIMARY KEY(GID, Seq)
)
INSERT INTO #Test VALUES (1, 1, 1, '01-08-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (1, 2, 0, '01-09-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (1, 3, 1, '01-10-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (1, 4, 0, '01-11-2012', NULL, 'RTA')
INSERT INTO #Test VALUES (1, 5, 1, '01-12-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (2, 1, 1, '01-08-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (2, 2, 0, '01-09-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (2, 3, 1, '01-10-2012', 'FSA', 'RTA')
INSERT INTO #Test VALUES (2, 4, 0, '01-11-2012', 'GSM', 'GSM')
INSERT INTO #Test VALUES (2, 5, 1, '01-12-2012', 'FSA', 'GSM')
INSERT INTO #Test VALUES (3, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (3, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (4, 1, 1, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (4, 2, 0, '01-02-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (4, 3, 0, '01-03-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (5, 1, 0, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (5, 2, 1, '01-02-2012', 'LSI', 'LSI')
INSERT INTO #Test VALUES (5, 3, 0, '01-03-2012', NULL, 'LSI')
INSERT INTO #Test VALUES (6, 1, 1, '01-01-2012', NULL, NULL)
INSERT INTO #Test VALUES (6, 2, 0, '01-02-2012', 'LSI', 'LSI')
INSERT INTO #Test VALUES (6, 3, 1, '01-03-2012', NULL, 'LSI')
INSERT INTO #Test VALUES (7, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (7, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (7, 3, 1, '01-03-2012', 'RTA', 'RTA')
INSERT INTO #Test VALUES (8, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (8, 2, 0, '01-02-2012', NULL, 'FSA')
INSERT INTO #Test VALUES (8, 3, 1, '01-03-2012', NULL, NULL)
INSERT INTO #Test VALUES (9, 1, 1, '01-01-2012', 'FSA', 'FSA')
INSERT INTO #Test VALUES (9, 2, 1, '01-02-2012', NULL, NULL)
INSERT INTO #Test VALUES (9, 3, 1, '01-03-2012', 'RTS', 'RTS')
INSERT INTO #Test VALUES (10, 1, 1, '01-01-2012', 'FSA','FSA')
INSERT INTO #Test VALUES (10, 2, 1, '01-02-2012', 'GSM','GSM')
INSERT INTO #Test VALUES (10, 3, 1, '01-03-2012', 'RTS','RTS')
INSERT INTO #Test VALUES (11, 1, 0, '01-01-2012', 'NOP','NOP')
INSERT INTO #Test VALUES (11, 2, 1, '01-02-2012', 'TAP','NOP')
INSERT INTO #Test VALUES (11, 3, 1, '01-03-2012', 'STG','NOP')
INSERT INTO #Test VALUES (12, 1, 1, '01-01-2012', 'RTS','RTS')
INSERT INTO #Test VALUES (12, 2, 0, '01-02-2012', 'RTM','RTM')
INSERT INTO #Test VALUES (12, 3, 1, '01-03-2012', 'LSA','RTM')
INSERT INTO #Test VALUES (12, 4, 1, '01-03-2012', 'LSA','RTM')
INSERT INTO #Test VALUES (12, 5, 1, '01-03-2012', 'GSM','RTM')
INSERT INTO #Test VALUES (13, 1, 1, '01-08-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (13, 2, 0, '01-09-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (13, 3, 1, '01-10-2012', 'TST','TST')
INSERT INTO #Test VALUES (14, 1, 1, '01-08-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (14, 2, 0, '01-09-2012', 'GIP','GIP')
INSERT INTO #Test VALUES (14, 3, 1, '01-10-2012', 'TST','GIP')
INSERT INTO #Test VALUES (15, 1, 1, '01-01-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 2, 0, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 3, 1, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (15, 4, 1, '01-02-2012', 'GYM','BAR')
INSERT INTO #Test VALUES (16, 1, 1, '01-02-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (16, 2, 0, '01-03-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (16, 3, 1, '01-03-2012', 'BAR','BAR')
INSERT INTO #Test VALUES (16, 4, 1, '01-03-2012', 'GYM','GYM')
INSERT INTO #Test VALUES (17, 1, 1, '01-02-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (17, 2, 0, '01-03-2012', 'GIP', 'GIP')
INSERT INTO #Test VALUES (17, 3, 0, '01-03-2012', NULL, 'GIP')
INSERT INTO #Test VALUES (17, 4, 1, '01-03-2012', 'TST', 'GIP')
-- -------------------------------------------
-- Following is the GID=18 test case that fails
-- -------------------------------------------
INSERT INTO #Test VALUES (18, 1, 1, '01-02-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (18, 2, 0, '01-03-2012', 'BAR', 'BAR')
INSERT INTO #Test VALUES (18, 3, 0, '01-03-2012', NULL, 'BAR')
INSERT INTO #Test VALUES (18, 4, 1, '01-03-2012', 'TST', 'BAR')
CHECKPOINT
INSERT INTO #Test (GID, Seq, IsLive, Eff, Name, Expected)
SELECT tmp.GID + (multiplier.Num * 20) AS [GID], tmp.Seq, tmp.IsLive, tmp.Eff, tmp.Name, tmp.Expected
FROM #Test tmp
CROSS JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS [Num]
FROM master.sys.objects so1
CROSS JOIN master.sys.objects so2
CROSS JOIN master.sys.objects so3
) multiplier
WHERE multiplier.Num <= 100000
CHECKPOINT
SELECT COUNT(*) FROM #Test
ALTER INDEX ALL ON #Test REBUILD
-- SELECT TOP 1000 * FROM #Test ORDER BY GID, Seq
END /* IF (OBJECT_ID('tempdb.dbo.#Test') IS NULL) */
-----------------------------------------------------------------------------
DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS
PRINT '-- Original solution (Recursive CTE):'
PRINT ''
SET STATISTICS IO ON
SET STATISTICS TIME ON
;WITH CTE AS (
SELECT T.GID, T.SEQ, T.IsLive, Expected
, Name AS Name
, CASE WHEN T.IsLive = 0 THEN T.SEQ ELSE NULL END As PrevNonLiveSeq
, CASE WHEN T.IsLive = 1 THEN T.SEQ ELSE NULL END As PrevLiveSeq
, NULL AS PerNonLiveSeqCalc
, NULL AS PerLiveSeqCalc
, 0 PrevSeq
, CAST(NULL AS varchar(50)) PrevName
FROM #Test T
WHERE T.Seq = 1
UNION ALL
SELECT Curr.GID, Curr.SEQ, Curr.IsLive, Curr.Expected
,CASE WHEN Curr.IsLive = 0 THEN ISNULL(Curr.Name, Prev.Name)
ELSE CASE WHEN PrevNonLive.Name IS NULL THEN
CASE WHEN Prev.Name <> PrevLive.Name THEN Prev.Name ELSE Curr.Name END
ELSE Prev.Name END
END
,CASE WHEN Curr.IsLive = 0 THEN Curr.SEQ ELSE Prev.PrevNonLiveSeq END As PrevNonLiveSeq
,CASE WHEN Curr.IsLive = 1 THEN Curr.SEQ ELSE Prev.PrevLiveSeq END As PrevLiveSeq
, ISNULL(Prev.PrevNonLiveSeq, Curr.SEQ) AS PerNonLiveSeqCalc
, ISNULL(Prev.PrevLiveSeq, Curr.SEQ) AS PerLiveSeqCalc
, Prev.Seq PrevSeq, Prev.Name PrevName
FROM CTE Prev
JOIN #Test Curr ON Curr.GID = Prev.GID AND Curr.SEQ = Prev.SEQ+1
JOIN #Test PrevNonLive ON Prev.GID = PrevNonLive.GID AND PrevNonLive.SEQ = ISNULL(Prev.PrevNonLiveSeq, Curr.SEQ)
JOIN #Test PrevLive ON Prev.GID = PrevLive.GID AND PrevLive.SEQ = ISNULL(Prev.PrevLiveSeq, Curr.SEQ)
)
SELECT CTE.GID, CTE.Seq, T.IsLive
, T.Name Input, CTE.Name [Output]
, CASE WHEN CTE.Name = CTE.Expected OR (CTE.Name IS NULL AND CTE.Expected IS NULL) THEN 'Pass' ELSE 'FAIL' END AS Result
, CTE.Expected
FROM CTE
INNER JOIN #Test T on CTE.GID = T.GID AND CTE.Seq = T.Seq
ORDER BY CTE.GID, CTE.Seq
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
PRINT '=================================================='
------------------------------------------------------
DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS
PRINT '-- Proposed solution (OUTER APPLY):'
PRINT ''
SET STATISTICS IO ON
SET STATISTICS TIME ON
SELECT crrnt.GID, crrnt.Seq, crrnt.IsLive,
COALESCE(cscd.Name, crrnt.Name) AS [Output],
CASE
WHEN COALESCE(COALESCE(cscd.Name, crrnt.Name), '~~~') = COALESCE(crrnt.Expected, '~~~') THEN 'Pass'
ELSE 'FAIL'
END AS [Result],
crrnt.Expected
FROM #Test crrnt
OUTER APPLY (
SELECT TOP 1 *
FROM #Test prir
WHERE prir.GID = crrnt.GID
AND prir.Seq < crrnt.Seq
AND (
(
crrnt.IsLive = 1
AND prir.IsLive = 0
AND prir.Name IS NOT NULL
)
OR (
crrnt.IsLive = 0
AND crrnt.Name IS NULL
AND (
(
prir.IsLive = 0
AND prir.Name IS NOT NULL
)
OR (
prir.IsLive = 1
AND NOT EXISTS(
SELECT *
FROM #Test confirm
WHERE confirm.GID = prir.GID
AND confirm.Seq < prir.Seq
AND confirm.IsLive = 0
AND confirm.Name IS NOT NULL
)
)
)
)
)
ORDER BY prir.Seq DESC
) cscd
SET STATISTICS TIME OFF
SET STATISTICS IO OFF
-----------------------------------
My execution of the above test shows:
Original Query: CPU time = 173031 ms, elapsed time = 252708 ms, logical reads = 97,538,739
Proposed Query = CPU time = 49125 ms, elapsed time = 74003 ms, logical reads = 17,747,775
Hence, the original query is about 3.5 times slower for both CPU and elapsed time, and about 5 times more logical reads than my proposed query. Be careful with Recursive CTEs ;-).

Related

summing by rows sql

I attempted to do it using the analytical function, but it appears that I did so improperly...
How can I receive the output from the table I've been given?
CREATE TABLE rides (
ride_id INT,
driver_id INT,
ride_in_kms INT,
ride_fare FLOAT,
ride_date DATE
);
INSERT INTO rides VALUES (1, 1, 3, 4.45, "2016-05-16");
INSERT INTO rides VALUES (2, 1, 4, 8.46, "2016-05-16");
INSERT INTO rides VALUES (3, 2, 6, 11.9, "2016-05-16");
INSERT INTO rides VALUES (4, 3, 3, 6.76, "2016-05-16");
INSERT INTO rides VALUES (5, 2, 6, 13.55, "2016-05-16");
INSERT INTO rides VALUES (6, 4, 3, 4.91, "2016-05-20");
INSERT INTO rides VALUES (7, 1, 7, 16.77, "2016-05-20");
INSERT INTO rides VALUES (8, 3, 9, 16.18, "2016-05-20");
INSERT INTO rides VALUES (9, 2, 3, 6.07, "2016-05-20");
INSERT INTO rides VALUES (10, 4, 4, 6.25, "2016-05-20");
Output result
Thanks in advance
The general gist is to use an expression within the sum() to operate on the correct rows:
select
driver_id,
sum(case when ride_date = "2016-05-16" then ride_in_kms else 0 end) `KMS_MAY_16`,
sum(case when ride_date = "2016-05-20" then ride_in_kms else 0 end) `KMS_MAY_20`
from
group by driver_id;
The particular syntax available, and how to express the column label depends on what database you are using.

Can I improve this query for use in large tables?

How can I improve this query for use in large tables....?
I use a table ('DataValues') to store a collection of values ('Value') for collections ('Visit_id') ie it records certain values for each visit.
I use a table ('MatchItems') to store dynamic match sets 'MatchSet' of values ('Value'), sets can contain any number of values. The table also has a IsNeg field to indicate if the match should require a value to be not present in the visit collection.
This allows me to dynamically match visits that conform to certain criteria such as
Must contain values A, B and C and NOT D OR C and B AND NOT A.
ie (Value = A and Value = B and Value = C and Value /= D)
or (Value = C and Value = B and Value /= A)
I have a query that delivers a reasonable solution fiddle:
CREATE TABLE DataValues (
id NUMBER(5) CONSTRAINT DataValues_pk PRIMARY KEY,
Visit_id Number(5) ,
Value varchar(5)
);
INSERT INTO DataValues VALUES (1, 1, 'M');
INSERT INTO DataValues VALUES (2, 1, 'I');
INSERT INTO DataValues VALUES (3, 1, 'C');
INSERT INTO DataValues VALUES (4, 1, 'K');
INSERT INTO DataValues VALUES (5, 1, 'E');
INSERT INTO DataValues VALUES (6, 1, 'Y');
INSERT INTO DataValues VALUES (7, 2, 'M');
INSERT INTO DataValues VALUES (8, 2, 'O');
INSERT INTO DataValues VALUES (9, 2, 'U');
INSERT INTO DataValues VALUES (10, 2, 'S');
INSERT INTO DataValues VALUES (11, 2, 'E');
INSERT INTO DataValues VALUES (12, 3, 'C');
INSERT INTO DataValues VALUES (13, 3, 'A');
INSERT INTO DataValues VALUES (14, 3, 'T');
INSERT INTO DataValues VALUES (15, 4, 'S');
INSERT INTO DataValues VALUES (16, 4, 'A');
INSERT INTO DataValues VALUES (17, 4, 'T');
INSERT INTO DataValues VALUES (18, 5, 'M');
INSERT INTO DataValues VALUES (19, 5, 'A');
INSERT INTO DataValues VALUES (20, 5, 'T');
CREATE TABLE MatchItems (
id NUMBER(5) CONSTRAINT MatchItems_pk PRIMARY KEY,
MatchSet Number(5),
Value VARCHAR(5),
IsNeg NUMBER(1) NOT NULL CHECK (IsNeg in (0,1))
);
INSERT INTO MatchItems VALUES (1, 1, 'M', 0);
INSERT INTO MatchItems VALUES (2, 1, 'I', 0);
INSERT INTO MatchItems VALUES (3, 1, 'C', 0);
INSERT INTO MatchItems VALUES (4, 1, 'K', 0);
INSERT INTO MatchItems VALUES (5, 1, 'E', 0);
INSERT INTO MatchItems VALUES (6, 1, 'Y', 0);
INSERT INTO MatchItems VALUES (7, 2, 'C', 0);
INSERT INTO MatchItems VALUES (8, 2, 'A', 0);
INSERT INTO MatchItems VALUES (9, 3, 'A', 0);
INSERT INTO MatchItems VALUES (10, 3, 'T', 0);
INSERT INTO MatchItems VALUES (11, 4, 'S', 1);
INSERT INTO MatchItems VALUES (12, 4, 'A', 0);
INSERT INTO MatchItems VALUES (13, 4, 'K', 1);
INSERT INTO MatchItems VALUES (14, 5, 'A', 0);
INSERT INTO MatchItems VALUES (15, 5, 'T', 0);
SELECT
MatchItems.MatchSet,
DataValues.Visit_id,
GpMatchItems.Count TgtCount,
Count(MatchItems.Id),
sum(MatchItems.IsNeg)
FROM DataValues
LEFT JOIN MatchItems ON MatchItems.Value = DataValues.Value
--AND MatchItems.MatchSet = 4
LEFT JOIN (SELECT
MatchItems.MatchSet,
count(*) Count
FROM MatchItems
WHERE
MatchItems.IsNeg = 0
GROUP BY
MatchItems.MatchSet) GpMatchItems ON GpMatchItems.MatchSet = MatchItems.MatchSet
HAVING
Count(MatchItems.Id) = GpMatchItems.Count
AND sum(MatchItems.IsNeg) = 0
GROUP BY
MatchItems.MatchSet,
DataValues.Visit_id,
GpMatchItems.Count
How can I improve the performance of this query where the DataValues table contains 100m records, and MatchItems may include a collection of 50 sets each of 2 - 20 values?
You can try this version using Analytic functions and see if it performs any better. This query removes the subquery GpMatchItems that you are joining with.
SELECT DISTINCT matchset,
visit_id,
tgtcount,
match_visit_count,
isneg_sum
FROM (SELECT MatchItems.MatchSet,
DataValues.Visit_id,
COUNT (DISTINCT CASE MatchItems.IsNeg WHEN 0 THEN MatchItems.id ELSE NULL END)
OVER (PARTITION BY MatchItems.MatchSet)
AS tgtcount,
COUNT (*) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS match_visit_count,
SUM (MatchItems.IsNeg) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS isneg_sum
FROM DataValues LEFT JOIN MatchItems ON MatchItems.VALUE = DataValues.VALUE)
WHERE tgtcount = match_visit_count AND isneg_sum = 0;
I have adjusted EJ's suggestion to include a LEFT JOIN to collect the tgtCount to identify the total number of good matches required in each MatchSet:
SELECT DISTINCT matchset,
visit_id,
tgtcount,
match_visit_count,
isneg_sum
GpMatchItems.count tgtCount
FROM
COUNT (*) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS match_visit_count,
SUM (MatchItems.IsNeg) OVER (PARTITION BY MatchItems.MatchSet, DataValues.Visit_id)
AS isneg_sum
FROM DataValues
LEFT JOIN MatchItems ON MatchItems.VALUE = DataValues.VALUE)
LEFT JOIN ( SELECT
MatchItems.MatchSet,
count(*) Count
FROM MatchItems
WHERE MatchItems.IsNeg = 0
GROUP BY
MatchItems.MatchSet) GpMatchItems
ON GpMatchItems.MatchSet = MatchItems.MatchSet
)
WHERE
tgtcount = match_visit_count
AND isneg_sum = 0;

MINIMUM on second column, take first and third

DECLARE #Foo TABLE (Id INT, PozId INT, Val INT)
INSERT #Foo (Id, PozId, Val)
VALUES
(1, 1, 34),
(1, 2, 976),
(2, 1, 235),
(2, 2, 792),
(3, 2, 456),
(3, 3, 123)
How to get results like this from above query?
(1, 1, 34)
(2, 1, 235)
(3, 2, 456)
This brings you desired result. Query partitions your Ids and picks lowest PozitionId.
DECLARE #Foo TABLE
(
Id INT, PozId INT, Val INT
);
INSERT #Foo
(Id, PozId, Val)
VALUES
(1, 1, 34)
, (1, 2, 976)
, (2, 1, 235)
, (2, 2, 792)
, (3, 2, 456)
, (3, 3, 123);
SELECT Id, PozId, Val
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Id ORDER BY PozId) AS RowNo, *
FROM #Foo
) AS T
WHERE RowNo = 1;

Populate Ordinal column sequentially

I would like to populate ordinal column but don't want to loop through records. Is there any way to do it in single update?
CREATE TABLE #Sample
(PrimaryKey Int NOT NULL,
ParentKey Int NOT NULL,
Ordinal Int NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (1, 1, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (2, 1, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (3, 1, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (4, 2, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (5, 2, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (6, 3, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (7, 4, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (8, 4, NULL)
INSERT #Sample (PrimaryKey, ParentKey, Ordinal) VALUES (9, 5, NULL)
SELECT * FROM #Sample
DROP TABLE #Sample
Values in Ordinal column would be 1, 2, 3, 1, 2, 1, 1, 2, 1
I want to number within each group. Group defined by "ParentKey" and Ordinal should go sorted by "PrimaryKey"
Important! Can't rely on values in PrimaryKey and ParentKey. They have "holes" and not necessary increment by 1 as shown in my sample..
Assuming SQL Server 2005+:
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY ParentKey ORDER BY PrimaryKey )
FROM #Sample
)
UPDATE CTE
SET Ordinal = RN

SQL View union from joiner table

I have the following situation and I am not sure how best to address it. Any guidance on how to prepare the needed view would be greatly appreciated.
I have 4 tables:
users (userid int, username varchar)
roles (roleid int, rolename varchar)
businessunit (buid int, buname varchar)
user_role_map (userid, roleid, buid)
In the roles table I have a role with the id of 0 which is the "system admin" role and in the businessunit table I have an IT business unit. Any users resulting from the below query would be considered system admins and should have full access to every business unit.
SELECT userid FROM user_role_map WHERE roleid = 0 AND buid = 0
I need to build a view that shows all "non system admins" union'd to a list of every business unit and every "system admin" user. The first part is easy with the below query, but the second part is what what I am struggling with.
SELECT userid, roleid, buid FROM user_role_map WHERE roleid > 0 AND buid > 0
I will give some example data to help illustrate what I am trying to accomplish:
users
---------------
1, "sysAdmin"
2, "salesUser1"
3, "serviceUser1"
4, "manager1"
5, "salesUser2"
6, "serviceUser2"
7, "manager2"
roles
---------------
0, "SystemAdmin"
1, "Full"
2, "Update"
3, "Read"
businessunit
---------------
0, "IT"
1, "fooSales"
2, "fooService"
3, "barSales"
4, "barService"
user_role_map
---------------
1, 0, 0
2, 1, 1
2, 3, 3
3, 1, 2
3, 3, 4
4, 1, 1
4, 1, 2
5, 1, 3
5, 3, 1
6, 1, 4
6, 3, 3
7, 1, 2
7, 1, 4
Finally, i need the view to provide the following for the above sample data (note the last 4 rows):
new view
---------------
2, 1, 1
2, 3, 3
3, 1, 2
3, 3, 4
4, 1, 1
4, 1, 2
5, 1, 3
5, 3, 1
6, 1, 4
6, 3, 3
7, 1, 2
7, 1, 4
1, 1, 1
1, 1, 2
1, 1, 3
1, 1, 4
NOTE: the example data here only has one "System Admin" user but there could be any number of users of this type.
You should be able to do something like this:
declare #users table(userid int, username varchar(255));
insert into #users values (1, 'sysAdmin');
insert into #users values (2, 'salesUser1');
insert into #users values (3, 'serviceUser1');
insert into #users values (4, 'manager1');
insert into #users values (5, 'salesUser2');
insert into #users values (6, 'serviceUser2');
insert into #users values (7, 'manager2');
declare #roles table(roleid int, rolename varchar(255));
INSERT INTO #roles VALUES (0, 'SystemAdmin');
INSERT INTO #roles VALUES (1, 'Full');
INSERT INTO #roles VALUES (2, 'Update');
INSERT INTO #roles VALUES (3, 'Read');
DECLARE #user_role_map TABLE(userid INT, roleid INT, buid int)
INSERT INTO #user_role_map values (1, 0, 0);
INSERT INTO #user_role_map values (2, 1, 1);
INSERT INTO #user_role_map values (2, 3, 3);
INSERT INTO #user_role_map values (3, 1, 2);
INSERT INTO #user_role_map values (3, 3, 4);
INSERT INTO #user_role_map values (4, 1, 1);
INSERT INTO #user_role_map values (4, 1, 2);
INSERT INTO #user_role_map values (5, 1, 3);
INSERT INTO #user_role_map values (5, 3, 1);
INSERT INTO #user_role_map values (6, 1, 4);
INSERT INTO #user_role_map values (6, 3, 3);
INSERT INTO #user_role_map values (7, 1, 2);
INSERT INTO #user_role_map values (7, 1, 4);
DECLARE #businessunit TABLE(buid int, buidname VARCHAR(255));
INSERT INTO #businessunit VALUES (0, 'IT')
INSERT INTO #businessunit VALUES (1, 'fooSales')
INSERT INTO #businessunit VALUES (2, 'fooService')
INSERT INTO #businessunit VALUES (3, 'barSales')
INSERT INTO #businessunit VALUES (4, 'barService')
--non-admin users
SELECT userid, roleid, buid
FROM #user_role_map
WHERE
roleid > 0 AND buid > 0
UNION ALL
--get admin users and add a full control entry
SELECT userid, 1, BusinessUnits.buid
FROM #user_role_map m
CROSS JOIN(
--use this if you have a businessunit table you can leverage; otherwise,
--you can select distinct buid on role_map where buid > 0
SELECT buid
FROM #businessunit
WHERE buid > 0
) AS BusinessUnits
WHERE
roleid = 0 AND m.buid = 0
You could append the business unit's with a UNION ALL to your view.
SELECT
userid,
roleid,
buid
FROM
user_role_map
WHERE
roleid > 0
AND buid > 0
UNION ALL
/* append full control for system admins to all bussiness units */
SELECT
CAST(1 AS INT) AS userid,
CAST(1 as INT) AS roleid,
BU.buid
FROM businessunit BU