Recursive CTE increases time - sql

I just made this code, the My_View table has about 9,000 rows, the CTE one has about 14,000. And CTE's first iteration lasted about 0,5s (Handwritting the code), but with the recursion, it lasts about 5 min. The main problem should be at the recursive code, but it shouldn't.
The objective of the code is: Having the following data:
{ID} [Primary ID] [Secondary ID]
Where all the Primary ID's begin with C... And the Secondary ID's with K... The problem is that some Secondary ID's are a link to a Primary ID as following:
{ID} [C010] [K011]
{ID} [C020] [C010]
{ID} [C020] [K020]
So what I want is it to finish like:
{ID} [C010] [K011]
{ID} [C020] [K011]
{ID} [C020] [K020]
{ID} = {[Cod_ 1], [First year], [First month]}
WITH CTE AS ( SELECT DISTINCT [Cod_ 1], [First year], [First month], [Primary ID], [Secondary ID] FROM My_View WHERE [Secondary ID] LIKE 'K%'
UNION ALL
SELECT m1.[Cod_ 1], m1.[First year], m1.[First month], m1.[Primary ID], [m2.Secondary ID] FROM My_View m1 INNER JOIN CTE m2 ON m1.[Cod_ 1] = m2.[Cod_ 1] AND m1.[First year] = m2.[First year] AND m1.[First month] = m2.[First month] AND m1.[Secondary ID] = m2.[Primary ID]
)
SELECT DISTINCT *
FROM CTE
ORDER BY [Cod_ 1], [Primary ID], [Secondary ID]

I believe you need to add WHERE condition like
WHERE m1.[Primary ID] NOT LIKE 'K%'
to avoid recursion-depth error or any similar case may slow it.
So this may help you:
WITH CTE
AS (
SELECT DISTINCT [Cod_ 1]
,[First year]
,[First month]
,[Primary ID]
,[Secondary ID]
FROM My_View
WHERE [Secondary ID] LIKE 'K%'
UNION ALL
SELECT m1.[Cod_ 1]
,m1.[First year]
,m1.[First month]
,m1.[Primary ID]
,[m2.Secondary ID]
FROM My_View m1
INNER JOIN CTE m2 ON m1.[Cod_ 1] = m2.[Cod_ 1]
AND m1.[First year] = m2.[First year]
AND m1.[First month] = m2.[First month]
AND m1.[Secondary ID] = m2.[Primary ID]
WHERE m1.[Primary ID] NOT LIKE 'K%'
)
SELECT DISTINCT *
FROM CTE
ORDER BY [Cod_ 1]
,[Primary ID]
,[Secondary ID]

Related

SQL WHERE/HAVING Condition

Currently I'm working in a forecasting project to estimate cash flow. This how the SQL query looks like:
SELECT [Date] AS ds, SUM([Sales Amount]) AS y, [Item ID]
FROM dbo.[Table]
GROUP BY [Date], [Item ID]
ORDER BY ds;
And in order to forecast sales I use an R package that strictly request that there has to be at least 2 instances where the forecast value(Sales) appears.
However there some instances in my query where an item it has been transacted just once.
Could you help me with an HAVING or WHERE condition where excludes all the items that were transacted just once?
Thanks!
I would add a count and use that:
SELECT ds, y, [Item ID]
FROM (SELECT [Date] AS ds, SUM([Sales Amount]) AS y, [Item ID],
COUNT(*) OVER (PARTITION BY [Item ID]) as cnt
FROM dbo.[Table]
GROUP BY [Date], [Item ID]
) t
WHERE cnt >= 2
ORDER BY ds;
You can use an extra filtering condition in a WHERE clause:
SELECT
[Date] AS ds
,SUM([Sales Amount]) AS y
,[Item ID]
FROM dbo.[Table]
WHERE [Item ID] in ( -- filters out the items with less than 2 samples
select distinct [Item ID]
from dbo.[Table]
group by [Item ID], [Date] having count(*) > 1
)
GROUP BY [Date]
,[Item ID]
ORDER BY ds

CASE returns more than one value

I have this query
select a.Tag,a.Type, a.[Starting Date], a.[Time From 1], a.[Time To 1],
DATEPART(dw,CBCal.Date) as [TagderWoche], CBCal.Date, CBCal.[Customer No_],
Description, [POS Holiday]
from [ReplicationLayer].[BackPro].[CustomerBPCal] as CBCal
CROSS APPLY
(
select Type,[Starting Date],[Time From 1],[Time To 1] ,
(case
WHEN DATEPART(dw,CBCal.Date)=1 then (select [Time From 1] from MyTable
where [Valid at Monday]=1
and [Customer No_]=CBCal.[Customer No_] )
WHEN DATEPART(dw,CBCal.Date)=3 then (select [Time From 1] from MyTable
where [Valid at Wednesday]=1
and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=4 then (select [Time From 1] from MyTable
where [Valid at Thursday]=1
and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=5 then (select [Time From 1] from MyTable
where [Valid at Friday]=1
and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=6 then (select [Time From 1] from MyTable
where [Valid at Saturday]=1
and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=7 then (select [Time From 1] from MyTable
where [Valid at Sunday]=1
and [Customer No_]=CBCal.[Customer No_])
end) as Tag
from [CustomerShopAndArrivalTime]
) as a
where CBCal.[Customer No_]=1 and CBCal.[POS Holiday]=0 and Date='2015-04-15'
If I run this query I am getting this error:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, = or when the subquery is used as an expression.
what should I do to solve this problem
You can use function max, min or top like this
select a.Tag,
a.Type,
a.[Starting Date],
a.[Time From 1],
a.[Time To 1],
DATEPART(dw,CBCal.Date) as [TagderWoche],
CBCal.Date,
CBCal.[Customer No_],
Description,[POS Holiday]
from [ReplicationLayer].[BackPro].[CustomerBPCal] as CBCal
CROSS APPLY
(
select Type,[Starting Date],[Time From 1],[Time To 1] ,
(case
WHEN DATEPART(dw,CBCal.Date)=1 then
(select top 1 [Time From 1] from MyTable where [Valid at Monday]=1 and [Customer No_]=CBCal.[Customer No_] )
WHEN DATEPART(dw,CBCal.Date)=3 then
(select top 1 [Time From 1] from MyTable where [Valid at Wednesday]=1 and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=4 then
(select top 1 [Time From 1] from MyTable where [Valid at Thursday]=1 and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=5 then
(select top 1 [Time From 1] from MyTable where [Valid at Friday]=1 and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=6 then
(select top 1 [Time From 1] from MyTable where [Valid at Saturday]=1 and [Customer No_]=CBCal.[Customer No_])
WHEN DATEPART(dw,CBCal.Date)=7 then
(select top 1 [Time From 1] from MyTable where [Valid at Sunday]=1 and [Customer No_]=CBCal.[Customer No_])
end) as Tag
from [CustomerShopAndArrivalTime]
) as a
where CBCal.[Customer No_]=1 and CBCal.[POS Holiday]=0 and Date='2015-04-15'

Turning results in comma separated list

I'm using this code to get the following results:
Select [Doc #],
[Production Number]
From vwLiveDocuments
Where [production number] IN
(
SELECT [production number]
FROM vwLiveDocuments
where [tags] LIKE N'%name of tag%'
Group by [production number]
Having Count (*) > 1
)
I'll get the following results:
'Doc #' 'Production number'
117611 CGI00069441
47864 CGI00069441
47865 CGI00069457
117901 CGI00069457
47866 CGI00069460
117904 CGI00069460
121479 CGI00071490
53934 CGI00071490
You can see duplicate results in Production number. what i would like is to convert this list to get the following results:
'Production number' 'Doc #'
CGI00069441 117611,47864
CGI00069457 47865,117901
CGI00069460 47866,117904
CGI00071490 121479,53934
Where for every duplicate "Prod number" i would like to get a comma seperate list of the doc # that are duplicates.
Use For xml path() trick to do this.
;WITH cte
AS (SELECT [Doc #],
[Production Number]
FROM vwLiveDocuments
WHERE [production number] IN (SELECT [production number]
FROM vwLiveDocuments
WHERE [tags] LIKE N'%20150126-Appendix B%'
GROUP BY [production number]
HAVING Count (*) > 1))
SELECT [Production number],
Stuff((SELECT ',' + CONVERT(VARCHAR(10), [doc #])
FROM cte b
WHERE b.[Production number] = a.[Production number]
FOR xml path('')), 1, 1, '') [Doc #]
FROM cte a
GROUP BY [Production number]
ORDER BY [Production Number] ASC

SQL query with multiple time based criteria

I have a database for managing blood lead levels where we have various reporting/exclusion requirements.
For mandatory removal of someone from my area they need a test over 34, and before they are allowed to return they require 3 consecutive tests to be under 27. I have been using the query below to find people that should be removed this looks at their last 3 tests for any over 34, This works to say a person must be excluded based on their last 3 tests, but does not work to generate a complete list of people that should currently be excluded
SELECT x1.[Employee ID]
FROM [Lead Results] AS x1 INNER JOIN Employees ON x1.[Employee ID] = Employees.[Employee ID]
WHERE (((x1.[Test Result])>34) AND (((select count(*)
from [Lead Results] x2
where x2.[Employee ID] = x1.[Employee ID]
and x2.[Date of Test] >= x1.[Date of Test]
))<=3))
GROUP BY x1.[Employee ID]
HAVING ((Count(x1.[Date of Test]))>=1);
I have also made a query to find people that have their last 3 tests under 27, This works fine to say someone can now be returned,
SELECT x1.[Employee ID]
FROM [Lead Results] AS x1 INNER JOIN Employees ON x1.[Employee ID] = Employees.[Employee ID]
WHERE (((x1.[Test Result])<27) AND (((select count(*)
from [Lead Results] x2
where x2.[Employee ID] = x1.[Employee ID]
and x2.[Date of Test] >= x1.[Date of Test]
))<=3))
GROUP BY x1.[Employee ID], Employees.[Current Employee]
HAVING ((Count(x1.[Date of Test]))>=3);
obviously these two queries work but only for the last 3 tests.
How can I get all the employee ids for people that have tested over 34, but have not yet had 3 consecutive tests under 27 since their high result.
Start with the most recent date where you have a test of 34:
select [Employee ID], max([Date of Test]) as max34
from [Lead Results]
where [Test Result] > 34
group by [Employee ID];
Next, calculate the most recent date where there are three tests all below 27. This is a bit harder, but it can be done.
The following actually finds tests where the previous, next, and current are all less than 27. It then aggregates these to get the latest value.
select [Employee ID], max([Date of Test]) as MiddleTest27
from [Lead Results] as lr
where [Test Result] < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] < lr.[Date of Test]
order by [Date of Test] desc, id desc
) < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] > lr.[Date of Test]
order by [Date of Test] asc, id asc
) < 27
group by [Employee ID]
Next, we can combine these to get what you want: Employees whose most recent test is in the first query but not the second:
select e34.*
from (select [Employee ID], max([Date of Test]) as max34
from [Lead Results]
where [Test Result] > 34
group by [Employee ID]
) as e34 left join
(select [Employee ID], max([Date of Test]) as MiddleTest27
from [Lead Results] as lr
where [Test Result] < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] < lr.[Date of Test]
order by [Date of Test] desc, id desc
) < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] > lr.[Date of Test]
order by [Date of Test] asc, id asc
) < 27
group by [Employee ID]
) as e27
on e34.[Employee ID] = e27.[Employee ID]
where e27.[Employee ID] is NULL or e27.MiddleTest27 < e34.max34
That is, there is either no more recent "27" sequence or the last one predated the "34" sequence.

Column invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause

We have a table which will capture the swipe record of each employee. I am trying to write a query to fetch the list of distinct employee record by the first swipe for today.
We are saving the swipe date info in datetime column. Here is my query its throwing exception.
select distinct
[employee number], [Employee First Name]
,[Employee Last Name]
,min([DateTime])
,[Card Number]
,[Reader Name]
,[Status]
,[Location]
from
[Interface].[dbo].[VwEmpSwipeDetail]
group by
[employee number]
where
[datetime] = CURDATE();
Getting error:
Column 'Interface.dbo.VwEmpSwipeDetail.Employee First Name' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Any help please?
Thanks in advance.
The error says it all:
...Employee First Name' is invalid in the select list because it is not contained
in either an aggregate function or the GROUP BY clause
Saying that, there are other columns that need attention too.
Either reduce the columns returned to only those needed or include the columns in your GROUP BY clause or add aggregate functions (MIN/MAX). Also, your WHERE clause should be placed before the GROUP BY.
Try:
select distinct [employee number]
,[Employee First Name]
,[Employee Last Name]
,min([DateTime])
,[Card Number]
,min([Reader Name])
from [Interface].[dbo].[VwEmpSwipeDetail]
where CAST([datetime] AS DATE)=CAST(GETDATE() AS DATE)
group by [employee number], [Employee First Name], [Employee Last Name], [Card Number]
I've removed status and location as this is likely to return non-distinct values. In order to return this data, you may need a subquery (or CTE) that first gets the unique IDs of the SwipeDetails table, and from this list you can join on to the other data, something like:
SELECT [employee number],[Employee First Name],[Employee Last Name].. -- other columns
FROM [YOUR_TABLE]
WHERE SwipeDetailID IN (SELECT MIN(SwipeDetailsId) as SwipeId
FROM SwipeDetailTable
WHERE CAST([datetime] AS DATE)=CAST(GETDATE() AS DATE)
GROUP BY [employee number])
Please Try Below Query :
select distinct [employee number],[Employee First Name]
,[Employee Last Name]
,min([DateTime])
,[Card Number]
,[Reader Name]
,[Status]
,[Location] from [Interface].[dbo].[VwEmpSwipeDetail] group by [employee number],[Employee First Name]
,[Employee Last Name]
,[Card Number]
,[Reader Name]
,[Status]
,[Location] having [datetime]=GetDate();
First find the first timestamp for each employee on the given day (CURDATE), then join back to the main table to get all the details:
WITH x AS (
SELECT [employee number], MIN([datetime] AS minDate
FROM [Interface].[dbo].[VwEmpSwipeDetail]
WHERE CAST([datetime] AS DATE) = CURDATE()
GROUP BY [employee number]
)
select [employee number]
,[Employee First Name]
,[Employee Last Name]
,[DateTime]
,[Card Number]
,[Reader Name]
,[Status]
,[Location]
from [Interface].[dbo].[VwEmpSwipeDetail] y
JOIN x ON (x.[employee number] = y.[employee number] AND x.[minDate] =Y.[datetime]
This should not be marked as mysql as this would not happen in mysql.
sql-server does not know which of the grouped [Employee First Name] values to return so you need to add an aggregate (even if you only actually expect one result). min/max will both work in that case. The same would apply to all the other rows where they are not in the GROUP BY or have an aggregate function (EG min) around them.