SQL query with multiple time based criteria - sql

I have a database for managing blood lead levels where we have various reporting/exclusion requirements.
For mandatory removal of someone from my area they need a test over 34, and before they are allowed to return they require 3 consecutive tests to be under 27. I have been using the query below to find people that should be removed this looks at their last 3 tests for any over 34, This works to say a person must be excluded based on their last 3 tests, but does not work to generate a complete list of people that should currently be excluded
SELECT x1.[Employee ID]
FROM [Lead Results] AS x1 INNER JOIN Employees ON x1.[Employee ID] = Employees.[Employee ID]
WHERE (((x1.[Test Result])>34) AND (((select count(*)
from [Lead Results] x2
where x2.[Employee ID] = x1.[Employee ID]
and x2.[Date of Test] >= x1.[Date of Test]
))<=3))
GROUP BY x1.[Employee ID]
HAVING ((Count(x1.[Date of Test]))>=1);
I have also made a query to find people that have their last 3 tests under 27, This works fine to say someone can now be returned,
SELECT x1.[Employee ID]
FROM [Lead Results] AS x1 INNER JOIN Employees ON x1.[Employee ID] = Employees.[Employee ID]
WHERE (((x1.[Test Result])<27) AND (((select count(*)
from [Lead Results] x2
where x2.[Employee ID] = x1.[Employee ID]
and x2.[Date of Test] >= x1.[Date of Test]
))<=3))
GROUP BY x1.[Employee ID], Employees.[Current Employee]
HAVING ((Count(x1.[Date of Test]))>=3);
obviously these two queries work but only for the last 3 tests.
How can I get all the employee ids for people that have tested over 34, but have not yet had 3 consecutive tests under 27 since their high result.

Start with the most recent date where you have a test of 34:
select [Employee ID], max([Date of Test]) as max34
from [Lead Results]
where [Test Result] > 34
group by [Employee ID];
Next, calculate the most recent date where there are three tests all below 27. This is a bit harder, but it can be done.
The following actually finds tests where the previous, next, and current are all less than 27. It then aggregates these to get the latest value.
select [Employee ID], max([Date of Test]) as MiddleTest27
from [Lead Results] as lr
where [Test Result] < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] < lr.[Date of Test]
order by [Date of Test] desc, id desc
) < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] > lr.[Date of Test]
order by [Date of Test] asc, id asc
) < 27
group by [Employee ID]
Next, we can combine these to get what you want: Employees whose most recent test is in the first query but not the second:
select e34.*
from (select [Employee ID], max([Date of Test]) as max34
from [Lead Results]
where [Test Result] > 34
group by [Employee ID]
) as e34 left join
(select [Employee ID], max([Date of Test]) as MiddleTest27
from [Lead Results] as lr
where [Test Result] < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] < lr.[Date of Test]
order by [Date of Test] desc, id desc
) < 27 and
(select top 1 [Test Result]
from [Lead Results] as lr2
where lr2.[Employee ID] = lr.[Employee ID] and lr2.[Date of Test] > lr.[Date of Test]
order by [Date of Test] asc, id asc
) < 27
group by [Employee ID]
) as e27
on e34.[Employee ID] = e27.[Employee ID]
where e27.[Employee ID] is NULL or e27.MiddleTest27 < e34.max34
That is, there is either no more recent "27" sequence or the last one predated the "34" sequence.

Related

Why does my record count increase as my denominator for a division performed in a nested query increases?

In my query I'm trying to calculate the total overtime worked by employees by dividing basic salary by 173(hours in a month) which gives me the hourly rate then dividing the total overtime amount of employee by the hourly rate. To my surprise the record count increases as the number of hours in a month increases, should it not decrease?
Here's my script:
select 'Employees that worked more overtime' [CAAT],*
from ( select H.[Month]
,H.[Employee Code]
,H.[Department]
,H.[Job title]
,H.[Surname]
,H.[Full Names]
,H.[Basic Salary]
,R.[Overtime]
,H.[Hourly Rate]
,round(R.[Overtime] / H.[Hourly Rate],2) [Overtime Hours]
from (select [Month]
,[Employee Code]
,Department
,[Job title]
,[Surname]
,[Full Names]
,nullif(convert(money,[Amount]),0.00) [Basic Salary]
,nullif(round(convert(money,[Amount]) / 173,2),0.00) [Hourly Rate]
from [Salary DB]
where [Field Desc] = 'ED01-Basic Salary') H
left join
(select [Month]
,[Employee Code]
,nullif(sum(convert(money,[Amount])),0.00) [Overtime]
from [Salary DB]
where [Field Desc] in ('ED02-O/Time 1.5','ED02-O/Time 2.0','ED42-Sunday Pay')
group by [Month]
,[Employee Code]) R
on H.[Employee Code] = R.[Employee Code]
and H.[Month] = R.[Month]) [Data]
where [Overtime Hours] > '40'
Order by [Employee Code], [Month] Desc

Recursive CTE increases time

I just made this code, the My_View table has about 9,000 rows, the CTE one has about 14,000. And CTE's first iteration lasted about 0,5s (Handwritting the code), but with the recursion, it lasts about 5 min. The main problem should be at the recursive code, but it shouldn't.
The objective of the code is: Having the following data:
{ID} [Primary ID] [Secondary ID]
Where all the Primary ID's begin with C... And the Secondary ID's with K... The problem is that some Secondary ID's are a link to a Primary ID as following:
{ID} [C010] [K011]
{ID} [C020] [C010]
{ID} [C020] [K020]
So what I want is it to finish like:
{ID} [C010] [K011]
{ID} [C020] [K011]
{ID} [C020] [K020]
{ID} = {[Cod_ 1], [First year], [First month]}
WITH CTE AS ( SELECT DISTINCT [Cod_ 1], [First year], [First month], [Primary ID], [Secondary ID] FROM My_View WHERE [Secondary ID] LIKE 'K%'
UNION ALL
SELECT m1.[Cod_ 1], m1.[First year], m1.[First month], m1.[Primary ID], [m2.Secondary ID] FROM My_View m1 INNER JOIN CTE m2 ON m1.[Cod_ 1] = m2.[Cod_ 1] AND m1.[First year] = m2.[First year] AND m1.[First month] = m2.[First month] AND m1.[Secondary ID] = m2.[Primary ID]
)
SELECT DISTINCT *
FROM CTE
ORDER BY [Cod_ 1], [Primary ID], [Secondary ID]
I believe you need to add WHERE condition like
WHERE m1.[Primary ID] NOT LIKE 'K%'
to avoid recursion-depth error or any similar case may slow it.
So this may help you:
WITH CTE
AS (
SELECT DISTINCT [Cod_ 1]
,[First year]
,[First month]
,[Primary ID]
,[Secondary ID]
FROM My_View
WHERE [Secondary ID] LIKE 'K%'
UNION ALL
SELECT m1.[Cod_ 1]
,m1.[First year]
,m1.[First month]
,m1.[Primary ID]
,[m2.Secondary ID]
FROM My_View m1
INNER JOIN CTE m2 ON m1.[Cod_ 1] = m2.[Cod_ 1]
AND m1.[First year] = m2.[First year]
AND m1.[First month] = m2.[First month]
AND m1.[Secondary ID] = m2.[Primary ID]
WHERE m1.[Primary ID] NOT LIKE 'K%'
)
SELECT DISTINCT *
FROM CTE
ORDER BY [Cod_ 1]
,[Primary ID]
,[Secondary ID]

Get results with MAX([Action Date]) per Cases

I would like to get your help in the following case:
I have three columns [Case], [Action Date], [Person ID] and I would like to get the maximum [Action Date] for each Case with the related [Person ID]
As you can see the results for the Case are identical and we have the [Action Date] and [Person ID] which are not, I would like to get back the MAX([Action Date]) (17-04-2020) and the related [Person ID].
select
[Case],
MAX([Action Date]) as 'Last Parking Date',
[Person ID]
from SOURCE_TABLE
group by
[Case],
MAX([Action Date]) as 'Last Parking Date',
[Person ID]
I tried to write it with subselects, but the code became totally confusing.
Thank you so much for you help!
Use ROW_NUMBER() window function:
SELECT [Case], [Person ID], [Action Date] AS [Last Parking Date]
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY [Case] ORDER BY [Action Date] DESC) rn
FROM SOURCE_TABLE
) t
WHERE rn = 1

SQL WHERE/HAVING Condition

Currently I'm working in a forecasting project to estimate cash flow. This how the SQL query looks like:
SELECT [Date] AS ds, SUM([Sales Amount]) AS y, [Item ID]
FROM dbo.[Table]
GROUP BY [Date], [Item ID]
ORDER BY ds;
And in order to forecast sales I use an R package that strictly request that there has to be at least 2 instances where the forecast value(Sales) appears.
However there some instances in my query where an item it has been transacted just once.
Could you help me with an HAVING or WHERE condition where excludes all the items that were transacted just once?
Thanks!
I would add a count and use that:
SELECT ds, y, [Item ID]
FROM (SELECT [Date] AS ds, SUM([Sales Amount]) AS y, [Item ID],
COUNT(*) OVER (PARTITION BY [Item ID]) as cnt
FROM dbo.[Table]
GROUP BY [Date], [Item ID]
) t
WHERE cnt >= 2
ORDER BY ds;
You can use an extra filtering condition in a WHERE clause:
SELECT
[Date] AS ds
,SUM([Sales Amount]) AS y
,[Item ID]
FROM dbo.[Table]
WHERE [Item ID] in ( -- filters out the items with less than 2 samples
select distinct [Item ID]
from dbo.[Table]
group by [Item ID], [Date] having count(*) > 1
)
GROUP BY [Date]
,[Item ID]
ORDER BY ds

Script help - count , total and avg

I have the following script, trying to count how many distinct customers are there , how many distinct orders , what is total of all orders under £15 and its's avg, total of orders above £20 and it's avg.
with consignments as
(select
[Sell-to Customer No_],
[Convert-to Document No_],
ic.[Shipping Agent Service Code],
[Pick Completed DateTime] as [Shipped DateTime],
ROUND((ic.[Amount Including VAT] + ic.Postage + ic.[Gift Wrap Price] +
ic.[Handling Fee] + ic.[Personalisation Fee]),2) as [Document Amount]
from dbo.[Temp$Consignment] ic inner join [dbo].
[Temp$Order] oh
on ic.[Owner Header GuID]=oh.[Order Guid]
where ic.[Shipping Agent Service Code]='secstan' and ic.[Pick Completed
DateTime] >= '2016-11-01T00:00:00.000' AND
ic.[Pick Completed DateTime] <= '2016-11-30T23:59:55.000' ),summary as
(select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments )select * from summary
I have working script like below, but as I am new to sql I am bit confused how to convert above script to below.
select amountclass,[Shipping Agent Service Code],
count(distinct [Sell-to Customer No_]) as Total_customers,
count(*) as Total_orders,
sum([Amount]) as total_revenue,
avg([Amount] * 1.0) as AOV
from
(
select [Sell-to Customer No_], oh.[Original Order No_], [Amount],ic.
[Shipping Agent Service Code],
case when [Amount] <= 20 then 'Under_20'
else 'Over_20'
end as amountclass
from [TBW_BI].[dbo].[Temp$Order] oh INNER JOIN [TBW_BI].[dbo].
[Temp$Consignment] IC
ON IC.[Owner Header GuID]=OH.[Order Guid]
where[order date] >= '2016-09-01' AND
[order date] <= '2016-09-30' AND [COUNTRY]='UNITED KINGDOM' and
[document type] like 'ord%' and ic.[Shipping Agent Service
Code]='secstan'
) dt
group by amountclass,[Shipping Agent Service Code]
order by amountclass,[Shipping Agent Service Code]
It looks like you have a query with a 2 Common Table Expressions (CTE) and you want to convert the CTE to a derived table. First, we can eliminate one of the CTEs.
summary as
(select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments )
select * from summary
The CTE for summary is unnecessary. We can convert that code to this:
select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from consignments
Now, instead of using the consignments CTE; we can copy the sql code within it to create a derived table.
select *,CASE WHEN [Document Amount] > 15 THEN 1 ELSE 0 END as 'Over15'
from (
select [Sell-to Customer No_],
[Convert-to Document No_],
ic.[Shipping Agent Service Code],
[Pick Completed DateTime] as [Shipped DateTime],
ROUND((ic.[Amount Including VAT] + ic.Postage + ic.[Gift Wrap Price] +
ic.[Handling Fee] + ic.[Personalisation Fee]),2) as [Document Amount]
from dbo.[Temp$Consignment] ic
inner join [dbo].[Temp$Order] oh on ic.[Owner Header GuID]=oh.[Order Guid]
where ic.[Shipping Agent Service Code]='secstan'
and ic.[Pick Completed
DateTime] >= '2016-11-01T00:00:00.000'
AND
ic.[Pick Completed DateTime] <= '2016-11-30T23:59:55.000' ) as t