First date when certain condition was met - sql

I'm trying to find a first date when a condition was met. So the logic is below:
use
[AdventureWorksDW2012]
go
;WITH sales AS (
select d.OrderDateKey,
SalesAmount = SUM(d.SalesAmount)
from [dbo].[FactInternetSales] d
group by d.OrderDateKey
having SUM(d.SalesAmount)>10000
)
select FirstOrderDateKey = MIN(OrderDateKey)
from sales
The only problem is that in my data is too complex and too huge to calculate value for each date and then choose the min date when the condition is met. Is there any quick way of finding first date when Internet sales amount exceeded 10000? Is there some kind of loop required?

You can do it in single statement also. The performance can be improved with this if you have proper index on orderdatekey.
select MIN(s.OrderDateKey) as FirstOrderDateKey
from [dbo].[FactInternetSales] d
group by d.OrderDateKey
having SUM(d.SalesAmount)>10000

SELECT TOP 1
d.OrderDateKey
,S.RunningTotal
FROM [dbo].[FactInternetSales] d
CROSS APPLY ( SELECT SUM(SalesAmount) AS RunningTotal
FROM [dbo].[FactInternetSales]
WHERE OrderDateKey <= d.OrderDateKey
) S
WHERE S.RunningTotal < -- Condition
ORDER BY d.OrderDateKey DESC

Related

Calculate time span between two specific statuses on the database for each ID

I have a table on the database that contains statuses updated on each vehicle I have, I want to calculate how many days each vehicle spends time between two specific statuses 'Maintenance' and 'Read'.
My table looks something like this
and I want to result to be like this, only show the number of days a vehicle spends in maintenance before becoming ready on a specific day
The code I written looks like this
drop table if exists #temps1
select
VehicleId,
json_value(VehiclesHistoryStatusID.text,'$.en') as VehiclesHistoryStatus,
VehiclesHistory.CreationTime,
datediff(day, VehiclesHistory.CreationTime ,
lead(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ) ) as days,
lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) as PrevStatus,
case
when (lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) <> json_value(VehiclesHistoryStatusID.text,'$.en')) THEN datediff(day, VehiclesHistory.CreationTime , (lag(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ))) else 0 end as testing
into #temps1
from fleet.VehicleHistory VehiclesHistory
left join Fleet.Lookups as VehiclesHistoryStatusID on VehiclesHistoryStatusID.Id = VehiclesHistory.StatusId
where (year(VehiclesHistory.CreationTime) > 2021 and (VehiclesHistory.StatusId = 140 Or VehiclesHistory.StatusId = 144) )
group by VehiclesHistory.VehicleId ,VehiclesHistory.CreationTime , VehiclesHistoryStatusID.text
order by VehicleId desc
drop table if exists #temps2
select * into #temps2 from #temps1 where testing <> 0
select * from #temps2
Try this
SELECT innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
,SUM(DATEDIFF(DAY,innerQ.PrevMaintenance,innerQ.CreationDate)) AS DayDuration
FROM
(
SELECT t1.VehichleID,t1.CreationDate,t1.Status,
(SELECT top(1) t2.CreationDate FROM dbo.Test t2
WHERE t1.VehichleID=t2.VehichleID
AND t2.CreationDate<t1.CreationDate
AND t2.Status='Maintenance'
ORDER BY t2.CreationDate Desc) AS PrevMaintenance
FROM
dbo.Test t1 WHERE t1.Status='Ready'
) innerQ
WHERE innerQ.PrevMaintenance IS NOT NULL
GROUP BY innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
In this query first we are finding the most recent 'maintenance' date before each 'ready' date in the inner most query (if exists). Then calculate the time span with DATEDIFF and sum all this spans for each vehicle.

Teradata spool space issue on running a sub query with Count

I am using below query to calculate business days between two dates for all the order numbers. Business days are already available in the teradata table Common_WorkingCalendar. But, i'm also facing spool space issue while i execute the query. I have ample space available in my data lab. Need to optimize the query. Appreciate any inputs.
SELECT
tx."OrderNumber",
(SELECT COUNT(1) FROM Common_WorkingCalendar
WHERE CalDate between Cast(tx."TimeStamp" as date) and Cast(mf.ShipDate as date)) as BusDays
from StoreFulfillment ff
inner join StoreTransmission tx
on tx.OrderNumber = ff.OrderNumber
inner join StoreMerchandiseFulfillment mf
on mf.OrderNumber = ff.OrderNumber
This is a very inefficient way to get this count which results in a product join.
The recommended approach is adding a sequential number to your calendar which increases only on business days (calculated using SUM(CASE WHEN businessDay THEN 1 ELSE 0 END) OVER (ORDER BY CalDate ROWS UNBOUNDED PRECEDING)), then it's two joins, for the start date and the end date.
If this calculation is needed a lot you better add a new column, otherwise you can do it on the fly:
WITH cte AS
(
SELECT CalDate,
-- as this table only contains business days you can use this instead
row_number(*) Over (ORDER BY CalDate) AS DayNo
FROM Common_WorkingCalendar
)
SELECT
tx."OrderNumber",
to_dt.DayNo - from_dt.DayNo AS BusDays
FROM StoreFulfillment ff
INNER JOIN StoreTransmission tx
ON tx.OrderNumber = ff.OrderNumber
INNER JOIN StoreMerchandiseFulfillment mf
ON mf.OrderNumber = ff.OrderNumber
JOIN cte AS from_dt
ON from_dt.CalDate = Cast(tx."TimeStamp" AS DATE)
JOIN cte AS to_dt
ON to_dt.CalDate = Cast(mf.ShipDate AS DATE)

Using a date field for matching SQL Query

I'm having a bit of an issue wrapping my head around the logic of this changing dimension. I would like to associate these two tables below. I need to match the Cost - Period fact table to the cost dimension based on the Id and the effective date.
As you can see - if the month and year field is greater than the effective date of its associated Cost dimension, it should adopt that value. Once a new Effective Date is entered into the dimension, it should use that value for any period greater than said date going forward.
EDIT: I apologize for the lack of detail but the Cost Dimension will actually have a unique Index value and the changing fields to reference for the matching would be Resource, Project, Cost. I tried to match the query you provided with my fields, but I'm getting the incorrect output.
FYI: Naming convention change: EngagementId is Id, Resource is ConsultantId, and Project is ProjectId
I've changed the images below and here is my query
,_cte(HoursWorked, HoursBilled, Month, Year, EngagementId, ConsultantId, ConsultantName, ProjectId, ProjectName, ProjectRetainer, RoleId, Role, Rate, ConsultantRetainer, Salary, amount, EffectiveDate)
as
(
select sum(t.Duration), 0, Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantId, c.ConsultantName, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, 0, c.EffectiveDate
from timesheet t
left join Engagement c on t.EngagementId = c.EngagementId and Month(c.EffectiveDate) = Month(t.EndDate) and Year(c.EffectiveDate) = Year(t.EndDate)
group by Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantName, c.ConsultantId, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, c.EffectiveDate
)
select * from _cte where EffectiveDate is not null
union
select _cte.HoursWorked, _cte.HoursBilled, _cte.Month, _cte.Year, _cte.EngagementId, _cte.ConsultantId, _cte.ConsultantName, _cte.ProjectId, _Cte.ProjectName, _cte.ProjectRetainer, _cte.RoleId, _cte.Role, sub.Rate, _cte.ConsultantRetainer,_cte.Salary, _cte.amount, sub.EffectiveDate
from _cte
outer apply (
select top 1 EffectiveDate, Rate
from Engagement e
where e.ConsultantId = _cte.ConsultantId and e.ProjectId = _cte.ProjectId and e.RoleId = _cte.RoleId
and Month(e.EffectiveDate) < _cte.Month and Year(e.EffectiveDate) < _cte.Year
order by EffectiveDate desc
) sub
where _cte.EffectiveDate is null
Example:
I'm struggling with writing the query that goes along with this. At first I attempted to partition by greatest date. However, when I executed the join I got the highest effective date for every single period (even those prior to the effective date).
Is this something that can be accomplished in a query or should I be focusing on incremental updates of the destination table so that any effective date / time period in the past is left alone?
Any tips would be great!
Thanks,
Channing
Try this one:
; with _CTE as(
select p.* , c.EffectiveDate, c.Cost
from period p
left join CostDimension c on p.id = c.id and p.Month = DATEPART(month, c.EffectiveDate) and p.year = DATEPART (year, EffectiveDate)
)
select * from _CTE Where EffectiveDate is not null
Union
select _CTE.id, _CTE.Month, _CTE.Year, sub.EffectiveDate, sub.Cost
from _CTE
outer apply (select top 1 EffectiveDate, Cost
from CostDimension as cd
where cd.Id = _CTE.id and cd.EffectiveDate < DATETIMEFROMPARTS(_CTE.Year, _CTE.Month, 1, 0, 0, 0, 0)
order by EffectiveDate desc
) sub
where _Cte.EffectiveDate is null

MIN MAX query with a twist

I need to get the MIN and MAX dates for volume but I need to group it based on volume and not all the volume of same amount....
Basically, I have daily volume and dates for those daily volume. I need to be able to get the MIN Date as "to" and MAX date as "from" for a set of volume.
Note that the volume can traverse dates and then break and then have a new set of dates for the same volume.
Hopefully the screenshots below do a better job explaining than I can. I know how to do this via code.. but was wondering if the same was possible with SQL.
Please note that the SQL will be called from within an application and I can't insert into a temp table to get my end result data set...
Here is the raw data that I am querying using select *:
Here is what I ultimately want:
The query that I am running gives me the MIN and MAX for all the occurrences of the volume 1100. I want it split based on the break between dates as shown in the End result screenshot....
Here is my SQL:
SELECT daily_volume, MIN(volume_date) AS min_date, MAX(volume_date) AS max_date, ins_num
FROM daily_volume
WHERE ins_num = 3854439
GROUP BY daily_volume, ins_num
The following was written for sql server, but it should work for other databases (sqlfiddle):
with DatesMinMax as
(
select
volume_date,
daily_volume,
isnull
(
(
select top 1 d2.volume_date
from daily_volume d2
where d.volume_date > d2.volume_date and d.daily_volume <> d2.daily_volume
order by d2.volume_date desc
)
, '1753-01-01'
) as min_date,
isnull
(
(
select top 1 d2.volume_date
from daily_volume d2
where d.volume_date < d2.volume_date and d.daily_volume <> d2.daily_volume
order by d2.volume_date
)
, '9999-12-31'
) as max_date
from daily_volume d
),
DatesFromTo as
(
select d1.daily_volume as qty,
(select min(d2.volume_date)
from DatesMinMax d2
where d2.volume_date > d1.min_date and d2.volume_date < d1.max_date
) as [from],
(select max(d2.volume_date)
from DatesMinMax d2
where d2.volume_date > d1.min_date and d2.volume_date < d1.max_date
) as [to]
from DatesMinMax d1
)
select distinct
qty,
[from],
[to]
from DatesFromTo
order by [from]
DatesMinMax is used to get the first date that had a different volume than the current volume
DatesFromTo is used to get the date range for each row

Create weighted average in SQL using dates

I have a SQL query that lists details about a certain item. Everything works as should except for the last column. I want the weight of transaction column to report back a difference in days.
So for example the 4th row in the txdate column is 05/21/2014 and the 3rd row is 05/12/20014. The weight of transaction column in the 4th row should say 9.
I read about the Lag and Lead functions, but I'm not sure how to implement those with dates (if it's even possible). If it isn't possible is there a way to accomplish this?
Select t.txNumber,
t.item,
t.txCode,
t.txdate,
(t.onhandlocold + t.stockQty) as 'Ending Quantity',
tmax.maxtnumber 'Latest Transaction Code',
tmax.maxdate 'Latest Transaction Date',
tmin.mindate 'First Transaction Date',
(t.txdate - tmin.mindate) 'weight of transaction'
From tbliminvtxhistory t
Left outer join
(Select t.item, max(t.txnumber) as maxtnumber, max(t.txdate) as maxdate
From tbliminvtxHistory t
Where t.txCode != 'PAWAY'
Group By Item) tmax
on t.item = tmax.item
Left Outer Join
(Select t.item, min(t.txdate) as mindate
From tbliminvtxHistory t
WHere t.txCode != 'PAWAY'
and t.txdate > DateAdd(Year, -1, GetDate())
Group By Item) tmin
on t.item = tmin.item
where t.item = 'LR50M'
and t.txCode != 'PAWAY'
and t.txdate > DateAdd(Year, -1, GetDate())
Check out the DATEDIFF function, which will return the difference between two dates.
I think this is what you are looking for:
DATEDIFF(dd,tmin.mindate,t.txdate)
UPDATE:
Now that I understand your question a little better, here is an update. As mentioned in a comment on the above post, the LAG function is only supported in SQL 2012 and up. An alternative is to use ROW_NUMBER and store the results into a temp table. Then you can left join back to the same table on the next ROW_NUMBER in the results. Then you would use your DATEDIFF to compare the dates. This will do the exact same thing as the the LAG function.
Example:
SELECT ROW_NUMBER() OVER (ORDER BY txdate) AS RowNumber,*
INTO #Rows
FROM tbliminvtxhistory
SELECT DATEDIFF(dd,r2.txdate,r.txdate),*
FROM #Rows r
LEFT JOIN #Rows r2 ON r.RowNumber=r2.RowNumber+1
DROP TABLE #Rows
I think you are looking for this expression:
Select . . . ,
datediff(day, lag(txdate) over (order by xnumber), txdate)
This assumes that the rows are ordered by the first column, which seems reasonable given your explanation and the sample data.
EDIT:
Without lag() you can use outer apply. For simplicity, let me assume that your query is defined as a CTE:
with cte as (<your query here>)
select . . . ,
datediff(day, prev.txdate , cte.txdate)
from cte cross apply
(select top 1 cte2.*
from cte cte2
where cte2.xnumber < cte.xnumber
order by cte2.xnumber desc
) prev