Using a date field for matching SQL Query - sql

I'm having a bit of an issue wrapping my head around the logic of this changing dimension. I would like to associate these two tables below. I need to match the Cost - Period fact table to the cost dimension based on the Id and the effective date.
As you can see - if the month and year field is greater than the effective date of its associated Cost dimension, it should adopt that value. Once a new Effective Date is entered into the dimension, it should use that value for any period greater than said date going forward.
EDIT: I apologize for the lack of detail but the Cost Dimension will actually have a unique Index value and the changing fields to reference for the matching would be Resource, Project, Cost. I tried to match the query you provided with my fields, but I'm getting the incorrect output.
FYI: Naming convention change: EngagementId is Id, Resource is ConsultantId, and Project is ProjectId
I've changed the images below and here is my query
,_cte(HoursWorked, HoursBilled, Month, Year, EngagementId, ConsultantId, ConsultantName, ProjectId, ProjectName, ProjectRetainer, RoleId, Role, Rate, ConsultantRetainer, Salary, amount, EffectiveDate)
as
(
select sum(t.Duration), 0, Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantId, c.ConsultantName, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, 0, c.EffectiveDate
from timesheet t
left join Engagement c on t.EngagementId = c.EngagementId and Month(c.EffectiveDate) = Month(t.EndDate) and Year(c.EffectiveDate) = Year(t.EndDate)
group by Month(t.StartDate), Year(t.StartDate), t.EngagementId, c.ConsultantName, c.ConsultantId, c.ProjectId, c.ProjectName, c.ProjectRetainer, c.RoleId, c.Role, c.Rate, c.ConsultantRetainer,
c.Salary, c.EffectiveDate
)
select * from _cte where EffectiveDate is not null
union
select _cte.HoursWorked, _cte.HoursBilled, _cte.Month, _cte.Year, _cte.EngagementId, _cte.ConsultantId, _cte.ConsultantName, _cte.ProjectId, _Cte.ProjectName, _cte.ProjectRetainer, _cte.RoleId, _cte.Role, sub.Rate, _cte.ConsultantRetainer,_cte.Salary, _cte.amount, sub.EffectiveDate
from _cte
outer apply (
select top 1 EffectiveDate, Rate
from Engagement e
where e.ConsultantId = _cte.ConsultantId and e.ProjectId = _cte.ProjectId and e.RoleId = _cte.RoleId
and Month(e.EffectiveDate) < _cte.Month and Year(e.EffectiveDate) < _cte.Year
order by EffectiveDate desc
) sub
where _cte.EffectiveDate is null
Example:
I'm struggling with writing the query that goes along with this. At first I attempted to partition by greatest date. However, when I executed the join I got the highest effective date for every single period (even those prior to the effective date).
Is this something that can be accomplished in a query or should I be focusing on incremental updates of the destination table so that any effective date / time period in the past is left alone?
Any tips would be great!
Thanks,
Channing

Try this one:
; with _CTE as(
select p.* , c.EffectiveDate, c.Cost
from period p
left join CostDimension c on p.id = c.id and p.Month = DATEPART(month, c.EffectiveDate) and p.year = DATEPART (year, EffectiveDate)
)
select * from _CTE Where EffectiveDate is not null
Union
select _CTE.id, _CTE.Month, _CTE.Year, sub.EffectiveDate, sub.Cost
from _CTE
outer apply (select top 1 EffectiveDate, Cost
from CostDimension as cd
where cd.Id = _CTE.id and cd.EffectiveDate < DATETIMEFROMPARTS(_CTE.Year, _CTE.Month, 1, 0, 0, 0, 0)
order by EffectiveDate desc
) sub
where _Cte.EffectiveDate is null

Related

Calculate time span between two specific statuses on the database for each ID

I have a table on the database that contains statuses updated on each vehicle I have, I want to calculate how many days each vehicle spends time between two specific statuses 'Maintenance' and 'Read'.
My table looks something like this
and I want to result to be like this, only show the number of days a vehicle spends in maintenance before becoming ready on a specific day
The code I written looks like this
drop table if exists #temps1
select
VehicleId,
json_value(VehiclesHistoryStatusID.text,'$.en') as VehiclesHistoryStatus,
VehiclesHistory.CreationTime,
datediff(day, VehiclesHistory.CreationTime ,
lead(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ) ) as days,
lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) as PrevStatus,
case
when (lag(json_value(VehiclesHistoryStatusID.text,'$.en')) over (order by VehiclesHistory.CreationTime) <> json_value(VehiclesHistoryStatusID.text,'$.en')) THEN datediff(day, VehiclesHistory.CreationTime , (lag(VehiclesHistory.CreationTime ) over (order by VehiclesHistory.CreationTime ))) else 0 end as testing
into #temps1
from fleet.VehicleHistory VehiclesHistory
left join Fleet.Lookups as VehiclesHistoryStatusID on VehiclesHistoryStatusID.Id = VehiclesHistory.StatusId
where (year(VehiclesHistory.CreationTime) > 2021 and (VehiclesHistory.StatusId = 140 Or VehiclesHistory.StatusId = 144) )
group by VehiclesHistory.VehicleId ,VehiclesHistory.CreationTime , VehiclesHistoryStatusID.text
order by VehicleId desc
drop table if exists #temps2
select * into #temps2 from #temps1 where testing <> 0
select * from #temps2
Try this
SELECT innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
,SUM(DATEDIFF(DAY,innerQ.PrevMaintenance,innerQ.CreationDate)) AS DayDuration
FROM
(
SELECT t1.VehichleID,t1.CreationDate,t1.Status,
(SELECT top(1) t2.CreationDate FROM dbo.Test t2
WHERE t1.VehichleID=t2.VehichleID
AND t2.CreationDate<t1.CreationDate
AND t2.Status='Maintenance'
ORDER BY t2.CreationDate Desc) AS PrevMaintenance
FROM
dbo.Test t1 WHERE t1.Status='Ready'
) innerQ
WHERE innerQ.PrevMaintenance IS NOT NULL
GROUP BY innerQ.VehichleID,innerQ.CreationDate,innerQ.Status
In this query first we are finding the most recent 'maintenance' date before each 'ready' date in the inner most query (if exists). Then calculate the time span with DATEDIFF and sum all this spans for each vehicle.

Teradata spool space issue on running a sub query with Count

I am using below query to calculate business days between two dates for all the order numbers. Business days are already available in the teradata table Common_WorkingCalendar. But, i'm also facing spool space issue while i execute the query. I have ample space available in my data lab. Need to optimize the query. Appreciate any inputs.
SELECT
tx."OrderNumber",
(SELECT COUNT(1) FROM Common_WorkingCalendar
WHERE CalDate between Cast(tx."TimeStamp" as date) and Cast(mf.ShipDate as date)) as BusDays
from StoreFulfillment ff
inner join StoreTransmission tx
on tx.OrderNumber = ff.OrderNumber
inner join StoreMerchandiseFulfillment mf
on mf.OrderNumber = ff.OrderNumber
This is a very inefficient way to get this count which results in a product join.
The recommended approach is adding a sequential number to your calendar which increases only on business days (calculated using SUM(CASE WHEN businessDay THEN 1 ELSE 0 END) OVER (ORDER BY CalDate ROWS UNBOUNDED PRECEDING)), then it's two joins, for the start date and the end date.
If this calculation is needed a lot you better add a new column, otherwise you can do it on the fly:
WITH cte AS
(
SELECT CalDate,
-- as this table only contains business days you can use this instead
row_number(*) Over (ORDER BY CalDate) AS DayNo
FROM Common_WorkingCalendar
)
SELECT
tx."OrderNumber",
to_dt.DayNo - from_dt.DayNo AS BusDays
FROM StoreFulfillment ff
INNER JOIN StoreTransmission tx
ON tx.OrderNumber = ff.OrderNumber
INNER JOIN StoreMerchandiseFulfillment mf
ON mf.OrderNumber = ff.OrderNumber
JOIN cte AS from_dt
ON from_dt.CalDate = Cast(tx."TimeStamp" AS DATE)
JOIN cte AS to_dt
ON to_dt.CalDate = Cast(mf.ShipDate AS DATE)

SQL Query to show order of work orders

First off sorry for the poor subject line.
EDIT: The Query here duplicates OrderNumbers I am needing the query to NOT duplicate OrderNumbers
EDIT: Shortened the question and provided a much cleaner question
I have a table that has a record of all of the work orders that have been performed. there are two types of orders. Installs and Trouble Calls. My query is to find all of the trouble calls that have taken place within 30 days of an install and match that trouble call (TC) to the proper Install (IN). So the Trouble Call date has to happen after the install but no more than 30 days after. Additionally if there are two installs and two trouble calls for the same account all within 30 days and they happen in order the results have to reflect that. The problem I am having is I am getting an Install order matching to two different Trouble Calls (TC) and a Trouble Call(TC) that is matching to two different Installs(IN)
In the example on SQL Fiddle pay close attention to the install order number 1234567810 and the Trouble Call order number 1234567890 and you will see the issue I am having.
http://sqlfiddle.com/#!3/811df/8
select b.accountnumber,
MAX(b.scheduleddate) as OriginalDate,
b.workordernumber as OriginalOrder,
b.jobtype as OriginalType,
MIN(a.scheduleddate) as NewDate,
a.workordernumber as NewOrder,
a.jobtype as NewType
from (
select workordernumber,accountnumber,jobtype,scheduleddate
from workorders
where jobtype = 'TC'
) a join
(
select workordernumber,accountnumber,jobtype,scheduleddate
from workorders
where jobtype = 'IN'
) b
on a.accountnumber = b.accountnumber
group by b.accountnumber,
b.scheduleddate,
b.workordernumber,
b.jobtype,
a.accountnumber,
a.scheduleddate,
a.workordernumber,
a.jobtype
having MIN(a.scheduleddate) > MAX(b.scheduleddate) and
DATEDIFF(day,MAX(b.scheduleddate),MIN(a.scheduleddate)) < 31
Example of what I am looking for the results to look like.
Thank you for any assistance you can provide in setting me on the correct path.
You were actually very close. I realized that what you really want is the MIN() TC date that is greater than each install date for that account number so long as they are 30 days or less apart.
So really you need to group by the install dates from your result set excluding WorkOrderNumbers still. Something like:
SELECT a.AccountNumber, MIN(a.scheduleddate) TCDate, b.scheduleddate INDate
FROM
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'TC'
) a
INNER JOIN
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'IN'
) b
ON a.AccountNumber = b.AccountNumber
WHERE b.ScheduledDate < a.ScheduledDate
AND DATEDIFF(DAY, b.ScheduledDate, a.ScheduledDate) <= 30
GROUP BY a.AccountNumber, b.AccountNumber, b.ScheduledDate
This takes care of the dates and AccountNumbers, but you still need the WorkOrderNumbers, so I joined the workorders table back twice, once for each type.
NOTE: I assume that each workorder has a unique date for each account number. So, if you have workorder 1 ('TC') for account 1 done on '1/1/2015' and you also have workorder 2 ('TC') for account 1 done on '1/1/2015' then I can't guarantee that you will have the correct WorkOrderNumber in your result set.
My final query looked like this:
SELECT
aggdata.AccountNumber, inst.workordernumber OriginalWorkOrderNumber, inst.JobType OriginalJobType, inst.ScheduledDate OriginalScheduledDate,
tc.WorkOrderNumber NewWorkOrderNumber, tc.JobType NewJobType, tc.ScheduledDate NewScheduledDate
FROM (
SELECT a.AccountNumber, MIN(a.scheduleddate) TCDate, b.scheduleddate INDate
FROM
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'TC'
) a
INNER JOIN
(
SELECT WorkOrderNumber, ScheduledDate, JobType, AccountNumber
FROM workorders
WHERE JobType = 'IN'
) b
ON a.AccountNumber = b.AccountNumber
WHERE b.ScheduledDate < a.ScheduledDate
AND DATEDIFF(DAY, b.ScheduledDate, a.ScheduledDate) <= 30
GROUP BY a.AccountNumber, b.AccountNumber, b.ScheduledDate
) aggdata
LEFT OUTER JOIN workorders tc
ON aggdata.TCDate = tc.ScheduledDate
AND aggdata.AccountNumber = tc.AccountNumber
AND tc.JobType = 'TC'
LEFT OUTER JOIN workorders inst
ON aggdata.INDate = inst.ScheduledDate
AND aggdata.AccountNumber = inst.AccountNumber
AND inst.JobType = 'IN'
select in1.accountnumber,
in1.scheduleddate as OriginalDate,
in1.workordernumber as OriginalOrder,
'IN' as OriginalType,
tc.scheduleddate as NewDate,
tc.workordernumber as NewOrder,
'TC' as NewType
from
workorders in1
out apply (Select min(in2.scheduleddate) as scheduleddate from workorders in2 Where in2.jobtype = 'IN' and in1.accountnumber=in2.accountnumber and in2.scheduleddate>in1.scheduleddate) ins
join workorders tc on tc.jobtype = 'TC' and tc.accountnumber=in1.accountnumber and tc.scheduleddate>in1.scheduleddate and (ins.scheduleddate is null or tc.scheduleddate<ins.scheduleddate) and DATEDIFF(day,in1.scheduleddate,tc.scheduleddate) < 31
Where in1.jobtype = 'IN'

First date when certain condition was met

I'm trying to find a first date when a condition was met. So the logic is below:
use
[AdventureWorksDW2012]
go
;WITH sales AS (
select d.OrderDateKey,
SalesAmount = SUM(d.SalesAmount)
from [dbo].[FactInternetSales] d
group by d.OrderDateKey
having SUM(d.SalesAmount)>10000
)
select FirstOrderDateKey = MIN(OrderDateKey)
from sales
The only problem is that in my data is too complex and too huge to calculate value for each date and then choose the min date when the condition is met. Is there any quick way of finding first date when Internet sales amount exceeded 10000? Is there some kind of loop required?
You can do it in single statement also. The performance can be improved with this if you have proper index on orderdatekey.
select MIN(s.OrderDateKey) as FirstOrderDateKey
from [dbo].[FactInternetSales] d
group by d.OrderDateKey
having SUM(d.SalesAmount)>10000
SELECT TOP 1
d.OrderDateKey
,S.RunningTotal
FROM [dbo].[FactInternetSales] d
CROSS APPLY ( SELECT SUM(SalesAmount) AS RunningTotal
FROM [dbo].[FactInternetSales]
WHERE OrderDateKey <= d.OrderDateKey
) S
WHERE S.RunningTotal < -- Condition
ORDER BY d.OrderDateKey DESC

Selecting only if at least one row matches condition

I have a select statement and want to return all values only if at least one of them has a date with 60 days of difference from today.
The problem is that i have an outer apply which returns the column i want to compare to, and they come from different tables (one belongs to cash items, and the other to card items).
Considering I have the following:
OUTER APPLY (
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1 --Cash
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2 --Card
) AS items
I want to return all the rows only when DATEDIFF(DAY, MIN(items.item_date), GETDATE()) >= 60, but I want them all even if only one matches this condition.
What would be the best approach to do this?
EDIT
To make it clearer, I'll explain the use case:
I need to show the items of every loan, only if the client is late for more than 60 days of the due date on any of it
I am also not sure, what do you expect, but how about that:
WITH items
AS (SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT Count(*) AS quantity,
Min(date) AS item_date
FROM dbo.Get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2)
SELECT a.*
FROM items AS a,
(SELECT TOP 1 *
FROM items AS b
WHERE Datediff(day, b.item_date, Getdate()) >= 60) AS c
It's a sort of CROSS JOIN, where table C will have one or zero rows depending on that if the condition is met - it will than join to every row in other table.
Have you tried something like this?
SELECT a.quantity, a.item_date
FROM
(SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_cash_items(loans.id_cash) AS cash_item
HAVING loans.id_product_type = 1
UNION
SELECT COUNT(*) AS quantity, MIN(date) AS item_date
FROM dbo.get_card_items(loans.id_card) AS card_item
HAVING loans.id_product_type = 2) a
WHERE DATEDIFF(day, a.item_date, GETDATE()) >= 60
Typically I do this using a CTE to select the key for the records I want to select and then join on that. Below is an attempt at an example:
with LateClients as
(
SELECT LoadId FROM Payment Where /*payment date later than 60 days*/
)
SELECT p.LoanId,
p.UserId
FROM Payment as p
INNER JOIN LateClients as LC
ON p.LoanId = lc.LoanId
OrderBy p.LoanId, p.UserId
I know it's a bit different from the code you posted, but this is a simplified example that should explain the concept. Good luck!