SQL adding missing dates to query - sql

I'm trying to add missing dates to a SQL query but it does not work.
Please can you tell me what I'm doing wrong.
I only have read only rights to database.
SQL query:
With cteDateGen AS
(
SELECT 0 as Offset, CAST(DATEADD(dd, 0, '2015-11-01') AS DATE) AS WorkDate
UNION ALL
SELECT Offset + 1, CAST(DATEADD(dd, Offset, '2015-11-05') AS DATE)
FROM cteDateGen
WHERE Offset < 100
), -- generate date from to --
cte AS (
SELECT COUNT(*) OVER() AS 'total' ,ROW_NUMBER()OVER (ORDER BY c.dt DESC) as row
, c.*
FROM clockL c
RIGHT JOIN cteDateGen d ON CAST(c.dt AS DATE) = d.WorkDate
WHERE
c.dt between '2015-11-01' AND '2015-11-05' and
--d.WorkDate BETWEEN '2015-11-01' AND '2015-11-05'
and c.id =10
) -- select user log and add missing dates --
SELECT *
FROM cte
--WHERE row BETWEEN 0 AND 15
--option (maxrecursion 0)

I think your problem is simply the dates in the CTE. You can also simplify it a bit:
With cteDateGen AS (
SELECT 0 as Offset, CAST('2015-11-01' AS DATE) AS WorkDate
UNION ALL
SELECT Offset + 1, DATEADD(day, 1, WorkDate) AS DATE)
-----------------------------------^
FROM cteDateGen
WHERE Offset < 100
), -- generate date from to --
cte AS
(SELECT COUNT(*) OVER () AS total,
ROW_NUMBER() OVER (ORDER BY c.dt DESC) as row,
c.*
FROM cteDateGen d LEFT JOIN
clockL c
ON CAST(c.dt AS DATE) = d.WorkDate AND c.id = 10
-----------------------------------------------^
WHERE d.WorkDate between '2015-11-01' AND '2015-11-05'
) -- select user log and add missing dates --
SELECT *
FROM cte
Notes:
Your query used a constant for the second date in the CTE. The constant was different from the first constant. Hence, it was missing some days.
I think that LEFT JOIN is much easier to follow than RIGHT JOIN. LEFT JOIN is basically "keep all rows in the first table".
The WHERE clause was undoing the outer join in any case. The c.id logic needs to move to the ON clause.
The date arithmetic in the first CTE was unnecessarily complex.

Related

Query to pick value depending on date

I have a table with exchange rates which update only when a new exchange rate comes, that is, the only the date that the new rate entered is recorded. however the system has logic to say if any date fall within a particular date, it picks the corresponding exchange rate
i would like to have a query which picks the required exchange rate given any date supplied, i.e., pick the rate from the period.
WITH ListDates(AllDates) AS
( SELECT cast('2015-11-01' as date) AS DATE
UNION ALL
SELECT DATEADD(DAY,1,AllDates)
FROM ListDates
WHERE AllDates < getdate())
SELECT ld.AllDates,cr.effective_from,cr.rate_against_base
FROM ListDates ld
left join CurrencyRatetable cr on cr.effective_from between cr.effective_from and ld.alldates
option (maxrecursion 0)
I guess you might want to achieve the required result using the window function LEAD. Following an example:
DECLARE #t TABLE(effective_from date, rate_against_base decimal(19,4))
INSERT INTO #t VALUES
('2000-01-01', 1.6)
,('2016-10-26', 1)
,('2020-07-13', 65.8765);
DECLARE #searchDate DATE = '2023-01-17';
WITH cte AS(
SELECT effective_from
,ISNULL(LEAD(effective_from) OVER (ORDER BY effective_from), CAST('2049-12-31' AS DATE)) AS effective_to
,rate_against_base
FROM #t
)
SELECT rate_against_base
FROM cte
WHERE #searchDate >= effective_from
AND #searchDate < effective_to
You can use a CROSS APPLY or OUTER APPLY together with a TOP 1 subselect.
Something like:
WITH ListDates(AllDates) AS (
SELECT cast('2015-11-01' as date) AS DATE
UNION ALL
SELECT DATEADD(DAY,1,AllDates)
FROM ListDates
WHERE AllDates < getdate()
)
SELECT ld.AllDates, cr.effective_from, cr.rate_against_base
FROM ListDates ld
OUTER APPLY (
SELECT TOP 1 *
FROM CurrencyRatetable cr
WHERE cr.effective_from <= ld.alldates
ORDER BY cr.effective_from DESC
) cr
ORDER BY ld.AllDates
option (maxrecursion 0)
Both CROSS APPLY or OUTER APPLY are like a join to a subselect. The difference is that CROSS APPLY is like an inner join and OUTER APPLY is like a left join.
Make sure that CurrencyRatetable has an index on effective_from for efficient access.
See this db<>fiddle.

How to extrapolate dates in SQL Server to calculate the daily counts?

This is how the data looks like. It's a long table
I need to calculate the number of people employed by day
How to write SQL Server logic to get this result? I treid to create a DATES table and then join, but this caused an error because the table is too big. Do I need a recursive logic?
For future questions, don't post images of data. Instead, use a service like dbfiddle. I'll anyhow add a sketch for an answer, with a better-prepared question you could have gotten a complete answer. Anyhow here it goes:
-- extrema is the least and the greatest date in staff table
with extrema(mn, mx) as (
select least(min(hired),min(retired)) as mn
, greatest(max(hired),max(retired)) as mx
from staff
), calendar (dt) as (
-- we construct a calendar with every date between extreme values
select mn from extrema
union all
select dateadd(day, 1, d)
from calendar
where dt < (select mx from extrema)
)
-- finally we can count the number of employed people for each such date
select dt, count(1)
from calendar c
join staff s
on c.dt between s.hired and s.retired
group by dt;
If you find yourself doing this kind of calculation often, it is a good idea to create a calendar table. You can add other attributes to it such as if it is a day of in the middle of the week etc.
With a constraint as:
CHECK(hired <= retired)
the first part can be simplified to:
with extrema(mn, mx) as (
select min(hired) as mn
, max(retired) as mx
from staff
),
Assuming Current Employees have a NULL retirement date
Declare #Date1 date = '2015-01-01'
Declare #Date2 date = getdate()
Select A.Date
,HeadCount = count(B.name)
From ( Select Top (DateDiff(DAY,#Date1,#Date2)+1)
Date=DateAdd(DAY,-1+Row_Number() Over (Order By (Select Null)),#Date1)
From master..spt_values n1,master..spt_values n2
) A
Left Join YourTable B on A.Date >= B.Hired and A.Date <= coalesce(B.Retired,getdate())
Group BY A.Date
You need a calendar table for this. You start with the calendar, and LEFT JOIN everything else, using BETWEEN logic.
You can use a real table. Or you can generate it on the fly, like this:
WITH
L0 AS ( SELECT c = 1
FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1),(1),(1),(1)) AS D(c) ),
L1 AS ( SELECT c = 1 FROM L0 A, L0 B, L0 C, L0 D ),
Nums AS ( SELECT rownum = ROW_NUMBER() OVER(ORDER BY (SELECT 1))
FROM L1 ),
Dates AS (
SELECT TOP (DATEDIFF(day, '20141231', GETDATE()))
Date = DATEADD(day, rownum, '20141231')
FROM Nums
)
SELECT
d.Date,
NumEmployed = COUNT(*)
FROM Dates d
JOIN YourTable t ON d.Date BETWEEN t.Hired AND t.Retired
GROUP BY
d.Date;
If your dates have a time component then you need to use >= AND < logic
Try limiting the scope of your date table. In this example I have a table of dates named TallyStickDT.
SELECT dt, COUNT(name)
FROM (
SELECT dt
FROM tallystickdt
WHERE dt >= (SELECT MIN(hired) FROM #employees)
AND dt <= GETDATE()
) A
LEFT OUTER JOIN #employees E ON A.dt >= E.Hired AND A.dt <= e.retired
GROUP BY dt
ORDER BY dt

How to loop through dates, 30 times to find historical information at points in time?

I am trying to find out the transaction details of customers at different points in time in the past. I came up with a query but I have to change the date every single time in my declare statement. Is there a way to loop through the dates to get data back x amount of days in the past?
DECLARE #StartDate AS Date
SET #StartDate = '2018-10-30'
SELECT u.member_id AS Member_id
,CAST(MAX(pe.executed_time) AS Date) AS Max_Date
,DATEDIFF(dd,#StartDate,MAX(pe.executed_time))*-1+1 AS R
,COUNT(*) AS F
,SUM(pi.Price) AS M
FROM [user] AS u
LEFT JOIN purchase_entry AS pe
ON u.member_id=pe.member_id
LEFT JOIN purchase_item AS pi
ON pe.order_id=pi.order_id
WHERE 1=1
AND pe.[status] = 'purchase'
AND pe.executed_time <= #StartDate -- Change the declared Date to have historical information.
GROUP BY u.id, u.member_id
ORDER BY Recency
I need a way to have my query to loop through the #StartDate for 30 days in the past for instance. For ex: #StartDate = '2018-11-30' then #StartDate '2018-11-29' , and so on ...
You can list the dates that you want:
WITH dates as (
SELECT v.*
FROM (VALUES (CONVERT(DATE, '2018-11-30')), (CONVERT(DATE, '2018-11-29'))
) v(dte)
)
SELECT dates.dte, u.member_id, . . .
FROM dates CROSS JOIN
[user] u JOIN
purchase_entry pe
ON u.member_id = pe.member_id LEFT JOIN
purchase_item pi
ON pe.order_id = pi.order_id
WHERE pe.[status] = 'purchase' AND
pe.executed_time <= date.dte -- Change the declared Date to have historical information.
GROUP BY v.dte, u.id, u.member_id
ORDER BY Recency;
For a specific period of dates, you can construct them using a recursive CTE:
with dates as (
select convert(date, '2018-11-30') as dte, 1 as n
union all
select dateadd(day, -1, date.dte), n + 1
from dates
where n < 30
)
. . .
Use a tally table to generate the date and then query:
DECLARE #StartDate AS Date
SET #StartDate = '2018-10-30'
WITH N AS (
SELECT N
FROM (VALUES(NULL),(NULL),(NULL),(NULL),(NULL)) N(N)),
Tally AS(
SELECT TOP 30 ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) -1 AS I
FROM N N1
CROSS JOIN N N2
CROSS JOIN N N3),
Dates AS(
SELECT DATEADD(DAY, I, #StartDate) AS [Date]
FROM Tally
)
SELECT u.member_id AS Member_id
,CAST(MAX(pe.executed_time) AS Date) AS Max_Date
,DATEDIFF(dd,#StartDate,MAX(pe.executed_time))*-1+1 AS R
,COUNT(*) AS F
,SUM(pi.Price) AS M
FROM [user] AS u
CROSS JOIN Dates D
LEFT JOIN purchase_entry AS pe
ON u.member_id=pe.member_id
LEFT JOIN purchase_item AS pi
ON pe.order_id=pi.order_id
WHERE pe.[status] = 'purchase'
AND pe.executed_time <= D.[Date]
ORDER BY Recency;

Find multiple most recent dates before a given date efficiently?

The following query takes 1.5s and because I need to run it several thousands times, I would like to optimize it. Basically I try to find the first date less than or equal to an array of provided dates (e.g. ['2016-01-01', '2017-01-01', '2018-01-01']). Now I'm doing each date individually:
SELECT date FROM date_history
WHERE ticker = 'APPL' AND date <= %(date)
ORDER BY date DESC LIMIT 1;
I feel as though it might be faster if I could reuse the date sorting or something under those lines but I can't think of a good way to do this. Any suggestions on how to make this faster would be appreciated!
You could use ROW_NUMBER:
WITH cte(d) AS (
VALUES ('2016-01-01'::date)
,('2017-01-01'::date)
,('2018-01-01'::date)
--Or unnest array_variable WITH ORDINALITY
), cte2 AS (
SELECT d.date, c.d,
ROW_NUMBER() OVER(PARTITION BY c.d ORDER BY d.date DESC) AS rn
FROM cte c
LEFT JOIN date_history d
ON d.date <= c.d
WHERE d.ticker = 'APPL'
)
SELECT c.d, d.date AS max_date_before
FROM cte2
WHERE rn = 1
ORDER BY c.d ASC;
Alternatively LEFT JOIN LATERAL and correlated subquery:
WITH cte(d) AS (
VALUES ('2016-01-01'::date)
,('2017-01-01'::date)
,('2018-01-01'::date)
--Or unnest array_variable WITH ORDINALITY
)
SELECT *
FROM cte c,
LEFT JOIN LATERAL (SELECT MAX(date) AS max_date_before
FROM date_history d
WHERE d.ticker = 'APPL'
AND d.date <= c.d) s;

SQL How to group by but with special conditions

I have a SQL Server 2008 table where I have a list of employees with timestamps.
I have a script that groups by employee the dates.
What I need is to group by employee but I have to exclude the timestamps that are in the same day and the difference between them are less than 8 hours.
Here is a table that explains better:
I created a SQL Fiddle with the table and sample data.
http://sqlfiddle.com/#!3/3b956/1
Any clue?
What you really want is lag(), which is in SQL Server 2012+. With lag(), you would do:
select t.*
from (select t.*, lag(date) over (partition by EmployeeId order by date) as prev_date
from t
) t
where not (cast(prev_date as date) = cast(date as date) and
date <= dateadd(hour, 8, prev_date)
) or
prev_date is null;
In SQL Server 2008, you can do something similar with outer apply:
select t.*
from t outer apply
(select top 1 prev.*
from t prev
where prev.Employee_id = t.EmployeeId and
prev.date < t.date and
cast(prev.date as date) = cast(t.date as date)
order by prev.date desc
) prev
where prev.date is null or
t.date > dateadd(hour, 8, prev.date);
You may need an order by to maintain the same ordering.
This should also work by excluding rows for which there exist previuos row with diffrence less than 8 hours:
select p1.employeeid, count(*) as [count]
from punch p1
where not exists(select * from punch p2
where p2.employeeid = p1.employeeid and p2.id < p1.id and
dateadd(hour, 8, p2.date) > p1.date)
group by p1.employeeid