How to look-up dates and find closest match

How to look-up dates and find closest match - sql

I have this problem using SQL where I want to find the closest matched date between two columns. Imagine this is my data-set:
'LY Date' is just last years date of 'Date' column I want to put into the new column 'Closest Date' which 'Date' that the 'LY Date', for each row, is closest to.
Example: the 22-02-2022 has last years date of 21-02-2021. This date is closest to the 15-02-2021 from 'Date' column so I put it in 'Closest Date'

I’ve referred to the columns as ‘DATE.DATE’ and ‘LYDATE.DATE’.
You can start from the LYDATE and obtain the top 1 record from DATE when DATE is sorted descending and only contains items less than ‘LYDATE’. You can then flip the DATE table and obtain the top 1 after filtering out those items less than LYDATE. This will give you the closest date that’s less than what we are looking for and the closest date that’s greater than what we are looking for.
I then calculate the date difference between those two limits to see which has smaller difference, and return the appropriate limit value as ClosestDate.
SELECT
CASE WHEN DATEDIFF(ns,lowerlimit.date, LYDate.date) < DATEDIFF(ns, upperlimit.date, LYDATE.date) THEN lowerlimit.date ELSE upperlimit.date END as ClosestDate
FROM LYDate
OUTER APPLY (SELECT TOP 1 DATE FROM DATE WHERE DATE.DATE < LYDATE.DATE ORDER BY DATE.DATE DESC) lowerlimit
OUTER APPLY (SELECT TOP 1 DATE FROM DATE WHERE DATE.DATE > LYDATE.DATE ORDER BY DATE.DATE ASC) upperlimit

Related

adjust recursive sql query to exclude holidays and weekends

I have a dataset like this called data_per_day
instructional_day
points
2023-01-24
2
2023-01-23
2
2023-01-20
1
2023-01-19
0
and so on. the table shows weekdays (days minus holidays and weekends) and the number of points someone has earned. 1 is the start of a streak and 0 is the end of a streak. 2 is max points after a streak has started.
I need to find how long is the latest streak. so in this case the result should be 3
I created a recursive cte but the query returns 2 as the streak count because i'm using lag mechanism with days. instead I need to adjust so that the instructional days are used rather than all dates.
RECURSIVE cte AS (
SELECT
student_unique_id,
instructional_day,
points,
1 AS cnt
FROM
`data_per_day`
WHERE
instructional_day = DATE_ADD(CURRENT_DATE('America/Chicago'), INTERVAL -1 DAY)
UNION ALL
SELECT
a.student_unique_id,
a.instructional_day,
a.points,
c.cnt+1
FROM (
SELECT
*
FROM
`data_per_day`
WHERE
points > 0 ) a
INNER JOIN
cte c
ON
a.student_unique_id = c.student_unique_id
AND a.instructional_day = c.instructional_day - INTERVAL '1' day )
SELECT
student_unique_id,
MAX(cnt) AS streak
FROM
cte --
WHERE
student_unique_id = "419"
GROUP BY
student_unique_id
How do I adjust the query?

This is not a trivial coding exercise, so I won't actually write the code and provide it.
What you have here is a gaps and islands question. You want to identify the largest "island" of days with points within a date range. Depending upon what dates are contained in your data, you may need to generate a list of sequential dates that meet your criteria.
One problem I see is that you are trying to combine the steps to generate the date range (the recursive CTE) with the points. You'll need to separate those steps.
Define the date range.
Generate the dates within the range.
Filter the dates with isweekday = 'no' and isholiday = 'no'. You will probably want to add a row number during this step.
[left] join the dates to your data, including coalesce(points, 0)
Filter the data to points > 0.
Identify the islands.
Identify the largest island per student.

Query to return the matching or nearest previous record by a list of dates

I have a table of records ordered by date. There is a maximum of 1 record per day, but some days there is no record (weekends and bank holidays).
When I query a record by date, if no record exists for that day I am interested in the previous record by date. Eg:
SELECT * FROM rates WHERE date <= $mydate ORDER BY date DESC LIMIT 1;
Given a list of dates, how would I construct a query to return multiple records matching the exact or closest previous record for each date? Is this possible to achieve in a single query?
The array of dates may be spread over a large time frame but I wouldn't necessarily want every record in the entire time span (eg query 20 dates spread over a year long time span).

You can construct the dates as a derived table and then use SQL logic. A lateral join is convenient:
select v.dte, r.*
from (values ($date1), ($date2), ($date3)
) v(dte) left join lateral
(select r.*
from rates r
where r.date <= v.dte
order by r.date desc
limit 1
) r
on 1=1;
You might find it useful to use an array to pass in the dates using an array and using unnest() on that array.

SQL: select datetime values prior to that date based on it's value

I want to select rows for a field MRD which is declared as date where it is prior for that date only.
So
(case when sum (transPoints) > 4 and MRD is that same date then 4
So if a row has a date of today, I want the case when to be triggered when the transaction points are bigger than 4 against all columns with the same date.
As you can imagine the date field will be different against many rows.

Based on what I can understand from your question, it seems that the GROUP BY clause may be what you're looking for. If your date column is in the correct format then you may have to use something like:
SELECT CAST(DateColumn as DATE)
FROM YourTable
GROUP BY CAST(DateColumn as DATE)

How to calculate difference between two rows in a date interval?

I'm trying to compare data from an Access 2010 database based on a date interval. Example I have items from various purchase orders and I want to maintain the history of these item's delivery to a warehouse. So my purchase order has a request for a quantity of 10 of a material, for example, and it can be partially delivered in many deliveries and I want to know how this delivery varied in a date interval. To fill the date field the criteria used is the following: if the item had an update in the QtyPending field, I copy the current row deactivating it with a booelan field, create a new entry with the current update date updating the QtyPending field, so the active record is the actual state of the item. So I have a table that holds informations about these items like that
PO POItem QtyPending Date Active
4500000123 10 10 01/09/2014 FALSE
4500000123 10 8 05/09/2014 TRUE
4500000122 30 5 03/09/2014 FALSE
4500000122 30 1 04/09/2014 TRUE
With this example, for the first item, it means that from date 01/09 to 04/09 the QtyPending field didn't suffer a variation, meaning that the supplier didn't make any delivery to me, but from 01/09 to 05/08 he delivered me a qty of 2 of a material. For the second one, from date 03/09 to 04/09 the supplier delivered me a qty of 4 of a material. So, if I were to be making a report query from 02/09/2014 to 04/09/2014, the expected output is like this:
PO POItem QtyDelivered
4500000123 10 0
4500000122 30 4
And a report from 31/08/2014 to 10/09/2014, would have this output
PO POItem QtyDelivered
4500000123 10 2
4500000122 30 4
I'm not coming up with a query to make this report. Can anyone help me?

There are many ways of solving this. The easiest one would be to simply make a query of all the necessary records between two dates, loop over them and insert into a temporary table the result. This temporary table can then be the source of your report. A lot of people will scream at you for not using a big query instead but getting the result that you want in the fastest and simplest way should be your priority.
Your problem with your schema is that you don't have the QtyDelivered stored for each record. If you would have it, it would be an easy thing to sum over it in order to get needed result. By not storing this value, you have transformed a simple and fast query into a much harder and slower one because you need to recalculate this value in some way or other and you must do this without forgetting the fact that it's possible to have more than two records.
For calculating this value, you can either use a sub-query to retrieve the value from the previous row or a Left join do to the same. Once you have this value, you can subtract these two to get the needed difference; allowing for the possibility of Null value if there is no previous row. Once you have these values, you can now sum over them to get the final result with a Group By. Notice that in order to perform these calculations, you need to have one or two more levels of subquery. The first query should be something like:
Select PO, POItem, QtyPending, (Select Top 1 QtyPending from MyTable T2 where T1.PO = T2.PO and T2.Date < T1.Date And (T2.Date between #Date1 and #Date2) Order by T2.Date Desc) as QtyPending2 from MyTable T1 Where T1.Date between #Date1 and #Date2) ...
With this as either another subquery or as a View, you can then compute the desired difference by comparing the values of QtyPending and QtyPending2; without forgetting that QtyPendin2 may be Null. The remaining steps are easy to do.
Notice that the above example is for SQL-Server, you might have to change it a little for Access. In any case, you can find here many examples on how to compare two rows under Access. As noted earlier, you can also use a Left Join instead of a subquery to compare your rows.

I came up with this query that solved the problem, it wasn't that simple
SELECT
ItmDtIni.PO
,ItmDtIni.POItem AS [PO Item]
,ROUND(ItmDtIni.QtyPending - ItmDtEnd.QtyPending, 3) AS [Qty Delivered]
,ROUND((ItmDtIni.QtyPending - ItmDtEnd.QtyPending) * ItmDtEnd.Price, 2) AS [Value delivered(US$)]
//Filtering subqueries to bring only the items in the date interval to make a self join
FROM (((SELECT
PO
,POItem
,QtyPending
,MIN(Date) AS MinDate
FROM Item
WHERE Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy')
GROUP BY
PO
,POItem
,QtyPending) AS ItmDtIni
//Self join filtering to bring only items in the date interval with the previously filtered table
INNER JOIN (SELECT
PO
,POItem
,QtyPending
,Price
,MAX(Date) AS MaxDate
FROM Item
WHERE Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy')
GROUP BY
PO
,POItem
,QtyPending
,Price) AS ItmDtEnd
ON ItmDtIni.PO = ItmDtEnd.PO
AND ItmDtIni.POItem = ItmDtEnd.POItem)
INNER JOIN PO
ON ItmDtEnd.PO = PO.Numero)
WHERE
//Showing only items that had a variation in the date interval
ROUND(ItmDtIni.QtyPending - ItmDtEnd.QtyPending, 3) <> 0
//Anchoring min date in the interval for each item found by the first subquery
AND ItmDtIni.MinDate = (SELECT MIN(Item.Date)
FROM Item
WHERE
ItmDtIni.PO = Item.PO
AND ItmDtIni.POItem = Item.POItem
AND Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy'))
//Anchoring max date in the interval for each item found by the second subquery
AND ItmDtEnd.MaxDate = (SELECT MAX(Item.Date)
FROM Item
WHERE
ItmDtEnd.PO = Item.PO
AND ItmDtEnd.POItem = Item.POItem
AND Date BETWEEN FORMAT(begin_date, 'dd/mm/yyyy') AND FORMAT(end_date, 'dd/mm/yyyy'))

Find closest date in SQL Server

I have a table dbo.X with DateTime column Y which may have hundreds of records.
My Stored Procedure has parameter #CurrentDate, I want to find out the date in the column Y in above table dbo.X which is less than and closest to #CurrentDate.
How to find it?

The where clause will match all rows with date less than #CurrentDate and, since they are ordered descendantly, the TOP 1 will be the closest date to the current date.
SELECT TOP 1 *
FROM x
WHERE x.date < #CurrentDate
ORDER BY x.date DESC

Use DateDiff and order your result by how many days or seconds are between that date and what the Input was
Something like this
select top 1 rowId, dateCol, datediff(second, #CurrentDate, dateCol) as SecondsBetweenDates
from myTable
where dateCol < #currentDate
order by datediff(second, #CurrentDate, dateCol)

I have a better solution for this problem i think.
I will show a few images to support and explain the final solution.
Background
In my solution I have a table of FX Rates. These represent market rates for different currencies. However, our service provider has had a problem with the rate feed and as such some rates have zero values. I want to fill the missing data with rates for that same currency that as closest in time to the missing rate. Basically I want to get the RateId for the nearest non zero rate which I will then substitute. (This is not shown here in my example.)
1) So to start off lets identify the missing rates information:
Query showing my missing rates i.e. have a rate value of zero
2) Next lets identify rates that are not missing.
Query showing rates that are not missing
3) This query is where the magic happens. I have made an assumption here which can be removed but was added to improve the efficiency/performance of the query. The assumption on line 26 is that I expect to find a substitute transaction on the same day as that of the missing / zero transaction.
The magic happens is line 23: The Row_Number function adds an auto number starting at 1 for the shortest time difference between the missing and non missing transaction. The next closest transaction has a rownum of 2 etc.
Please note that in line 25 I must join the currencies so that I do not mismatch the currency types. That is I don't want to substitute a AUD currency with CHF values. I want the closest matching currencies.
Combining the two data sets with a row_number to identify nearest transaction
4) Finally, lets get data where the RowNum is 1
The final query
The query full query is as follows;
; with cte_zero_rates as
(
Select *
from fxrates
where (spot_exp = 0 or spot_exp = 0)
),
cte_non_zero_rates as
(
Select *
from fxrates
where (spot_exp > 0 and spot_exp > 0)
)
,cte_Nearest_Transaction as
(
select z.FXRatesID as Zero_FXRatesID
,z.importDate as Zero_importDate
,z.currency as Zero_Currency
,nz.currency as NonZero_Currency
,nz.FXRatesID as NonZero_FXRatesID
,nz.spot_imp
,nz.importDate as NonZero_importDate
,DATEDIFF(ss, z.importDate, nz.importDate) as TimeDifferece
,ROW_NUMBER() Over(partition by z.FXRatesID order by abs(DATEDIFF(ss, z.importDate, nz.importDate)) asc) as RowNum
from cte_zero_rates z
left join cte_non_zero_rates nz on nz.currency = z.currency
and cast(nz.importDate as date) = cast(z.importDate as date)
--order by z.currency desc, z.importDate desc
)
select n.Zero_FXRatesID
,n.Zero_Currency
,n.Zero_importDate
,n.NonZero_importDate
,DATEDIFF(s, n.NonZero_importDate,n.Zero_importDate) as Delay_In_Seconds
,n.NonZero_Currency
,n.NonZero_FXRatesID
from cte_Nearest_Transaction n
where n.RowNum = 1
and n.NonZero_FXRatesID is not null
order by n.Zero_Currency, n.NonZero_importDate

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas