Query to return the matching or nearest previous record by a list of dates - sql

I have a table of records ordered by date. There is a maximum of 1 record per day, but some days there is no record (weekends and bank holidays).
When I query a record by date, if no record exists for that day I am interested in the previous record by date. Eg:
SELECT * FROM rates WHERE date <= $mydate ORDER BY date DESC LIMIT 1;
Given a list of dates, how would I construct a query to return multiple records matching the exact or closest previous record for each date? Is this possible to achieve in a single query?
The array of dates may be spread over a large time frame but I wouldn't necessarily want every record in the entire time span (eg query 20 dates spread over a year long time span).

You can construct the dates as a derived table and then use SQL logic. A lateral join is convenient:
select v.dte, r.*
from (values ($date1), ($date2), ($date3)
) v(dte) left join lateral
(select r.*
from rates r
where r.date <= v.dte
order by r.date desc
limit 1
) r
on 1=1;
You might find it useful to use an array to pass in the dates using an array and using unnest() on that array.

Related

adjust recursive sql query to exclude holidays and weekends

I have a dataset like this called data_per_day
instructional_day
points
2023-01-24
2
2023-01-23
2
2023-01-20
1
2023-01-19
0
and so on. the table shows weekdays (days minus holidays and weekends) and the number of points someone has earned. 1 is the start of a streak and 0 is the end of a streak. 2 is max points after a streak has started.
I need to find how long is the latest streak. so in this case the result should be 3
I created a recursive cte but the query returns 2 as the streak count because i'm using lag mechanism with days. instead I need to adjust so that the instructional days are used rather than all dates.
RECURSIVE cte AS (
SELECT
student_unique_id,
instructional_day,
points,
1 AS cnt
FROM
`data_per_day`
WHERE
instructional_day = DATE_ADD(CURRENT_DATE('America/Chicago'), INTERVAL -1 DAY)
UNION ALL
SELECT
a.student_unique_id,
a.instructional_day,
a.points,
c.cnt+1
FROM (
SELECT
*
FROM
`data_per_day`
WHERE
points > 0 ) a
INNER JOIN
cte c
ON
a.student_unique_id = c.student_unique_id
AND a.instructional_day = c.instructional_day - INTERVAL '1' day )
SELECT
student_unique_id,
MAX(cnt) AS streak
FROM
cte --
WHERE
student_unique_id = "419"
GROUP BY
student_unique_id
How do I adjust the query?
This is not a trivial coding exercise, so I won't actually write the code and provide it.
What you have here is a gaps and islands question. You want to identify the largest "island" of days with points within a date range. Depending upon what dates are contained in your data, you may need to generate a list of sequential dates that meet your criteria.
One problem I see is that you are trying to combine the steps to generate the date range (the recursive CTE) with the points. You'll need to separate those steps.
Define the date range.
Generate the dates within the range.
Filter the dates with isweekday = 'no' and isholiday = 'no'. You will probably want to add a row number during this step.
[left] join the dates to your data, including coalesce(points, 0)
Filter the data to points > 0.
Identify the islands.
Identify the largest island per student.

Get a Row if within certain time period of other row

I have a SQL statement that I am currently using to return a number of rows from a database:
SELECT
as1.AssetTagID, as1.TagID, as1.CategoryID,
as1.Description, as1.HomeLocationID, as1.ParentAssetTagID
FROM Assets AS as1
INNER JOIN AssetsReads AS ar ON as1.AssetTagID = ar.AssetTagID
WHERE
(ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
I am wanting to do a query that will get the row with the oldest DateScanned from this query and also get another row from the database if there was one that was within a certain period of time from this row (say 5 seconds for an example). The oldest record would be relatively simple by selecting the first record in a descending sort, but how would I also get the second record if it was within a certain time period of the first?
I know I could do this process with multiple queries, but is there any way to combine this process into one query?
The database that I am using is SQL Server 2008 R2.
Also please note that the DateScanned times are just placeholders and I am taking care of that in the application that will be using this query.
Here is a fairly general way to approach it. Get the oldest scan date using min() as a window function, then use date arithmetic to get any rows you want:
select t.* -- or whatever fields you want
from (SELECT as1.AssetTagID, as1.TagID, as1.CategoryID,
as1.Description, as1.HomeLocationID, as1.ParentAssetTagID,
min(DateScanned) over () as minDateScanned, DateScanned
FROM Assets AS as1
INNER JOIN AssetsReads AS ar ON as1.AssetTagID = ar.AssetTagID
WHERE (ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
) t
where datediff(second, minDateScanned, DateScanned) <= 5;
I am not really sure of sql server syntax, but you can do something like this
SELECT * FROM (
SELECT
TOP 2
as1.AssetTagID,
as1.TagID,
as1.CategoryID,
as1.Description,
as1.HomeLocationID,
as1.ParentAssetTagID ,
ar.DateScanned,
LAG(ar.DateScanned) OVER (order by ar.DateScanned desc) AS lagging
FROM
Assets AS as1
INNER JOIN AssetsReads AS ar
ON as1.AssetTagID = ar.AssetTagID
WHERE (ar.ReadPointLocationID='Readpoint1' OR ar.ReadPointLocationID='Readpoint2')
AND (ar.DateScanned between 'LastScan' AND 'Now')
AND as1.TagID!='000000000000000000000000'
ORDER BY
ar.DateScanned DESC
)
WHERE
lagging IS NULL or DateScanned - lagging < '5 SECONDS'
I have tried to sort the results by DateScanned desc and then just the top most 2 rows. I have then used the lag() function on DateScanned field, to get the DateScanned value for the previous row. For the topmost row the DateScanned shall be null as its the first record, but for the second one it shall be value of the first row. You can then compare both of these values to determine whether you wish to display the second row or not
more info on the lagging function: http://blog.sqlauthority.com/2011/11/15/sql-server-introduction-to-lead-and-lag-analytic-functions-introduced-in-sql-server-2012/

How to have GROUP BY and COUNT include zero sums?

I have SQL like this (where $ytoday is 5 days ago):
$sql = 'SELECT Count(*), created_at FROM People WHERE created_at >= "'. $ytoday .'" AND GROUP BY DATE(created_at)';
I want this to return a value for every day, so it would return 5 results in this case (5 days ago until today).
But say Count(*) is 0 for yesterday, instead of returning a zero it doesn't return any data at all for that date.
How can I change that SQLite query so it also returns data that has a count of 0?
Without convoluted (in my opinion) queries, your output data-set won't include dates that don't exist in your input data-set. This means that you need a data-set with the 5 days to join on to.
The simple version would be to create a table with the 5 dates, and join on that. I typically create and keep (effectively caching) a calendar table with every date I could ever need. (Such as from 1900-01-01 to 2099-12-31.)
SELECT
calendar.calendar_date,
Count(People.created_at)
FROM
Calendar
LEFT JOIN
People
ON Calendar.calendar_date = People.created_at
WHERE
Calendar.calendar_date >= '2012-05-01'
GROUP BY
Calendar.calendar_date
You'll need to left join against a list of dates. You can either create a table with the dates you need in it, or you can take the dynamic approach I outlined here:
generate days from date range

SQL Difference between using two dates

My database has a table called 'clientordermas'. Two columns of that table are as follows.
execid and filledqty.
Three records of those two fields are as follows.
E02011/03/12-05:57_24384 : 1000
E02011/03/12-05:57_24384 : 800
E02011/03/09-05:57_24384 : 600
What i need to do is get the filledqty diffrence btween latest date and 1 before latest date which is 400(1000-400).
I have extracted the date from the execid as follows:
(SUBSTR (execid, 3, 10)
I tried so hard but but I was unable to write the sql query to get 400. Can someone please help me to do this???
P.S I need to select maximum filled quantity from the same date. That is 1000 not, 800.
You can use window functions to access "nearby" rows, so if you first clean up the data in a subquery and then use window functions to access the next row, you should get the right results. But unless you have an index on substr(execid, 3, 10), this is going to be be slow.
WITH datevalues AS
(
SELECT max(filledqty) maxfilledqty, substr(execid, 3, 10) execiddate
FROM clientordermas
GROUP BY substr(execid, 3, 10)
)
SELECT
execiddate,
maxfilledqty -
last_value(maxfilledqty) over(ORDER BY execiddate DESC ROWS BETWEEN 0 PRECEDING AND 1 FOLLOWING)
FROM datevalues
ORDER BY execiddate DESC;
WITH maxqtys AS (
SELECT substr(a.execid,3,10) AS date, MAX(a.filledqty) AS maxqty
FROM clientordermas a
GROUP BY substr(a.execid,3,10)
)
SELECT a.maxqty-b.maxqty
FROM maxqtys a, maxqtys b
WHERE a.date <> b.date AND ROWNUM=1
ORDER BY a.date DESC, b.date DESC
This first creates a subquery (maxqty) which contains the max filledqty for each unique date, then cross join this subquery to itself, excluding the same rows. This results in a table containing pairs of dates (excluding the same dates).
Sort these pairs by date descending, and the top row till contain the last and 2nd-to-last date, with the appropriate quantities.

MySQL to get the count of rows that fall on a date for each day of a month

I have a table that contains a list of community events with columns for the days the event starts and ends. If the end date is 0 then the event occurs only on the start day. I have a query that returns the number of events happening on any given day:
SELECT COUNT(*) FROM p_community e WHERE
(TO_DAYS(e.date_ends)=0 AND DATE(e.date_starts)=DATE('2009-05-13')) OR
(DATE('2009-05-13')>=DATE(e.date_starts) AND DATE('2009-05-13')<=DATE(e.date_ends))
I just sub in any date I want to test for "2009-05-13".
I need to be be able to fetch this data for every day in an entire month. I could just run the query against each day one at a time, but I'd rather run one query that can give me the entire month at once. Does anyone have any suggestions on how I might do that?
And no, I can't use a stored procedure.
Try:
SELECT COUNT(*), DATE(date) FROM table WHERE DATE(dtCreatedAt) >= DATE('2009-03-01') AND DATE(dtCreatedAt) <= DATE('2009-03-10') GROUP BY DATE(date);
This would get the amount for each day in may 2009.
UPDATED: Now works on a range of dates spanning months/years.
Unfortunately, MySQL lacks a way to generate a rowset of given number of rows.
You can create a helper table:
CREATE TABLE t_day (day INT NOT NULL PRIMARY KEY)
INSERT
INTO t_day (day)
VALUES (1),
(2),
…,
(31)
and use it in a JOIN:
SELECT day, COUNT(*)
FROM t_day
JOIN p_community e
ON day BETWEEN DATE(e.start) AND IF(DATE(e.end), DATE(e.end), DATE(e.start))
GROUP BY
day
Or you may use an ugly subquery:
SELECT day, COUNT(*)
FROM (
SELECT 1 AS day
UNION ALL
SELECT 2 AS day
…
UNION ALL
SELECT 31 AS day
) t_day
JOIN p_community e
ON day BETWEEN DATE(e.start) AND IF(DATE(e.end), DATE(e.end), DATE(e.start))
GROUP BY
day