Find the latest record - sql

In MS Access I have the DateList table, which holds the due date of different orders. Thus, the table has two columns: OrderNo and DueDate. For some order numbers, there could be multiple DueDates. The table could look like below:
OrderNo DueDate
100 12/9/2021
101 20/9/2021
102 30/9/2021
100 7/10/2021
102 11/10/2021
103 15/10/2021
…
My goal is write a query to fetch the latest DueDate of each OrderNr.
I created two queries;
the first one, qry1, to generate a list of OrdNo without duplications:
SELECT
DateList.OrderNo AS UniqOrderNo
FROM DateList
GROUPBY DateList.OrderNo;
in the second query, qry2, I used the DMax function in order to search through DueDates of each order for the maximum value.
SELECT
qry1.UniqOrderNo
,DMax("[DueDate]","[DateList]","[OrderNo]='[qry1]![UniqOrderNo]'") AS LatDuDate
FROM qry1
INNER JOIN DateList
ON qry1.UniqOrderNo = DateList.OrderNo;
LatDuDate represents the latest DueDate of the Order.
The query is unfortunately does not work and returns nothing.
Now my questions:
Is there something wrong with my approach / queries?
Is there better way to accomplish this task in MS Access?

You almost figured it out yourself. Max returns you the biggest value of the group.
SELECT Max(DueDate) DueDate, OrderNo
FROM DateList
GROUP BY OrderNo

Similar to Christian's answer, but since OrderNo is a unique id, you can simply select the First() instead of grouping - it performs better. **
Of course it depends on the number of records the table holds.
SELECT First(OrderNo) AS OrderNo, Max(DueDate) AS DueDate
FROM DateList;
** Source: Allen Browne - Optimizing queries

Related

Sum Column for Running Total where Overlapping Date

I have a table with about 3 million rows of Customer Sales by Date.
For each CustomerID row I need to get the sum of the Spend_Value
WHERE Order_Date BETWEEN Order_Date_m365 AND Order_Date
Order_Date_m365 = OrderDate minus 365 days.
I just tried a self join but of course, this gave the wrong results due to rows overlapping dates.
If there is a way with Window Functions this would be ideal but I tried and can't do the between dates in the function, unless I missed a way.
Tonly way I can think now is to loop so process all rank 1 rows into a table, then rank 2 in a table, etc., but this will be really inefficient on 3 million rows.
Any ideas on how this is usually handled in SQL?
SELECT CustomerID,Order_Date_m365,Order_Date,Spend_Value
FROM dbo.CustomerSales
Window functions likely won't help you here, so you are going to need to reference the table again. I would suggest that you use an APPLY with a subquery to do this. Provided you have relevant indexes then this will likely be the more efficient approach:
SELECT CS.CustomerID,
CS.Order_Date_m365,
CS.Order_Date,
CS.Spend_Value,
o.output
FROM dbo.CustomerSales CS
CROSS APPLY (SELECT SUM(Spend_Value) AS output
FROM dbo.CustomerSales ca
WHERE ca.CustomerID = CS.CustomerID
AND ca.Order_Date >= CS.Order_Date_m365 --Or should this is >?
AND ca.Order_Date <= CS.Order_Date) o;

Ignore duplicates in results of a select statement based upon secondary column

Sorry if the title is a bit confusing, this is my first time posting.
Essentially, I have a table called roombooking, where a room has a room number(r_no), a bookingref (b_ref) and a checkin and checkout date (checkin and checkout respectively). Due to multiple different b_refs, an r_no appears in the table multiple times, with varying checkin and checkout dates.
What I want is to select all r_nos where checkin != "dateX", and for it to display only rooms where it, and any duplicates, do not contain "dateX" in the checkin column.
To provide an example data:
R_NO B_REF CHECKIN
101 999 2019-09
101 998 2019-08
102 997 2019-07
What I essentially want to see when I run my SQL statement (where dateX = 2019-09) is for it to only select 102, as despite 101 (b_ref 998) having a different checkin date, it's duplicate has 2019-09 in the checkin column and so neither appear as a result.
For those wondering, my current SQL is:
SELECT DISTINCT r_no
from roombooking
where checkin != '2019-09';
However (using the example data) this would return both 101 and 102 as results, which I don't want.
Hopefully, this is clear, and again I apologize if not, it's my first time posting.
Break it down into 2 conditions to apply as your filters -
checkin should not equal the specified date
r_no should not be the same r_no in rows where checkin is equal to the specified date
For example,
SELECT DISTINCT r_no FROM roombooking
WHERE
checkin != '2019-09' AND
r_no NOT IN (SELECT DISTINCT r_no FROM roombooking WHERE checkin = '2019-09')
There are multiple ways to achieve this, depending on your use-case and data size. A few options are
Use a sub-query to select duplicate rooms and eliminate them in your main query, as shown above
Use a CTE to select duplicate rooms and eliminate them in your main query by joining with the CTE
Self join on the same table to eliminate duplicate rooms
As far as I understand your requirements, an easy way to go is
select distinct r_no from roombooking r1
where not exists (
select * from roombooking r2
where r1.r_no = r2.r_no
and r2.checkin = '2019-09'

Given a single column of effective dates, is there a SQL statement that can transform that into date ranges?

Similar to another question I've posted, given the following table...
Promo EffectiveDate
------ -------------
PromoA 1/1/2016
PromoB 4/1/2016
PromoC 7/1/2016
PromoD 10/1/2016
PromoE 1/1/2017
What is the easiest way to transform it into start and end dates, like so...
Promo StartDate EndDate
------ --------- ---------
PromoA 1/1/2016 4/1/2016
PromoB 4/1/2016 7/1/2016
PromoC 7/1/2016 10/1/2016
PromoD 10/1/2016 1/1/2017
PromoE 1/1/2017 null (ongoing until a new Effective Date is added)
Update
Correlated queries seem to be the simplest solution, but as I understand it, they are extremely inefficient since the subquery has to run once per row of the outer select.
What I was thinking as a potential solution was something along the lines of selecting the values from the table a second time, but eliminating the first result, then pairing them up with the first select by ordinal index with a simple outer left join.
As an example, substituting letters for dates above, the first select would be like A,B,C,D,E and second would be B,C,D,E (which is the first select minus the first record 'A') then pairing them up by ordinal index with a simple outer left join, resulting in A-B, B-C, C-D, D-E, E-null. However I couldn't figure out the syntax to make that work.
A correlated sub-query can lookup the additional field you need.
SELECT
yourTable.*,
(
SELECT MIN(lookup.EffectiveDate)
FROM yourTable AS lookup
WHERE lookup.EffectiveDate > yourTable.EffectiveDate
)
FROM
yourTable
EDIT
The notion of "has to run once per row" is a mis-understanding of how SQL generates the execution plan that actually runs. The same can be said for joining one table to another, the join has to be run at-least once per row... There is indeed a larger cost to a correlated sub-query, but with appropriate indexes it won't be "extemely high", and the functionality described does warrant it.
If you had another field that was guaranteed to be sequential, then it would be trivial, but do not try to re-use the existing Promo field for that additional purpose.
SELECT
this.*,
next.EffectiveEpoch
FROM
yourTable this
LEFT JOIN
yourTable next
ON next.sequential_id = this.sequential_id + 1
Yes, you can use a correlated query with LIMIT :
SELECT t.promo,t.effectiveDate as start_date,
(SELECT s.effectiveDate FROM YourTable s
WHERE s.date > t.date
ORDER BY s.effectiveDate
LIMIT 1) as end_date
FROM YourTable t
EDIT: Here is a solution with a join :
SELECT t.promo,t.effectiveDate as start_date,
MIN(s.effectiveDate) as end_date
FROM YourTable t
LEFT JOIN YourTable s
ON(t.date < s.date)
GROUP BY t.promo,t.effectiveDate
show this, use subquery
select
p.promo,
p.EffectiveDate as "Start",
(select n.EffectiveDate from table_promo n where n.EffectiveDate >
p.EffectiveDate order by n.EffectiveDate limit 1) as "End"
from table_promo p

SQL statement to match dates that are the closest?

I have the following table, let's call it Names:
Name Id Date
Dirk 1 27-01-2015
Jan 2 31-01-2015
Thomas 3 21-02-2015
Next I have the another table called Consumption:
Id Date Consumption
1 26-01-2015 30
1 01-01-2015 20
2 01-01-2015 10
2 05-05-2015 20
Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.
So the problem is as follows, I would like to match each Id from the Names table with the Consumption table provided that the difference between the dates are the lowest, so we have: Dirk consumes on 27-01-2015 about 30. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.
While I know how to join, I do not know how to code the difference part.
Thanks.
DBMS is Microsoft SQL Server 2012.
I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.
This is how you could it in SQL Server:
SELECT Id, Name, AVG(Consumption)
FROM (
SELECT n.Id, Name, Consumption,
RANK() OVER (PARTITION BY n.Id
ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
FROM Names AS n
INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name
Using RANK with PARTITION BY n.Id and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date])) you can locate all matching records per Id: all records with the smallest difference in days are going to have rnk = 1.
Then, using AVG in the outer query, you are calculating the average value of Consumption between all matching records.
SQL Fiddle Demo

How to get duplicate values in all rows filtering by one column

Here is what my table looks like.
Person Date Entry
Person1 05-20-14 142
Person2 05-20-14 443
Person1 05-21-14 248
Person1 05-21-14 142
I need two things.
First the number of times a Person made an entry for the first time.
I tried doing it with these queries. But the problem is I need this information per day.
That is if I query for 05/21, I need to see output
"Person1 1"
142 wont be included because it already exists.
In my query, I am filtering by date already, so I am not sure how to go out and search in the rest of the dates values. Here is what I have.
SELECT PERSON, Count(distinct Entry)
from [table]
where date >= 05/21/2014
and date < 05/22/2014
group by person
order by person.
This gives me
Person1 2
Both 248 and 142 are considered here. How do I look for 142 was an entry already made in previous dates. I am not very good at nested queries.
Thanks for looking.
Will this solve your problem or give you an idea how inner query should be?
SELECT PERSON, Count(distinct Entry)
from [table]
where date >= 05/21/2014
and date < 05/22/2014
and Entry not in (select distinct entry from [table] where date <> 05/21/2014)
group by person
order by person.
in the above query i have just added an inner query to get the distinct entry from other dates
select distinct entry from [table] where date <> 05/21/2014
and i have added the where condition that the current result should not consider those entries by
and Entry not in (select distinct entry from [table] where date <> 05/21/2014)
hope this helps you.
For the first query, it sounds like you need something like this:
SELECT Person
FROM sample
GROUP BY date, person
HAVING date = '05-21-2014'
See http://sqlfiddle.com/#!3/4653d/1
This might also help:
SELECT Person, date
FROM sample
GROUP BY date, person
ORDER BY date
Hopefully that helps, let me know if I am misunderstanding something..