Retrieve last N records from a table - sql

I have searched but not found an answer for my question.
I have a table orders that consists of
id (primary key autonumber)
client_id : identifies each client (unique)
date: order dates for each client
I want to retrieve the last N order dates for each client in a single view
Of course I could use SELECT TOP N date FROM orders WHERE client = 'xx' ORDER DESC and then use UNION for the different values for client. The problem is that with changes in client base the statement would require revision and that the UNION statement is impractical with a large client base.
As an additional requirement this needs to work in Access SQL.

Step 1: Create a query that yields a rank order by date per client for every row. Since Access SQL does not have ROW_NUMBER() OVER (...) like SQL Server, you can simulate this by using the technique described in the following question:
Access query producing results like ROW_NUMBER() in T-SQL
If you have done step 1 correctly, your result should be as follows:
id client_id date rank
----------------------------------
1 2014-12-01 7
1 2014-12-02 6
1 2014-12-05 5
1 2014-12-07 4
1 2014-12-11 3
1 2014-12-14 2
1 2014-12-15 1
2 2014-12-01 2
2 2014-12-02 1
...
Step 2: Use the result from step 1 as a subquery and filter the result such that only records with rank <= N are returned.

I think the following will work in MS Access:
select t.*
from table as t
where t.date in (select top N t2.date
from table as t2
where t2.client_id = t.client_id
order by t2.date desc
);
One problem with MS Access is that top N will retrieve more than N records if there are ties. If you want exactly "N", then you can use order by date, id in the subquery.

Related

SQL State to Find Record that meets certain date criterion

I am use ADO SQL in VBA to try and need the solution to the following example:
Table
ID Effective Date of Documentation
1 1/1/2015
2 6/1/2015
3 1/1/2016
4 6/1/2016
In the example above the documentation for the ID 1 is in effective from 1/1/2015 through 5/31/2015, for ID 2 its in effect from 6/1/2015 through 12/31/2015, and so forth. So if I have a date say 8/1/2015, then I need to return the record which the documentation is in effect for. So in this example the record for ID 2 would be returned. I need some SQL to accomplish this and I cant figure out how! Any ideas? Is this possible to do with this structure or do I need to create some artificial column first for the end date and query that?
You can do this by doing:
select t.*
from table t
where effdate < #date
order by effdate desc
fetch first 1 row only;
This is the SQL standard syntax. Different databases might have different syntax for selecting one row.
EDIT:
In MS Access you would do:
select TOP 1 t.*
from t
where effdate < #date
order by effdate desc;

How can I return rows which match on some columns and fulfil a DateTime comparison between two other columns using SQL?

I have a table which contains rows for jobs, example below, where 01/01/1980 is used rather than null in the ClosedDate column for jobs which are not finished:
JobNumber JobCategory CustomerID CreatedDate ClosedDate
1 Small 1 01/01/2016 03/01/2016
2 Small 2 03/01/2016 07/01/2016
3 Large 2 06/01/2016 07/01/2016
4 Medium 1 08/01/2016 10/01/2016
5 Small 3 10/01/2016 01/01/1980
6 Medium 3 15/01/2016 01/01/1980
7 Large 2 16/01/2016 17/01/2016
8 Large 2 19/01/2016 20/01/2016
9 Small 1 19/01/2016 01/01/1980
10 Medium 2 19/01/2016 01/01/1980
I need to return a list of any jobs where the same customer has had a job of the same category created within 3 days of the previous job being closed.
So, I would want to return:
7 Large 2 16/01/2016 17/01/2016
8 Large 2 19/01/2016 20/01/2016
because Customer 2 had a Large job closed on 17/01/2016 and another Large job opened on 19/01/2016, which is within 3 days.
In order to do this, I assume I need to compare each record in the table with each subsequent record, looking for a match on JobCategory and comparing CreatedDate with ClosedDate between rows.
Can anyone advise my best option for this using SQL? I'm using SQL Server 2012.
The first thing that you should do is get rid of "magic dates" in your system. If the job hasn't been closed yet then the ClosedDate is not known. SQL has a value for exactly that - NULL. That prevents anyone in the future from having to know the magic date of 1/1/1980 or from that having to be hard-coded throughout your system.
Next, you don't have to compare each row with each one after it. Define what you're looking for and find matches that meet those qualifications. You didn't specify which type of SQL Server you're using (you should tag your question with Oracle or MySQL or SQL Server), so the below query is written for SQL Server. Your version might have different date functions.
SELECT
J1.JobNumber,
J1.JobCategory,
J1.CustomerID,
J1.CreatedDate,
J1.ClosedDate,
J2.JobNumber,
J2.CreatedDate,
J2.ClosedDate
FROM
Jobs J1
INNER JOIN Jobs J2 ON
J2.CustomerID = J1.CustomerID AND
J2.JobCategory = J1.JobCategory AND
DATEDIFF(DAY, J1.ClosedDate, J2.CreatedDate) BETWEEN 0 AND 3 AND
J2.JobNumber <> J1.JobNumber
This will return the jobs in a single row instead of two rows. If that's a problem then the query could be altered slightly to do so. This can also be done a little more easily with windowed functions, but again, since you didn't specify your SQL vendor I didn't want to use those.
Since you're using SQL Server, you should be able to use windowed functions like so:
;WITH CTE_JobsWithDates AS -- Probably a poor name for the CTE
(
SELECT
JobNumber,
JobCategory,
CustomerID,
CreatedDate,
ClosedDate,
LEAD(CreatedDate, 1) OVER (PARTITION BY JobCategory, CustomerID ORDER BY CreatedDate) AS NextCreatedDate,
LAG(ClosedDate, 1) OVER (PARTITION BY JobCategory, CustomerID ORDER BY CreatedDate) AS PreviousClosedDate
FROM
Jobs
)
SELECT
JobNumber,
JobCategory,
CustomerID,
CreatedDate,
ClosedDate
FROM
CTE_JobsWithDates
WHERE
DATEDIFF(DAY, ClosedDate, NextCreatedDate) BETWEEN 0 AND 3 OR
DATEDIFF(DAY, LastClosedDate, CreatedDate) BETWEEN 0 AND 3
That was off the cuff, so please test and let me know if anything isn't quite right.
Try:
SELECT a.*
FROM
Job AS a
JOIN
Job AS b ON
a.CustomerID = b.CustomerID AND a.JobCategory = b.JobCategory
WHERE
a.JobNumber != b.JobNumber
AND (
b.CreatedDate - a.ClosedDate BETWEEN 0 AND 3
OR
a.CreatedDate - b.ClosedDate BETWEEN 0 AND 3)

Compare two numeric columns between tables in sql

Hi I would like to create a ms sql query two numeric columns (Days Consumed and vs limit day) between tables in ms sql here are my tables...
Table 1
Days consumed case_id case_deficiency
1 101 Debris
5 102 Graffiti
1 103 Not Functioning
on Second table I have
Table 1
id limit day case_deficiency
1 1 Debris
2 3 Graffiti
3 1 Not Functioning
I would like to view/show all case_deficiency if already exceeded the limit day.. in this case Graffiti already exceeded the DAY Consumed > Limit day. Hoping for your helping hand on this situation. Thanks
It looks like you want to link the tables based on case_deficiency. The rest is just an appropriate join. So:
select t1.*
from table1 t1 join
table2 t2
on t1.case_deficiency = t2.case_deficiency
where t2.limitday < t1.daysconsumed;

SQL statement to match dates that are the closest?

I have the following table, let's call it Names:
Name Id Date
Dirk 1 27-01-2015
Jan 2 31-01-2015
Thomas 3 21-02-2015
Next I have the another table called Consumption:
Id Date Consumption
1 26-01-2015 30
1 01-01-2015 20
2 01-01-2015 10
2 05-05-2015 20
Now the problem is, that I think that doing this using SQL is the fastest, since the table contains about 1.5 million rows.
So the problem is as follows, I would like to match each Id from the Names table with the Consumption table provided that the difference between the dates are the lowest, so we have: Dirk consumes on 27-01-2015 about 30. In case there are two dates that have the same "difference", I would like to calculate the average consumption on those two dates.
While I know how to join, I do not know how to code the difference part.
Thanks.
DBMS is Microsoft SQL Server 2012.
I believe that my question differs from the one mentioned in the comments, because it is much more complicated since it involves comparison of dates between two tables rather than having one date and comparing it with the rest of the dates in the table.
This is how you could it in SQL Server:
SELECT Id, Name, AVG(Consumption)
FROM (
SELECT n.Id, Name, Consumption,
RANK() OVER (PARTITION BY n.Id
ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date]))) AS rnk
FROM Names AS n
INNER JOIN Consumption AS c ON n.Id = c.Id ) t
WHERE t.rnk = 1
GROUP BY Id, Name
Using RANK with PARTITION BY n.Id and ORDER BY ABS(DATEDIFF(d, n.[Date], c.[Date])) you can locate all matching records per Id: all records with the smallest difference in days are going to have rnk = 1.
Then, using AVG in the outer query, you are calculating the average value of Consumption between all matching records.
SQL Fiddle Demo

Exponential decay in SQL for different dates page views

I have a different dates with the amount of products viewed on a webpage over a 30 day time frame. I am trying to create a exponential decay model in SQL. I am using exponential decay because I want to highlight the latest events over older ones. I not sure how to write this in SQL without getting an error. I have never done this before with this type of model so want to make sure I am doing it correctly too.
=================================
Data looks like this
product views date
a 1 2014-05-15
a 2 2014-05-01
b 2 2014-05-10
c 4 2014-05-02
c 1 2014-05-12
d 3 2014-05-11
================================
Code:
create table decay model as
select product,views,date
case when......
from table abc
group by product;
not sure what to write to do the model
I want to penalize products that were viewed that were older vs products that were viewed more recently
Thank you for your help
You can do it like this:
Choose the partition in which you want to apply exponential decay, then order descending by date within such a group.
use the function ROW_NUMBER() with ascendent ordering to get the row numbering within each subgroup.
calculate pow(your_variable_in_[0,1], rownum) and apply it to your result.
Code might look like this (might work in Oracle SQL or db2):
SELECT <your_partitioning>, date, <whatever>*power(<your_variable>,rownum-1)
FROM (SELECT a.*
, ROW_NUMBER() OVER (PARTITION BY <your_partitioning> ORDER BY a.date DESC) AS rownum
FROM YOUR_TABLE a)
ORDER BY <your_partitioning>, date DESC
EDIT: I read again over your problem and think I understood now what you asked for, so here is a solution which might work (decay factor is 0.9 here):
SELECT product, sum(adjusted_views) // (i)
FROM (SELECT product, views*power(0.9, rownum-1) AS adjusted_views, date, rownum // (ii)
FROM (SELECT product, views, date // (iii)
, ROW_NUMBER() OVER (PARTITION BY product ORDER BY a.date DESC) AS rownum
FROM YOUR_TABLE a)
ORDER BY product, date DESC)
GROUP BY product
The inner select statement (iii) creates a temporary table that might look like this
product views date rownum
--------------------------------------------------
a 1 2014-05-15 1
a 2 2014-05-14 2
a 2 2014-05-13 3
b 2 2014-05-10 1
b 3 2014-05-09 2
b 2 2014-05-08 3
b 1 2014-05-07 4
The next query (ii) then uses the rownumber to construct an exponentially decaying factor 0.9^(rownum-1) and applies it to views. The result is
product adjusted_views date rownum
--------------------------------------------------
a 1 * 0.9^0 2014-05-15 1
a 2 * 0.9^1 2014-05-14 2
a 2 * 0.9^2 2014-05-13 3
b 2 * 0.9^0 2014-05-10 1
b 3 * 0.9^1 2014-05-09 2
b 2 * 0.9^2 2014-05-08 3
b 1 * 0.9^3 2014-05-07 4
In a last step (the outer query) the adjusted views are summed up, as this seems to be the quantity you are interested in.
Note, however, that in order to be consistent there should be regular distances between the dates, e.g., always on day (--not one day here and a month there, because these will be weighted in a similar fashion although they shouldn't).