PL/SQL select records with the latest two dates - sql

I have a table where i store the Customer id and the date they logged in.
Cust_ID REC_DATE
773209 11/5/2013 4:30:52 PM
817265 11/5/2013 4:31:19 PM
And so on
How can i see only the latest two records by date for each customer?

You can use the analytic function row_number():
select t.*
from (select t.*,
row_number() over (partition by cust_id order by rec_date desc) as seqnum
from yourtable t
) t
where seqnum <= 2;

Related

How to get min value at max date in sql?

I have a table with snapshot data. It has productid and date and quantity columns. I need to find min value in the max date. Let's say, we have product X: X had the last snapshot at Y date but it has two snapshots at Y with 9 and 8 quantity values. I need to get
product_id | date | quantity
X Y 8
So far I came up with this.
select
productid
, max(snapshot_date) max_date
, min(quantity) min_quantity
from snapshot_table
group by 1
It works but I don't know why. Why this does not bring min value for each date?
I would use RANK here along with a scalar subquery:
WITH cte AS (
SELECT *, RANK() OVER (ORDER BY quantity) rnk
FROM snapshot_table
WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM snapshot_table)
)
SELECT productid, snapshot_date, quantity
FROM cte
WHERE rnk = 1;
Note that this solution caters to the possibility that two or more records happened to be tied for having the lower quantity among those most recent records.
Edit: We could simplify by doing away with the CTE and instead using the QUALIFY clause for the restriction on the RANK:
SELECT productid, snapshot_date, quantity
FROM snapshot_table
WHERE snapshot_date = (SELECT MAX(snapshot_date) FROM snapshot_table)
QUALIFY RANK() OVER (ORDER BY quantity) = 1;
Consider also below approach
select distinct product_id,
max(snapshot_date) over product as max_date,
first_value(quantity) over(product order by snapshot_date desc, quantity) as min_quantity
from your_table
window product as (partition by product_id)
use row_number()
with cte as (select *,
row_number() over(partition by product_id order by date desc) rn
from table_name) select * from cte where rn=1

Select Must Return only one row against every id

I'm using the below query to get results. The purpose of the query is to get the latest sales_amount of every customer, but when the sales are two or more in the given date range, the query returns all the records, how I can get only the latest records against the id. The same id should contain only one row against each id.
SELECT id,
Max(date),
sales_amount
FROM customer
WHERE date BETWEEN '2020-08-01' AND '2020-08-15'
AND id = 1001
GROUP BY id,
sales_amount;
You can use row number in a sub query to give you an ordering and then just pick the first one.
SELECT *
FROM (
SELECT id, date, sales_amount,
ROW_NUMBER() OVER (ORDER BY date DESC) as RN
FROM customer
WHERE date BETWEEN '2020-08-01' AND '2020-08-15'
AND id = 1001
) sub
WHERE RN = 1
Note if you want to do it for all customers then this is the query
SELECT *
FROM (
SELECT id, date, sales_amount,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date DESC) as RN
FROM customer
WHERE date BETWEEN '2020-08-01' AND '2020-08-15'
) sub
WHERE RN = 1
That will give you the most recent row for each customer.
this is because you have included the ID in teh group by clause, all ID's will be returned.
Have you tried adding:
SELECT id,
Max(date),
sales_amount
FROM customer
WHERE date BETWEEN '2020-08-01' AND '2020-08-15'
AND id = 1001
GROUP BY id,
sales_amount
ORDER BY date DESC
LIMIT 1;
select * from (
SELECT ROW_NUMBER() as Rn,
id,
Max(date),
sales_amount
FROM customer
WHERE date BETWEEN '2020-08-01' AND '2020-08-15'
AND id = 1001
GROUP BY id
order by max(date)
)
where Rn = 1

Running count distinct

I am trying to see how the cumulative number of subscribers changed over time based on unique email addresses and date they were created. Below is an example of a table I am working with.
I am trying to turn it into the table below. Email 1#gmail.com was created twice and I would like to count it once. I cannot figure out how to generate the Running count distinct column.
Thanks for the help.
I would usually do this using row_number():
select date, count(*),
sum(count(*)) over (order by date),
sum(sum(case when seqnum = 1 then 1 else 0 end)) over (order by date)
from (select t.*,
row_number() over (partition by email order by date) as seqnum
from t
) t
group by date
order by date;
This is similar to the version using lag(). However, I get nervous using lag if the same email appears multiple times on the same date.
Getting the total count and cumulative count is straight forward. To get the cumulative distinct count, use lag to check if the email had a row with a previous date, and set the flag to 0 so it would be ignored during a running sum.
select distinct dt
,count(*) over(partition by dt) as day_total
,count(*) over(order by dt) as cumsum
,sum(flag) over(order by dt) as cumdist
from (select t.*
,case when lag(dt) over(partition by email order by dt) is not null then 0 else 1 end as flag
from tbl t
) t
DEMO HERE
Here is a solution that does not uses sum over, neither lag... And does produces the correct results.
Hence it could appear as simpler to read and to maintain.
select
t1.date_created,
(select count(*) from my_table where date_created = t1.date_created) emails_created,
(select count(*) from my_table where date_created <= t1.date_created) cumulative_sum,
(select count( distinct email) from my_table where date_created <= t1.date_created) running_count_distinct
from
(select distinct date_created from my_table) t1
order by 1

Retrieve recent 5 days forecast for each cities with latest issue date

I need to retrieve the recent 5 days forecast info for each cities.
My table looks like below
The real problem is with the issue date.
the city may contain several forecast info for the same date with distinct issue date.
I need to retrieve recent 5 records for each cities with latest issue date and group by forecast date
I have tried something like below but not giving the expected result
SELECT * FROM(
SELECT
ROW_NUMBER () OVER (PARTITION BY CITY_ID ORDER BY FORECAST_DATE DESC, ISSUE_DATE DESC) AS rn,
CITY_ID, FORECAST_DATE, ISSUE_DATE
FROM
FORECAST
GROUP BY FORECAST_DATE
) WHERE rn <= 5
Any suggestion or advice will be helpful
This will get the latest issued forecast per day over the most recent 5 days for each city:
SELECT *
FROM (
SELECT f.*,
DENSE_RANK() OVER ( PARTITION BY city_id ORDER BY forecast_date DESC )
AS forecast_rank,
ROW_NUMBER() OVER ( PARTITION BY city_id, forecast_date ORDER BY issue_date DESC )
AS issue_rn
FROM Forecast f
)
WHERE forecast_rank <= 5
AND issue_rn = 1;
Partition by works like group by but for the function only.
Try
with CTE as
(
select t1.*,
row_number() over (partition by city_id, forecast_date order by issue_date desc) as r_ord
from Forecast
)
select CTE.*
from CTE
where r_ord <= 5
Try this
SELECT * FROM(
SELECT
ROW_NUMBER () OVER (PARTITION BY CITY_ID, FORECAST_DATE order by ISSUE_DATE DESC) AS rn,
CITY_ID, FORECAST_DATE, ISSUE_DATE
FROM
FORECAST
) WHERE rn <= 5

Group data by latest date per month

I store data on a daily basis in the following table
CREATE TABLE dbo.DemoTable
(
ReportDate DATE NOT NULL,
IdOne INT NOT NULL,
IdTwo INT NOT NULL,
NumberOfThings INT NOT NULL DEFAULT 0
CONSTRAINT PK_DemoTable PRIMARY KEY NONCLUSTERED (ReportDate, IdOne, IdTwo)
)
I'd like to report on this but only pull out data (sum of NumberOfThings) for the latest date we have for each month.
Example data
INSERT INTO DemoTable
(ReportDate, IdOne, IdTwo, NumberOfThings)
VALUES
('2016-11-02',1,2,2), ('2016-11-02',1,3,2), ('2016-11-01',1,2,20), ('2016-11-01',1,3,20),
('2016-10-31',1,2,2), ('2016-10-31',1,3,2), ('2016-10-30',1,2,20), ('2016-10-30',1,3,20), ('2016-10-29',1,2,200), ('2016-10-29',1,3,200),
('2016-09-30',1,2,5), ('2016-09-30',1,3,5), ('2016-09-29',1,2,55), ('2016-09-29',1,3,55)
So for this data I want to see:
2016-11-02 | 4
2016-10-31 | 4
2016-09-30 | 10
Thanks
You can use RANK() to spot the latest date rows on each month, and them sum them .
SELECT s.ReportDate,SUM(s.NumberOfThings)
FROM (
SELECT t.*,
RANK() OVER(PARTITION BY YEAR(t.ReportDate), MONTH(t.ReportDate) ORDER BY t.ReportDate DESC) as rnk
FROM DemoTable t) s
WHERE s.rnk = 1
GROUP BY s.ReportDate
You can use query like this
select ReportDate, sum(NumberofThings) as SumNumberofThings from DemoTable where ReportDate in
(
select max(ReportDate) MaxReportDate from DemoTable
group by datepart(yy,reportdate), datepart(m,reportdate)
)
group by ReportDate
A typical method involves row_number(). The only trick is using date functions to get the year and the month:
select dt.*
from (select dt.*,
row_number() over (partition by year(ReportDate), month(ReportDate)
order by ReportDate desc
) as seqnum
from DemoTable dt
) dt
where seqnum = 1;
If there are duplicates per date, you would just do the same thing with aggregation:
select dt.ReportDate, dt.NumberOfThings
from (select dt.ReportDate, sum(NumberOfThings) as NumberOfThings,
row_number() over (partition by year(ReportDate), month(ReportDate)
order by ReportDate desc
) as seqnum
from DemoTable dt
group by NumberOfThings
) dt
where seqnum = 1;
Aggregate your data so as to get the sum per date. Then rank your records by date within month. Then pick the best ranked records.
SELECT
ReportDate,
SumNumberOfThings
FROM
(
SELECT
ReportDate,
ROW_NUMBER() OVER (PARTITION BY YEAR(ReportDate), MONTH(ReportDate)
ORDER BY ReportDate DESC) AS rn
SUM(NumberOfThings) AS SumNumberOfThings
FROM DemoTable
GROUP BY ReportDate
) ranked
WHERE rn = 1
ORDER BY ReportDate;