Get latest record for customers by Date - sql

I want to get the latest phone number of customer by date. There are multiple entries for the same customer. But out of that I only want the record which has the maximum date.
Sample Data,
|cust_id | phone | hist_date
| A | 1234 | 2015-10-02
| A | 4567 | 2016-10-02
| A | 7896 | 2017-10-02
| B | 6456 | 2015-10-02
| B | 8621 | 2016-10-02
| B | 6382 | 2017-10-02
| A | 1393 | 2018-10-02
Desired result is
|cust_id | phone | hist_date
| A | 1393 | 2018-10-02
| B | 6382 | 2017-10-02
Please don't hard-code it with year. I need it to be dynamic so that every time only the latest date record will show. I know this can be achieved by Sub-query and CTE using ROW NUMBER. I tried but haven't got it right. Thanks a lot for the help.

use row_number() analytic function
select * from
(select *,row_number()over(partition by cust_id order by hist_date desc) rn
from logic
) t where t.rn=1
or you can use corelate subquery
select t1.* from logic t1
where t1.hist_date=( select max(hist_date)
from logic t2 where t1.cust_id=t2.cust_id
)

use row_number() window function
select * from
(
select *, row_number() over(partition by cus_id order by hist_date desc) as rn
from logic
)A where rn=1

You can also try the following query.
create table temp(cust_id char(1), phone char(5), hist_date date)
insert into temp values
('A', '1234', '2015-10-02'),
('A', '4567', '2016-10-02'),
('A', '7896', '2017-10-02'),
('B', '6456', '2015-10-02'),
('B', '8621', '2016-10-02'),
('B', '6382', '2017-10-02'),
('A', '1393', '2018-10-02')
Now the actual query.
Select a.* from temp a
inner join (
Select cust_id, MAX(hist_date) as hist_date from temp
group by cust_id
)b on a.cust_id = b.cust_id and a.hist_date = b.hist_date
Live Demo

Related

Find first record of multiple values in single query

Table
timestamp | tracker_id | position
----------------------------------+------------+----------
2020-02-01 21:53:45.571429+05:30 | 15 | 1
2020-02-01 21:53:45.857143+05:30 | 11 | 1
2020-02-01 21:53:46.428571+05:30 | 15 | 1
2020-02-01 21:53:46.714286+05:30 | 11 | 2
2020-02-01 21:53:54.714288+05:30 | 15 | 2
2020-02-01 21:53:55+05:30 | 12 | 1
2020-02-01 21:53:55.285714+05:30 | 11 | 1
2020-02-01 21:53:55.571429+05:30 | 15 | 3
2020-02-01 21:53:55.857143+05:30 | 13 | 1
2020-02-01 21:53:56.428571+05:30 | 11 | 1
2020-02-01 21:53:56.714286+05:30 | 15 | 1
2020-02-01 21:53:57+05:30 | 13 | 2
2020-02-01 21:53:58.142857+05:30 | 12 | 2
2020-02-01 21:53:58.428571+05:30 | 20 | 1
Output
timestamp | tracker_id | position
----------------------------------+------------+----------
2020-02-01 21:53:45.571429+05:30 | 15 | 1
2020-02-01 21:53:45.857143+05:30 | 11 | 1
2020-02-01 21:53:55+05:30 | 12 | 1
How do I find the first record WHERE tracker_id IN ('15', '11', '12') in a single query?
I can find the first record by separately querying for each tracker_id:
SELECT *
FROM my_table
WHERE tracker_id = '15'
ORDER BY timestamp
LIMIT 1;
In Postgres this can be done using the DISTINCT ON () clause:
select distinct on (tracker_id) *
from the_table
where tracker_id in (11,12,15)
order by tracker_id, "timestamp" desc;
Online example
I have named your timestampl column col1 because I do nto recommend to name your columns with keywords.
select * from mytable m
where m.col1 = (select min(col1)
from mytable m1
where m.tracker_id = m1.tracker_id
group by tracker_id)
and m.tracker_id in (11,15,12);
Here is a small demo
You can use first_value with the nested select query:
select mt.*
from my_table mt
where mt.timestamp in (
select first_value(imt.timestamp) over (partition by imt.tracker_id order by imt.timestamp)
from my_table imt
where imt.tracker_id in ('11', '12', '15')
)
I'm assuming timestamp is unique, like you said in the comment. You can always replace the joining column with a primary key, like id.
select distinct on (tracker_id) *
from the_table
where tracker_id in ( select distinct tracker_id from the_table)
order by tracker_id, "timestamp" desc;
If you want the first row that matches each of your IN values, you can use a window function:
SELECT src.timestamp, src.tracker_id, src.position
FROM (
SELECT
t.timestamp, t.tracker_id, t.position,
ROW_NUMBER() OVER(PARTITION BY tracker_id ORDER BY timestamp DESC) myrownum
FROM mytable t
WHERE tracker_id IN ('15', '11', '12')
) src
WHERE myrownum = 1 -- Get first row for each "tracker_id" grouping
This will return the first row that matches for each of your IN values, ordering by timestamp.
Find this Query:
You can uncomment where clause if you want to run query for selected tracker_id
;WITH CTE AS
(
SELECT ROW_NUMBER() OVER (PARTITION BY tracker_id ORDER BY timestamp)
duplicates, * FROM my_table -- WHERE tracker_id IN (15,11,12)
)
SELECT timestamp, tracker_id, position FROM CTE WHERE duplicates = 1
select distinct on (tracker_id) *
from table
where tracker_id in (11,12,15)
order by tracker_id, "timestamp" asc;
i use distinct on when use postgres for this case

Redshift window function for change in column

I have a redshift table with amongst other things an id and plan_type column and would like a window function group clause where the plan_type changes so that if this is the data for example:
| user_id | plan_type | created |
|---------|-----------|------------|
| 1 | A | 2019-01-01 |
| 1 | A | 2019-01-02 |
| 1 | B | 2019-01-05 |
| 2 | A | 2019-01-01 |
| 2 | A | 2-10-01-05 |
I would like a result like this where I get the first date that the plan_type was "new":
| user_id | plan_type | created |
|---------|-----------|------------|
| 1 | A | 2019-01-01 |
| 1 | B | 2019-01-05 |
| 2 | A | 2019-01-01 |
Is this possible with window functions?
EDIT
Since I have some garbage in the data where plan_type can sometimes be null and the accepted solution does not include the first row (since I can't have the OR is not null I had to make some modifications. Hopefully his will help other people if they have similar issues. The final query is as follows:
SELECT * FROM
(
SELECT
user_id,
plan_type,
created_at,
lag(plan_type) OVER (PARTITION by user_id ORDER BY created_at) as prev_plan,
row_number() OVER (PARTITION by user_id ORDER BY created_at) as rownum
FROM tablename
WHERE plan_type IS NOT NULL
) userHistory
WHERE
userHistory.plan_type <> userHistory.prev_plan
OR userHistory.rownum = 1
ORDER BY created_at;
The plan_type IS NOT NULL filters out bad data at the source table and the outer where clause gets any changes OR the first row of data that would not be included otherwise.
ALSO BE CAREFUL about the created_at timestamp if you are working of your prev_plan field since it would of course give you the time of the new value!!!
This is a gaps-and-islands problem. I think lag() is the simplest approach:
select user_id, plan_type, created
from (select t.*,
lag(plan_type) over (partition by user_id order by created) as prev_plan_type
from t
) t
where prev_plan_type is null or prev_plan_type <> plan_type;
This assumes that plan types can move back to another value and you want each one.
If not, just use aggregation:
select user_id, plan_type, min(created)
from t
group by user_id, plan_type;
use row_number() window function
select * from
(select *,row_number()over(partition by user_id,plan_type order by created) rn
) a where a.rn=1
use lag()
select * from
(
select user_id, plant_type, lag(plan_type) over (partition by user_id order by created) as changes, created
from tablename
)A where plan_type<>changes and changes is not null

How to select rows and nearby rows with specific conditions

I have a table (Trans) of values like
OrderID (unique) | CustID | OrderDate| TimeSinceLast|
------------------------------------------------------
123a | A01 | 20.06.18 | 20 |
123y | B05 | 20.06.18 | 31 |
113k | A01 | 18.05.18 | NULL | <------- need this
168x | C01 | 17.04.18 | 8 |
999y | B05 | 15.04.18 | NULL | <------- need this
188k | A01 | 15.04.18 | 123 |
678a | B05 | 16.03.18 | 45 |
What I need is to select the rows where TimeSinceLast is null, as well as a row preceding and following where TimeSinceLast is not null, grouped by custID
I'd need my final table to look like:
OrderID (unique) | CustID | OrderDate| TimeSinceLast|
------------------------------------------------------
123a | A01 | 20.06.18 | 20 |
113k | A01 | 18.05.18 | NULL |
188k | A01 | 15.04.18 | 123 |
123y | B05 | 20.06.18 | 31 |
999y | B05 | 15.04.18 | NULL |
678a | B05 | 16.03.18 | 45 |
The main problem is that TimeSinceLast is not reliable and for whatsoever reason does not calculate well the days since last order, so I cannot use it in a query for preceding or following row.
I have tried to look for codes and found something like this on this forum
with dt as
(select distinct custID, OrderID,
max (case when timeSinceLast is null then OrderID end)
over(partition by custID order by OrderDate
rows between 1 preceding and 1 following) as NullID
from Trans)
select *
from dt
where request_id between NullID -1 and NullID+1
But does not work well for my purposes. Also it looks like max function cannot work with missing values.
Many thanks
Use lead() and lag().
What I need is to select the rows where TimeSinceLast is null, as well as a row preceding and following where TimeSinceLast is not null.
First, the ordering is a little unclear. Your sample data and code do not match. The following assumes some combination of the date and orderid, but there may be other columns that better capture what you mean by "preceding" and "following".
This is a little tricky, because you don't want to always include the first and last rows -- unless necessary. So, look at two columns:
select t.*
from (select t.*,
lead(TimeSinceLast) over (partition by custid order by orderdate, orderid) as next_tsl,
lag(TimeSinceLast) over (partition by custid order by orderdate, orderid) as prev_tsl,
lead(orderid) over (partition by custid order by orderdate, orderid) as next_orderid,
lag(orderid) over (partition by custid order by orderdate, orderid) as prev_orderid
from t
) t
where TimeSinceLast is not null or
(next_tsl is null and next_orderid is not null) or
(prev_tsl is null and prev_orderid is not null);
USE APPLY
DECLARE #TransTable TABLE (OrderID char(4), CustID char(3), OrderDate date, TimeSinceLast int)
INSERT #TransTable VALUES
('123a', 'A01', '06.20.2018', 20),
('123y', 'B05', '06.20.2018' ,31),
('113k', 'A01', '05.18.2018' ,NULL), ------- need this
('168x', 'C01', '04.17.2018' ,8),
('999y', 'B05', '04.15.2018' ,NULL), ------- need this
('188k', 'A01', '04.15.2018' ,123),
('678a', 'B05', '03.16.2018' ,45)
SELECT B.OrderID, B.CustID, B.OrderDate, B.TimeSinceLast
FROM #TransTable A
CROSS APPLY (
SELECT 0 AS rn, A.OrderID, A.CustID, A.OrderDate, A.TimeSinceLast
UNION ALL
SELECT TOP 2 ROW_NUMBER() OVER (PARTITION BY CASE WHEN T.OrderDate > A.OrderDate THEN 1 ELSE 0 END ORDER BY ABS(DATEDIFF(day, T.OrderDate, A.OrderDate))) rn,
T.OrderID, T.CustID, T.OrderDate, T.TimeSinceLast
FROM #TransTable T
WHERE T.CustID = A.CustID AND T.OrderID <> A.OrderID
ORDER BY rn
) B
WHERE A.TimeSinceLast IS NULL
ORDER BY B.CustID, B.OrderDate DESC

Returning most recent row SQL Server

I have this table
CREATE TABLE Test (
OrderID int,
Person varchar(10),
LastModified Date
);
INSERT INTO Test (OrderID, Person, LastModified)
VALUES (1, 'Sam', '2018-05-15'),
(1, 'Tim','2018-05-14'),
(1, 'Kim','2018-05-05'),
(1, 'Dave','2018-05-13'),
(1, 'James','2018-05-11'),
(1, 'Fred','2018-05-05');
select * result:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
| 1 | Tim | 2018-05-14 |
| 1 | Kim | 2018-05-05 |
| 1 | Dave | 2018-05-13 |
| 1 | James | 2018-05-11 |
| 1 | Fred | 2018-05-05 |
I am looking to return the most recent modified row which is the first row with 'Sam'.
Now i now i can use max to return the most recent date but how can i aggregate the person column to return sam?
Looking for a result set like
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
I ran this:
SELECT
OrderID,
max(Person) AS [Person],
max(LastModified) AS [LastModified]
FROM Test
GROUP BY
OrderID
but this returns:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Tim | 2018-05-15 |
Can someone advice me further please? thanks
*** UPDATE
INSERT INTO Test (OrderID, Person, LastModified)
VALUES (1, 'Sam', '2018-05-15'),
(1, 'Tim','2018-05-14'),
(1, 'Kim','2018-05-05'),
(1, 'Dave','2018-05-13'),
(1, 'James','2018-05-11'),
(1, 'Fred','2018-05-05'),
(2, 'Dave','2018-05-13'),
(2, 'James','2018-05-11'),
(2, 'Fred','2018-05-05');
So i would be looking for this result to be:
| OrderID | Person | LastModified |
|---------|--------|--------------|
| 1 | Sam | 2018-05-15 |
| 2 | Dave | 2018-05-13 |
If you always want just one record (the latest modified one) per OrderID then this would do it:
SELECT
t2.OrderID
, t2.Person
, t2.LastModified
FROM (
SELECT
MAX( LastModified ) AS LastModified
, OrderID
FROM
Test
GROUP BY
OrderID
) t
INNER JOIN Test t2
ON t2.LastModified = t.LastModified
AND t2.OrderID = t.OrderID
Expanding on your comment ("thanks very much, is there a way i can do this if there is more than one orderID e.g. multiple people and lastmodified for multiple orderID's?"), in xcvd's answer, I assume what you therefore want is this:
WITH CTE AS(
SELECT OrderId,
Person,
LastModifed,
ROW_NUMBER() OVER (PARTITION BY OrderID ORDER BY LastModified DESC) AS RN
FROM YourTable)
SELECT OrderID,
Person,
LastModified
FROM CTE
WHERE RN = 1;
How about just using TOP (1) and ORDER BY?
SELECT TOP (1) t.*
FROM Test t
ORDER BY LastModified DESC;
If you want this for each orderid, then this is a handy method in SQL Server:
SELECT TOP (1) WITH TIES t.*
FROM Test t
ORDER BY ROW_NUMBER() OVER (PARTITION BY OrderId ORDER BY LastModified DESC);
"xcvd's" answer is perfect for this, I would just like to add another solution that can be used here for the sake of showing you a method that can be used in more complex situations than this. This solution uses a nested query (sub-query) to find the MAX(LastModified) regardless of any other field and it will use the result in the original query's WHERE clause to find any results that meet the new criteria. Cheers.
SELECT OrderID
, Person
, LastModified
FROM Test
WHERE LastModified IN (SELECT MAX(LastModified)
FROM Test)
Here is one other method :
select t.*
from Test t
where LastModified = (select max(t1.LastModified) from Test t1 where t1.OrderID = t.OrderID);

SQL Order By and "Not-So-Much Group"

Lets say I have a table:
--------------------------------------
| ID | DATE | GROUP | RESULT |
--------------------------------------
| 1 | 01/06 | Group1 | 12345 |
| 2 | 01/05 | Group2 | 54321 |
| 3 | 01/04 | Group1 | 11111 |
--------------------------------------
I want to order the result by the most recent date at the top but group the "group" column together, but still have distinct entries. The result that I want would be:
1 | 01/06 | Group1 | 12345
3 | 01/04 | Group1 | 11111
2 | 01/05 | Group2 | 54321
What would be a query to get that result?
thank you!
EDIT:
I'm using MSSQL. I'll look into translating the oracle query into MS SQL and report my results.
EDIT
SQL Server 2000, so OVER/PARTITION is not supported =[
Thank you!
You should specify what RDBMS you are using. This answer is for Oracle, may not work in other systems.
SELECT * FROM table
ORDER BY MAX(date) OVER (PARTITION BY group) DESC, group, date DESC
declare #table table (
ID int not null,
[DATE] smalldatetime not null,
[GROUP] varchar(10) not null,
[RESULT] varchar(10) not null
)
insert #table values (1, '2009-01-06', 'Group1', '12345')
insert #table values (2, '2009-01-05', 'Group2', '12345')
insert #table values (3, '2009-01-04', 'Group1', '12345')
select t.*
from #table t
inner join (
select
max([date]) as [order-date],
[GROUP]
from #table orderer
group by
[GROUP]
) x
on t.[GROUP] = x.[GROUP]
order by
x.[order-date] desc,
t.[GROUP],
t.[DATE] desc
use an order by clause with two params:
...order by group, date desc
this assumes that your date column does hold dates and not varchars
SELECT table2.myID,
table2.mydate,
table2.mygroup,
table2.myresult
FROM (SELECT DISTINCT mygroup FROM testtable as table1) as grouptable
JOIN testtable as table2
ON grouptable.mygroup = table2.mygroup
ORDER BY grouptable.mygroup,table2.mydate
SORRY, could NOT bring myself to use columns that were reserved names, rename the columns to make it work :)
this is MUCH simpler than the accepted answer btw.