Display the latest modified record for each employee - sql

emp table as like this
id Name Date Modified
1 Ram 2017-01-05
2 Kishore 2017-02-04
3 John 2017-04-22
1 Ram K 2017-04-25
1 Ram Kumar 2017-05-01
2 Kishore Babu 2017-05-05
3 John B 2017-06-01

Assuming you're using a reasonable rdbms that supports window functions, row_number should do the trick:
SELECT id, name, date_modified
FROM (SELECT id, name, date_modified,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date_modified DESC) rn
FROM emp) t
WHERE rn = 1

Related

BigQuery row_number to remove duplicates

I want to keep only the ID with the latest timestamp from the table, is there a more optimal and efficient way to solve the problem
a query that I tried
SELECT * except(row_number)
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY ID)
row_number
FROM employees
)
WHERE row_number = 1
employees table:
ID NAME DEPARTMENT UPDATED_AT
1 James IT 2019-05-21 12:13:14
1 James IT 2019-05-21 12:14:14
1 James IT 2019-05-21 12:18:14
2 Pam HR 2019-05-26 13:18:14
2 Pam HR 2019-05-26 14:18:14
3 David IT 2019-06-22 14:18:14
3 David IT 2019-06-23 12:18:14
result:
ID NAME DEPARTMENT UPDATED_AT
1 James IT 2019-05-21 12:18:14
2 Pam HR 2019-05-26 14:18:14
3 David IT 2019-06-23 12:18:14
You are just missing the ORDER BY clause in your subquery statement.
WITH
DATA AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) AS _row,
*
FROM
employees )
SELECT
* EXCEPT(_row)
FROM
DATA
WHERE
_row = 1
SELECT *
FROM employees
WHERE TRUE
QUALIFY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) = 1

Change the result of RANK() based on conditions in other columns

Now I have a table in redshift like this:
Table Project_team
Employee_ID Employee_Name Start_date Ranking Is_leader Is_Parttime_Staff
Emp001 John 2014-04-01 1 No No
Emp002 Mary 2015-02-01 2 No Yes
Emp003 Terry 2015-02-15 3 Yes No
Emp004 Peter 2016-02-05 4 No No
Emp004 Morris 2016-05-01 5 No No
Initially there is no ranking for staff.
What I do is to use the rank() function like this:
RANK() over (partition by Employee_ID,Employee_Name order by Start_date) as page_seq
However, now I want to manipulate the ranking based on their status. If the employee is leader then he or she should be ranked at the first. If he or she is parttime staff then should be ranked at the last. The table should be sth like this:
Employee_ID Employee_Name Start_date Ranking Is_leader Is_Parttime_Staff
Emp003 Terry 2015-02-15 1 Yes No
Emp001 John 2014-04-01 2 No No
Emp004 Peter 2016-02-05 3 No No
Emp004 Morris 2016-05-01 4 No No
Emp002 Mary 2015-02-01 5 No Yes
I tried to use the case function to manipulate it like
Case when Is_leader = true then Ranking = 1 else RANK() over (partition by Employee_ID,Employee_Name order by Start_date) End as page_seq.
However it does not work.
What is the process that I need to change the ranking based on other conditions in other columns?
Many thanks!
use dense_rank()
demo
select *,dense_Rank() over(order by case when leader='yes' then 1 else 0 end desc, case when parmanent='yes' then 1 else 0 end)
from cte1
output:
id name leader parmanent employeerank
1 A yes no 1
3 C no no 2
2 B no yes 3

Get latest date value based on month

I have the following records. It is broken based on username, date and testscore.
Username date testscore
mike 2016-11-30 23:41:10.143 1
mike 2016-11-27 23:41:11.143 12
mike 2016-11-24 23:41:11.143 16
john 2016-11-28 23:41:11.143 7
john 2016-11-25 23:42:11.143 12
john 2016-11-25 23:42:11.143 7
mike 2016-10-30 23:41:10.143 1
mike 2016-10-27 23:41:11.143 5
mike 2016-10-24 23:41:11.143 16
john 2016-10-28 23:41:11.143 12
john 2016-10-25 23:42:11.143 8
john 2016-10-24 23:42:11.143 2
For each one of the users I like to get the latest test score (month wise) for the year broken down by month with their score. In other words, I like to get the last score per user per month for a given year.
so for the above it would be
username date testscore
mike 2016-11-30 23:41:10.143 1
john 2016-11-28 23:41:11.143 7
mike 2016-10-30 23:41:10.143 1
john 2016-10-28 23:41:11.143 12
Perhaps using the WITH TIES clause in concert with Row_Number()
Select top 1 with ties *
from YourTable
Order by Row_Number() over (partition by UserName,year(date),month(date) order by date desc)
Returns
Username date testscore
john 2016-10-28 23:41:11.143 12
john 2016-11-28 23:41:11.143 7
mike 2016-10-30 23:41:10.143 1
mike 2016-11-30 23:41:10.143 1
You can use ROW_NUMBER():
WITH CTE AS
(
SELECT *,
RN = ROW_NUMBER() OVER(PARTITION BY username, CONVERT(VARCHAR(6),[date],112)
ORDER BY [date] DESC)
FROM dbo.YourTable
)
SELECT *
FROM CTE
WHERE RN = 1;
What about this :
select top 1 from <mytable> group by date.year(),date.month(),username order by date;
It seems you need a primary key, but you can still get the job done.
Essentially, you want the row for each username that corresponds to the latest date for that user.
SELECT username, date, testscore
FROM MyTable m1
WHERE m1.date = (SELECT MAX(m2.DATE))
FROM MyTable m2
WHERE m2.username = m1.username)

Select records with fewer than 10 entries sql server

i want only to display the The ID which have record less than 10 entries for each ID, an ID may have several values as you see in the data below. i want
i have tried this query but it selects also the record for ID 2
select ID, Name ,LastName ,PaymentDate,POSITION
From ( select ID, Name ,LastName ,PaymentDate ,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY PaymentDate DESC) AS POSITION
)
where Position < 10
any help please
ID Name LastName PaymentDate
1 John Abraham 2015-05-08
1 John Abraham 2014-05-08
1 John Abraham 2013-05-08
1 John Abraham 2012-05-08
1 John Abraham 2011-05-08
1 John Abraham 2010-05-08
------------------------------
2 Adam White 2015-05-08
2 Adam White 2014-05-08
2 Adam White 2013-05-08
2 Adam White 2012-05-08
2 Adam White 2011-05-08
2 Adam White 2010-05-08
2 Adam White 2009-05-08
2 Adam White 2008-05-08
2 Adam White 2007-05-08
2 Adam White 2006-05-08
2 Adam White 2005-05-08
2 Adam White 20004-05-08
SELECT ID, COUNT(ID)
FROM sometable
GROUP BY ID
HAVING COUNT(ID) < 10
You want count(*), not row_number():
select ID, Name, LastName, PaymentDate
from (select ID, Name, LastName, PaymentDate,
count(*) over (partition by ID) as cnt
from . . .
) t
where cnt < 10;
This displays the rows (which your question suggests is what you want). If you want only the ids, then aggregation is better:
select id
from t
group by id
having count(*) < 10;
Please try:
Select ID, Name, LastName, PaymentDate
From MyTable
Where ID in (Select ID From MyTable Group By ID Having Count(*) < 10);

How to refine last but one?

I have the following table . I need to get the last but one event associate for each event
event_id event_date event_associate
1 2/14/2014 ben
1 2/15/2014 ben
1 2/16/2014 steve
1 2/17/2014 steve // this associate is the last but one for event 1
1 2/18/2014 paul
2 2/19/2014 paul
2 2/20/2014 paul // this associate is the last but one for event 2
2 2/21/2014 ben
3 2/22/2014 paul
3 2/23/2014 paul
3 2/24/2014 ben
3 2/25/2014 steve // this associate is the last but one for event 3
3 2/26/2014 ben
I need to find out who was the last but one event_associate for each event . The result should be
event_id event_associate
1 steve
2 paul
3 steve
I know in order to do this I need to maximize event_date and exclude the last event_associate
So I tried
SELECT event_id , event_associate
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE event_date = MAX(event_date)
)
QUALIFY ROW_NUMBER() OVER ( PARTITION BY event_id ORDER BY event_date DESC) = 1
But I do not know how to use EXISTS in this case .
You are quite close, you just need the 2nd row based on ROW_NUMBER:
select t.*,
row_number()
over (partition by event_id
order by event_date desc)
from tab as t
qualify
row_number()
over (partition by event_id
order by event_date desc) = 2
-- or simply
-- qualify rn = 2