How to refine last but one? - sql

I have the following table . I need to get the last but one event associate for each event
event_id event_date event_associate
1 2/14/2014 ben
1 2/15/2014 ben
1 2/16/2014 steve
1 2/17/2014 steve // this associate is the last but one for event 1
1 2/18/2014 paul
2 2/19/2014 paul
2 2/20/2014 paul // this associate is the last but one for event 2
2 2/21/2014 ben
3 2/22/2014 paul
3 2/23/2014 paul
3 2/24/2014 ben
3 2/25/2014 steve // this associate is the last but one for event 3
3 2/26/2014 ben
I need to find out who was the last but one event_associate for each event . The result should be
event_id event_associate
1 steve
2 paul
3 steve
I know in order to do this I need to maximize event_date and exclude the last event_associate
So I tried
SELECT event_id , event_associate
WHERE NOT EXISTS (
SELECT *
FROM mytable
WHERE event_date = MAX(event_date)
)
QUALIFY ROW_NUMBER() OVER ( PARTITION BY event_id ORDER BY event_date DESC) = 1
But I do not know how to use EXISTS in this case .

You are quite close, you just need the 2nd row based on ROW_NUMBER:
select t.*,
row_number()
over (partition by event_id
order by event_date desc)
from tab as t
qualify
row_number()
over (partition by event_id
order by event_date desc) = 2
-- or simply
-- qualify rn = 2

Related

How to return most recend id based on user name and date - SQL

I have the current sql table bellow:
id user date
1 john 2021-08-20
3 john 2021-08-24
5 john 2021-08-25
8 will 2021-08-25
9 will 2021-08-20
6 will 2021-08-18
I need to return the id's who have the most recent date, and with that, return a count of how many times the user appeared. And the id isn't always numered in crescent order by date, as the example bellow.
id user count
5 john 3
8 will 3
You can use qualify to get the most recent row:
select t.*
from t
where 1=1
qualify row_number() over (partition by name order by date desc) = 1;
You can add a window function for the count:
select t.*,
count(*) over (partition by name) as cnt
from t
where 1=1
qualify row_number() over (partition by name order by date desc) = 1;
You can use the following query:
SELECT id, user, count(user) AS count
FROM table_name
Group by user
having date = max(date);

BigQuery row_number to remove duplicates

I want to keep only the ID with the latest timestamp from the table, is there a more optimal and efficient way to solve the problem
a query that I tried
SELECT * except(row_number)
FROM (
SELECT
*,
ROW_NUMBER()
OVER (PARTITION BY ID)
row_number
FROM employees
)
WHERE row_number = 1
employees table:
ID NAME DEPARTMENT UPDATED_AT
1 James IT 2019-05-21 12:13:14
1 James IT 2019-05-21 12:14:14
1 James IT 2019-05-21 12:18:14
2 Pam HR 2019-05-26 13:18:14
2 Pam HR 2019-05-26 14:18:14
3 David IT 2019-06-22 14:18:14
3 David IT 2019-06-23 12:18:14
result:
ID NAME DEPARTMENT UPDATED_AT
1 James IT 2019-05-21 12:18:14
2 Pam HR 2019-05-26 14:18:14
3 David IT 2019-06-23 12:18:14
You are just missing the ORDER BY clause in your subquery statement.
WITH
DATA AS (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) AS _row,
*
FROM
employees )
SELECT
* EXCEPT(_row)
FROM
DATA
WHERE
_row = 1
SELECT *
FROM employees
WHERE TRUE
QUALIFY ROW_NUMBER() OVER (PARTITION BY ID ORDER BY UPDATED_AT DESC) = 1

SQL find and group consecutive number in rows without duplicate

So I have a table like this:
Taxi Client Time
Tom A 1
Tom A 2
Tom B 3
Tom A 4
Tom A 5
Tom A 6
Tom B 7
Tom B 8
Bob A 1
Bob A 2
Bob A 3
and the expected result will be like this:
Tom 3
Bob 1
I have used the partition function to count the consecutive value but the result become this:
Tom A 2
Tom A 3
Tom B 2
Bob A 2
Please help, I am not good in English, thanks!
This is a variation of a gaps-and-islands problem. You can solve it using window functions:
select taxi, count(*)
from (select t.taxi, t.client, count(*) as num_times
from (select t.*,
row_number() over (partition by taxi order by time) as seqnum,
row_number() over (partition by taxi, client order by time) as seqnum_c
from t
) t
group by t.taxi, t.client, (seqnum - seqnum_c)
having count(*) >= 2
)
group by taxi;
use distinct count
select taxi ,count( distinct cient)
from table_name
group by taxi
It seems your expected output is wrong
I don't see where you get the number 3 from. If you're trying to do what your question says and group by client in consecutive order only and then get the number of different groups, I can help you out with the following query. Bob has 1 group and Tom has 4.
Partition by taxi, ORDER BY taxi, time and check if this client matches the previous client for this taxi. If yes, do not count this row. If no, count this row, this is a new group.
SELECT FEE.taxi,
SUM(FEE.clientNotSameAsPreviousInSequence)
FROM
(
SELECT taxi,
CASE
WHEN PreviousClient IS NULL THEN
1
WHEN PreviousClient <> client THEN
1
ELSE
0
END AS clientNotSameAsPreviousInSequence
FROM
(
SELECT *,
LAG(client) OVER (PARTITION BY taxi ORDER BY taxi, time) AS PreviousClient
FROM table
) taxisWithPreviousClient
) FEE
GROUP BY FEE.taxi;

Display the latest modified record for each employee

emp table as like this
id Name Date Modified
1 Ram 2017-01-05
2 Kishore 2017-02-04
3 John 2017-04-22
1 Ram K 2017-04-25
1 Ram Kumar 2017-05-01
2 Kishore Babu 2017-05-05
3 John B 2017-06-01
Assuming you're using a reasonable rdbms that supports window functions, row_number should do the trick:
SELECT id, name, date_modified
FROM (SELECT id, name, date_modified,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date_modified DESC) rn
FROM emp) t
WHERE rn = 1

using dense_rank to find distinct

I have a table something like this
Id student_name City
4 abc Mumbai
6 xyz Delhi
4 lmn Kolkata
6 abc Mumbai
6 GHI Chennai
I am using dense_rank() function to dismiss the duplicate entry of ID in the table means if I am having the ID 4 twice it should give me only once in the output.
When I am using dense_rank function like:
select dense_rank() over (order by student_id desc ) as ID ,Id, student_name,city
from test
It is giving me the output something like this
ID ID student_name city
1 4 abc Mumbai
1 4 lmn kolkata
2 6 xyz Delhi
2 6 abc Mumbai
But I don't want duplicate how to remove using dense_rank() function
First, dense_rank() in the select does not do filtering. So, I don't know why the output would have four rows when the input has five.
Second, to keep only one row per id, then use row_number(), not dense_rank() . . . along with a subquery:
select t.*
from (select t.*,
row_number() over (partition by student_id order by student_id) as seqnum
from test t
) t
where seqnum = 1;