How can I select the first and second to last record for a given group in SQL? - sql

In Teradata I need to select the first record for a group as well as the second to last record for the same group for multiple groups with other set conditions. How can I acheive this?
ex table:
group id
records
date
place
One
1
2022-01-12
1
One
2
2022-01-12
1
One
3
2022-01-12
1
One
4
2022-01-12
1
One
1
2022-01-12
2
Two
1
2022-01-12
1
Two
2
2022-01-12
1
Two
3
2022-01-12
1
Two
4
2022-01-12
1
Two
5
2022-01-12
1
Two
6
2022-01-12
1
Two
5
2022-05-12
1
Two
6
2022-05-12
1
Desired Output:
group id
records
date
place
One
1
2022-01-12
1
One
3
2022-01-12
1
Two
1
2022-01-12
1
Two
5
2022-01-12
1

I would do something like this:
select
*
from
table
qualify row_number() over (partition by groupid order by date ASC) = 1 --"first"
or row_number() over (partition by groupid order by date DESC) = 2 -- "second to last"

Not tested, just an idea.
select q.*
from
(
select t.*,
max(t.records)-1 over (partition by t.group_id) as mxprev
from yourtable as t
) as q
where q.records=1 or q.records=q.mxprev

This works if you're ok with specifying each group manually:
(SELECT * FROM extable
WHERE groupid = 'One'
ORDER BY date ASC -- or whatever you want to order by to get "first" and "second to last"
LIMIT 1)
UNION
(SELECT * FROM extable
WHERE groupid = 'One'
ORDER BY date DESC -- or whatever you want to order by to get "first" and "second to last"
LIMIT 1
OFFSET 1)
UNION
(SELECT * FROM extable
WHERE groupid = 'Two'
ORDER BY date ASC -- or whatever you want to order by to get "first" and "second to last"
LIMIT 1)
UNION
(SELECT * FROM extable
WHERE groupid = 'Two'
ORDER BY date DESC -- or whatever you want to order by to get "first" and "second to last"
LIMIT 1
OFFSET 1);
(looking into a more generic solution atm)

Should work , not tested
select * from
(select *
,row_number() over(partition by group id order by records) rn1
from table1
) t1 where t1 = 1
union all
select * from
(
select *
,row_number() over(partition by group id order by records desc) rn2
from table1
) t2 where rn2 = 2

Related

SQL : Return joint most frequent values from a column

I have the following table named customerOrders.
ID user order
1 1 2
2 1 3
3 1 1
4 2 1
5 1 5
6 2 4
7 3 1
8 6 2
9 2 2
10 2 3
I want to return to users with most orders. Currently, I have the following QUERY:
SELECT user, COUNT(user) AS UsersWithMostOrders
FROM customerOrders
GROUP BY user
ORDER BY UsersWithMostOrders DESC;
This returns me all the values grouped by total orders like.
user UsersWithMostOrders
1 4
2 4
3 1
6 1
I only want to return the users with most orders. In my case that would be user 1 and 2 since both of them have 4 orders. If I use TOP 1 or LIMIT, it will only return the first user. If I use TOP 2, it will only work in this scenario, it will return invalid data when top two users have different count of orders.
Required Result
user UsersWithMostOrders
1 4
2 4
You can use TOP 1 WITH TIES:
SELECT TOP 1 WITH TIES
[user], COUNT(*) AS UsersWithMostOrders
FROM customerOrders
GROUP BY [user]
ORDER BY UsersWithMostOrders DESC;
See the demo.
Results:
> user | UsersWithMostOrders
> ---: | ------------------:
> 1 | 4
> 2 | 4
Option 1
Should work with most versions of SQL.
select *
from (
select *,
rank() over(order by numOrders desc) as rrank
from (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
) summed
) ranked
where rrank = 1
Play around with the code here
Option 2
If your version of SQL allows window functions (with), here is a much more readable solution which does the same thing
with summed as (
select `user`, count(*) as numOrders
from customerOrders
group by `user`
),
ranked as (
select *,
rank() over(order by numOrders desc) as rrank
from summed
)
select *
from ranked
where rrank = 1
Play around with the code here
You can use a CTE to attain this Req:
;WITH CTE AS(
SELECT [user], COUNT(user) AS UsersWithMostOrders
FROM #T
GROUP BY [user])
SELECT M.* from CTE M
INNER JOIN ( SELECT
MAX(UsersWithMostOrders) AS MaximumOrders FROM CTE) S ON
M.UsersWithMostOrders=S.MaximumOrders
Below Oracle Query can help:
WITH test_table AS
(
SELECT user, COUNT(order) AS total_order , DENSE_RANK() OVER (ORDER BY
total_order desc) AS rank_orders FROM customerOrders
GROUP BY user
)
select * from test_table where rank_orders = 1

How to select top 2 values for each id

I have a table with values
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
1 1 "2015-01-01"
1 1 "2015-01-01"
2 7 "2015-01-05"
2 6 "2015-01-04"
2 4 "2015-01-03"
3 11 "2015-01-08"
3 10 "2015-01-07"
3 9 "2015-01-06"
3 8 "2015-01-05"
I want to select top two values of each id as shown in desired output.
Desired output:
id sales date
1 5 "2015-01-04"
1 3 "2015-01-03"
2 7 "2015-01-05"
2 6 "2015-01-04"
3 11 "2015-01-08"
3 10 "2015-01-07"
My attempt:
can someone help me with this. Thank you in advance!
select transactions.salesperson_id, transactions.id, transactions.date
from transactions
ORDER BY transactions.salesperson_id ASC, transactions.date DESC;
This can be done using window functions:
select id, sales, "date"
from (
select id, sales, "date",
dense_rank() over (partition by id order by "date" desc) as rnk
from transactions
) t
where rnk <= 2;
If there are multiple rows on the same date this might return more than two rows for the same ID. If you don't want that, use row_number() instead of dense_rank()
row_number() will get what you want.
select * from
(select row_number() over (partition by id order by date) as rn, sales, date from transactions) t1
where t1.rn <= 2

SQL: How do I display all records per unique id, but not the first record ever recorded in SQL

Example:
id Pricemoney time/date
1 100 01/20/2017
1 10 01/21/2017
1 1000 01/21/20147
2 10 01/23/2017
2 100 01/24/2017
3 1000 01/19/2017
3 100 01/22/2017
3 10 01/24/2017
I want to run a SQL query where I can display all the Id and it's pricemoney BUT NOT include the first record (based on time/date) per unique
Just to clarify what I do not want to be displayed
userid Pricemoney issuedate
1 100 01/20/2017 -- not included
1 10 01/21/2017
1 1000 01/21/20147
2 10 01/23/2017 --- not inlcuded
2 100 01/24/2017
3 1000 01/19/2017 -- not included
3 100 01/22/2017
3 10 01/24/2017
Expected result:
id Pricemoney time/date
1 10 01/21/2017
1 1000 01/21/20147
2 100 01/24/2017
3 100 01/22/2017
3 10 01/24/2017
You can use row_number():
select t.*
from (select t.*,
row_number() over (partition by id order by time_date asc) as seqnum
from <tablename> t
) t
where seqnum > 1;
If you want to keep single rows, you can do:
select t.*
from (select t.*,
row_number() over (partition by id order by time_date asc) as seqnum,
count(*) over (partition by id) as cnt
from <tablename> t
) t
where seqnum > 1 and cnt > 1;
You may use EXISTS
select t1.*
from data t1
where exists (
select 1
from data t2
where t1.id = t2.id and t2.time_date < t1.time_date
)
you can try this :
select data1.id,data1.Date,data1.Pricemoney from data1
left join (
select id ,min(Date) date from data1
group by id
) as t
on data1.date= t.date and t.id = data1.id
where t.id is null
group by data1.id,data1.Date,data1.Pricemoney
above query not duplicated records also ignore, if want
not duplicated records then use having count(id) > 1 in left query e,g.
select data1.id,data1.Date,data1.Pricemoney from data1
left join (
select id ,min(Date) date from data1
group by id
having COUNT(id) > 1
) as t
on data1.date= t.date and t.id = data1.id
where t.id is null
group by data1.id,data1.Date,data1.Pricemoney

random row from diapason (1: n) in groups sql

I need select random row from Table using groups and order, but random's row number in group should not be more then constant (for example const = 3).
What I mean:
id time x
1 10:20 1
1 11:21 9
1 16:14 4
1 08:13 8
2 01:20 2
2 21:13 0
For id=1 rows could be:
id time x
1 10:20 1
1 11:21 9
1 08:13 8
BUT not
1 16:14 4 because in order by time it's local number more than 3
for
Id= 2 - any row
WITH cte as (
SELECT *, ROW_NUMBER() OVER (partition by id ORDER BY RANNDOM()) as rn
FROM myTable
)
SELECT *
FROM cte
WHERE rn <= 3
Something like this:
SELECT distinct on (id) *
FROM (select
row_number() over (partition by id order by time ) as up_lim
from tab1) as a
WHERE row_number <= 3
ORDER by id, random() ;

selecting set of second lowest values

I have two columns of interest ID and Deadline:
ID Deadline (DD/MM/YYYY)
1 01/01/2017
1 05/01/2017
1 04/01/2017
2 02/01/2017
2 03/01/2017
2 06/02/2017
2 08/03/2017
Each ID can have multiple (n) deadlines. I need to select all rows where the Deadline is second lowest for each individual ID.
Desired output:
ID Deadline (DD/MM/YYYY)
1 04/01/2017
2 03/01/2017
Selecting minimum can be done by:
select min(deadline) from XXX group by ID
but I am lost with "middle" values. I am using Rpostgresql, but any idea helps as well.
Thanks for your help
One way is to use ROW_NUMBER() window function
SELECT id, deadline
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY deadline) rn
FROM xxx
) q
WHERE rn = 2 -- get only second lowest ones
or with LATERAL
SELECT t.*
FROM (
SELECT DISTINCT id FROM xxx
) i JOIN LATERAL (
SELECT *
FROM xxx
WHERE id = i.id
ORDER BY deadline
OFFSET 1 LIMIT 1
) t ON (TRUE)
Output:
id | deadline
----+------------
1 | 2017-04-01
2 | 2017-03-01
Here is a dbfiddle demo
Using ROW_NUMBER() after taking distinct records will eliminate the chance of getting the lowest date instead of second lowest if there are duplicate records.
select ID,Deadline
from (
select ID,
Deadline,
ROW_NUMBER() over(partition by ID order by Deadline) RowNum
from (select distinct ID, Deadline from SourceTable) T
) Tbl
where RowNum = 2