Get top N records grouped by another field

Get top N records grouped by another field - sql

I have an Oracle table with ID, SUBJECT, and PAYLOAD (CLOB). I'd like to get a listing of the TOP 10 records who have the biggest PAYLOAD (LENGTH(PAYLOAD)) grouped by subject. So if I have 10 DISTINCT SUBJECT's in the table, the query should return 100 rows (top 10 per subject).

Use row_number():
select t.*
from (select t.*, row_number() over (partition by subject order by length(payload) desc) as seqnum
from table t
) t
where seqnum <= 10;

Related

How do I select 1 [oldest] row per group of rows, given multiple groups?

Let's say we have the database table below, called USER_JOBS.
I'd like to write an SQL query that reflects this algorithm:
Divide the whole table in groups of rows defined by a common USER_ID (in the example table, the 2 resulting groups are colored yellow & green)
From each group, select the oldest row (according to SCHEDULE_TIME)
From this example table, the desired SQL query would return these 2 rows:

You can use ranking function (supported in most RDBS):
SELECT *
FROM
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY USER_ID ORDER BY SCHEDULE_TIME DESC) AS RowID
FROM [table]
)
WHERE RowID = 1

WITH Ranked AS (
SELECT
RANK() OVER (PARTITION BY User_ID ORDER BY ScheduleTime DESC) as Ranking,
*
FROM [table_name]
)
SELECT Status, Sob_Type, User_ID, TimeStamp FROM ranking WHERE Ranks = 1;

Select 20 results per every column value

I prepared query that select date from table. In table I got: rank, name, citycode as columns. When I am doing something like that:
select name, citycode
from tab20
where rank <= 20
I got resault of first 20 rows that gets rank <= 20. And Everything would be ok, but I have to show results of first 20 rows per every citystate. Is it possible to create in one query ? I was tryin union etc but it doesn't work well.
Thanks

You would use the row_number() function. Based on the rank that would be:
select t.*
from (select t.*,
row_number() over (partition by citycode order by rank) as seqnum
from tab20 t
) t
where seqnum <= 20;

How do I create a new SQL table with custom column names and populate these columns

So I currently have an SQL statement that generates a table with the most frequent occurring value as well as the least frequent occurring value in a table. However this table has 2 rows with the row values as well as the fields. I need to create a custom table with 2 columns with min and max. Then have one row with one value for each. The value for these columns needs to be from the same row.
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency DESC limit 1)
UNION
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency ASC limit 1);
So for the above query I would need the names of the min and max values in one row. I need to be able to define the name of new columns for the generated SQL query as well.
Min_Name | Max_Name
Certif_1 | Certif_2

I think this query should give you the results you want. It ranks each name according to the number of times it appears in the table, then uses conditional aggregation to select the min and max frequency names in one row:
with cte as (
select name,
row_number() over (order by count(*) desc) as maxr,
row_number() over (order by count(*)) as minr
from firefighter_certifications
group by name
)
select max(case when minr = 1 then name end) as Min_Name,
max(case when maxr = 1 then name end) as Max_Name
from cte

Postgres doesn't offer "first" and "last" aggregation functions. But there are other, similar methods:
select distinct first_value(name) over (order by cnt desc, name) as name_at_max,
first_value(name) over (order by cnt asc, name) as name_at_min
from (select name, count(*) as cnt
from firefighter_certifications
group by name
) n;
Or without any subquery at all:
select first_value(name) over (order by count(*) desc, name) as name_at_max,
first_value(name) over (order by count(*) asc, name) as name_at_min
from firefighter_certifications
group by name
limit 1;
Here is a db<>fiddle

db2 select x random rows for a given id

If I have two columns - an ID field and a score field that can take 10 possible values, how can I select 5 random rows per ID? I know I can select 5 random rows from a table by using the following:
select *, rand() as idx
from mytable
order by idx fetch first 5 rows only
but how about 5 rows per ID?

You can do this using row_number():
select t.*
from (select t.*,
row_number() over (partition by idx order by rand()) as seqnum
from mytable t
) t
where seqnum <= 5;

Aggregate function like MAX for most common cell in column?

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?

You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1

A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get top N records grouped by another field - sql

I have an Oracle table with ID, SUBJECT, and PAYLOAD (CLOB). I'd like to get a listing of the TOP 10 records who have the biggest PAYLOAD (LENGTH(PAYLOAD)) grouped by subject. So if I have 10 DISTINCT SUBJECT's in the table, the query should return 100 rows (top 10 per subject).

Use row_number(): select t.* from (select t.*, row_number() over (partition by subject order by length(payload) desc) as seqnum from table t ) t where seqnum <= 10;

Related

How do I select 1 [oldest] row per group of rows, given multiple groups?

Select 20 results per every column value

How do I create a new SQL table with custom column names and populate these columns

db2 select x random rows for a given id

Aggregate function like MAX for most common cell in column?

Categories

Resources