Aggregate function like MAX for most common cell in column? - sql

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?

You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1

A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

Related

Select 20 results per every column value

I prepared query that select date from table. In table I got: rank, name, citycode as columns. When I am doing something like that:
select name, citycode
from tab20
where rank <= 20
I got resault of first 20 rows that gets rank <= 20. And Everything would be ok, but I have to show results of first 20 rows per every citystate. Is it possible to create in one query ? I was tryin union etc but it doesn't work well.
Thanks
You would use the row_number() function. Based on the rank that would be:
select t.*
from (select t.*,
row_number() over (partition by citycode order by rank) as seqnum
from tab20 t
) t
where seqnum <= 20;

How do I create a new SQL table with custom column names and populate these columns

So I currently have an SQL statement that generates a table with the most frequent occurring value as well as the least frequent occurring value in a table. However this table has 2 rows with the row values as well as the fields. I need to create a custom table with 2 columns with min and max. Then have one row with one value for each. The value for these columns needs to be from the same row.
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency DESC limit 1)
UNION
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency ASC limit 1);
So for the above query I would need the names of the min and max values in one row. I need to be able to define the name of new columns for the generated SQL query as well.
Min_Name | Max_Name
Certif_1 | Certif_2
I think this query should give you the results you want. It ranks each name according to the number of times it appears in the table, then uses conditional aggregation to select the min and max frequency names in one row:
with cte as (
select name,
row_number() over (order by count(*) desc) as maxr,
row_number() over (order by count(*)) as minr
from firefighter_certifications
group by name
)
select max(case when minr = 1 then name end) as Min_Name,
max(case when maxr = 1 then name end) as Max_Name
from cte
Postgres doesn't offer "first" and "last" aggregation functions. But there are other, similar methods:
select distinct first_value(name) over (order by cnt desc, name) as name_at_max,
first_value(name) over (order by cnt asc, name) as name_at_min
from (select name, count(*) as cnt
from firefighter_certifications
group by name
) n;
Or without any subquery at all:
select first_value(name) over (order by count(*) desc, name) as name_at_max,
first_value(name) over (order by count(*) asc, name) as name_at_min
from firefighter_certifications
group by name
limit 1;
Here is a db<>fiddle

SQL MAX(COUNT(*)) GROUP BY Alternatives?

I've seen many topics about this and none of them is what I'm looking for.
Say we have this simple table:
CREATE TABLE A (
id INT,
date DATETIME
);
I want to retrieve the MAX value after grouping.
So I do it as follow:
DECLARE #tmpTable TABLE(id INT, count INT);
INSERT INTO #tmpTable SELECT id, COUNT(*) FROM A GROUP BY id;
SELECT MAX(count) FROM #tmpTable;
Is there a better way of doing that?
I've seen a solution in a book that I'm reading that they do it as follows:
SELECT MAX(count) FROM (SELECT COUNT(*) AS count FROM A GROUP BY id);
But this won't work :/ Could be that it works in newer T-SQL servers? Currently I'm using 2008 R2.
You can make use of TOP
SELECT TOP 1 Id,COUNT(*) AS MAXCOUNT
FROM A
GROUP BY Id
ORDER BY MAXCOUNT DESC
If you wants the result with same max count use TOP WITH TIES
SELECT TOP 1 WITH TIES Id,COUNT(*) AS MAXCOUNT
FROM A
GROUP BY Id
ORDER BY MAXCOUNT DESC
Is there a better way of doing that?
We could try using analytic functions:
WITH cte AS (
SELECT id, COUNT(*) cnt, ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) rn
FROM A
GROUP BY id
)
SELECT cnt
FROM cte
WHERE rn = 1;
This approach is to turn out a row number, ordered descending by the count, during your original aggregation query by id. The id with the highest count then should be the first record (and this result should hold valid even if more than one id be tied for the highest count).
Regarding your original max query, see the answer by #apomene, and you are just missing an alias.
You also need to add an alias name for your sub query. Try like:
SELECT MAX(sub.count1) FROM (SELECT COUNT(*) AS count1 FROM A GROUP BY id) sub;

Selecting type(s) of account with 2nd maximum number of accounts

Suppose we have an accounts table along with the already given values
I want to find the type of account with second highest number of accounts. In this case, result should be 'FD'. In case their is a contention for second highest count I need all those types in the result.
I'm not getting any idea of how to do it. I've found numerous posts for finding second highest values, say salary, in a table. But not for second highest COUNT.
This can be done using cte's. Get the counts for each type as the first step. Then use dense_rank (to get multiple rows with same counts in case of ties) to get the rank of rows by type based on counts. Finally, select the second ranked row.
with counts as (
select type, count(*) cnt
from yourtable
group by type)
, ranks as (
select type, dense_rank() over(order by cnt desc) rnk
from counts)
select type
from ranks
where rnk = 2;
One option is to use row_number() (or dense_rank(), depending on what "second" means when there are ties):
select a.*
from (select a.type, count(*) as cnt,
row_number() over (order by count(*) desc) as seqnum
from accounta a
group by a.type
) a
where seqnum = 2;
In Oracle 12c+, you can use offset/fetch:
select a.type, count(*) as cnt
from accounta a
group by a.type
order by count(*) desc
offset 1
fetch first 1 row only

Get top N records grouped by another field

I have an Oracle table with ID, SUBJECT, and PAYLOAD (CLOB). I'd like to get a listing of the TOP 10 records who have the biggest PAYLOAD (LENGTH(PAYLOAD)) grouped by subject. So if I have 10 DISTINCT SUBJECT's in the table, the query should return 100 rows (top 10 per subject).
Use row_number():
select t.*
from (select t.*, row_number() over (partition by subject order by length(payload) desc) as seqnum
from table t
) t
where seqnum <= 10;