how to select the most frequently appearing values? [duplicate]

how to select the most frequently appearing values? [duplicate] - sql

This question already has answers here:
Get most common value for each value of another column in SQL
(9 answers)
Closed 8 years ago.
I've seen examples where the query orders by count and takes the top row, but in this case there can be multiple "most frequent" values, so I might want to return more than just a single result.
In this case I want to find the most frequently appearing last names in a users table, here's what I have so far:
select last_name from users group by last_name having max(count(*));
Unfortunately with this query I get an error that my max function is nested too deeply.

select
x.last_name,
x.name_count
from
(select
u.last_name,
count(*) as name_count,
rank() over (order by count(*) desc) as rank
from
users u
group by
u.last_name) x
where
x.rank = 1
Use the analytical function rank. It will assign a numbering based on the order of count(*) desc. If two names got the same count, they get the same rank, and the next number is skipped (so you might get rows having ranks 1, 1 and 3). dense_rank is an alternative which doesn't skip the next number if two rows got the same rank, (so you'd get 1, 1, 2), but if you want only the rows with rank 1, there is not much of a difference.
If you want only one row, you'd want each row to have a different number. In that case, use row_number. Apart from this small-but-important difference, these functions are similar and can be used in the same way.

select name
from
(select name, count(1)
from table
group by name
order by count(1) desc) a
where rownum = 1

Related

SQL statement for when the conditions are met, then pull that value. If not, pull the fallback value [duplicate]

This question already has answers here:
Select First Row of Every Group in sql [duplicate]
(2 answers)
Closed last year.
I have this list with multiple records for each person.
I want to select a specific record from it for each person with a certain condition.
If that certain condition does not meet "all" the records for that person, then we pull the fallback values.
For example:
In this case, I want to pull the record for each person with Active Status = Y then the max date.
If the records does not meet these condition, like John, then it will pull the record with max date regardless of its Active Status.
The result should be:

Using ROW_NUMBER we can try:
WITH cte AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY Person
ORDER BY Status DESC, Date DESC) rn
FROM yourTable t
)
SELECT Person, Status, Mood, Date
FROM cte
WHERE rn = 1;
The ORDER BY clause used with ROW_NUMBER above places yes status records before no records. Within each of those two groups, the latest date records are placed first.

What was the cost for the most expensive movie(s) in the collection? [duplicate]

This question already has answers here:
Oracle SELECT TOP 10 records [duplicate]
(6 answers)
Oracle SQL - How to Retrieve highest 5 values of a column [duplicate]
(5 answers)
Closed 2 years ago.
Hey guys I know the code to show the most expensive movie but what's the one that will show the most expensive and ones right below it. I think that's the question. This is the code I got for one movie.
SELECT *
FROM movie
WHERE purchase_price =
(SELECT MAX(purchase_price) FROM movie);

Well since your description is a little ambiguous, to find your prefer solution, you will have to try several of them.
For example, you can try by using an ORDER BY Condition. Using this condition, you will retrieve all the movies starting with the most expensive one at the top.
SELECT
*
FROM
movie
ORDER BY
purchase_price DESC;
FETCH FIRST 2 ROWS ONLY
But yet again, there are other solutions you can try as well. You can RANK them by price in a subquery and then fetch all the answers. Another example would be to use between max and min ( or any other value ). You can reach even some more technical and harder to implement solutions.

Rank them by price (in a subquery), and fetch the first two. Something like this:
select *
from (select m.*,
rank() over (order by m.purchase_price desc) rnk
from movie m
)
where rnk <= 2
Depending on data you have, you might also consider using ROW_NUMBER or DENSE_RANK analytic functions.

If you strictly want the two most expensive movies, you could order the query and use a fetch first clause:
SELECT *
FROM movie
ORDER BY purchase_price DESC
FETCH FIRST 2 ROWS ONLY
If multiple movies can have the same price, and you want to find all the movies with two most expensive prices, you could use the dense_rank window function:
SELECT *
FROM (SELECT *, DENSE_RANK() OVER (ORDER BY purchase_price DESC) AS rk
FROM movie) m
WHERE rk <= 2

I would rather use the FETCH FIRST 2 ROWS WITH TIES option which will give you the first two most expensive movies and also takes care of multiple movies with the same purchase price
SELECT *
FROM movie
ORDER BY purchase_price DESC
FETCH FIRST 2 ROWS ONLY TIES;

Consult records for certain values of an attribute [duplicate]

This question already has answers here:
Grouped LIMIT in PostgreSQL: show the first N rows for each group?
(6 answers)
Closed 4 years ago.
I have a table with the following scheme (idMovie, genre, title, rating).
How can I make a query that returns the ten films with the best rating for each genre?
I think it could possibly be solved using 'ORDER BY' and also 'LIMIT' to get the top 10 of a genre but I do not know how to do it for each genre.
Disclaimer: I'm newbie in sql.

This is a typical problem called greatest-N-per-group. This normally isn't solved using order by + limit (unless you use LATERAL which is more complicated in my opinion), since as you've mentioned it is an answer to problem of greatest-N but not per group. In your case movie genre is the group.
You could use dense_rank window function to generate ranks based on rating for each genre in a subquery and then select those which are top 10:
select title, rating
from (
select title, rating, dense_rank() over (partition by genre order by rating desc) as rn
from yourtable
) t
where rn <= 10
This may return more than 10 titles for each genre, because there may be ties (the same rating for different movies belonging to one genre). If you only want top 10 without looking at ties, use row_number instead of dense_rank.

How to get the most frequent value in a column? [duplicate]

This question already has answers here:
how to select the most frequently appearing values? [duplicate]
(2 answers)
Closed 9 years ago.
I have a table with column 'Price' and would need to get the most frequent value. What would be the eeasiest way?

One option would be something like
SELECT price
FROM (SELECT price, rank() over (order by cnt desc) rnk
FROM (SELECT price, count(*) cnt
FROM your_table
GROUP BY price))
WHERE rnk = 1
If there are two (or more) prices that occur equally as often, both will be returned by this query. If you want to guarantee a single row, you'll need to tell us how you want to handle ties.

My algorithm is as follows:
Step one: make distinct selection as a collection;
Step two: foreach item in distinct collection count the items found in the original collection as diffcollection;
Step three: select max from diffcollection.

Find row number in a sort based on row id, then find its neighbours

Say that I have some SELECT statement:
SELECT id, name FROM people
ORDER BY name ASC;
I have a few million rows in the people table and the ORDER BY clause can be much more complex than what I have shown here (possibly operating on a dozen columns).
I retrieve only a small subset of the rows (say rows 1..11) in order to display them in the UI. Now, I would like to solve following problems:
Find the number of a row with a given id.
Display the 5 items before and the 5 items after a row with a given id.
Problem 2 is easy to solve once I have solved problem 1, as I can then use something like this if I know that the item I was looking for has row number 1000 in the sorted result set (this is the Firebird SQL dialect):
SELECT id, name FROM people
ORDER BY name ASC
ROWS 995 TO 1005;
I also know that I can find the rank of a row by counting all of the rows which come before the one I am looking for, but this can lead to very long WHERE clauses with tons of OR and AND in the condition. And I have to do this repeatedly. With my test data, this takes hundreds of milliseconds, even when using properly indexed columns, which is way too slow.
Is there some means of achieving this by using some SQL:2003 features (such as row_number supported in Firebird 3.0)? I am by no way an SQL guru and I need some pointers here. Could I create a cached view where the result would include a rank/dense rank/row index?

Firebird appears to support window functions (called analytic functions in Oracle). So you can do the following:
To find the "row" number of a a row with a given id:
select id, row_number() over (partition by NULL order by name, id)
from t
where id = <id>
This assumes the id's are unique.
To solve the second problem:
select t.*
from (select id, row_number() over (partition by NULL order by name, id) as rownum
from t
) t join
(select id, row_number() over (partition by NULL order by name, id) as rownum
from t
where id = <id>
) tid
on t.rownum between tid.rownum - 5 and tid.rownum + 5
I might suggest something else, though, if you can modify the table structure. Most databases offer the ability to add an auto-increment column when a row is inserted. If your records are never deleted, this can server as your counter, simplifying your queries.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how to select the most frequently appearing values? [duplicate] - sql

select name from (select name, count(1) from table group by name order by count(1) desc) a where rownum = 1

Related

SQL statement for when the conditions are met, then pull that value. If not, pull the fallback value [duplicate]

What was the cost for the most expensive movie(s) in the collection? [duplicate]

Consult records for certain values of an attribute [duplicate]

How to get the most frequent value in a column? [duplicate]

Find row number in a sort based on row id, then find its neighbours

Categories

Resources