Get most frequent value with SQL query - sql

I'm trying to write an SQL query where I find the value that occurs the most frequently.
So far, I have this:
SELECT GENRE, COUNT(*) AS Frequency
FROM BooksRead
GROUP BY GENRE
This gives me output like this:
Anthropological 1
Biography 7
Crime 4
Essay 2
I want the returned result to be 7. I've tried using TOP 1 but my Java compiler doesn't seem to like it.

The ANSI SQL syntax would be:
SELECT GENRE, COUNT(*) AS Frequency
FROM BooksRead
GROUP BY GENRE
ORDER BY COUNT(*) DESC
FETCH FIRST 1 ROW ONLY;
Not all databases support that syntax. Many support LIMIT:
SELECT GENRE, COUNT(*) AS Frequency
FROM BooksRead
GROUP BY GENRE
ORDER BY COUNT(*) DESC
LIMIT 1;
However, the exact syntax depends on the database you are using.
You can also use ANSI standard window functions:
SELECT *
FROM (SELECT GENRE, COUNT(*) AS Frequency,
ROW_NUMBER() OVER (ORDER BY COUNT(*) DESC) as seqnum
FROM BooksRead
GROUP BY GENRE
) g
WHERE seqnum = 1;
If you want ties then use RANK() instead of ROW_NUMBER().

Related

LIMITING MAX VALUES IN SQL

I am completely rewriting this question, I just cant crack it
IDB DB2 SQL
(from a Chicago Crime Dataset)
Which community area is most crome prone?
When I use this code, it does correctly count and sort the data
select community_area_number as community_area_number, count(community_area_number) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc;
the problem is, it lists all the data descending, but no matter what MAX statement I use, either in the select or the order by statement, it wont show just the max values.
The max values are 43, so I would like to to show both 'community_area_numbers' that have 43.
Instead it shows the entire list.
Here is a screenshot
also, yes I understand I can just do a LIMIT 2 command, but that would be cheating since I manually checked that there are 2 max values, but if this data changed or i didnt know that, it doesnt solve anything
thanks in advance
What you would be looking for is the standard SQL clause FETCH WITH TIES;
select community_area_number, count(*) as total_area_crime
from chicago_crime_data
group by community_area_number
order by total_area_crime desc
fetch first row with ties;
Unfortunately, though, DB2 doesn't support WITH TIES in FETCH FIRST.
The classic way (that is before we had the window functions RANK and DENSE_RANK) is to use a subquery: Get the maximum value, then get all rows with that maximum. I am using a CTE (aka WITH clause) here in order not to have to write everything twice.
with counted as
(
select community_area_number, count(*) as total_area_crime
from chicago_crime_data
group by community_area_number
)
select community_area_number, total_area_crime
from counted
where total_area_crime = (select max(total_area_crime) from counted);
(Please note that this is a mere COUNT(*), because we want to count rows per community_area_number.)
Like #topsail mentioned. You could use a rank function.
From the table you have above you could do the following
SELECT t.* FROM
(
SELECT *,
RANK() OVER (Order by Total_Area_Crime DESC) rnk
from
table1
)t
WHERE t.rnk = 1
db fiddle
So your full query should look something like this:
With cte AS (
SELECT MAX(COMMUNITY_AREA_NUMBER) AS COMMUNITY_AREA_NUMBER,
COUNT(COMMUNITY_AREA_NUMBER) AS TOTAL_AREA_CRIME
FROM CHICAGO_CRIME_DATA
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY TOTAL_AREA_CRIME DESC;
)
SELECT t.* FROM
(
SELECT *,
RANK() OVER (Order by Total_Area_Crime DESC) rnk
from
cte
)t
WHERE t.rnk = 1
It turns out the professor did want us to use the Limit command.
Here is the final answer:
SELECT COMMUNITY_AREA_NUMBER, COUNT(ID) AS CRIMES_RECORDED
FROM CHICAGO_CRIME_DATA
GROUP BY COMMUNITY_AREA_NUMBER
ORDER BY CRIMES_RECORDED DESC LIMIT 1;
thanks to all those who responded :D

COUNTRI get an error when I run my query even though the syntax is correct

I have a table named CUSTOMERS with a column COUNTRY. I want to retrieve the city that has the most customers, in other words the most frequent COUNTRY in table CUSTOMERS.
I get an error message as following:
ORA-00904: "COUNTRY": invalid identifier
My code:
SELECT
COUNTRY,
COUNT(COUNTRY) AS `value_occurrence`
FROM
CUSTOMERS
GROUP BY
COUNTRY
ORDER BY
`value_occurrence` DESC
LIMIT 1;
Your syntax is MySQL, which is not portable. MySQL is another DBMS than Oracle.
Here is your query in standard SQL. It works in Oracle as of version 12c.
select country, count(*) as value_occurrence
from customers
order by value_occurrence desc
fetch first row only;
In earlier Oracle versions you can use:
select country, value_occurrence
from
(
select
country,
count(*) as value_occurrence,
row_number() over (order by count(*) desc) as rn
from customers
)
where rn = 1;
If you want to allow for ties, then you'd change only to with ties in the first query and row_number to rank or dense_rank in the second.

SQL plus, top 3 rank across two tables

I'm trying to find a way to query the top three users in a database in terms of number of listens and output their user ID and their rank.
The schema for the two tables in question is as follows :
User(user_id, email, first_name, last_name, password, created_on, last_sign_in)
PreviouslyPlayed(user_id, track_id, timestamp)
I could see how many people pull this off with a count query, but am wondering is there's a way to do this with a rank or dense rank
If you just want the user id and are using Oracle 12g+, then you can do:
select pp.user_id, rank() over (order by count(*) desc) as therank
from previouslyplayed pp
group by pp.user_id
order by count(*) desc
fetch first 3 rows only;
In earlier versions, you would use a subquery:
select pp.*
from (select pp.user_id, rank() over (order by count(*) desc) as therank
from previouslyplayed pp
group by pp.user_id
) pp
where therank <= 3;
You might want to review row_number(), rank(), and dense_rank() to be sure you are getting what you really want (the difference is in how they handle ties).
You only need the join if you are concerned that something called user_id in one table is not a valid user id. That seems unlikely, in any well-designed database.

Selecting top 10 counts in SQLite

I have a table which records questions, their answers, and their authors. The column names are as follows:
id, question, answers, author
I would like to get a list of top 10 authors who have written the most questions. So it would need to first count the number of questions each author has written then sort them by the count then return the top 10.
This is in SQLite and I'm not exactly sure how to get the list of counts. The second part should be fairly simple as it's just an ORDER BY and a LIMIT 10. How can I get the counts into a list which I can select from?
SELECT BY COUNT(author)
,author
FROM table_name
GROUP BY author
ORDER BY COUNT(author) DESC LIMIT 10;
You can apply an order by clause to an aggregate query:
SELECT author, COUNT(*)
FROM mytable
GROUP BY author
ORDER BY 2 DESC
LIMIT 10
You could wrap your query as a subquery and then use LIMIT like this:
SELECT *
FROM (
SELECT author
,COUNT(*) AS cnt
FROM mytable
GROUP BY author
) t
ORDER BY t.cnt DESC
LIMIT 10;

SQL ORDER BY with column number or aggregate function

I've got 2 different results when I try ORDER BY with column number and with an aggregate function. What is the difference between these 2 methods? ( I thought they'd have the same output)
List the 1978 films by order of cast list size. There're 3 tables below:
movie(id, title, yr, director)
actor(id, name)
casting(movieid, actorid, ord)
Answer 1 using ORDER BY with column number:
SELECT title
,COUNT(a.id)
FROM movie m
,casting c
,actor a
WHERE m.id=movieid
AND a.id=actorid
AND yr=1978
GROUP BY title
ORDER BY 2 DESC
Using COUNT(a.id). Everything is the same except the last line
...
ORDER BY COUNT(a.id) DESC
I would suggest you to make a subquery and order that for the clean result
I think Jester has it. Here's an example for you:
select title, actor_count from (
SELECT title
,COUNT(a.id) actor_count
FROM movie m
,casting c
,actor a
WHERE m.id=movieid
AND a.id=actorid
AND yr=1978
GROUP BY title) subquery
order by actor_count