How to count occurrences in sql - sql

I would like to count how many times an id is present in a table, then print the title associated with this id from another table (next to the amount of occurrences). I also want to only return the top 10 in descending order.
I could only manage to return the total number of occurrences

You are missing GROUP BY:
SELECT b.title, b.book_id,
COUNT(*)
FROM books b INNER JOIN
students_books sb
ON b.book_id = sb.book_id
GROUP BY b.title, b.book_id
ORDER BY COUNT(*) DESC
LIMIT 10;
I also added table aliases. These generally make the query easier to write and to read.

Related

Not getting 0 value in SQL count aggregate by inner join

I am using the basic chinook database and I am trying to get a query that will display the worst selling genres. I am mostly getting the answer, however there is one genre 'Opera' that has 0 sales, but the query result is ignoring that and moving on to the next lowest non-zero value.
I tried using left join instead of inner join but that returns different values.
This is my query currently:
create view max
as
select distinct
t1.name as genre,
count(*) as Sales
from
tracks t2
inner join
invoice_items t3 on t2.trackid == t3.trackid
left join
genres as t1 on t1.genreid == t2.genreid
group by
t1.genreid
order by
2
limit 10;
The result however skips past the opera value which is 0 sales. How can I include that? I tried using left join but it yields different results.
Any help is appreciated.
If you want to include genres with no sales then you should start the joins from genres and then do LEFT joins to the other tables.
Also, you should not use count(*) which counts any row in the resultset.
SELECT g.name Genre,
COUNT(i.trackid) Sales
FROM genres g
LEFT JOIN tracks t ON t.genreid = g.genreid
LEFT JOIN invoice_items i ON i.trackid = t.trackid
GROUP BY g.genreid
ORDER BY Sales LIMIT 10;
There is no need for the keyword DISTINCT, since the query returns 1 row for each genre.
When asking for the top n one must always state how to deal with ties. If I am looking for the top 1, but there are three rows in the table, all with the same value, shall I select 3 rows? Zero rows? One row arbitrarily chosen? Most often we don't want arbitrary results, which excludes the last option. This excludes LIMIT, too, because LIMIT has no clause for ties in SQLite.
Here is an example with DENSE_RANK instead. You are looking for the worst selling genres, so we must probably look at the revenue per genre, which is the sum of price x quantity sold. In order to include genres without invoices (and maybe even without tracks?) we outer join this data to the genre table.
select total, genre_name
from
(
select
g.name as genre_name,
coalesce(sum(ii.unit_price * ii.quantity), 0) as total
dense_rank() over (order by coalesce(sum(ii.unit_price * ii.quantity), 0)) as rnk
from genres g
left join tracks t on t.genreid = g.genreid
left join invoice_items ii on ii.trackid = t.trackid
group by g.name
) aggregated
where rnk <= 10
order by total, genre_name;

Select entire row on group by aggregation

I'm having some struggle with something that should be a simple SQL query.
This is my initial database schema:
Also prepared the example in SQLFiddle
The query I've ended up with is:
select
b.ad_id,
b.auction_id,
max(b.amount) as max,
max(b.created_at) created_at
from bid b
where b.user_id = '601'
group by b.ad_id, b.auction_id
But in the result, I need the whole row from the bid table:
select
b.id,
b.ad_id,
b.auction_id,
max(b.amount) as max,
max(b.created_at) created_at
from bid b
where b.user_id = '601'
group by b.ad_id, b.auction_id
Which fails with: [42803] ERROR: column "b.id" must appear in the GROUP BY clause or be used in an aggregate function Position: 16. Cannot add the id field in the GROUP BY clause, because it will add some extra rows I don't need.
What I need is to select from the bid table the highest record (amount field) grouped by auction_id and ad_id.
I think I need to make some self inner join or subselect but right now I'm not able to write the SQL.
What I need is to select from the bid table the highest record (amount field) grouped by auction_id and ad_id
Take a look at DISTINCT ON in the docs. Your desired result would be obtained by the following query.
select DISTINCT ON (b.ad_id, b.auction_id)
b.id,
b.ad_id,
b.auction_id,
b.amount
b.created_at
from bid b
where b.user_id = '601'
ORDER BY b.ad_id, b.auction_id, b.amount DESC
If you want the most recent row for each auction for the given user, then you can use a correlated subquery to filter:
select b.*
from bid b
where b.user_id = '601' and
b.created_at = (select max(b2.created_at)
from bid b2
where b2.auction_id = b.auction_id and
b2.user_id = b.user_id
);
This seems like a sensible interpretation of what you want. I don't know if ad_id is needed.

Subtracting values of columns from two different tables

I would like to take values from one table column and subtract those values from another column from another table.
I was able to achieve this by joining those tables and then subtracting both columns from each other.
Data from first table:
SELECT max_participants FROM courses ORDER BY id;
Data from second table:
SELECT COUNT(id) FROM participations GROUP BY course_id ORDER BY course_id;
Here is some code:
SELECT max_participants - participations AS free_places FROM
(
SELECT max_participants, COUNT(participations.id) AS participations
FROM courses
INNER JOIN participations ON participations.course_id = courses.id
GROUP BY courses.max_participants, participations.course_id
ORDER BY participations.course_id
) AS course_places;
In general, it works, but I was wondering, if there is some way to make it simplier or maybe my approach isn't correct and this code will not work in some conditions? Maybe it needs to be optimized.
I've read some information about not to rely on natural order of result set in databases and that information made my doubts to appear.
If you want the values per course, I would recommend:
SELECT c.id, (c.max_participants - COUNT(p.id)) AS free_places
FROM courses c LEFT JOIN
participations p
ON p.course_id = c.id
GROUP BY c.id, c.max_participants
ORDER BY 1;
Note the LEFT JOIN to be sure all courses are included, even those with no participants.
The overall number is a little tricker. One method is to use the above as a subquery. Alternatively, you can pre-aggregate each table:
select c.max_participants - p.num_participants
from (select sum(max_participants) as max_participants from courses) c cross join
(select count(*) as num_participants from participants from participations) p;

How to Display the name of singer who has the maximum number of songs

Hi I am beginner and i want to Show that singer name who has maximum number of songs in songs table but i failed to do this because subquery cannot return two values at a time. how can i solve this problem. Below code shows this error -> Only one expression can be specified in the select list when the subquery is not introduced with EXISTS.
here is my code ->
SELECT Singer_Name
FROM Singer
WHERE Singer_id IN(SELECT TOP 1 Singer.Singer_id,COUNT(SongTitle) TotalSounds
FROM Singer,Songs
WHERE Songs.Singer_id=Singer.Singer_id
GROUP BY Singer.Singer_id
ORDER BY TotalSounds DESC)
This should do the trick:
SELECT TOP 1 n.Singer_Name, count(*) as Song_Count
FROM Songs s
INNER JOIN Singer n on n.Singer_id = s.Singer_id
GROUP BY n.Singer_id, n.Singer_Name
ORDER BY count(*) DESC
I added n.Singer_id to the group by on the off-chance that two singers could have the same name.
I hope this helps.
You were close to a workable solution. The issue is you were doing the subquery in the where clause, trying to limit your results that way, where you could just return the full list of names and number of songs and just pick the top one after ordering by the count:
SELECT TOP 1 SingerName FROM (SELECT Singer.Singer_Name, count(1) as TotalSounds
FROM Singer
JOIN Songs
ON Songs.Singer_id=Singer.Singer_id
GROUP BY Singer.Singer_Name, Singer.Singer_id) ss
ORDER BY TotalSounds DESC
If you have a tie for first place, and want to return both names, you make it TOP 1 WITH TIES otherwise it will just grab the first one, arbitrarily breaking the tie by the order they appear in the table (likely by Singer_id)
You can use group by statement like this:
SELECT Singer_Name, count(SongTitle) c FROM Singer
join Song on Song.singer_id = Singer.singer_id
group by Singer_Name
ORDER BY c desc limit 1;

Problem With DISTINCT!

Here is my query:
SELECT
DISTINCT `c`.`user_id`,
`c`.`created_at`,
`c`.`body`,
(SELECT COUNT(*) FROM profiles_comments c2 WHERE c2.user_id = c.user_id AND c2.profile_id = 1) AS `comments_count`,
`u`.`username`,
`u`.`avatar_path`
FROM `profiles_comments` AS `c` INNER JOIN `users` AS `u` ON u.id = c.user_id
WHERE (c.profile_id = 1) ORDER BY `u`.`id` DESC;
It works. The problem though is with the DISTINCT word. As I understand it, it should select only one row per c.user_id.
But what I get is even 4-5 rows with the same c.user_id column. Where is the problem?
actually, DISTINCT does not limit itself to 1 column, basically when you say:
SELECT DISTINCT a, b
What you're saying is, "give me the distinct value of a and b combined" .. just like a multi-column UNIQUE index
distinct will ensure that ALL values in your select clause are unique, not just user_id. If you want to limit the results to individual user_ids, you should group by user_id.
Perhaps what you want is:
SELECT
`c`.`user_id`,
`u`.`username`,
`u`.`avatar_path`,
(SELECT COUNT(*) FROM profiles_comments c2 WHERE c2.user_id = c.user_id AND c2.profile_id = 1) AS `comments_count`
FROM `profiles_comments` AS `c` INNER JOIN `users` AS `u` ON u.id = c.user_id
WHERE (c.profile_id = 1)
GROUP BY `c`.`user_id`,
`u`.`username`,
`u`.`avatar_path`
ORDER BY `u`.`id` DESC;
DISTINCT works at a row level, not just a column level
If you want the DISTiNCT of only one column then you will have to aggregate the rest of the columns returned (MIN, MAX, SUM, AVG, etc)
SELECT DISTINCT (Name), Min (ID)
From MyTable
Distinct will try to return only unique rows, it will not return only 1 row per user id in your example.
http://dev.mysql.com/doc/refman/5.0/en/distinct-optimization.html
You misunderstand. The DISTINCT modifier applies to the entire row — it states that no two identical ROWS will be returned in the result set.
Looking at your SQL, what value of the several available do you expect to see returned in the created_at column (for instance)? It would be impossible to predict the results of the query as written.
Also, you're using profile_comments twice in your SELECT. It appears that you're trying to obtain a count of how many times each user has commented. If so, what you want to do is use an AGGREGATE query, grouped on user_id and including only those columns that uniquely identify a user along with a COUNT of the comments:
SELECT user_id, COUNT(*) FROM profile_comments WHERE profile_id = 1 GROUP BY user_id
You can add the join to users to get the user name if you want but, logically, your result set cannot include other columns from profile_comments and still produce only a single row per user_id unless those columns are also aggregated in some way:
SELECT user_id, MIN(created_at) AS Earliest, MAX(created_at) AS Latest, COUNT(*) FROM profile_comments WHERE profile_id = 1 GROUP BY user_id