How to use results of one sql query within a sub-query - sql

I have to answer the following question
"For each year in the database, list the year and the total number of
movies that were released in that year, showing these totals in
decreasing order. That is, the year(s) with the largest number of
movies appear first. If some years have the same number of movies,
show these in increasing order of year."
Currently I am using the code below to get the movies to group together, but am unable to get them to sort:
Select YearReleased, count(*)
from Movies
group by YearReleased
I wish to use something to order this and am trying to make a sub query that uses the results of the first query, along the lines of:
(select * from results order by count(*))
but so far I have been unsuccessful. How do I achieve this or is there a better way of getting the results in that order?

"Unsuccessful" isn't very useful, as opposed to actual error text -- and you aren't telling us which vendor's database you're running against, so we can't test. That said, the following should work:
select
YearReleased,
count(*) as movie_count
from movies
group by YearReleased
order by movie_count desc, YearReleased;
No subqueries needed!
Validated against SQLite 3.5.9; if running against something less standards-compliant (which SQLite is, except in very explicitly documented ways), your mileage may vary.

select *
from
(
Select YearReleased, count(*) as NumReleased
from Movies
group by YearReleased
)
order by NumReleased

select * from (select YearReleased, count(*) counter
from Movies
group by YearReleased
) a order by counter
May need a syntax change depending on your sql flavour.

in first query: select yearReleased, count(*) as 'count1'
in second: order by count1

Yep, with aggregates you can just put an alias for count(*) and it can be reffered as a column.

Related

How to Group By column, while keeping a naming column in as well

I am trying to show the most popular TV show in each country. However, the resulting table outputs multiple shows from the same country, if I include the column that has the shows name. If I don't include this column, it correctly outputs the MAX for eacg country, but without the show name. Can I include both?
This is the script that gets the result I want without the names.
SELECT
origin_country, MAX(popularity) as Most_popular
FROM TV_data
WHERE origin_country not like '%(%'
GROUP BY origin_country
order by Most_popular DESC
This is the script that results in multiple shows from the same country, since the name column is grouped as well.
SELECT
origin_country, name, MAX(popularity) as Most_popular
FROM TV_data
WHERE origin_country not like '%(%'
GROUP BY origin_country, name
order by Most_popular DESC
Thnka you, still learning SQL so any advice is greatly appreciated.
Your idea is correct to GROUP BY origin_country and use MAX to find the highest popularity per country.
All you need to do now is to put this in a subquery, build a main query which shows the other columns, too and JOIN them:
SELECT
tv1.origin_country,
tv1.name,
tv1.popularity Most_Popular
FROM tv_data tv1
JOIN (
SELECT origin_country, MAX(popularity) popularity
FROM tv_data
GROUP BY origin_country) tv2
ON tv1.origin_country = tv2.origin_country
AND tv1.popularity = tv2.popularity
WHERE tv1.origin_country NOT LIKE '%(%'
ORDER BY tv1.popularity DESC;
The above query will be executed on every DB.
Today, DB's usually provide window functions for that as another and maybe easier option. The exact syntax for this way depends on the DB you use since functions often differ between OracleDB, MYSQL DB etc.
Here is an example for a SQLServer DB using RANK:
SELECT
origin_country,
name,
popularity Most_Popular
FROM (SELECT origin_country,
name,
popularity,
RANK() OVER(PARTITION BY origin_country ORDER BY popularity DESC) dest_rank
FROM tv_data) sub
WHERE dest_rank = 1
AND origin_country NOT LIKE '%(%'
ORDER BY popularity DESC;
The PARTITION BY clause works like the GROUP BY in the first query.
If you change for example the condition dest_rank = 1 to dest_rank < 3, you will get the two most popular shows per country.
Try out here: db<>fiddle

Query GROUP BY and COUNT

I'm new to SQL and taking COURSERA's "SQL for Data Science" course.I have the following question in a summary assignment:
Show the number of orders placed by each customer and sort the result by the number of orders in descending order.
Having failed to write the correct code, the answer would be as follows (of course one of several options):
SELECT *
,COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId
ORDER BY number_of_orders DESC
I am still having trouble understanding the query logic. I would appreciate your assistance in understanding this query.
I seriously hope that Coursera isn't giving you the query you cited above as the recommended answer. It won't run on most databases, and even in cases such as MySQL where it might run, it is not completely correct. You should be using this version:
SELECT CustomerId, COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId
ORDER BY number_of_orders DESC;
A basic rule of GROUP BY is that the only columns available for selection are those which appear in the GROUP BY clause. In addition to these columns, aggregates of any column(s) may also appear in the select. The version I gave you above follows these rules, and is ANSI compliant, meaning it would run on any database.
When you say SELECT * it represents ALL COLUMNS. But you are grouping by only CustomerId which is wrong in SQL.
Specify the other columns in the group section that you want to show
The script should be something like
SELECT CustomerName, DateEntered
,COUNT (InvoiceId) AS number_of_orders
FROM Invoices
GROUP BY CustomerId, CustomerName, DateEntered
ORDER BY number_of_orders DESC

How does one get the total rows for a partition in postgresql

I'm using a windows function to help me pagination through a list of records in the database.
For example
I have a list of dogs and they all have a breed associated with them.
I want to show 10 dogs from each breed to my users.
So that would be
select * from dogs
join (
SELECT id, row_number() OVER (PARTITION BY breed) as row_number FROM dogs
) rn on dogs.id = rn.id
where (row_number between 1 and 10)
That will give me ~ten dogs from each breed..
What I need though is a count. Is there a way to get the count of the partitions. I want to know how many Staffies I have waiting for adoption.
I do notice that there's a percentage and all the docs I find seem to indicate theres something called total rows. But I don't see it.
Just run the window aggregate function count() over the same partition (without adding ORDER BY!) to get the total count for the partition:
SELECT *
FROM (
SELECT *
, row_number() OVER (PARTITION BY breed ORDER BY id) AS rn
, count() OVER (PARTITION BY breed) AS breed_count -- !
FROM dogs
) sub
WHERE rn < 11;
Also removed the unnecessary join and simplified.
See:
Run a query with a LIMIT/OFFSET and also get the total number of rows
And I added ORDER BY to the frame definition of row_number() to get a deterministic result. Without, Postgres is free to return any 10 arbitrary rows. Any write to the table (or VACUUM, etc.) can and will change the result without ORDER BY.
Aside, pagination with LIMIT / OFFSET does not scale well. Consider:
Optimize query with OFFSET on large table

Count(), max(),min() fuctions definition with many selects

Lets say we have a view/table hotel(hotel_n,hotel_name, room_n, price). I want to find the cheapest room. I tried group by room_n, but I want the hotels name (hotel_name) to be shown to the board without grouping it.
So as an amateur with sql(oracle 11g) I began with
select hotel_n, room_n, min(price)
from hotel
group by room_n;
but it shows the error: ORA-00979: not a GROUP BY expression. I know I have to type group by room_n, hotel_n, but I want the hotel_n to be seen in the table that I make without grouping by it!
Any ideas? thank you very much!
Aggregate functions are useful to show, well, aggregate information per group of rows. If you want to get a specific row from a group of rows in relation to the other group members (e.g., the cheapest room per room_n), you'd probably need an analytic function, such as rank:
SELECT hotel_n, hotel_name, room_n, price
FROM (SELECT hotel_n, hotel_name, room_n, price
RANK() OVER (PARTITION BY room_n ORDER BY price ASC) rk
FROM hotel) t
WHERE rk = 1

How do I use T-SQL Group By

I know I need to have (although I don't know why) a GROUP BY clause on the end of a SQL query that uses any aggregate functions like count, sum, avg, etc:
SELECT count(userID), userName
FROM users
GROUP BY userName
When else would GROUP BY be useful, and what are the performance ramifications?
To retrieve the number of widgets from each widget category that has more than 5 widgets, you could do this:
SELECT WidgetCategory, count(*)
FROM Widgets
GROUP BY WidgetCategory
HAVING count(*) > 5
The "having" clause is something people often forget about, instead opting to retrieve all their data to the client and iterating through it there.
GROUP BY is similar to DISTINCT in that it groups multiple records into one.
This example, borrowed from http://www.devguru.com/technologies/t-sql/7080.asp, lists distinct products in the Products table.
SELECT Product FROM Products GROUP BY Product
Product
-------------
Desktop
Laptop
Mouse
Network Card
Hard Drive
Software
Book
Accessory
The advantage of GROUP BY over DISTINCT, is that it can give you granular control when used with a HAVING clause.
SELECT Product, count(Product) as ProdCnt
FROM Products
GROUP BY Product
HAVING count(Product) > 2
Product ProdCnt
--------------------
Desktop 10
Laptop 5
Mouse 3
Network Card 9
Software 6
Group By forces the entire set to be populated before records are returned (since it is an implicit sort).
For that reason (and many others), never use a Group By in a subquery.
Counting the number of times tags are used might be a google example:
SELECT TagName, Count(*)
AS TimesUsed
FROM Tags
GROUP BY TagName ORDER TimesUsed
If you simply want a distinct value of tags, I would prefer to use the DISTINCT statement.
SELECT DISTINCT TagName
FROM Tags
ORDER BY TagName ASC
GROUP BY also helps when you want to generate a report that will average or sum a bunch of data. You can GROUP By the Department ID and the SUM all the sales revenue or AVG the count of sales for each month.