JOIN 2 tables ORDER BY SUM value - sql

I have 2 tables: 1st is comment, 2nd is rating
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
ORDER BY b.total_rating DESC
I tried the above SQL but doesn't work!
Object is to display a list of comments order by rating points of each comments.

SELECT s.* FROM (
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
) AS s
ORDER BY s.total_rating DESC
Nest it inside an another select. It will then output the data in the correct order.

Related

Selecting rows with the most repeated values at specific column

Problem in general words: I need to select value from one table referenced to the most repeated values in another table.
Tables have this structure:
screenshot
screenshot2
The question is to find country which has the most results from sportsmen related to it.
First, INNER JOIN tables to have relation between result and country
SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);
Then, I count how much time each country appear
SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;
And got this screenshot3
Now it feels like I'm one step away from solution ))
I guess it's possible with one more SELECT FROM (SELECT ...) and MAX() but I can't wrap it up?
ps:
I did it with doubling the query like this but I feel like it's so inefficient if there are millions of rows.
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
)
WHERE highest_participation = (SELECT MAX(highest_participation)
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
))
Also I did it with a view
CREATE VIEW temp AS
SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp
FROM(
SELECT country, competition_id FROM result
INNER JOIN sportsman USING(sportsman_id)
)
GROUP BY country;
SELECT country_with_most_participations FROM temp
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);
But not sure if it's easiest way.
If I understand this correctly you want to rank the countries per competition count and show the highest ranking country (or countries) with their count. I suggest you use RANK for the ranking.
select country, competition_count
from
(
select
s.country,
count(*) as competition_count,
rank() over (order by count(*) desc) as rn
from sportsman s
inner join result r using (sportsman_id)
group by s.country
) ranked_by_count
where rn = 1
order by country;
If the order of the result rows doesn't matter, you can shorten this to:
select s.country, count(*) as competition_count
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by count(*) desc
fetch first rows with ties;
You seem to be overcomplicating this. Starting from your existing join query, you can aggregate, order the results and keep the top row(s) only.
select s.country, count(*) cnt
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by cnt desc
fetch first 1 row with ties
Note that this allows top ties, if any.
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
order by 2 desc
)
where rownum=1

How to find the three greatest values in each category in PostgreSQL?

I am a SQL beginner. I have trouble on how to find the top 3 max values in each category. The question was
"For order_ids in January 2006, what were the top (by revenue) 3 product_ids for each category_id? "
Table A:
(Column name)
customer_id
order_id
order_date
revenue
product_id
Table B:
product_id
category_id
I tried to combine table B and A using an Inner Join and filtered by the order_date. But then I am stuck on how to find the top 3 max values in each category_id.
Thanks.
This is so far what I can think of
SELECT B.product_id, category_id FROM A
JOIN B ON B.product_id = A.product_id
WHERE order_date BETWEEN ‘2006-01-01’ AND ‘2006-01-31’
ORDER BY revenue DESC
LIMIT 3;
This kind of query is typically solved using window functions
select *
from (
SELECT b.product_id,
b.category_id,
a.revenue,
dense_rank() over (partition by b.category_id, b.product_id order by a.revenue desc) as rnk
from A
join b ON B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
) as t
where rnk <= 3
order by product_id, category_id, revenue desc;
dense_rank() will also deal with ties (products with the same revenue in the same category) so you might actually get more than 3 rows per product/category.
If the same product can show up more than once in table b (for the same category) you need to combine this with a GROUP BY to get the sum of all revenues:
select *
from (
SELECT b.product_id,
b.category_id,
sum(a.revenue) as total_revenue,
dense_rank() over (partition by b.category_id, a.product_id order by sum(a.revenue) desc) as rnk
from a
join b on B.product_id = A.product_id
where a.order_date between date '2006-01-01' AND date '2006-01-31'
group by b.product_id, b.category_id
) as t
where rnk <= 3
order by product_id, category_id, total_revenue desc;
When combining window functions and GROUP BY, the window function will be applied after the GROUP BY.
You can use window functions to gather the grouped revenue and then pull the last X in the outer query. I have not worked in PostgreSQL in a bit so I may be missing a shortcut function below.
WITH ByRevenue AS
(
--This creates a virtualized table that can be queried similar to a physical table in the conjoined statements below
SELECT
category_id,
product_id,
MAX(revenue) as max_revenue
FROM
A
JOIN B ON B.product_id = A.product_id
WHERE
order_date BETWEEN ‘2018-01-01’ AND ‘2018-01-31’
GROUP BY
category_id,product_id
)
,Normalized
(
--Pull data from the in memory table above using normal sql syntax and normalize it with a RANK function to achieve the limit.
SELECT
category_id,
product_id,
max_revenue,
ROW_NUMBER() OVER (PARTITION BY category_id,product_id ORDER BY max_revenue DESC) as rn
FROM
ByRevenue
)
--Final query from stuff above with each category/product ranked by revenue
SELECT *
FROM Normalized
WHERE RN<=3;
For top-n queries, the first thing to try is usually the lateral join:
WITH categories as (
SELECT DISTINCT category_id
FROM B
)
SELECT categories.category_id, sub.product_id
FROM categories
JOIN LATERAL (
SELECT a.product_id
FROM B
JOIN A ON (a.product_id = b.product_id)
WHERE b.category_id = categories.category_id
AND order_date BETWEEN '2006-01-01' AND '2006-01-31'
GROUP BY a.product_id
ORDER BY sum(revenue) desc
LIMIT 3
) sub on true;
Try using Fetch n rows only?
Note: Let's think that your primary key here is product_id, so I used them for combining the two table.
SELECT A.category,A.revenue From Table A
INNER JOIN Table B on A.product_id = B.Product_ID
WHERE A.Order_Date between (from date) and (to date)
ORDER BY A.Revenue DESC
Fetch first 3 rows only

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory

Help with SQL QUERY OF JOIN+COUNT+MAX

I need a help constructung an sql query for mysql database. 2 Table as follows:
tblcities (id,name)
tblmembers(id,name,city_id)
Now I want to retrieve the 'city' details that has maximum number of 'members'.
Regards
SELECT tblcities.id, tblcities.name, COUNT(tblmembers.id) AS member_count
FROM tblcities
LEFT JOIN tblmembers ON tblcities.id = tblmembers.city_id
GROUP BY tblcities.id
ORDER BY member_count DESC
LIMIT 1
Basically: retrieve all cities and count how many members each has, sort by that member count in descending order, making the highest count first - then show only that first city.
Terrible, but that's a way of doing it:
SELECT * FROM tblcities WHERE id IN (
SELECT city_id
FROM tblMembers
GROUP BY city_id
HAVING COUNT(*) = (
SELECT MAX(TOTAL)
FROM (
SELECT COUNT(*) AS TOTAL
FROM tblMembers
GROUP BY city_id
) AS AUX
)
)
That way, if there is a tie, still you'll get all cities with the maximum number of members...
Select ...
From tblCities As C
Join (
Select city_id, Count(*) As MemberCount
From tblMembers
Order By Count(*) Desc
Limit 1
) As MostMembers
On MostMembers.city_id = C.id
select top 1 c.id, c.name, count(*)
from tblcities c, tblmembers m
where c.id = m.city_id
group by c.id, c.name
order by count(*) desc

mysql query with double join

I have 3 tables, but I can only get to join another table count. See below.
The one below works like a charm, but I need to add another "count" from another table.
there is a 3rd table called "ci_nomatch" and contains a reference to ci_address_book.reference
which could have multiple entries (many on many) but I only need the count of that table.
so if ci_address_book would have an entries called "item1","item 2","item3"
and ci_nomatch would have "1,item1,user1","2,item1,user4"
I would like to have returned "2" for Item1 on the query.
Any ideas? I tried another join, but it tells me that the reference does not exist, while it does!
SELECT c.*, IFNULL(p.total, 0) AS matchcount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id
) AS p
ON c.id=p.addressbook_id
ORDER BY matchcount DESC
LIMIT 0,15
You could subquery it directly in the select
SELECT c.*, IFNULL(p.total, 0) AS matchcount,
(SELECT COUNT(*) FROM ci_nomatch n on n.reference = c.reference) AS othercount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT(match_id) AS total
FROM ci_matched_sanctions
GROUP BY addressbook_id
) AS p
ON c.id=p.addressbook_id
ORDER BY matchcount DESC
LIMIT 0,15
#updated for comment. Including an extra column "(matchcount - othercount) AS deducted" would be best done by sub-querying.
SELECT *, matchcount - othercount AS deducted
FROM
(
SELECT c.* , IFNULL( p.total, 0 ) AS matchcount, (
SELECT COUNT( * ) FROM ci_falsepositives n
WHERE n.addressbook_id = c.reference ) AS othercount
FROM ci_address_book c
LEFT JOIN (
SELECT addressbook_id, COUNT( match_id ) AS total
FROM ci_matched_sanctions GROUP BY addressbook_id ) AS p
ON c.id = p.addressbook_id ORDER BY matchcount DESC LIMIT 0 , 15
) S