Selecting rows with the most repeated values at specific column - sql

Problem in general words: I need to select value from one table referenced to the most repeated values in another table.
Tables have this structure:
screenshot
screenshot2
The question is to find country which has the most results from sportsmen related to it.
First, INNER JOIN tables to have relation between result and country
SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);
Then, I count how much time each country appear
SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;
And got this screenshot3
Now it feels like I'm one step away from solution ))
I guess it's possible with one more SELECT FROM (SELECT ...) and MAX() but I can't wrap it up?
ps:
I did it with doubling the query like this but I feel like it's so inefficient if there are millions of rows.
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
)
WHERE highest_participation = (SELECT MAX(highest_participation)
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
))
Also I did it with a view
CREATE VIEW temp AS
SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp
FROM(
SELECT country, competition_id FROM result
INNER JOIN sportsman USING(sportsman_id)
)
GROUP BY country;
SELECT country_with_most_participations FROM temp
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);
But not sure if it's easiest way.

If I understand this correctly you want to rank the countries per competition count and show the highest ranking country (or countries) with their count. I suggest you use RANK for the ranking.
select country, competition_count
from
(
select
s.country,
count(*) as competition_count,
rank() over (order by count(*) desc) as rn
from sportsman s
inner join result r using (sportsman_id)
group by s.country
) ranked_by_count
where rn = 1
order by country;
If the order of the result rows doesn't matter, you can shorten this to:
select s.country, count(*) as competition_count
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by count(*) desc
fetch first rows with ties;

You seem to be overcomplicating this. Starting from your existing join query, you can aggregate, order the results and keep the top row(s) only.
select s.country, count(*) cnt
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by cnt desc
fetch first 1 row with ties
Note that this allows top ties, if any.

SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
order by 2 desc
)
where rownum=1

Related

JOIN 2 tables ORDER BY SUM value

I have 2 tables: 1st is comment, 2nd is rating
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
ORDER BY b.total_rating DESC
I tried the above SQL but doesn't work!
Object is to display a list of comments order by rating points of each comments.
SELECT s.* FROM (
SELECT * FROM comment_table a
INNER JOIN (SELECT comment_id, SUM(rating_value) AS total_rating FROM rating_table GROUP BY comment_id) b
ON a.comment_id = b.comment_id
) AS s
ORDER BY s.total_rating DESC
Nest it inside an another select. It will then output the data in the correct order.

Trouble Aliasing tables when using subqueries and INNER JOIN

H​ello! I have two tables: trips and stations, and I'm trying to print the ID and name from a distinct station along with the total number of rides from that station.
The ID and name come from the stations table, while the number of trips comes from the trips table, so I'd went ahead and created a query to:
SELECT id, name and num_rides
FROM (SELECT COUNT (*) num_rides FROM tableB AS b) AS num_rides
INNER JOIN tableA AS a ON a.station_id = b.start_station_id
The problem is in the JOIN statement of the outer query, where it doesn't seem to recognize the "b" alias for my table, which I aliased in the inner query.
I tried running the queries separately, and they both work fine. I'm assuming then, that the problem is that the computer doesn't remember my inner query alias on the outer query, but that doesn't make much sense, does it?
Error states: "Unrecognized name: trips" ---> trips being the alias I used for table B.
SELECT
station_id,
name,
num_of_rides AS num_of_rides_starting_at
FROM
(
SELECT
start_station_id,
COUNT(*) number_of_rides
FROM
bigquery-public-data.new_york_citibike.citibike_trips AS trips
GROUP BY
trips.start_station_id
)
AS num_of_rides
INNER JOIN
bigquery-public-data.new_york_citibike.citibike_stations AS stations ON stations.station_id = trips.start_station_id
ORDER BY num_of_rides DESC ```
I believe the issue is that the "trips" alias is only active inside the parentheses. Try naming that whole select statement and referencing that name.
SELECT
station_id,
name,
num_of_rides AS num_of_rides_starting_at
FROM
(
SELECT
start_station_id,
COUNT(*) number_of_rides
FROM
bigquery-public-data.new_york_citibike.citibike_trips AS trips
GROUP BY
trips.start_station_id
) NeedNameHere
AS num_of_rides
INNER JOIN
bigquery-public-data.new_york_citibike.citibike_stations AS stations
ON stations.station_id = NeedNameHere.start_station_id
ORDER BY num_of_rides DESC
Hope this may help
SELECT
station_id, /*from table: citibike_stations */
name, /*from table: citibike_trips */
number_of_rides AS number_of_rides_starting_at_station /*from table: station_num_trips*/
FROM
(SELECT
start_station_id,
COUNT(*) number_of_rides
FROM `bigquery-public-data.new_york_citibike.citibike_trips`
GROUP BY
start_station_id
)
AS station_num_trips
INNER JOIN
`bigquery-public-data.new_york_citibike.citibike_stations`
ON
station_id = start_station_id
ORDER BY
number_of_rides DESC

Hive sql find out how many common customer in each country

I have a table called custtable, have 3 columns custid,country,date
there are 5 countrise in country: 'CH','US', 'UK','FR' and 'GE'
I hope have elegent query to find out how many unique [custid] in 5 countrise.
currently, I can use subquery and temporary table to find the overlapping set, but any suggestions for a more simple way.
here is my way to find out the overlapping for 2 countries and then I need to do another subquery
with t1 AS
(SELECT DISTINCT [custid]
FROM custtable
   where date>20140101
and country='CH'),
t2 as
(SELECT DISTINCT [custid]
FROM custtable
   where date>20140101
and country='FR'),
t3 AS
(SELECT DISTINCT [custid]
FROM custtable
   where date>20140101
and country='US'),
t4 as
(SELECT DISTINCT [custid]
FROM custtable
   where date>20140101
and country='UK')
select count (distinct t1.custid)
from t1
inner join t3
on (t1.custid=t3.custid)
inner join t2
on (t1.custid=t2.custid)
inner join t4
on (t1.custid=t4.custid)
     
thank you for any input
I think a better way is to count how many distinct countries each custid has and filter count >= 5, e.g.,
with count_table as (
select custid, count(distinct country) as cnt
from custtable
where date>20140101
)
select custid, cnt
from count_table
where cnt >= 5
then count your cusid
SELECT COUNTRY
, COUNT(DISTINCT CUSTID) AS CNT
FROM CUSTTABLE
GROUP BY COUNTRY
If you want customers in all five countries:
select custid
from custtable
where date > 20140101
group by custid
having count(distinct country) = 5;
If you want those particular five countries (as your query suggests):
select custid
from custtable
where date > 20140101 and
country in ('CH','US', 'UK','FR', 'GE')
group by custid
having count(distinct country) = 5;

Count and group the number of times each town is listed in the table

SELECT PEOPLE.TOWNKEY, TOWN_LOOKUP.TOWN FROM PEOPLE
INNER JOIN TOWN_LOOKUP
ON PEOPLE.TOWNKEY = TOWN_LOOKUP.PK
ORDER BY TOWN
Current Table Output:
You are missing the group by clause entirely:
SELECT tl.town, COUNT(*)
FROM people p
INNER JOIN town_lookup ON p.townkey = tl.pk
GROUP BY tl.town
ORDER BY tl.town

Help with SQL QUERY OF JOIN+COUNT+MAX

I need a help constructung an sql query for mysql database. 2 Table as follows:
tblcities (id,name)
tblmembers(id,name,city_id)
Now I want to retrieve the 'city' details that has maximum number of 'members'.
Regards
SELECT tblcities.id, tblcities.name, COUNT(tblmembers.id) AS member_count
FROM tblcities
LEFT JOIN tblmembers ON tblcities.id = tblmembers.city_id
GROUP BY tblcities.id
ORDER BY member_count DESC
LIMIT 1
Basically: retrieve all cities and count how many members each has, sort by that member count in descending order, making the highest count first - then show only that first city.
Terrible, but that's a way of doing it:
SELECT * FROM tblcities WHERE id IN (
SELECT city_id
FROM tblMembers
GROUP BY city_id
HAVING COUNT(*) = (
SELECT MAX(TOTAL)
FROM (
SELECT COUNT(*) AS TOTAL
FROM tblMembers
GROUP BY city_id
) AS AUX
)
)
That way, if there is a tie, still you'll get all cities with the maximum number of members...
Select ...
From tblCities As C
Join (
Select city_id, Count(*) As MemberCount
From tblMembers
Order By Count(*) Desc
Limit 1
) As MostMembers
On MostMembers.city_id = C.id
select top 1 c.id, c.name, count(*)
from tblcities c, tblmembers m
where c.id = m.city_id
group by c.id, c.name
order by count(*) desc