Get a column without adding it to the group by - sql

select year, gender, max(nHospitalizations) from (select TO_CHAR(i.since, 'YYYY') as year, u.gender, h.name,count(h.name) as nHospitalizations from hospital h
join hospitalization i on i.hospital = h.name
join person u on i.person = u.numberID
group by TO_CHAR(i.since, 'YYYY'), u.gender, h.name)
group by year, gender
order by year desc, gender asc
;
I have this query, and it's pretty much doing what I want it too, except, I want to know the hospital name with the most hospitalizations per year, but when I add the h.name to the select, SQL makes me add it to the outer group by, which would mean I would be getting the count per year, gender and hospital name like in the subquery, instead of the hospital with most hospitalizations per year and gender, how can I add the h.name to the outer query without adding it to the outer group by?

Never use commas in the FROM clause. Always use proper, explicit, standard, readable JOIN syntax.
You want to use window functions for this:
select *
from (select count(*) as nHospitalizations, to_char(i.since, 'YYYY') as year,
u.gender, h.name,
row_number() over (partition by min(i.since) order by count(*) desc) as seqnum
from hospital h join
hospitalization i
i.hospital = h.name join
person u
on i.person = u.numberID
group by TO_CHAR(i.since, 'YYYY'), u.gender, h.name
) x
where serqnum = 1

Related

How could I use the select statement on feature from a subquery ? (Postgree)

I'm training for an interview and trying to solve a query, I would like to find for each city who is the client who spent the most. I got the good result the max spent by city but I get an error when I'm trying to retrieve the name and lastname of my customer who spent this amount. Is there an efficient way to do it ? Thank you!
select max(total_payment),X.city, X.firstname, X.lastname
from (
select sum(amount) as total_payment, c.customer_id, cit.city_id, cit.city as city, c.first_name as firstname, c.last_name as lastname
from payment p
inner join customer as c on p.customer_id=c.customer_id
inner join address as ad on c.address_id=ad.address_id
inner join city as cit on ad.city_id=cit.city_id
group by c.customer_id, cit.city_id
order by city
) as X
group by X.city
Target result column:
The name and last name of the customer who spent the most for each city.
120,Paris,Nicolas, Dupont
130, Madrid, Raul, Garcia
70, London,Dave, Goldman
You want window functions:
select cc.*
from (select sum(p.amount) as total_payment, c.customer_id, cit.city_id,
cit.city as city, c.first_name as firstname, c.last_name as lastname,
row_number() over (partition by cit.city order by sum(p.amount) desc) as seqnum
from payment p join
customer c
on p.customer_id = c.customer_id join
address ad
on c.address_id = ad.address_id join
city cit
on ad.city_id = cit.city_id
group by c.customer_id, cit.city_id
) cc
where seqnum = 1;
Note that your query has two errors that should fail any interview:
You are using ORDER BY in a subquery. According to the standard and most databases, ORDER BY is either not allowed or ignored.
In your outer query, the GROUP BY columns are inconsistent with the unaggregated SELECT columns. Once again, this violates the standard and most databases return a syntax error.

Selecting rows with the most repeated values at specific column

Problem in general words: I need to select value from one table referenced to the most repeated values in another table.
Tables have this structure:
screenshot
screenshot2
The question is to find country which has the most results from sportsmen related to it.
First, INNER JOIN tables to have relation between result and country
SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);
Then, I count how much time each country appear
SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;
And got this screenshot3
Now it feels like I'm one step away from solution ))
I guess it's possible with one more SELECT FROM (SELECT ...) and MAX() but I can't wrap it up?
ps:
I did it with doubling the query like this but I feel like it's so inefficient if there are millions of rows.
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
)
WHERE highest_participation = (SELECT MAX(highest_participation)
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
))
Also I did it with a view
CREATE VIEW temp AS
SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp
FROM(
SELECT country, competition_id FROM result
INNER JOIN sportsman USING(sportsman_id)
)
GROUP BY country;
SELECT country_with_most_participations FROM temp
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);
But not sure if it's easiest way.
If I understand this correctly you want to rank the countries per competition count and show the highest ranking country (or countries) with their count. I suggest you use RANK for the ranking.
select country, competition_count
from
(
select
s.country,
count(*) as competition_count,
rank() over (order by count(*) desc) as rn
from sportsman s
inner join result r using (sportsman_id)
group by s.country
) ranked_by_count
where rn = 1
order by country;
If the order of the result rows doesn't matter, you can shorten this to:
select s.country, count(*) as competition_count
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by count(*) desc
fetch first rows with ties;
You seem to be overcomplicating this. Starting from your existing join query, you can aggregate, order the results and keep the top row(s) only.
select s.country, count(*) cnt
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by cnt desc
fetch first 1 row with ties
Note that this allows top ties, if any.
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
order by 2 desc
)
where rownum=1

query to find the athlete with the most medals per year in sql

select A."ID","ATHLETE_NAME", "YEAR"
from OLYM."OLYM_ATHLETES" A
JOIN OLYM."OLYM_MEDALS" B ON A."ID" = B."ATHLETE_GAME_ID"
JOIN OLYM."OLYM_GAMES" C ON B."EVENT_ID" = C.ID;
this gave me a table with athlete id, name and year in which he won a medal. is there any way to extract the highest decorated athlete per year form this table or am i missing something?
Table image
If I followed you correctly, you need the athlete with max award in a year. If this is what you required then you can use the analytical function row_number as follows:
SELECT ID, ATHLETE_NAME, YEAR, CNT FROM
(select A."ID","ATHLETE_NAME", "YEAR", COUNT(1) AS cnt,
Row_number() over (partition by "YEAR" order by count(1) desc) as rn
from OLYM."OLYM_ATHLETES" A
JOIN OLYM."OLYM_MEDALS" B ON A."ID" = B."ATHLETE_GAME_ID"
JOIN OLYM."OLYM_GAMES" C ON B."EVENT_ID" = C.ID
group by A."ID","ATHLETE_NAME", "YEAR")
WHERE RN = 1
ORDER BY YEAR
Please use below query, this query will provide you the count of an athlete. You can use HAVING clause further to filter out according to your requirement
select A."ID","ATHLETE_NAME", "YEAR", count(1)
from OLYM."OLYM_ATHLETES" A
JOIN OLYM."OLYM_MEDALS" B ON A."ID" = B."ATHLETE_GAME_ID"
JOIN OLYM."OLYM_GAMES" C ON B."EVENT_ID" = C.ID
group by A."ID","ATHLETE_NAME", "YEAR" order by count(1) desc;

hive select max count by grouping on two fields

I am trying to write a sql query to find Most Popular Artist in each Country. Popular artist is one which has maximum number of rating>=8
Below is table structure,
describe album;
albumid string
album_title string
album_artist string`
describe album_ratings;
userid int
albumid string
rating int
describe cusers;
userid int
state string
country string
Below is one query that I wrote but it is not working.
select album_artist, country, count(rating)
from album, album_ratings, cusers
where album.albumid=album_ratings.albumid
and album_ratings.userid=cusers.userid
and rating>=6
group by country, album_artist
having count(rating) = (
select max(t.cnt)
from (
select count(rating) as cnt
from album, album_ratings, cusers
where album.albumid=album_ratings.albumid
and album_ratings.userid=cusers.userid
and rating>=6
group by country, album_artist
) as t
group by t.country
);
Learn to use proper, explicit JOIN syntax. Never use commas in the FROM clause.
You can do this with window functions:
select *
from (select album_artist, country, count(*) as cnt,
row_number() over (partition by country order by count(*) desc) as seqnum
from album a join
album_ratings ar join
on a.albumid = ar.albumid
cusers u
on ar.userid = u.userid
where rating >= 6
group by country, album_artist
) aru
where seqnum = 1;
If you want ties, use rank() instead of row_number().
You can use window function row_number to find most popular artist in each country (higher rating - more popular):
select *
from (
select c.country,
a.album_artist,
sum(rating) as total_rating,
row_number() over (partition by c.country order by sum(rating) desc) as rn
from cusers c
join album_ratings r on c.userid = r.userid
join album a on r.albumid = a.albumid
where r.rating >= 8
group by c.country,
a.album_artist
) t
where rn = 1;
I assumed sum(rating) instead, because I think rating should be additive.
Also, always use explicit join syntax instead of old comma based join.

How to find the oldest date in an SQL Table

I'm making a SQL query and I need some help
The question is:
What is an employee's current salary , and what was their salary when they first started working?
(give name , first letters , current salary , department name , start date and their salary when they started)
The problem is:
I keep getting all the dates, and I can't find out how to filter it to only the first date (start/hire date) being shown.
Any suggestions?
My SQL code looks like this:
SELECT M.NAME,
M.FIRSTLETTERS,
M.MONTHSALARY,
D.NAME,
H.STARTDATE,
H.SALARY
FROM E_EMPLOYEES M
LEFT OUTER JOIN E_DEPARTMENTS A
ON M.AFD = A.ANR
INNER JOIN E_historie H
ON M.MNR = H.MNR
You can use aggregation and keep for this purpose. The query looks like:
SELECT M.NAME, M.FIRSTLETTERS,
MIN(H.STARTDATE) as STARTDATE,
MAX(H.SALARY) KEEP (DENSE_RANK FIRST ORDER BY H.STARTDATE) as FIRST_SALARY,
MAX(H.SALARY) KEEP (DENSE_RANK FIRST ORDER BY H.STARTDATE DESC) as LAST_SALARY
FROM E_EMPLOYEES M INNER JOIN
E_DEPARTMENTS A
ON M.AFD = A.ANR INNER JOIN
E_historie H
ON M.MNR = H.MNR
GROUP BY M.MNR, M.NAME, M.FIRSTLETTERS;
Use ROW_NUMBER to find the first salary of each employee from history table
SELECT M.NAME,
M.FIRSTLETTERS,
M.MONTHSALARY, -- considering this is current salary
A.NAME, -- I think there is a typo here Alias name should be A not H
H.STARTDATE,
H.SALARY
FROM E_EMPLOYEES M
LEFT OUTER JOIN E_DEPARTMENTS A
ON M.AFD = A.ANR
INNER JOIN (SELECT Row_number()OVER(partition BY H.MNR ORDER BY H.STARTDATE ASC) AS Rn,
H.MNR,
H.STARTDATE,
H.STARTDATE
FROM E_historie H) H
ON M.MNR = H.MNR
WHERE Rn = 1