How to apply HAVING to a column and not a function - sql

I have two tables, one with the names of hacking students and another with the challenges they made. I need to return the id, names and the number of challenges of the hackers, e.g:
hacker_id name challenges_created
21283 Angela 6
88255 Patrick 5
5077 Rose 4
62743 Frank 4
96196 Lisa 1
But if more than one hacker created the same number of challenges and that number is less that of the hacker who made the most challenges, those hackers must be excluded from the results. In this case, the 4s must be excluded. I found the exact answer for the problem online, which looks like this (edited to use my table names):
SELECT c.hacker_id, h.name, COUNT(c.hacker_id) AS ctn
FROM Sample0.Hackers as h
LEFT JOIN Sample0.Challenges c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
HAVING ctn = (SELECT TOP 1 COUNT(c1.challenge_id) FROM Sample0.Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*)) OR
ctn NOT IN (SELECT COUNT(c2.challenge_id) FROM Sample0.Challenges AS c2 GROUP BY c2.hacker_id HAVING c2.hacker_id <> c.hacker_id);
I'm getting errors on the HAVING clause, saying "Invalid column name 'ctn'". I've only worked with HAVING once and can only use a basic function on it. I don't know why it's giving me this error.

I would handle this using analytic functions:
WITH cte AS (
SELECT c.hacker_id, h.name, COUNT(*) AS challenges_created,
RANK() OVER (ORDER BY COUNT(*) DESC) rnk,
COUNT(*) OVER (PARTITION BY COUNT(*)) cnt
FROM Sample0.Hackers as h
LEFT JOIN Sample0.Challenges c
ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
)
SELECT
hacker_id,
name,
challenges_created
FROM cte
WHERE rnk = 1 OR cnt = 1;
The idea here is that an aggregate record should be retained if it either is tied for the highest challenge count or there are no other records having the same challenge count.

you can not use inline alias name try like below
SELECT
c.hacker_id,
h.name,
COUNT(c.hacker_id) AS ctn
FROM Sample0.Hackers as h
left JOIN Sample0.Challenges as c
ON h.hacker_id = c.hacker_id
GROUP BY c.hacker_id, h.name
HAVING COUNT(c.hacker_id) = (SELECT TOP 1 COUNT(c1.challenge_id) FROM Sample0.Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*)) OR
COUNT(c.hacker_id) NOT IN (SELECT COUNT(c2.challenge_id) FROM Sample0.Challenges AS c2 where c2.hacker_id <> c.hacker_id GROUP BY c2.hacker_id )
And your selection column and group by column was not from same table so it thrown error hacker_id' is invalid in the HAVING clause

Related

SQL Query optimization written for an Hacker Rank Challenge [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 months ago.
Improve this question
For a hacker rank SQL challenge, I have written SQL script that is producing the required result. But I overused too many subqueries, I would like to know if the code can be optimized. Below is the link for the SQL challenge
https://www.hackerrank.com/challenges/challenges/problem .
Challenge details in brief :
"Write a query to print the hacker_id, name, and the total number of challenges created by each student. Sort your results by the total number of challenges in descending order. If more than one student created the same number of challenges, then sort the result by hacker_id. If more than one student created the same number of challenges and the count is less than the maximum number of challenges created, then exclude those students from the result."
Here the code I have written for above challenge:
SELECt
h.hacker_id,
h.name,
t.tot_ch
from
hackers h,
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
T,
(
SELect
tot_ch,
count(tot_ch) DUPS
from
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
group by tot_ch
)
D
Where
h.hacker_id = t.hacker_id
And d.tot_ch = t.tot_ch
AND
(
CASe
when
d.dups < 2
then
1
ELSE
(
case when
t.tot_ch =
(
select
MAX(T1.tot_ch)
from
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
T1
)
then
1
End
)
end
)
= 1
ORDER BY
t.tot_ch desc, h.hacker_id;
You can use common table expressions to turn your repeated subqueries into inline views which you can query just as you would a normal view:
WITH cteCount AS (SELECT c.HACKER_ID,
COUNT(c.CHALLENGE_ID) TOT_CH
FROM CHALLENGES c
GROUP BY HACKER_ID),
cteDups AS (SELECT TOT_CH,
COUNT(TOT_CH) AS DUPS
FROM cteCount
GROUP BY TOT_CH)
SELECT h.HACKER_ID,
h.NAME,
t.TOT_CH
FROM HACKERS h
INNER JOIN cteCount t
ON t.HACKER_ID = h.HACKER_ID
INNER JOIN cteDups d
ON d.TOT_CH = t.TOT_CH
WHERE CASE
WHEN d.DUPS < 2
THEN 1
WHEN t.TOT_CH = (SELECT MAX(TOT_CH)
FROM cteCount)
THEN 1
END = 1
ORDER BY t.TOT_CH DESC,
h.HACKER_ID;
You can use the combination of the analytical function as following:
Select hacker_id, name, cnt
from
(Select t.*,
count(1) over (partition by cnt) as same_cnt,
Max(cnt) over () as max_cnt
from
(Select h.hacker_id,
h.name,
Count(1) as cnt
From hackers h
Join challenges c
On h.hacker_id = c.hacker_id
Group by h.hacker_id, h.name) t)
Where same_cnt = 1
or cnt = max_cnt
Cheers!!
with
cte1 as (select hacker_id, count(challenge_id) as challenges_created
from Challenges
group by hacker_id
order by hacker_id
),
cte2 as (select c.hacker_id, h.name, c.challenges_created
from cte1 c
left join Hackers h
on c.hacker_id = h.hacker_id
),
cte3 as (select max(challenges_created) as max_challenges
from cte2
),
cte4 as (select challenges_created, count(hacker_id) as hack_count_same
from cte2
where challenges_created not in(select max_challenges from cte3)
group by challenges_created
order by challenges_created
),
cte5 as (select hacker_id, name, challenges_created
from cte2
where challenges_created not in(select challenges_created from cte4 where hack_count_same > 1)
),
cte6 as (select hacker_id, name, challenges_created
from cte5
union all
select hacker_id, name, challenges_created
from cte2
where challenges_created in(select max_challenges from cte3)
and hacker_id not in(select hacker_id from cte5)
)
select *
from cte6
order by challenges_created desc, hacker_id;
select a.hacker_id,a.name,count(b.hacker_id) As challenges_created
from Hackers a, Challenges b
WHERE a.hacker_id = b.hacker_id
GROUP BY a.hacker_id,a.name
HAVING count(b.hacker_id) not in (select distinct count(hacker_id) from Challenges
WHERE hacker_id <> a.hacker_id
group by hacker_id
having count(hacker_id) < (select max(x.challenge_count)
from (select count(b.challenge_id) as challenge_count from Challenges b GROUP BY b.hacker_id) as x ))
ORDER BY count(b.hacker_id) desc, a.hacker_id

Counter Example to the Following Solution?

I recently completed a coding challenge with the following prompt:
The total score of a hacker is the sum of their maximum scores for all
of the challenges. Write a query to print the hacker_id, name, and
total score of the hackers ordered by the descending score. If more
than one hacker achieved the same total score, then sort the result by
ascending hacker_id. Exclude all hackers with a total score of 0 from
your result.
The following tables contain contest data:
Hackers: The hacker_id is the id of the hacker, and name is the name
of the hacker.
Submissions: The submission_id is the id of the submission, hacker_id
is the id of the hacker who made the submission, challenge_id is the
id of the challenge for which the submission belongs to, and score is
the score of the submission.
My solution passed the test cases, but it took me many iterations to get there.
I get the feeling that there may be an edge case / specific input which would not pass using my solution, but I'm having trouble figuring it out.
Any guesses or counter examples?
My solution:
Select ID, Name, sum(maxscore) as tot From
(Select ID, Name, chal, Max(score) as maxscore From
(Select Submissions.hacker_id as ID, Hackers.name as Name, Submissions.score as score, Submissions.challenge_id as chal
From Submissions
Inner Join Hackers on Submissions.hacker_id = Hackers.hacker_id
Where Submissions.score <> 0)
Group by chal, ID, Name)
Group by ID, Name Order by tot desc, ID asc;
Select a.hacker_id,a.name,x.sum_max
from Hackers a
inner join
(SELECT hacker_id,sum(max_score) as sum_max from (
Select hacker_id,challenge_id,max(Score) as max_score
from submissions
group by hacker_id,challenge_id)b
group by hacker_id
having sum_max>0)x on a.hacker_id=x.hacker_id
order by sum_max desc,hacker_id
Here is mine
select T.hacker_id, T.name , sum(T.max_score) sum_score from (
Select h.hacker_id, name , challenge_id, max(score) max_score from Hackers h
inner join submissions s on h.hacker_id = s.hacker_id
group by h.hacker_id, name , challenge_id) as T
group by T.hacker_id, T.name
having sum(T.max_score) > 0
order by sum_score desc , T.hacker_id
This code for MySQL successfully ran the test cases.
select m.id, h.name, m.total from
(select x.hacker_id id, sum(x.max_score) total from
(select h.hacker_id, s.challenge_id, max(s.score) as max_score from submissions as s
join hackers as h on h.hacker_id = s.hacker_id
group by h.hacker_id, s.challenge_id) as x
group by x.hacker_id
having total <> 0)as m,
hackers as h
where m.id = h.hacker_id
order by m.total desc, m.id;
enter image description here
First try this query than vote:
select h.hacker_id,h.name,sum(s.score) as scores from Hackers h
inner join
(select hacker_id,max(score) as score from Submissions group by hacker_id,challenge_id) s
on h.hacker_id=s.hacker_id
group by h.hacker_id,h.name having scores>0
order by scores desc,h.hacker_id;
SELECT A, B, sum(C) FROM
(SELECT h.hacker_id AS A, h.name AS B, max(s.score)AS C FROM
Hackers h
JOIN Submissions s ON s.hacker_id=h.hacker_id
GROUP BY h.hacker_id,h.name, s.challenge_id) Sub
GROUP BY A, B
HAVING sum(C)>0
ORDER BY sum(C) DESC, A ASC;
That's how I did it:
WITH NOW AS (
SELECT CHALLENGE_ID AS CI, HACKER_ID AS HI, MAX(SCORE) AS TSCORE FROM SUBMISSIONS
WHERE HACKER_ID = HACKER_ID GROUP BY CHALLENGE_ID, HACKER_ID
)
SELECT C.HI, H.NAME, SUM(C.TSCORE) AS [TOTAL] FROM NOW AS C, HACKERS AS H
WHERE C.HI = H.HACKER_ID AND C.TSCORE > 0
GROUP BY C.HI, H.NAME ORDER BY TOTAL DESC, C.HI
SELECT
res.hacker_id as id,
res.name as name,
SUM(res.max_score) as score
from (
SELECT s.challenge_id as challenge_id, h.hacker_id as hacker_id ,h.name as name,Max(score) as max_score
FROM Hackers as h JOIN Submissions as s
ON h.hacker_id = s.hacker_id
GROUP BY challenge_id ,hacker_id,name
having max_score <> 0
) as res
GROUP BY hacker_id,name order by score desc,id asc;
Try this for mysql
SELECT t.hacker_id,t.name,sum(t.score) as score FROM
(select s.hacker_id, h.name, s.challenge_id, max(s.score) as score
FROM submissions s join Hackers h ON s.hacker_id = h.hacker_id
GROUP BY hacker_id, h.name, challenge_id ) as t
where score != 0 group by t.hacker_id,t.name order by sum(t.score) desc,t.hacker_id
oracle solution:
with temp as (
select tp.hacker_id,h.name, sum(tp.score) s from (
select hacker_id ,submission_id, challenge_id ,score, rank() over (partition by
hacker_id, challenge_id order by score desc,submission_id )as rank from
submissions)tp
join hackers h on h.hacker_id=tp.hacker_id
where tp.rank=1
group by tp.hacker_id,h.name
order by s desc,tp.hacker_id)
select hacker_id,name,s from temp where s <> 0 order by s desc,hacker_id ;
i tried this and worked with me
select H.Hacker_id, h.Name, sum(x.Score)
from hackers h
inner join (
select hacker_id, challenge_id, max(Score) as Score
from Submissions
group by hacker_id, challenge_id
) x
on H.Hacker_id = x.hacker_Id
group by H.Hacker_Id, h.Name
having sum(x.score) > 0
order by 3 desc, h.hacker_id

Display the city name which has most number of branches

I have tried to get city name which has most number of branches .
select C.City_name ,count(B.B_Name)
from tblcity C
inner join
tblBranch B
on c.city_id=B.City_id
group by C.City_name
order by count(B.B_Name) desc
Above code will give me the count of branches for particular city .
Please help me solve to get city name which has most number of branches
you can add TOP 1 to your query
select TOP 1 C.City_name ,count(B.B_Name)
from tblcity C
inner join
tblBranch B
on c.city_id=B.City_id
group by C.City_name
order by count(B.B_Name) desc
Use DENSE_RANK():
SELECT
City_Name, cnt
FROM
(
SELECT
c.City_name,
COUNT(b.B_Name) cnt,
DENSE_RANK() OVER (ORDER BY COUNT(b.B_Name) DESC) dr
FROM tblcity c
INNER JOIN tblBranch b
ON b.city_id = c.City_id
GROUP BY c.City_name
) t
WHERE dr = 1;
Using TOP 1 WITH TIES would be another option here, but that is specific to SQL Server.

Oracle - Not a single-group group function

For the following sql command
select h.hacker_id
,h.name
,challenges_created
from hackers h
inner join (
select hacker_id
,count(*) as challenges_created /* line 1 */
from challenges
group by hacker_id
order by 2 desc
) tmp on h.hacker_id = tmp.hacker_id
order by challenges_created desc
,h.hacker_id;
so far so good but
As soon as I try to add max(count(*)) as maximum to the line 1 it give error as:
Not a single-group group function
This is the code for which it gives error :
select h.hacker_id
,h.name
,challenges_created
from hackers h
inner join (
select hacker_id
,count(*) as challenges_created
,max(count(*)) as maximum
from challenges
group by hacker_id
) tmp on h.hacker_id = tmp.hacker_id
order by challenges_created desc
,h.hacker_id;
I am basically interested in getting the maximum count i.e. maximum number of challenges created so far.
I am new to sql, kindly help and oblige. Thanks in advance. And yes, of course! I know many such question have been asked in recent past but none matches my situation which is why I am asking it again.
Try this:
with x as (
select h.hacker_id, h.name, count(*) challenges_created
from hackers h
inner join challenges on h.hacker_id = challenges.hacker_id
group by h.hacker_id, h.name
)
select x.*,
(select max(challenges_created) from x) as "max"
from x
order by challenges_created desc, hacker_id;

oracle - maximum per group

University Table - UniversityName, UniversityId
Lease Table - LeaseId, BookId, UniversityId, LeaseDate
Book Table - BookId, UniversityId, Category, PageCount.
For each university, I have to find category that had the most number of books leased.
So, something like
UniversityName Category #OfTimesLeased
I have been playing around with it with some success using Dense_Rank etc - but if there is a tie, only one of them shows up, while I want both of them to show up.
Current Query:
select b.UniversityId, MAX(tempTable.type) KEEP (DENSE_RANK FIRST ORDER BY tempTable.counter DESC)
from book b
join
(select count(l.leaseid) AS counter, b.category, b.universityid
from lease l
join book b
on b.bookid =l.bookid AND b.universityid=r.universityid
group by b.category, b.universityid) tempTable
on counterTable.universityid= b.universityid
group by b.universityid
^Unable to solve the tie issue and get the number of leases for the most leased book type.
Try this
WITH CTE AS
(
SELECT UniversityName, Category, Count(*) NumOfTimesLeased
FROM University u
INNER JOIN Book b on u.UniversityId = b.UniversityId
INNER JOIN Lease l on b.bookid = l.bookid and b.UniversityId = l.UniversityId
GROUP BY UniversityName, Category
),
CTE2 AS (
SELECT UniversityName, Category, NumOfTimesLeased,
RANK() OVER (PARTITION BY UniversityName
ORDER BY NumOfTimesLeased DESC) Rnk
FROM CTE)
SELECT * FROM CTE2 WHERE Rnk = 1
You are on the right track with the analytic functions:
select Univerity, Category, NumLeased
from (select t.*,
row_number() over (partition by university order by Numleased desc) as seqnum
from (select l.university, b.category, count(*) as NumLeased
from lease l join
book b
on l.bookid = b.bookid
group by l.university, b.category
) t
) t
where seqnum = 1
I use the row_number() because you only want the one top value. Rank and dense_rank are more useful when you are looking for values other than "1".
If you want the top values to show up when there is a tie, then use dense_rank instead of row_number. The values will be on different rows.