SQL Query optimization written for an Hacker Rank Challenge [closed] - sql

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 8 months ago.
Improve this question
For a hacker rank SQL challenge, I have written SQL script that is producing the required result. But I overused too many subqueries, I would like to know if the code can be optimized. Below is the link for the SQL challenge
https://www.hackerrank.com/challenges/challenges/problem .
Challenge details in brief :
"Write a query to print the hacker_id, name, and the total number of challenges created by each student. Sort your results by the total number of challenges in descending order. If more than one student created the same number of challenges, then sort the result by hacker_id. If more than one student created the same number of challenges and the count is less than the maximum number of challenges created, then exclude those students from the result."
Here the code I have written for above challenge:
SELECt
h.hacker_id,
h.name,
t.tot_ch
from
hackers h,
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
T,
(
SELect
tot_ch,
count(tot_ch) DUPS
from
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
group by tot_ch
)
D
Where
h.hacker_id = t.hacker_id
And d.tot_ch = t.tot_ch
AND
(
CASe
when
d.dups < 2
then
1
ELSE
(
case when
t.tot_ch =
(
select
MAX(T1.tot_ch)
from
(
Select
c.hacker_id,
count(c.challenge_id) Tot_ch
from
challenges c
Group by
hacker_id
)
T1
)
then
1
End
)
end
)
= 1
ORDER BY
t.tot_ch desc, h.hacker_id;

You can use common table expressions to turn your repeated subqueries into inline views which you can query just as you would a normal view:
WITH cteCount AS (SELECT c.HACKER_ID,
COUNT(c.CHALLENGE_ID) TOT_CH
FROM CHALLENGES c
GROUP BY HACKER_ID),
cteDups AS (SELECT TOT_CH,
COUNT(TOT_CH) AS DUPS
FROM cteCount
GROUP BY TOT_CH)
SELECT h.HACKER_ID,
h.NAME,
t.TOT_CH
FROM HACKERS h
INNER JOIN cteCount t
ON t.HACKER_ID = h.HACKER_ID
INNER JOIN cteDups d
ON d.TOT_CH = t.TOT_CH
WHERE CASE
WHEN d.DUPS < 2
THEN 1
WHEN t.TOT_CH = (SELECT MAX(TOT_CH)
FROM cteCount)
THEN 1
END = 1
ORDER BY t.TOT_CH DESC,
h.HACKER_ID;

You can use the combination of the analytical function as following:
Select hacker_id, name, cnt
from
(Select t.*,
count(1) over (partition by cnt) as same_cnt,
Max(cnt) over () as max_cnt
from
(Select h.hacker_id,
h.name,
Count(1) as cnt
From hackers h
Join challenges c
On h.hacker_id = c.hacker_id
Group by h.hacker_id, h.name) t)
Where same_cnt = 1
or cnt = max_cnt
Cheers!!

with
cte1 as (select hacker_id, count(challenge_id) as challenges_created
from Challenges
group by hacker_id
order by hacker_id
),
cte2 as (select c.hacker_id, h.name, c.challenges_created
from cte1 c
left join Hackers h
on c.hacker_id = h.hacker_id
),
cte3 as (select max(challenges_created) as max_challenges
from cte2
),
cte4 as (select challenges_created, count(hacker_id) as hack_count_same
from cte2
where challenges_created not in(select max_challenges from cte3)
group by challenges_created
order by challenges_created
),
cte5 as (select hacker_id, name, challenges_created
from cte2
where challenges_created not in(select challenges_created from cte4 where hack_count_same > 1)
),
cte6 as (select hacker_id, name, challenges_created
from cte5
union all
select hacker_id, name, challenges_created
from cte2
where challenges_created in(select max_challenges from cte3)
and hacker_id not in(select hacker_id from cte5)
)
select *
from cte6
order by challenges_created desc, hacker_id;

select a.hacker_id,a.name,count(b.hacker_id) As challenges_created
from Hackers a, Challenges b
WHERE a.hacker_id = b.hacker_id
GROUP BY a.hacker_id,a.name
HAVING count(b.hacker_id) not in (select distinct count(hacker_id) from Challenges
WHERE hacker_id <> a.hacker_id
group by hacker_id
having count(hacker_id) < (select max(x.challenge_count)
from (select count(b.challenge_id) as challenge_count from Challenges b GROUP BY b.hacker_id) as x ))
ORDER BY count(b.hacker_id) desc, a.hacker_id

Related

Getting ORA-00907: missing right parenthesis when no extra parenthesis on left

Here is my SQL query:
select hck.hacker_id, hck.name, cnt
from (
Hacker as hck
inner join (
Select hacker_id, count(challenge_id) as cnt
from Challenges
group by hacker_id
) chl_count on hck.hacker_id = chl_count.hacker_id
) having cnt = max(cnt) or
cnt in (select cnt
from chl_count
group by cnt
having count(hacker_id) = 1)
order by cnt desc, hck.hacker_id asc;
Here Hackers has schema:
Hackers(name, hacker_id)
And Challenges has schema
Challenges(hacker_id, ,challenge_id)
I don't see any missing parenthesis in the query. So, what is wrong? Also, other syntaxes such as commas are correct as well.
It seems you are new in Oracle SQL.
You can't do this: "..FROM (Hacker as hck inner join) .." but you can do it like this:
WITH chl_count
AS ( SELECT hacker_id, COUNT (challenge_id) AS cnt
FROM Challenges
GROUP BY hacker_id)
SELECT hck.hacker_id, hck.name, cnt
FROM Hacker hck INNER JOIN chl_count ON hck.hacker_id = chl_count.hacker_id
HAVING cnt = (select max(challenge_id) from Challenges)
OR cnt IN ( SELECT cnt
FROM chl_count
WHERE hacker_id= 1)
ORDER BY cnt DESC, hck.hacker_id ASC;
It should work now.

How to apply HAVING to a column and not a function

I have two tables, one with the names of hacking students and another with the challenges they made. I need to return the id, names and the number of challenges of the hackers, e.g:
hacker_id name challenges_created
21283 Angela 6
88255 Patrick 5
5077 Rose 4
62743 Frank 4
96196 Lisa 1
But if more than one hacker created the same number of challenges and that number is less that of the hacker who made the most challenges, those hackers must be excluded from the results. In this case, the 4s must be excluded. I found the exact answer for the problem online, which looks like this (edited to use my table names):
SELECT c.hacker_id, h.name, COUNT(c.hacker_id) AS ctn
FROM Sample0.Hackers as h
LEFT JOIN Sample0.Challenges c ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
HAVING ctn = (SELECT TOP 1 COUNT(c1.challenge_id) FROM Sample0.Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*)) OR
ctn NOT IN (SELECT COUNT(c2.challenge_id) FROM Sample0.Challenges AS c2 GROUP BY c2.hacker_id HAVING c2.hacker_id <> c.hacker_id);
I'm getting errors on the HAVING clause, saying "Invalid column name 'ctn'". I've only worked with HAVING once and can only use a basic function on it. I don't know why it's giving me this error.
I would handle this using analytic functions:
WITH cte AS (
SELECT c.hacker_id, h.name, COUNT(*) AS challenges_created,
RANK() OVER (ORDER BY COUNT(*) DESC) rnk,
COUNT(*) OVER (PARTITION BY COUNT(*)) cnt
FROM Sample0.Hackers as h
LEFT JOIN Sample0.Challenges c
ON h.hacker_id = c.hacker_id
GROUP BY h.hacker_id, h.name
)
SELECT
hacker_id,
name,
challenges_created
FROM cte
WHERE rnk = 1 OR cnt = 1;
The idea here is that an aggregate record should be retained if it either is tied for the highest challenge count or there are no other records having the same challenge count.
you can not use inline alias name try like below
SELECT
c.hacker_id,
h.name,
COUNT(c.hacker_id) AS ctn
FROM Sample0.Hackers as h
left JOIN Sample0.Challenges as c
ON h.hacker_id = c.hacker_id
GROUP BY c.hacker_id, h.name
HAVING COUNT(c.hacker_id) = (SELECT TOP 1 COUNT(c1.challenge_id) FROM Sample0.Challenges AS c1 GROUP BY c1.hacker_id ORDER BY COUNT(*)) OR
COUNT(c.hacker_id) NOT IN (SELECT COUNT(c2.challenge_id) FROM Sample0.Challenges AS c2 where c2.hacker_id <> c.hacker_id GROUP BY c2.hacker_id )
And your selection column and group by column was not from same table so it thrown error hacker_id' is invalid in the HAVING clause

Counter Example to the Following Solution?

I recently completed a coding challenge with the following prompt:
The total score of a hacker is the sum of their maximum scores for all
of the challenges. Write a query to print the hacker_id, name, and
total score of the hackers ordered by the descending score. If more
than one hacker achieved the same total score, then sort the result by
ascending hacker_id. Exclude all hackers with a total score of 0 from
your result.
The following tables contain contest data:
Hackers: The hacker_id is the id of the hacker, and name is the name
of the hacker.
Submissions: The submission_id is the id of the submission, hacker_id
is the id of the hacker who made the submission, challenge_id is the
id of the challenge for which the submission belongs to, and score is
the score of the submission.
My solution passed the test cases, but it took me many iterations to get there.
I get the feeling that there may be an edge case / specific input which would not pass using my solution, but I'm having trouble figuring it out.
Any guesses or counter examples?
My solution:
Select ID, Name, sum(maxscore) as tot From
(Select ID, Name, chal, Max(score) as maxscore From
(Select Submissions.hacker_id as ID, Hackers.name as Name, Submissions.score as score, Submissions.challenge_id as chal
From Submissions
Inner Join Hackers on Submissions.hacker_id = Hackers.hacker_id
Where Submissions.score <> 0)
Group by chal, ID, Name)
Group by ID, Name Order by tot desc, ID asc;
Select a.hacker_id,a.name,x.sum_max
from Hackers a
inner join
(SELECT hacker_id,sum(max_score) as sum_max from (
Select hacker_id,challenge_id,max(Score) as max_score
from submissions
group by hacker_id,challenge_id)b
group by hacker_id
having sum_max>0)x on a.hacker_id=x.hacker_id
order by sum_max desc,hacker_id
Here is mine
select T.hacker_id, T.name , sum(T.max_score) sum_score from (
Select h.hacker_id, name , challenge_id, max(score) max_score from Hackers h
inner join submissions s on h.hacker_id = s.hacker_id
group by h.hacker_id, name , challenge_id) as T
group by T.hacker_id, T.name
having sum(T.max_score) > 0
order by sum_score desc , T.hacker_id
This code for MySQL successfully ran the test cases.
select m.id, h.name, m.total from
(select x.hacker_id id, sum(x.max_score) total from
(select h.hacker_id, s.challenge_id, max(s.score) as max_score from submissions as s
join hackers as h on h.hacker_id = s.hacker_id
group by h.hacker_id, s.challenge_id) as x
group by x.hacker_id
having total <> 0)as m,
hackers as h
where m.id = h.hacker_id
order by m.total desc, m.id;
enter image description here
First try this query than vote:
select h.hacker_id,h.name,sum(s.score) as scores from Hackers h
inner join
(select hacker_id,max(score) as score from Submissions group by hacker_id,challenge_id) s
on h.hacker_id=s.hacker_id
group by h.hacker_id,h.name having scores>0
order by scores desc,h.hacker_id;
SELECT A, B, sum(C) FROM
(SELECT h.hacker_id AS A, h.name AS B, max(s.score)AS C FROM
Hackers h
JOIN Submissions s ON s.hacker_id=h.hacker_id
GROUP BY h.hacker_id,h.name, s.challenge_id) Sub
GROUP BY A, B
HAVING sum(C)>0
ORDER BY sum(C) DESC, A ASC;
That's how I did it:
WITH NOW AS (
SELECT CHALLENGE_ID AS CI, HACKER_ID AS HI, MAX(SCORE) AS TSCORE FROM SUBMISSIONS
WHERE HACKER_ID = HACKER_ID GROUP BY CHALLENGE_ID, HACKER_ID
)
SELECT C.HI, H.NAME, SUM(C.TSCORE) AS [TOTAL] FROM NOW AS C, HACKERS AS H
WHERE C.HI = H.HACKER_ID AND C.TSCORE > 0
GROUP BY C.HI, H.NAME ORDER BY TOTAL DESC, C.HI
SELECT
res.hacker_id as id,
res.name as name,
SUM(res.max_score) as score
from (
SELECT s.challenge_id as challenge_id, h.hacker_id as hacker_id ,h.name as name,Max(score) as max_score
FROM Hackers as h JOIN Submissions as s
ON h.hacker_id = s.hacker_id
GROUP BY challenge_id ,hacker_id,name
having max_score <> 0
) as res
GROUP BY hacker_id,name order by score desc,id asc;
Try this for mysql
SELECT t.hacker_id,t.name,sum(t.score) as score FROM
(select s.hacker_id, h.name, s.challenge_id, max(s.score) as score
FROM submissions s join Hackers h ON s.hacker_id = h.hacker_id
GROUP BY hacker_id, h.name, challenge_id ) as t
where score != 0 group by t.hacker_id,t.name order by sum(t.score) desc,t.hacker_id
oracle solution:
with temp as (
select tp.hacker_id,h.name, sum(tp.score) s from (
select hacker_id ,submission_id, challenge_id ,score, rank() over (partition by
hacker_id, challenge_id order by score desc,submission_id )as rank from
submissions)tp
join hackers h on h.hacker_id=tp.hacker_id
where tp.rank=1
group by tp.hacker_id,h.name
order by s desc,tp.hacker_id)
select hacker_id,name,s from temp where s <> 0 order by s desc,hacker_id ;
i tried this and worked with me
select H.Hacker_id, h.Name, sum(x.Score)
from hackers h
inner join (
select hacker_id, challenge_id, max(Score) as Score
from Submissions
group by hacker_id, challenge_id
) x
on H.Hacker_id = x.hacker_Id
group by H.Hacker_Id, h.Name
having sum(x.score) > 0
order by 3 desc, h.hacker_id

Oracle - Not a single-group group function

For the following sql command
select h.hacker_id
,h.name
,challenges_created
from hackers h
inner join (
select hacker_id
,count(*) as challenges_created /* line 1 */
from challenges
group by hacker_id
order by 2 desc
) tmp on h.hacker_id = tmp.hacker_id
order by challenges_created desc
,h.hacker_id;
so far so good but
As soon as I try to add max(count(*)) as maximum to the line 1 it give error as:
Not a single-group group function
This is the code for which it gives error :
select h.hacker_id
,h.name
,challenges_created
from hackers h
inner join (
select hacker_id
,count(*) as challenges_created
,max(count(*)) as maximum
from challenges
group by hacker_id
) tmp on h.hacker_id = tmp.hacker_id
order by challenges_created desc
,h.hacker_id;
I am basically interested in getting the maximum count i.e. maximum number of challenges created so far.
I am new to sql, kindly help and oblige. Thanks in advance. And yes, of course! I know many such question have been asked in recent past but none matches my situation which is why I am asking it again.
Try this:
with x as (
select h.hacker_id, h.name, count(*) challenges_created
from hackers h
inner join challenges on h.hacker_id = challenges.hacker_id
group by h.hacker_id, h.name
)
select x.*,
(select max(challenges_created) from x) as "max"
from x
order by challenges_created desc, hacker_id;

Highest Count with a group

I'm having an absolute brain fade
SELECT p.ProductCategory, f.ProductSubCategory, COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
GROUP BY p.ProductCategory, f.ProductSubCategory
ORDER BY 1,3 DESC
This shows me the count for each ProductSubCategory, I would like to see only the highest ProductSubCategory per ProductCategory.
I wish to see (I don't care about the Count value)
There are a couple of different ways to do this. One involves joining the results back to themselves and using the max aggregate. But since you are using SQL Server, you can use ROW_NUMBER to achieve the same result:
with cte as (
select p.productcategory, p.ProductSubCategory, COUNT(*) cnt,
ROW_NUMBER() over (partition by p.productcategory order by count(*) desc) rn
from products p
join sales s on p.ProductSubCategory = s.ProductSubCategory
group by p.productcategory, p.ProductSubCategory
)
select *
from cte
where rn = 1
You already got the answer, Please see the following code to. It may help you.
SELECT p.ProductCategory,
f.ProductSubCategory,
COUNT(*) AS Cnt
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory
JOIN (
SELECT p.ProductCategory,
f.ProductSubCategory,
ROW_NUMBER() OVER ( PARTITION BY p.ProductCategory,
f.ProductSubCategory
ORDER BY COUNT(*) DESC) [Row]
FROM Sales f
JOIN Products p ON f.ProductSubCategory = p.ProductSubCategory) Lu
ON P.ProductCategory = Lu.ProductCategory
AND f.ProductSubCategory = Lu.ProductSubCategory
WHERE Lu.Row = 1
GROUP By p.ProductCategory,
f.ProductSubCategory