I know this has been asked before but I've looked at other questions and my query still won't work. I have a table with MLB batting stats from last 100 years or so, I am trying to find the playerid, homeruns and percentage of that year's (2012) total homerun's hit that player's hrs make up.
query:
select playerid, hr, hr/sum(hr) over (partition by playerid, yearid)*100 p
from mlbbattingstats
where yearid=2012 and p != 0
order by hr;
error:
Error at line 3:
ORA-00904: "P": invalid identifier
I have tried multiple different aliases and gotten the same error. Any help in what I am doing wrong would be appreciated and sorry if this has been answered previously.
You can't reference a column alias on the same query level (except for order by). You need to wrap the statement into a derived table:
select *
from (
select playerid, hr, hr/sum(hr) over (partition by playerid, yearid)*100 p
from mlbbattingstats
where yearid = 2012
)
where p <> 0
order by hr;
If p <> 0, then hr <> 0. So, your query would seem to be equivalent to:
select playerid, hr,
hr/sum(hr) over (partition by playerid, yearid)*100 as p
from mlbbattingstats
where yearid = 2012 and hr <> 0
order by hr;
Your original problem is that you cannot use a column alias defined in a select in the where clause as the same level.
Related
I attempted the 1407. Top Travellers. But am struggling with my Oracle query below, 'Runtime error'. A little too tired to understand why. Any idea where I am going wrong? Have been rusty with SQL of late. :(
select name as name,
case when rides.distance is null then 0 else sum(rides.distance) end as travelled_distance
from users
left join rides
on users.id = rides.user_id
group by rides.users_id
order by travelled_distance desc, name;
As commented, is another way round:
select
name,
sum(case when rides.distance is null then 0 else rides.distance end) as travelled_distance
from users left join rides on users.id = rides.user_id
group by name
order by travelled_distance desc, name;
Or, simpler, use the nvl function:
select
name,
sum(nvl(rides.distance, 0)) as travelled_distance
from ...
Though, a few more objections:
you should use table aliases (as they simplify query and improve readability)
moreover, you should precede all column names with table aliases; in your case, you failed to do so for the name column. It probably belongs to the users table, but we can't tell for sure as we don't have your data model nor access to your database
group by clause should contain column(s) that aren't aggregated. In your query, that's the name column. You can put rides.users_id into that clause, but you must put name in there
The below solution works. Thanks to one of the Discussion posts at leetcode I could figure out the issue:
select r.name,
case when x.td is null
then 0
else x.td
end travelled_distance
from Users r
left join
(
select user_id, sum(distance) td
from Rides
group by user_id
) x
on r.id = x.user_id
order by travelled_distance desc, r.name;
Hey, guys. I'm struggling to solve one query, just cant get around it.
Basically, I got a some tables from data mart :
DimTheatre(TheatreId(PK), TheatreNo, Name, Address, MainTel);
DimTrow(TrowId(PK), TrowNo, RowName, RowType);
DimProduction(ProductionId(PK), ProductionNo, Title, ProductionDir, PlayAuthor);
DimTime(TimeId(PK), Year, Month, Day, Hour);
TicketPurchaseFact( TheatreId(FK), TimeId(FK), TrowId(FK),
PId(FK), TicketAmount);
The thing I'm trying to achieve in oracle is - I need to retrieve the most popular row type in each theatre by value of ticket sale
Thing I'm doing now is :
SELECT dthr.theatreid, dthr.name, max(tr.rowtype) keep(dense_rank last order
by tpf.ticketamount), sum(tpf.ticketamount) TotalSale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid
GROUP BY dthr.theatreid, dthr.name;
It does give me the output, but the 'TotalSale' column is totally out of place, it gives much way higher numbers than they should be.. How could I approach this issue :) ?
I am not sure how MAX() KEEP () would help your case if I understand the problem correctly. But the below approach should work:
SELECT x.theatreid, x.name, x.rowtype, x.total_sale
FROM
(SELECT z.theatreid, z.name, z.rowtype, z.total_sale, DENSE_RANK() OVER (PARTITION BY z.theatreid, z.name ORDER BY z.total_sale DESC) as popular_row_rank
FROM
(SELECT dthr.theatreid, dthr.name, tr.rowtype, SUM(tpf.ticketamount) as total_sale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid AND tr.trowid = tpf.trowid
GROUP BY dthr.theatreid, dthr.name, tr.rowtype) z
) x
WHERE x.popular_row_rank = 1;
You want the row type per theatre with the highest ticket amount. So join purchases and rows and then aggregate to get the total per rowtype. Use RANK to rank your row types per theatre and stay with the best ranked ones. At last join with the theatre table to get the theatre name.
select
theatreid,
t.name,
tr.trowid
from
(
select
p.theatreid,
r.rowtype,
rank() over (partition by p.theatreid order by sum(p.ticketamount) desc) as rn
from ticketpurchasefact p
join dimtrow r using (trowid)
group by p.theatreid, r.rowtype
) tr
join dimtheatre t using (theatreid)
where tr.rn = 1;
When I run the script below, I got a error message "Cannot perform an aggregate function on an expression containing an aggregate or a subquery" Please provide some advice. Thanks
SELECT
CONVERT(DECIMAL(18,5),SUM(CASE WHEN PATIENT_ACCOUNT_NO IN (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING ( COUNT(PATIENT_ACCOUNT_NO) > 1)) THEN 0 ELSE 1 END)) dupPatNo
FROM [DBO].[STND_ENCOUNTER]
I think the error message is pretty clear. You have a sum() function with a subquery in it (albeit within a case, but that doesn't matter).
It seems that you want to choose patients that have more than one encounter, then add 0 if the patients is in the list and 1 if the patient is not. Hmmm. . . sounds like you want to count the number of patients with only one encounter.
Try using this logic instead:
select count(*)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters = 1;
As a note, the variable you are assigning is called DupPatientNo. This sounds like the number of patients that have duplicates. In that case, the query is:
select count(distinct PATIENT_ACCOUNT_NO)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters > 1;
(Or use count(*) if you want the number of encounters on duplicate patients.)
If you want to find number of PATIENT_ACCOUNT_NO that does not have any duplicates then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) = 1
) dupPatNo
If you want to find number of PATIENT_ACCOUNT_NO that have atleast one duplicate then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) > 1
) dupPatNo
Use of DISTINCT will make the query not count same item again and again
Though your query looks for first result, its not clear what you want. Hence giving query for both
I tried to look for an answer and I found more advices, but not anyone of them was helpful, so I'm trying to ask now.
I have two tables, one with distributors (columns: distributorid, name) and the second one with delivered products (columns: distributorid, productid, corruptcount, date) - the column corruptcount contains the number of corrupted deliveries. I need to select the first five distributors with the most corrupted deliveries in last two months. I need to select distributorid, name and sum of corruptcount, here is my query:
SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt
FROM distributor d, delivery del
WHERE d.distributorid = del.distributorid
AND d.distributorid IN
(SELECT distributorid
FROM (SELECT distributorid, SUM(corruptcount) AS corrupt
FROM delivery
WHERE storeid = 1
AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE
AND ROWNUM <= 5
GROUP BY distributorid
ORDER BY corrupt DESC))
GROUP BY del.distributorid
But Oracle returns error message: "not a GROUP BY expression".And when I edit my query to this:
SELECT del.distributorid, d.name, del.corruptcount-- , SUM(del.corruptcount) AS corrupt
FROM distributor d, delivery del
WHERE d.distributorid = del.distributorid
AND d.distributorid IN
(SELECT distributorid
FROM (SELECT distributorid, SUM(corruptcount) AS corrupt
FROM delivery
WHERE storeid = 1
AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE
AND ROWNUM <= 5
GROUP BY distributorid
ORDER BY corrupt DESC))
--GROUP BY del.distributorid
It's working as you expect and returns correct data:
1 IBM 10
2 DELL 0
2 DELL 1
2 DELL 6
3 HP 3
8 ACER 2
9 ASUS 1
I'd like to group this data. Where and why is my query wrong? Can you help please? Thank you very, very much.
I think the problem is just the d.name in the select list; you need to include it in the group by clause as well. Try this:
SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt
FROM distributor d join
delivery del
on d.distributorid = del.distributorid
WHERE d.distributorid IN
(SELECT distributorid
FROM delivery
WHERE storeid = 1 AND
"date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE AND
ROWNUM <= 5
GROUP BY distributorid
ORDER BY SUM(corruptcount) DESC
)
GROUP BY del.distributorid, d.name;
I also switched the query to using explicit join syntax with an on clause, instead of the outdated implicit join syntax using a condition in the where.
I also removed the additional layer of subquery. It is not really necessary.
EDIT:
"Why does d.name have to be included in the group by?" The easy answer is that SQL requires it because it does not know which value to include from the group. You could instead use min(d.name) in the select, for instance, and there would be no need to change the group by clause.
The real answer is a wee bit more complicated. The ANSI standard does actually permit the query as you wrote it. This is because id is (presumably) declared as a primary key on the table. When you group by a primary key (or unique key), then you can use other columns from the same table just as you did. Although ANSI supports this, most databases do not yet. So, the real reason is that Oracle doesn't support the ANSI standard functionality that would allow your query to work.
I have a following query
select wbod.subject, wbi.object,
age(dod.object,wbod.object) as ageOfPerson
from wasbornin as wbi,
wasbornondate as wbod,
diedondate as dod
where wbi.subject=wbod.subject
and wbod.subject=dod.subject
and age(dod.object,wbod.object) = (select max(age(dod1.object,wbod1.object))
from wasbornin as wbi1,
wasbornondate as wbod1,
diedondate as dod1
where wbi1.subject = wbod1.subject
and wbod1.subject=dod1.subject
group by wbi1.object)
group by wbi.object
ORDER BY wbi.subject;
But it is giving following error
column "wbod.subject" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select wbod.subject, wbi.object
Why is this error coming
That's because you are selecting this column. And if given group (GROUP BY wbi.object) there are 150 different subjects, which one of them should be returned?
I initially misread the query - order is using wbi.subject, but error is about wbod.subject.
If I understand your query correctly, you actually don't need the sub-query or the group by:
select subject,
object,
ageOfPerson
from (
select wbod.subject,
wbi.object,
age(dod.object, wbod.object) as ageOfPerson,
dense_rank() over (partition by dod.subject order by age(dod.object, wbod.object) desc) as rnk
from wasbornin as wbi
join wasbornondate as wbod on wbi.subject=wbod.subject
join diedondate as dod on wbod.subject=dod.subject
) t
where rnk = 1;