Oracle SQL query, getting a a maximum of a sum - sql

Hey, guys. I'm struggling to solve one query, just cant get around it.
Basically, I got a some tables from data mart :
DimTheatre(TheatreId(PK), TheatreNo, Name, Address, MainTel);
DimTrow(TrowId(PK), TrowNo, RowName, RowType);
DimProduction(ProductionId(PK), ProductionNo, Title, ProductionDir, PlayAuthor);
DimTime(TimeId(PK), Year, Month, Day, Hour);
TicketPurchaseFact( TheatreId(FK), TimeId(FK), TrowId(FK),
PId(FK), TicketAmount);
The thing I'm trying to achieve in oracle is - I need to retrieve the most popular row type in each theatre by value of ticket sale
Thing I'm doing now is :
SELECT dthr.theatreid, dthr.name, max(tr.rowtype) keep(dense_rank last order
by tpf.ticketamount), sum(tpf.ticketamount) TotalSale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid
GROUP BY dthr.theatreid, dthr.name;
It does give me the output, but the 'TotalSale' column is totally out of place, it gives much way higher numbers than they should be.. How could I approach this issue :) ?

I am not sure how MAX() KEEP () would help your case if I understand the problem correctly. But the below approach should work:
SELECT x.theatreid, x.name, x.rowtype, x.total_sale
FROM
(SELECT z.theatreid, z.name, z.rowtype, z.total_sale, DENSE_RANK() OVER (PARTITION BY z.theatreid, z.name ORDER BY z.total_sale DESC) as popular_row_rank
FROM
(SELECT dthr.theatreid, dthr.name, tr.rowtype, SUM(tpf.ticketamount) as total_sale
FROM TicketPurchaseFact tpf, DimTheatre dthr, DimTrow tr
WHERE dthr.theatreid = tpf.theatreid AND tr.trowid = tpf.trowid
GROUP BY dthr.theatreid, dthr.name, tr.rowtype) z
) x
WHERE x.popular_row_rank = 1;

You want the row type per theatre with the highest ticket amount. So join purchases and rows and then aggregate to get the total per rowtype. Use RANK to rank your row types per theatre and stay with the best ranked ones. At last join with the theatre table to get the theatre name.
select
theatreid,
t.name,
tr.trowid
from
(
select
p.theatreid,
r.rowtype,
rank() over (partition by p.theatreid order by sum(p.ticketamount) desc) as rn
from ticketpurchasefact p
join dimtrow r using (trowid)
group by p.theatreid, r.rowtype
) tr
join dimtheatre t using (theatreid)
where tr.rn = 1;

Related

Oracle: select just last update of date

I have the following query that return me: 100 rows
SELECT uni_id, uni_mast_id, uni_type
FROM UNIVERSITIES
WHERE uni_master ='SO88'AND uni_stat= 'OK'
now i need to do a join with another table and to obtain last entry of that day then:
SELECT uni_id, uni_teach_name, MAX(cal_update), cal_status
FROM UNIVERSITIES
LEFT JOIN CALENDAR
ON unı_id = cal_id
WHERE uni_master = 'SO88'
AND uni_stat = 'OK'
AND cal_name = 'REGISTRED'
GROUP BY uni_id, uni_teach_name, uni_stat
ORDER BY cal_update
but this query gives me 102 records, because cal_update appears 2 times.
One for example with date : 22-OCT-2020 11:34:55 another for the same uni_id at time 22-OCT-2020 11:30:22
I want just to get the max date for that date, not both.
In this case the query with the join needs to return the same records of the first select query.
I think you can do what you want using row_number():
SELECT UNI_ID, UNI_TEACH_NAME, CAL_UPDATE, CAL_STATUS
FROM (SELECT U.UNI_ID, U.UNI_TEACH_NAME, C.CAL_UPDATE, C.CAL_STATUS,
ROW_NUMBER() OVER (PARTITION BY U.UNI_ID, TRUNC(C.CAL_UPDATE) ORDER BY C.CAL_UPDATE DESC) as seqnum
FROM UNIVERSITIES U LEFT JOIN
CALENDAR C
ON U.UNI_ID = C.CAL_ID AND C.CAL_NAME = 'REGISTRED'
WHERE U.UNI_MASTER = 'SO88' AND
U.UNI_STAT= 'OK'
) UC
WHERE seqnum = 1;
I have to guess where the columns come from, because the question is not clear. Any filtering columns from CALENDAR should be in the ON clause if you are using a LEFT JOIN.
You can replace the last part of the query, while aliasing the MAX(cal_update) with cal_update , as
ORDER BY cal_update DESC
FETCH FIRST 1 ROW WITH TIES
for DB version 12c+ to descendingly order by the concerned column in order to pick the record with the latest value for that column.
WITH TIES option stand for bringing all records with the same datetime values, might be replaced with ONLY in order to bring only one row even for those cases occur.
The column call_status(within the select list) should be removed which's a non- aggregated column
As an alternative to a subquery and rank, you could use KEEP...LAST :
SELECT U.UNI_ID,
U.UNI_TEACH_NAME,
MAX(C.CAL_UPDATE) AS CAL_UPDATE,
MAX(C.CAL_STATUS) KEEP (DENSE_RANK LAST ORDER BY C.CAL_UPDATE) AS CAL_STATUS
FROM UNIVERSITIES U
LEFT JOIN CALENDAR C
ON U.UNI_ID = C.CAL_ID
AND C.CAL_NAME = 'REGISTRED'
WHERE U.UNI_MASTER = 'SO88'
AND U.UNI_STAT= 'OK'
GROUP BY U.UNI_ID,
U.UNI_TEACH_NAME,
TRUNC(C.CAL_UPDATE)
I've moved the CAL_NAME check into the outer join's ON clause; if it's in the WHERE clause then it will effectively turn it back into an inner join. So this will get one row per university per day that the calendar was updated: "I want just to get the max date for that date". And it will show nulls for the calendar fields if there is no matching calendar, since it's an outer join.
If you actually only want the latest update on any day then just remove the TRUNC(C.CAL_UPDATE) from the grouping:
SELECT U.UNI_ID,
U.UNI_TEACH_NAME,
MAX(C.CAL_UPDATE) AS CAL_UPDATE,
MAX(C.CAL_STATUS) KEEP (DENSE_RANK LAST ORDER BY C.CAL_UPDATE) AS CAL_STATUS
FROM UNIVERSITIES U
LEFT JOIN CALENDAR C
ON U.UNI_ID = C.CAL_ID
AND C.CAL_NAME = 'REGISTRED'
WHERE U.UNI_MASTER = 'SO88'
AND U.UNI_STAT= 'OK'
GROUP BY U.UNI_ID,
U.UNI_TEACH_NAME
db<>fiddle with some made-up data; and also (just for fun) showing Gordon's query with the calendar name clause in both places to show the difference, and to show this gets the same result for that dummy data. (And an 18c version which shows Barbaros' too; getting back a single row.)

How to use rounded value for join in SQL

i'm just learning SQL today and i never thought how fun it's until i'm fiddling with it.
I got a problem and i need a help.
i have 2 tables, Customer and Rate, with details stated below
Customer
idcustomer = int
namecustomer = varchar
rate = decimal(3,0)
with value as described:
idcustomer---namecustomer---rate
1---JOHN DOE---100
2---MARY JANE---90
3---CLIVE BAKER---12
4---DANIEL REYES---47
Rate
rate = decimal(3,0)
description = varchar(40)
with value as described:
rate---description
10---G Rank
20---F Rank
30---E Rank
40---D Rank
50---C Rank
60---B Rank
70---A Rank
80---S Rank
90---SS Rank
100---SSS Rank
Then i ran query below in order to round all values in customer.rate field then inner join it with rate table.
SELECT *, round(rate,-1) as roundedrate
FROM customer INNER JOIN rate ON customer.roundedrate = rate.rate
It didn't produce this result:
idcustomer---namecustomer---rate---roundedrate---description
1---JOHN DOE---100---100---SSS Rank
2---MARY JANE---90---90---SS Rank
3---CLIVE BAKER---12---10---G Rank
4---DANIEL REYES---47---50---C Rank
Is there anything wrong with my code ?
Your query should produce an 'ambigious column' error because you're not specifying a table name when referring to rate (in round(rate,-1)), which exists in both tables.
Also, the where part of a sql query is executed before the select part, so you can't refer to the alias customer.roundedrate in your where statement.
Try this instead
SELECT *, round(customer.rate,-1) as roundedrate
FROM customer INNER JOIN rate ON round(customer.rate,-1) = rate.rate
http://sqlfiddle.com/#!9/e94a60/2
I would suggest a correlated subquery for this:
select c.*,
(select r.description
from rate r
where r.rate <= c.rate
order by r.rate desc
fetch first 1 row only
) as description
from customer c;
Note: fetch first 1 row only is ANSI standard SQL, which some databases do not support. MySQL uses limit. Older versions of SQL Server use select top 1 instead.

SQL Nested Select -Subquery returned more than 1 value-

I have a table Sales with columns SalesID, SalesName, SalesCity, SalesState.
I am trying to come up with a query that only shows salesName where there is one SalesName per SalesCity. So for example, if SaleA is in Houston and SaleB is in Houston, SaleA and SaleB will not be returned.
select
SalesName, SalesCity, SalesState
from
Sales
where
(select count(*) from Sales group by SalesCity) = 1;
I am not entirely sure how to link the inner select back out. I need another column in the nested select to identify the SalesID. I am currently stuck and have made no progress.
You can get the names of cities that have only 1 sale by using GROUP BY and HAVING operators. Then use these results in your where clause:
SELECT SalesName, SalesCity, SalesState
FROM Sales WHERE SalesCity IN
(
SELECT SalesCity
FROM Sales
GROUP BY SalesCity
HAVING COUNT(SalesCity) = 1
)
You can do this without a subquery:
select MIN(SalesName) as SalesName, SalesCity, MIN(SalesState) as SalesState
from Sales
group by SalesCity
having count(*) = 1;
If there is only one row for the city, then the min() will return the value on that row.

SQL Query GroupBy with date parameter

Suppose I have a table, TeamRatings, that looks something like this
|---Team----|--ValuationDate--|-Rating-|
|--Saints---|---10/15/2012----|---81.1-|
|--Broncos--|---10/15/2012----|---91.1-|
|--Ravens---|---10/16/2012----|--101.1-|
|--Broncos--|---10/22/2012----|---82.1-|
|--Ravens---|---10/22/2012----|---83.1-|
|--Saints---|---10/29/2012----|---84.1-|
|--Broncos--|---10/28/2012----|---85.1-|
|--Ravens---|---10/29/2012----|---86.1-|
Also, it is assumed that a team's rating remains unchanged until they play a new game, (representing a new record). E.g. The Broncos' rating on date 10/21/2012 is assumed to be 102.8
I want a query with a date parameter, that will return one record per team represnting that team's most recent game prior to the date specified. For instance,
If I input 10/23/2012 as my date parameter, the query should return
|---Team---|-ValuationDate---|-Rating-|
|--Saints--|---10/15/2012----|---81.1-|
|--Broncos-|---10/22/2012----|---82.1-|
|--Ravens--|---10/22/2012----|---83.1-|
Any help is greatly appreciated. Thanks!
On MS SQL Server 2005 or greater you can use a cte with ROW_NUMBER function:
WITH x
AS (SELECT team,
valuationdate,
rating,
rn = Row_number()
OVER(
partition BY team
ORDER BY valuationdate DESC)
FROM teamratings
WHERE valuationdate < #DateParam)
SELECT team,
valuationdate,
rating
FROM x
WHERE rn = 1
You can use a more general query like this:
select Team, x.ValuationDate, Rating
from TeamRatings inner join
(
select Team, max(ValuationDate) as ValuationDate
from TeamRatings
where ValuationDate < #dateParameter
group by Team
) x on TeamRatings.Team = x.Team and TeamRatings.ValuationDate = x.ValuationDate

Selecting DISTINCT records on subset of columns with MAX in another column

I have been looking at other T-SQL questions including DISTINCT and MAX here on the site for a couple hours now, but cannot find anything that quite matches my need. Here is a desription of my dataset and query objectives. Any guidance is much appreciated.
Dataset
Dataset is a list of customers, customer sites, dates and values from the last billing cycle, with the following columns. It is possible for a single customer to have multiple sites:
Customer, Site, Date, Counter, CounterValue, CollectorNode
Query Requirements
For the given billing cycle, I would like to select the following
DISTINCT (Customer and Site)
MAX(CounterValue) for this billing cycle for each DISTINCT Customer and Site
While still returning all the fields for that record from the table (CollectorNode, Date, Counter)
My challenge here is my inability to return all the columns while selecting the DISTINCT columns and MAX for each. My many varied attempts return multiple records for each customer/site combination.
Using a self JOIN:
SELECT ds.customer,
ds.site,
ds.counter,
ds.countervalue,
ds.collectornode
FROM DATASET ds
JOIN (SELECT t.customer,
t.site,
MAX(t.countervalue) AS max_countervalue
FROM DATASET t
GROUP BY t.customer, t.site) x ON x.customer = ds.customer
AND x.site = ds.site
AND x.max_countervalue = ds.countervalue
Using a CTE & ROW_NUMBER (SQL Server 2005+):
WITH example AS (
SELECT ds.customer,
ds.site,
ds.counter,
ds.countervalue,
ds.collectornode,
ROW_NUMBER() OVER(PARTITION BY ds.customer, ds.site
ORDER BY ds.countervalue DESC) AS rank
FROM DATASET ds)
SELECT e.customer,
e.site,
e.counter,
e.countervalue,
e.collectornode
FROM example e
WHERE e.rank = 1
Use a subquery to do the grouping and join the result back to the original table, like this:
SELECT g.Customer, g.Site, c.Date, c.Counter, g.MaxCounterValue, c.CollectorNode
FROM Customers c
INNER JOIN
(
SELECT Customer, Site, MAX(CounterValue) MaxCounterValue
FROM Customers
GROUP BY Customer, Site
) g
ON g.Customer = c.Customer
AND g.Site = g.Site