ORACLE SQL select max(count()) to year - sql

I have database of library and i am trying to assign most borrowed title to each year like
2015 - The Great Gatsby
2014 - Da vinci code
2013 - Harry Potter
....
I've tried this but i am not sure about it
select to_char(borrow_date,'YYYY'),title_name
from k_title
join k_book
using(title_id)
join k_rent_books
using(book_id)
group by to_char(borrow_date,'YYYY'),title_name
having count(title_id) = (
select max(cnt) FROM(select count(title_name) as cnt
from k_title
join k_book
using(title_id)
join k_rent_books
using(book_id)
group by title_id,title_name,to_char(borrow_date,'YYYY')));
I've got only 3 results
2016 - Shogun
2006 - The Revolt of Mamie Stover
1996 - The Great Gatsby
I will be happy for any help :)

Oracle has the nice capability to get the first or last value in an aggregation (as opposed to the min() or max()). This requires using something called keep.
So, the way to express what you want to do is:
select yyyy,
max(title_name) keep (dense_rank first order by cnt desc) as title_name
from (select to_char(borrow_date, 'YYYY') as yyyy,
title_name, count(*) as cnt
from k_title t join
k_book b
using (title_id) join
k_rent_books
using (book_id)
group by to_char(borrow_date, 'YYYY'), title_name
) yt
group by yyyy;
Your query is returning the year/title combinations that have the overall maximum count over all years, not the maximum per year.

Related

Improve the performance of a SQL Query via SQL Server

I'm looking for improve the performance of the Last Year Attended query. Right now, its taking 20+ minutes to run this block.
The LYA take the most recent year attended for a particular event and finds the year they attended prior to the max. For example if they attended in 2018 for an event, the query will look for the last year attended prior to 2018.
LYA for 2018 should return a Null
The data should return the following:
CompanyID MarketID Industry LAST YEAR ATTENDED
-------------------------------------------------------
123456 1234 GIFT 2018
123457 1234 HOME 2017
123458 1234 GIFT 2018
123459 1234 HOME 2018
123460 1234 APPAREL 2018
123461 1234 HOME 2018
123462 1234 HOME 2017
123463 1234 APPAREL 2018
Can anyone assist?
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED] = (SELECT MAX(YEAR(attdate))
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb
WHERE vb.companyid = vm.companyid)
AND MARKETCODE LIKE 'SM1%')
FROM
v_marketatt vm
WHERE
MARKETID IN (835, 1032, 1101)
UPDATE:
Found that is version is more efficient than the rest. Run time down to 7 minutes on a clone. Instead of allowing the subquery to dip into my view twice, had it dip once.
select
DISTINCT COMPANYID,
MARKETID,
INDUSTRY,
CSTATUS,
[LAST YEAR ATTENDED] = (select max(year(attdate)) from v_marketatt va where year(attdate) <> (select max(year(attdate)) from v_marketatt) AND MARKETCODE LIKE 'SM1%' AND va.COMPANYID = vm.COMPANYID)
from v_marketatt vm
WHERE MARKETID IN (835,1032,1101)
;
Thanks to all who responded.
The field [LAST YEAR ATTENDED] has a subquery that computes the max year on each iteration.You can try moving this piece of query to a join something like
select DISTINCT COMPANYID, MARKETID, INDUSTRY,
[LAST YEAR ATTENDED]
from v_marketatt vm
inner join
( select max(year(attdate)) as [LAST YEAR ATTENDED]
from v_marketatt ivm
where year(ivm.attdate) <> (select max(year(attdate))
from v_marketatt vb
where vb.companyid =
ivm.companyid)
AND MARKETCODE LIKE 'SM1%')va on va.companyid = vm.companyid
--where companyid not in (select distinct companyid from
v_marketatt where marketid in (602))
WHERE MARKETID IN (835,1032,1101)
I have not run this query , there could be some minor corrections on syntax , but if you get the concept it should be easy to pick and fix.
apologies for syntax, I'm throwing this together quickly. But I suspect making use a CTE should improve performance dramatically. I'm also not quite sure what you're doing here:
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%'
So I've left that piece alone. Try something like this, which should help, and possible clarification on the part I've noted above might unlock other things to tweak.
;with Year_CTE (year)
as
(SELECT MAX(YEAR(attdate), va.companyid)
FROM v_marketatt va
WHERE va.companyid = vm.companyid
AND YEAR(attdate) <> (SELECT MAX(YEAR(attdate))
FROM v_marketatt vb)
AND MARKETCODE LIKE 'SM1%')
SELECT DISTINCT
COMPANYID, MARKETID, INDUSTRY,
vb.[YEAR]
FROM
v_marketatt vm
join Year_CTE vb on vb.companyid = vm.companyid
WHERE
MARKETID IN (835, 1032, 1101)
IF you need 'the one before this one' I'd suggest to use LEAD() or LAG() functions.
Although I'm not quite sure I fully understand your example (see Thorsten Kettners comments), going by the explanation I think what you want is something along the lines of:
;WITH years
AS (
SELECT COMPANYID, MARKETID, INDUSTRY, YEAR_ATTENDED = Year(attdate)
FROM v_marketatt
WHERE MARKETID IN (835, 1032, 1101)
AND MARKETCODE LIKE 'SM1%' -- not sure about this one, the example isn't very clear
GROUP BY COMPANYID, MARKETID, INDUSTRY, Year(attdate)
),
last_ones
AS (
SELECT row_nbr = ROW_NUMBER() OVER ( PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC),
COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED = YEAR_ATTENDED,
PREV_YEAR_ATTENDED = LEAD(YEAR_ATTENDED, 1, NULL) OVER (PARTITION BY COMPANYID, MARKETID, INDUSTRY ORDER BY YEAR_ATTENDED DESC)
FROM years
)
SELECT COMPANYID, MARKETID, INDUSTRY,
LAST_YEAR_ATTENDED,
PREV_YEAR_ATTENDED
FROM last_ones
WHERE row_nbr = 1
Since I don't have the tables nor the data here, I haven't tested the query, but I hope it will get you going...

SQL MAx function with multiple columns showing in apex?

image1 image2I am trying to write an sql function to show the year, playername, and ppg of the player with the highest ppg from each year in our database.
We have a Players table with all the stats, and a team table with stats linked to each season as a team total.
What I want to do is get the highest scorer from each season so:
2010: Jake 10ppg
2011: Jake 12 ppg
2012 Carl 13 ppq
Etc.
here is my current query
SELECT Year, PlayerName, MAX(PPG) AS PPG
FROM PLAYERS_T, TEAM_T
GROUP BY Year
ORDER BY PPG;
However this is not working, what do I need to do to make this work?
This should work, but will show duplicated record if same PPG. Dont know what is the use of Team table there
SQL DEMO
WITH PLAYERS_T as (
SELECT 2010 "Year", 'Jake' "PlayerName", 10 ppg
UNION
SELECT 2011 "Year", 'Jake' "PlayerName", 12 ppg
UNION
SELECT 2012 "Year", 'Carl' "PlayerName", 13 ppg
)
SELECT T1."Year", T1."PlayerName", T1.PPG
FROM PLAYERS_T T1
LEFT JOIN PLAYERS_T T2
ON T1."Year" = T2."Year"
AND T1.PPG < T2.PPG
WHERE T2."Year" IS NULL
OUTPUT
Try this one:
SELECT players_T.playername, players_T.ppg, players_T.year
FROM
(SELECT year, MAX(PPG) AS mx
FROM players_T
GROUP BY year) sub
INNER JOIN players_T ON sub.mx = players_T.ppg
WHERE sub.year = players_T.year
ORDER BY players_T.year
In the subquery, this finds the max ppg per year. Then we join with the players table on the ppg to find the player name. The result should be the player name, ppg and year together. Let me know what you find!
Edit: Need to include a WHERE clause for year

SQL Server get customer with 7 consecutive transactions

I am trying to write a query that would get the customers with 7 consecutive transactions given a list of CustomerKeys.
I am currently doing a self join on Customer fact table that has 700 Million records in SQL Server 2008.
This is is what I came up with but its taking a long time to run. I have an clustered index as (CustomerKey, TranDateKey)
SELECT
ct1.CustomerKey,ct1.TranDateKey
FROM
CustomerTransactionFact ct1
INNER JOIN
#CRTCustomerList dl ON ct1.CustomerKey = dl.CustomerKey --temp table with customer list
INNER JOIN
dbo.CustomerTransactionFact ct2 ON ct1.CustomerKey = ct2.CustomerKey -- Same Customer
AND ct2.TranDateKey >= ct1.TranDateKey
AND ct2.TranDateKey <= CONVERT(VARCHAR(8), (dateadd(d, 6, ct1.TranDateTime), 112) -- Consecutive Transactions in the last 7 days
WHERE
ct1.LogID >= 82800000
AND ct2.LogID >= 82800000
AND ct1.TranDateKey between dl.BeginTranDateKey and dl.EndTranDateKey
AND ct2.TranDateKey between dl.BeginTranDateKey and dl.EndTranDateKey
GROUP BY
ct1.CustomerKey,ct1.TranDateKey
HAVING
COUNT(*) = 7
Please help make it more efficient. Is there a better way to write this query in 2008?
You can do this using window functions, which should be much faster. Assuming that TranDateKey is a number and you can subtract a sequential number from it, then the difference constant for consecutive days.
You can put this in a query like this:
SELECT CustomerKey, MIN(TranDateKey), MAX(TranDateKey)
FROM (SELECT ct.CustomerKey, ct.TranDateKey,
(ct.TranDateKey -
DENSE_RANK() OVER (PARTITION BY ct.CustomerKey, ct.TranDateKey)
) as grp
FROM CustomerTransactionFact ct INNER JOIN
#CRTCustomerList dl
ON ct.CustomerKey = dl.CustomerKey
) t
GROUP BY CustomerKey, grp
HAVING COUNT(*) = 7;
If your date key is something else, there is probably a way to modify the query to handle that, but you might have to join to the dimension table.
This would be a perfect task for a COUNT(*) OVER (RANGE ...), but SQL Server 2008 supports only a limited syntax for Windowed Aggregate Functions.
SELECT CustomerKey, MIN(TranDateKey), COUNT(*)
FROM
(
SELECT CustomerKey, TranDateKey,
dateadd(d,-ROW_NUMBER()
OVER (PARTITION BY CustomerKey
ORDER BY TranDateKey),TranDateTime) AS dummyDate
FROM CustomerTransactionFact
) AS dt
GROUP BY CustomerKey, dummyDate
HAVING COUNT(*) >= 7
The dateadd calculates the difference between the current TranDateTime and a Row_Number over all date per customer. The resulting dummyDatehas no actual meaning, but is the same meaningless date for consecutive dates.

Select Max with groupby

I have the following table.
I need to select SemesterID,AcadamiYear,AcademicSemester of the record with highest Academic year and Academic semester of the year 2015
Expected output is
2013 1 2
I tried the following query but it returns both of the records
select MAX(AcadamiYear) as Year,
MAX(AcadamicSemester) as Semester
,SemesterID
from
tblSemesterRegistration
where [IntakeYear]='2015'
Group by SemesterID
Since you are searching for a single record you might use TOP 1, ordered by your intend
select TOP 1 *
from
tblSemesterRegistration
where [IntakeYear]='2015'
Order by AcadamiYear DESC, AcadamicSemester DESC
This is the query you're looking for:
SELECT SR.*
FROM tblSemesterRegistration SR
INNER JOIN (SELECT MAX(SR2.AcadamiYear) AS [AcadamiYear]
,MAX(SR2.AcadamicSemester) AS [AcadamicSemester]
,IntakeYear
FROM tblSemesterRegistration SR2
GROUP BY SR2.IntageYear) T ON T.AcadamiYear = SR.AcadamiYear
AND T.AcadamicSemester = SR.AcadamicSemester
AND T.IntakeYear = SR.IntakeYear
WHERE SR.IntakeYear = '2015'
Hope this will help you.
If SemesterID is the primary key, grouping on it will always yield all rows (since it is always unique).
I guess you mean to find back that semester id with the parameters set:
select r.*
from tblSemesterRegistration r
join ( select max(AcadamiYear) as Year
, max(AcadamicSemester) as Semester
from tblSemesterRegistration
where [IntakeYear]='2015'
) m
on r.acadamiyear = m.year
and r.acadamicsemester = m.semester
SELECT MAX(AcadamiYear) AS Year,
MAX(AcademicSemester) AS Semester,
MAX(SemesterID) AS SemesterID
FROM tblSemesterRegistration
WHERE [IntakeYear] = '2015'
GROUP BY IntakeYear

Oracle - group by of joined tables

I tried to look for an answer and I found more advices, but not anyone of them was helpful, so I'm trying to ask now.
I have two tables, one with distributors (columns: distributorid, name) and the second one with delivered products (columns: distributorid, productid, corruptcount, date) - the column corruptcount contains the number of corrupted deliveries. I need to select the first five distributors with the most corrupted deliveries in last two months. I need to select distributorid, name and sum of corruptcount, here is my query:
SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt
FROM distributor d, delivery del
WHERE d.distributorid = del.distributorid
AND d.distributorid IN
(SELECT distributorid
FROM (SELECT distributorid, SUM(corruptcount) AS corrupt
FROM delivery
WHERE storeid = 1
AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE
AND ROWNUM <= 5
GROUP BY distributorid
ORDER BY corrupt DESC))
GROUP BY del.distributorid
But Oracle returns error message: "not a GROUP BY expression".And when I edit my query to this:
SELECT del.distributorid, d.name, del.corruptcount-- , SUM(del.corruptcount) AS corrupt
FROM distributor d, delivery del
WHERE d.distributorid = del.distributorid
AND d.distributorid IN
(SELECT distributorid
FROM (SELECT distributorid, SUM(corruptcount) AS corrupt
FROM delivery
WHERE storeid = 1
AND "date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE
AND ROWNUM <= 5
GROUP BY distributorid
ORDER BY corrupt DESC))
--GROUP BY del.distributorid
It's working as you expect and returns correct data:
1 IBM 10
2 DELL 0
2 DELL 1
2 DELL 6
3 HP 3
8 ACER 2
9 ASUS 1
I'd like to group this data. Where and why is my query wrong? Can you help please? Thank you very, very much.
I think the problem is just the d.name in the select list; you need to include it in the group by clause as well. Try this:
SELECT del.distributorid, d.name, SUM(del.corruptcount) AS corrupt
FROM distributor d join
delivery del
on d.distributorid = del.distributorid
WHERE d.distributorid IN
(SELECT distributorid
FROM delivery
WHERE storeid = 1 AND
"date" BETWEEN ADD_MONTHS(SYSDATE, -2) AND SYSDATE AND
ROWNUM <= 5
GROUP BY distributorid
ORDER BY SUM(corruptcount) DESC
)
GROUP BY del.distributorid, d.name;
I also switched the query to using explicit join syntax with an on clause, instead of the outdated implicit join syntax using a condition in the where.
I also removed the additional layer of subquery. It is not really necessary.
EDIT:
"Why does d.name have to be included in the group by?" The easy answer is that SQL requires it because it does not know which value to include from the group. You could instead use min(d.name) in the select, for instance, and there would be no need to change the group by clause.
The real answer is a wee bit more complicated. The ANSI standard does actually permit the query as you wrote it. This is because id is (presumably) declared as a primary key on the table. When you group by a primary key (or unique key), then you can use other columns from the same table just as you did. Although ANSI supports this, most databases do not yet. So, the real reason is that Oracle doesn't support the ANSI standard functionality that would allow your query to work.