SQL RANK with multiple WHERE clause - sql

I have got few sales offices, together with their sales. I am trying to set-up report that will basically tell how is each office performing. Getting some SUMs, COUNTs are quite easy, however I am struggling with getting rank of single office.
I would like to have this query return the rank of single office, during the entire period and/or specified time (eg. BETWEEN '2015-01-01' AND '2015-01-15')
I need to also exclude some offices from the rank list (eg. OfficeName NOT IN ('GGG','QQQ')), so using the sample data, the rank of office 'XYZ' would be 5.
In case that the OfficeName = 'XYZ' is included in WHERE clause, the RANK would be obviously = 1 as SQL filters out other rows, not contained in WHERE clause before executing the rest of the code.
Is there any way of doing the same, without using the TemporaryTable ?
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
--AND OfficeName = 'XYZ'
GROUP BY OfficeName
ORDER BY 2 DESC;
I am using MS SQL server 2008.
SQL Fiddle with some random data is here: http://sqlfiddle.com/#!3/fac7a/35
Many thanks for help!

if i understand you correctly you want to do:
SELECT *
FROM (
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
) dat
WHERE OfficeName = 'XYZ';

You just need to wrap your code as derived table or use a CTE like this and then do the filter for OfficeName = 'XYZ'.
;WITH CTE AS
(
SELECT OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t
JOIN Office o ON t.TransID=o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
)
SELECT *
FROM CTE
WHERE OfficeName = 'XYZ';

Here is an amusing way to do this without a subquery:
SELECT TOP 1 OfficeName, SUM(Value) as SUM,
RANK() OVER (ORDER BY SUM(VALUE) DESC) AS Rank
FROM Transactions t JOIN
Office o
ON t.TransID = o.ID
WHERE OfficeName NOT IN ('GGG','QQQ')
GROUP BY OfficeName
ORDER BY (CASE WHEN OfficeName = 'XYZ' THEN 1 ELSE 2 END);

Related

How to select multiple max values from a sql table

I am trying to get the top performers from a table, grouped by the company but can't seem to get the grouping right.
I have tried to use subqueries but this goes beyond my knowledge
I am trying to make a query that selects the rows in green. In other words I want to include the name, the company, and what they paid but only the top performers of each company.
Here is the raw data
create table test (person varchar(50),company varchar(50),paid numeric);
insert into
test
values
('bob','a',200),
('jane','a',100),
('mark','a',350),
('susan','b',650),
('thabo','b',100),
('thembi','b',210),
('lucas','b',110),
('oscar','c',10),
('janet','c',20),
('nancy','c',30)
You can use MAX() in a subquery as
CREATE TABLE T(
Person VARCHAR(45),
Company CHAR(1),
Paid INT
);
INSERT INTO T
VALUES ('Person1', 'A', 10),
('Person2', 'A', 20),
('Person3', 'B', 10);
SELECT T.*
FROM T INNER JOIN
(
SELECT Company, MAX(Paid) Paid
FROM T
GROUP BY Company
) TT ON T.Company = TT.Company AND T.Paid = TT.Paid;
Demo
Or using a window function as
SELECT Person,
Company,
Paid
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY Company ORDER BY Paid DESC) RN
FROM T
) TT
WHERE RN = 1;
Demo
Here's your query.
select a.person, a.company, a.paid from tableA a
inner join
(select person, company, row_number() over (partition by company order by paid desc) as rn from tableA) as t1
on t1.person = a.person and t1.company = a.company
where t1.rn = 1
Maybe something like
WITH ranked AS (SELECT person, company, paid
, rank() OVER (PARTITION BY company ORDER BY paid DESC) AS rnk
FROM yourtable)
SELECT person, company, paid
FROM ranked
WHERE rnk = 1
ORDER BY company;
You can use rank() function with partition by clause.
DENSE_RANK gives you the ranking within your ordered partition, but the ranks are consecutive. No ranks are skipped if there are ranks with multiple items.
WITH cte AS (
SELECT person, company, paid
rank() OVER (PARTITION BY company ORDER BY paid desc) rn
FROM yourtable
)
SELECT
*
FROM cte

SQL Windowing Ranks Functions

SELECT
*
FROM (
SELECT
Product,
SalesAmount,
ROW_NUMBER() OVER (ORDER BY SalesAmount DESC) as RowNum,
RANK() OVER (ORDER BY SalesAmount DESC) as RankOf2007,
DENSE_RANK() OVER (ORDER BY SalesAmount DESC) as DRankOf2007
FROM (
SELECT
c.EnglishProductName as Product,
SUM(a.SalesAmount) as SalesAmount,
b.CalendarYear as CalenderYear
FROM FactInternetSales a
INNER JOIN DimDate b
ON a.OrderDateKey=b.DateKey
INNER JOIN DimProduct c
ON a.ProductKey=c.ProductKey
WHERE b.CalendarYear IN (2007)
GROUP BY c.EnglishProductName,b.CalendarYear
) Sales
) Rankings
WHERE [RankOf2007] <= 5
ORDER BY [SalesAmount] DESC
I am currently sorting products based on summation of Sales Amount in descending fashion and getting rank based on the summation of sales amount of every product in 2007 and ranking product 1 if it has the highest Sales Amount in that year and so forth.
Currently my database table looks like the one mentioned in the image (apart from RankOf2008 and DRankOf2008 columns), I would like to have rankings in year 2008 for same top 5 products of 2007 (Null value if any of those top 5 products of 2007 are unsold in 2008) in the same table with side by side columns as shown in the image above.
May be you require something like this.
First getting ranks for all products then partition by year, that is rank of products year wise and fetching required data with help of CTE.
WITH cte
AS (
SELECT *
FROM (
SELECT Product
,SalesAmount
,CalenderYear
,ROW_NUMBER() OVER (
PARTITION BY CalenderYear ORDER BY SalesAmount DESC
) AS RowNum
,RANK() OVER (
PARTITION BY CalenderYear ORDER BY SalesAmount DESC
) AS RankOf2007
,DENSE_RANK() OVER (
PARTITION BY CalenderYear ORDER BY SalesAmount DESC
) AS DRankOf2007
FROM (
SELECT c.EnglishProductName AS Product
,SUM(a.SalesAmount) AS SalesAmount
,b.CalendarYear AS CalenderYear
FROM FactInternetSales a
INNER JOIN DimDate b ON a.OrderDateKey = b.DateKey
INNER JOIN DimProduct c ON a.ProductKey = c.ProductKey
--WHERE b.CalendarYear IN (2007)
GROUP BY c.EnglishProductName
,b.CalendarYear
) Sales
) Rankings
--WHERE [RankOf2007] <= 5
--ORDER BY [SalesAmount] DESC
)
SELECT a.*
,b.DRankOf2007 AS [DRankOf2008]
,b.RankOf2007 AS [RankOf2008]
FROM cte a
LEFT JOIN cte b ON a.Product = b.Product
AND b.CalenderYear = 2008
WHERE a.CalenderYear = 2007
AND a.[RankOf2007] <= 5
Use conditional aggregation in your innermost query (i.e. select both years and sum conditionally for one of the years):
select
p.productkey,
p.englishproductname as product,
ranked.salesamount2007,
ranked.salesamount2008,
ranked.rankof2007,
ranked.rankof2008
from
(
select
productkey,
salesamount2007,
salesamount2008,
rank() over (order by salesamount2007 desc) as rankof2007,
rank() over (order by salesamount2008 desc) as rankof2008
from
(
select
s.productkey,
sum(case when d.calendaryear = 2007 then s.salesamount end) as salesamount2007,
sum(case when d.calendaryear = 2008 then s.salesamount end) as salesamount2008
from factinternetsales s
inner join dimdate d on d.datekey = s.orderdatekey
where d.calendaryear in (2007, 2008)
group by s.productkey
) aggregated
) ranked
join dimproduct p on p.productkey = ranked.productkey
where ranked.rankof2007 <= 5
order by ranked.rankof2007 desc;
For the case there are no rows for a product in 2008, salesamount2008 will be null. In standard SQL we would consider this in the ORDER BY clause:
rank() over (order by salesamount2008 desc nulls last) as rankof2008
But SQL Server doesn't comply with the SQL standard here and doesn't feature NULLS FIRST/LAST in the ORDER BY clause. Fortunately, it sorts nulls last when sorting in descending order, so it implicitly does just what we want here.
By the way: we could do the aggregation and ranking in a single step, but in that case we'd have to repeat the SUM expressions. It's a matter of personal preference, whether to do this in one step (shorter query) or two steps (no repetitive expressions).

Getting the last value of a column for an object with many entries

Please consider the below Repair_Table of a car-repair workshop.
I'm trying to get the last "Repair_Status" value for each car within a given day.
"Repair_Update" column will have a new entry for each change during the car repair process: like the repair status, the repair name or the repair team and repair sub-task which are stored in a different table..
I tried this query but it's not giving me what I need:
select distinct(Car_ID), Repair_Status
from Repair_Table r1
left join (select Car_ID, max(Repair_Update) from Repair_Table
group by Car_ID) r2
on r2.Repair_Update = r1.Repair_Update
where convert(date,Repair_Start) = '20180122'
Window functions are an easy way to do this:
select rt.*
from (select rt.*,
row_number() over (partition by car_id, cast(repair_update as date)
order by repair_update desc
) as seqnum
from Repair_Table rt
where convert(date, Repair_Start) = '20180122'
) rt
where seqnum = 1;
You can remove the where clause if you want this information on multiple dates -- or even on all dates.
Using row_number() function:
select Car_ID, Repair_Status
from (
select Car_ID, Repair_Status
row_number() over (partition by Car_ID order by Repair_Update desc) as rnk
from Repair_Table
) t
where t.rnk = 1

How to get the records from inner query results with the MAX value

The results are below. I need to get the records (seller and purchaser) with the max count- grouped by purchaser (marked with yellow)
You can use window functions:
with q as (
<your query here>
)
select q.*
from (select q.*,
row_number() over (order by seller desc) as seqnum_s,
row_number() over (order by purchaser desc) as seqnum_p
from q
) q
where seqnum_s = 1 or seqnum_p = 1;
Try this:
SELECT COUNT,seller,purchaser FROM YourTable ORDER BY seller,purchaser DESC
SELECT T2.MaxCount,T2.purchaser,T1.Seller FROM <Yourtable> T1
Inner JOIN
(
Select Max(Count) as MaxCount, purchaser
FROM <Yourtable>
GROUP BY Purchaser
)T2
On T2.Purchaser=T1.Purchaser AND T2.MaxCount=T1.Count
First you select the Seller from which will give you a list of all 5 sellers. Then you write another query where you select only the Purchaser and the Max(count) grouped by Purchaser which will give you the two yellow-marked lines. Join the two queries on fields Purchaser and Max(Count) and add the columns from the joined table to your first query.
I can't think of a faster way but this works pretty fast even with rather large queries. You can further-by order the fields as needed.

Tricky SQL SELECT Statement

I have a performance issue when selecting data in my project.
There is a table with 3 columns: "id","time" and "group"
The ids are just unique ids as usual.
The time is the creation date of the entry.
The group is there to cummulate certain entries together.
So the table data may look like this:
ID | TIME | GROUP
------------------------
1 | 20090805 | A
2 | 20090804 | A
3 | 20090804 | B
4 | 20090805 | B
5 | 20090803 | A
6 | 20090802 | B
...and so on.
The task is now to select the "current" entries (their ids) in each group for a given date. That is, for each group find the most recent entry for a given date.
Following preconditions apply:
I do not know the different groups in advance - there may be many different ones changing over time
The selection date may lie "in between" the dates of the entries in the table. Then I have to find the closest one in each group. That is, TIME is less than the selection date but the maximum of those to which this rule applies in a group.
What I currently do is a multi-step process which I would like to change into single SELECT statement:
SELECT DISTINCT group FROM table to find the available groups
For each group found in 1), SELECT * FROM table WHERE time<selectionDate AND group=loop ORDER BY time DESC
Take the first row of each result found in 2)
Obviously this is not optimal.
So I would be very happy if some more experienced SQL expert could help me to find a solution to put these steps in a single statement.
Thank you!
The following will work on SQL Server 2005+ and Oracle 9i+:
WITH groups AS (
SELECT t.group,
MAX(t.time) 'maxtime'
FROM TABLE t
GROUP BY t.group)
SELECT t.id,
t.time,
t.group
FROM TABLE t
JOIN groups g ON g.group = t.group AND g.maxtime = t.time
Any database should support:
SELECT t.id,
t.time,
t.group
FROM TABLE t
JOIN (SELECT t.group,
MAX(t.time) 'maxtime'
FROM TABLE t
GROUP BY t.group) g ON g.group = t.group AND g.maxtime = t.time
Here's how I would do it in SQL Server:
SELECT * FROM table WHERE id in
(SELECT top 1 id FROM table WHERE time<selectionDate GROUP BY [group] ORDER BY [time])
The solution will vary by database server, since the syntax for TOP queries varies. Basically you are looking for a "top n per group" query, so you can Google that if you want.
Here is a solution in SQL Server. The following will return the top 10 players who hit the most home runs per year since 1990. The key is to calculate the "Home Run Rank" of each player for each year.
select
HRRanks.*
from
(
Select
b.yearID, b.PlayerID, sum(b.Hr) as TotalHR,
rank() over (partition by b.yearID order by sum(b.hr) desc) as HR_Rank
from
Batting b
where
b.yearID > 1990
group by
b.yearID, b.playerID
)
HRRanks
where
HRRanks.HR_Rank <= 10
Here is a solution in Oracle (Top Salespeople per Department)
SELECT deptno, avg_sal
FROM(
SELECT deptno, AVG(sal) avg_sal
GROUP BY deptno
ORDER BY AVG(sal) DESC
)
WHERE ROWNUM <= 10;
Or using analytic functions:
SELECT deptno, avg_sal
FROM (
SELECT deptno, avg_sal, RANK() OVER (ORDER BY sal DESC) rank
FROM
(
SELECT deptno, AVG(sal) avg_sal
FROM emp
GROUP BY deptno
)
)
WHERE rank <= 10;
Or same again, but using DENSE_RANK() instead of RANK()
select * from TABLE where (GROUP, TIME) in (
select GROUP, max(TIME) from things
where TIME >= 20090804
group by GROUP
)
Tested with MySQL (but I had to change the table and column names because they are keywords).
SELECT *
FROM TABB T1
QUALIFY ROW_NUMBER() OVER ( PARTITION BY GROUPP,TIMEE order by id desc )=1