SQL Grouping and dense rank concept - sql

I have a data set that looks like:
cust city hotel_id amount
-------------------------------
A 1 252 3160
B 1 256 1893
C 2 105 2188
D 2 105 3054
E 3 370 6107
F 2 110 3160
G 2 150 1893
H 3 310 2188
I 1 252 3160
J 1 250 4000
K 3 370 5000
L 3 311 1095
Query to display the top 3 hotels by revenue (Sum of amount) for each city?
Since same hotel can be booked by other customer in same city so we need to sum the amount to find total amount.
Expected output:
city hotel_id amount
---------------------------
1 252 6320
1 250 4000
1 256 1893
2 105 5242
2 110 3160
2 150 1893
3 370 11107
3 310 2188
3 311 1095

SELECT
t.city, t.hotel_id, t.amount
FROM
(
SELECT city, hotel_id, SUM(amount) AS amount,
ROW_NUMBER() OVER (PARTITION BY city ORDER BY SUM(amount) DESC) AS rn
FROM yourTable
GROUP BY city, hotel_id
) t
WHERE t.rn <= 3
ORDER BY t.city, t.amount DESC;
Demo here:
Rextester

To get the total sum for each hotel_id you need to group by that column first, then group by the city for syntax purposes. The #tmp table here should have all of the data you need, so then you just have to select the top 3 entries for each city from there.
SELECT city, hotel_id, SUM(amount) AS 'total' INTO #tmp
FROM [table]
GROUP BY hotel_id, city
(SELECT TOP 3 *
FROM #tmp
WHERE city = 1)
UNION
(SELECT TOP 3 *
FROM #tmp
WHERE city = 2)
UNION
(SELECT TOP 3 *
FROM #tmp
WHERE city = 3)

Related

Return 1 row from all groups

https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=9e6f83edf836f4496afb509eb9411d4a
Edited to include sql code:
CREATE TABLE TMP_PRODUCTS (STORE INT, UPC INT, PROMOCODE CHAR(3), FORSALE CHAR(1))
INSERT INTO TMP_PRODUCTS VALUES
(100,1,'123','Y'),
(100,2,'123','Y'),
(100,3,'123','N'),
(100,4,'124','Y'),
(100,5,'124','N'),
(100,6,'124','N'),
(100,7,'125','N'),
(100,8,'125','N'),
(100,9,'125','N');
SELECT
STORE,
UPC,
PROMOCODE,
DENSE_RANK() OVER (PARTITION BY STORE ORDER BY PROMOCODE) AS 'GroupCode'
FROM
TMP_PRODUCTS
WHERE
FORSALE = 'Y'
I need to return all rows where FORSALE='Y' across all groups of PROMOCODE, and also at least 1 row from all groups where FORSALE='N'. In this example all products from group 125 are FORSALE='N', but I need at least 1 row to return. Here is the output I am currently getting:
STORE UPC PROMOCODE GroupCode FORSALE
100 1 123 1 Y
100 2 123 1 Y
100 4 124 2 Y
But here is the ideal output I would like to get:
STORE UPC PROMOCODE GroupCode FORSALE
100 1 123 1 Y
100 2 123 1 Y
100 4 124 2 Y
100 7 125 3 N
It would also be completely acceptable to return 1 row from PROMOCODE 123 and 124 even though they already have some items that are FORSALE='Y'. So this would also be acceptable outcome:
STORE UPC PROMOCODE GroupCode FORSALE
100 1 123 1 Y
100 2 123 1 Y
100 3 123 1 N
100 4 124 2 Y
100 5 124 2 N
100 7 125 3 N
You can do that with an additional row number window function to always include 1 row from each group regardless of Y/N
select STORE, UPC, PROMOCODE, Dense_Rank() over (partition by STORE order by PROMOCODE) GROUPCODE, FORSALE
from (
select * , Row_Number() over(partition by STORE, PROMOCODE order by UPC) rn
from TMP_PRODUCTS
)x
where FORSALE = 'Y' or rn=1
If I understand correctly, the logic you want is:
SELECT STORE, UPC, PROMOCODE,
DENSE_RANK() OVER (PARTITION BY STORE ORDER BY PROMOCODE) AS GroupCode
FROM (SELECT P.*,
ROW_NUMBER() OVER (PARTITION BY STORE, PROMOCODE, FORSALE ORDER BY (SELECT NULL)) as seqnum
FROM TMP_PRODUCTS P
) P
WHERE FORSALE = 'Y' OR seqnum = 1;

How to use this in sql -- > max(sum (paid * quantity )) to solve a query

How to get the max value order of each customer ?
select num, max(sum(paid*quantity))
from orders join
pizza
using (order#)
group by customer#;
table
num orderN price
-------- --- -------
1 109 30
1 118 25
3 101 30
3 115 27
4 107 23
5 100 17
5 129 16
output req-
num Pnum price
-------- --- -------
1 109 30
3 101 30
4 107 23
5 100 17
You want to select the record having the highest price in each group of nums.
If your RDBMS supports window functions, that's straight forward with ROW_NUMBER() :
SELECT num, pnum, price
FROM (
SELECT t.*, ROW_NUMBER OVER(PARTITION BY num ORDER BY price DESC) rn
FROM mytable t
) x
WHERE rn = 1
Else, you can take the following approach, that uses a NOT EXISTS condition with a correlated subquery to ensure that the record being joined in the one with the highest price for the current num :
SELECT num, pnum, price
FROM mytable t
WHERE NOT EXISTS (
SELECT 1 FROM mytable t1 WHERE t1.num = t.num AND t1.price > t.price
)

T-SQL: Row_number() group by

I am using SQL Server 2008 R2 and have a structure as below:
create table #temp( deptid int, regionid int)
insert into #temp
select 15000, 50
union
select 15100, 51
union
select 15200, 50
union
select 15300, 52
union
select 15400, 50
union
select 15500, 51
union
select 15600, 52
select deptid, regionid, RANK() OVER(PARTITION BY regionid ORDER BY deptid) AS 'RANK',
ROW_NUMBER() OVER(PARTITION BY regionid ORDER BY deptid) AS 'ROW_NUMBER',
DENSE_RANK() OVER(PARTITION BY regionid ORDER BY deptid) AS 'DENSE_RANK'
from #temp
drop table #temp
And output currently is as below:
deptid regionid RANK ROW_NUMBER DENSE_RANK
--------------------------------------------------
15000 50 1 1 1
15200 50 2 2 2
15400 50 3 3 3
15100 51 1 1 1
15500 51 2 2 2
15300 52 1 1 1
15600 52 2 2 2
My requirement however is to row_number over regionid column but by grouping and not row by row. To explain better, below is my desired result set.
deptid regionid RN
-----------------------
15000 50 1
15200 50 1
15400 50 1
15100 51 2
15500 51 2
15300 52 3
15600 52 3
Please let me know if my question is unclear. Thanks.
Use dense_rank() over (order by regionid) to get the expected result.
select deptid, regionid,
DENSE_RANK() OVER( ORDER BY regionid) AS 'DENSE_RANK'
from #temp
Partitioning within a rank/row_number window function will assign numbers within the partitions, so you don't need to use a partition on regionid to order the regionids themselves.

sum the values in column for same date and id

I want to add the values in the column cost ,amt- if there is a flag 1 and 2 for same person id on same date. please help. Thank you. Column are:
id date cost amt flag
455 05/25/2013 150 110 1
455 05/25/2013 20 45 2
456 08/17/2013 140 60 1
456 08/17/2013 15 20 2
457 09/28/2013 135 10 1
457 09/28/2013 8 40 2
458 11/09/2013 10 30 1
output should be:
id date cost amt flag
455 05/25/2013 170 155 1
456 08/17/2013 155 80 1
457 09/28/2013 143 50 1
458 11/09/2013 10 30 1
Just for diversity, check out my solution. It uses over (partition by ) for calculation and distinct for filtering out the duplicates.
select distinct o.ID, o.Date,
SUM(o.COST) OVER(PARTITION BY o.ID, o.Date) as cost
,SUM(o.AMT) OVER(PARTITION BY o.ID, o.Date) as amt
,MIN(FLAG) OVER(PARTITION BY o.ID, o.Date) as flag
from orders o
order by o.ID, o.Date
SqlFiddle proof
It's inspired by this article.
Not really sure what you want to do with flag, but you need GROUP BY like:
SELECT id, date, SUM(cost), Sum(amt), 1 as flag
FROM yourTable
GROUP BY id,date
SELECT ID, DATE , SUM(COST), SUM(AMT), MIN(Flag)
FROM TABLE
GROUP BY ID, DATE
If you need flag 1 AND 2 in the same date for the same id then this should work:
SELECT id, date, SUM(cost), SUM(amt) , flag
FROM yourtable a
WHERE flag=1
AND EXISTS (SELECT 1 FROM yourtable b WHERE a.id=b.id AND b.flag=2 AND a.date=b.date)
GROUP BY id, date, flag

Pivot SQL with Rank

Basically i have the following query and i am trying to distinguish only the unique ranks from this:
WITH numbered_rows
as (
SELECT Claim,
reserve,
time,
RANK() OVER (PARTITION BY ClaimNumber ORDER BY time asc) as 'Rank'
FROM (
SELECT cc.Claim,
MAX(csd.time) as time,
csd.reserve
FROM ClaimData csd WITH (NOLOCK)
JOIN Core cc WITH (NOLOCK)
on cc.ClaimID = csd.ClaimID
GROUP BY cc.Claim, csd.Reserve
) as t
)
select *
from numbered_rows cur, numbered_rows prev
where cur.Claim= prev.Claim
and cur.Rank = prev.Rank -1
The results set I get is the following:
Claim reserve Time Rank Claim reserve Time Rank
--------------------------------------------------------------------
11 0 12/10/2012 1 11 15000 5/30/2013 2
34 2000 1/21/2013 1 34 750 1/31/2013 2
34 750 1/31/2013 2 34 0 3/31/2013 3
07 800000 5/9/2013 1 07 0 5/10/2013 2
But what I only want to see the following: (have the Claim 34 Rank 2 removed because its not the highest
Claim reserve Time Rank Claim reserve Time Rank
--------------------------------------------------------------------
11 0 12/10/2012 1 11 15000 5/30/2013 2
34 750 1/31/2013 2 34 0 3/31/2013 3
07 800000 5/9/2013 1 07 0 5/10/2013 2
I think you can do this by just reversing your logic, i.e. order by time DESC, switching cur and prev in your final select and changing -1 to +1 in your final select, then just limiting prev.rank to 1, therefore ensuring that the you only include the latest 2 results for each claim:
WITH numbered_rows AS
( SELECT Claim,
reserve,
time,
[Rank] = RANK() OVER (PARTITION BY ClaimNumber ORDER BY time DESC)
FROM ( SELECT cc.Claim,
[Time] = MAX(csd.time),
csd.reserve
FROM ClaimData AS csd WITH (NOLOCK)
INNER JOIN JOIN Core AS cc WITH (NOLOCK)
ON cc.ClaimID = csd.ClaimID
GROUP BY cc.Claim, csd.Reserve
) t
)
SELECT *
FROM numbered_rows AS prev
INNER JOIN numbered_rows AS cur
ON cur.Claim= prev.Claim
AND cur.Rank = prev.Rank + 1
WHERE prev.Rank = 1;