Select TOP or GROUP BY ranked data from nested SQL query

Select TOP or GROUP BY ranked data from nested SQL query - sql

I have a table with events and results and query like this:
SELECT T2.EventDate, T2.EventPlace, T2.Par1,
T2.Par2, T2.Result, T2.ResultEventDate_sum
FROM
(SELECT TOP 15 EventDate, EventPlace, Par1, Par2,
Result, SUM(Result) OVER (PARTITION BY EventDate) AS ResultEventDate_SUM
FROM
(SELECT EventDate, EventPlace, Par1, Par2, Result,
ROW_NUMBER() OVER (PARTITION BY EventDate ORDER BY Result DESC) AS EventDate_rank
FROM MainTable
WHERE AND CAST(UserNb AS int) = 103
AND Col1 = 'X' AND Result > 0 AND Col2 LIKE '%Y%'
AND Par1 > 500 AND Par1 <= 700) ranked
WHERE EventDate_rank <= 5 ORDER BY ResultEventDate_sum DESC) T2
The results I get:
EventDate EventPlace Par1 Par2 Result ResultEventDate_SUM
2019-05-26 PLACE nb 1 508 604 51.20 278.11
2019-05-26 PLACE nb 1 508 571 51.68 278.11
2019-05-26 PLACE nb 1 508 249 56.38 278.11
2019-05-26 PLACE nb 1 508 42 59.40 278.11
2019-05-26 PLACE nb 1 508 39 59.45 278.11
2019-06-09 PLACE nb 3 508 449 50.95 217.05
2019-06-09 PLACE nb 3 508 259 54.79 217.05
2019-06-09 PLACE nb 3 508 254 54.89 217.05
2019-06-09 PLACE nb 3 508 178 56.42 217.05
2019-06-16 PLACE nb 4 508 372 51.49 169.56
2019-06-16 PLACE nb 4 508 66 58.51 169.56
2019-06-16 PLACE nb 4 508 20 59.56 169.56
2019-06-02 PLACE nb 2 508 533 50.19 107.46
2019-06-02 PLACE nb 2 508 149 57.27 107.46
I need to get list of best (highest) sum of max 5 result from each event, but take only 3 best events. So I put SELECT TOP 15 (3 events by 5 results) and the query results is ok when there is a 5 results by each event. But if there is less then 5 results for each event I get also records from 4 event. How to modify query to be sure there will be only 3 events no matter how many results are for each one? In this example cut last 2 records (1 event with the smallest result 107.46). Is there a way to achieve this by simplify query without adding another code?
I tried to put COUNT(*) in the first line, but it counts the same like col resulteventdate, so I cannot use it as condition. Also If I added for ex. EXISTS, the statement is too big and take to much time.
Expected results is a table with only 3 events with best ResultEventDate_sum:
EventDate EventPlace Par1 Par2 Result ResultEventDate_SUM
2019-05-26 PLACE nb 1 508 604 51.20 278.11
2019-05-26 PLACE nb 1 508 571 51.68 278.11
2019-05-26 PLACE nb 1 508 249 56.38 278.11
2019-05-26 PLACE nb 1 508 42 59.40 278.11
2019-05-26 PLACE nb 1 508 39 59.45 278.11
2019-06-09 PLACE nb 3 508 449 50.95 217.05
2019-06-09 PLACE nb 3 508 259 54.79 217.05
2019-06-09 PLACE nb 3 508 254 54.89 217.05
2019-06-09 PLACE nb 3 508 178 56.42 217.05
2019-06-16 PLACE nb 4 508 372 51.49 169.56
2019-06-16 PLACE nb 4 508 66 58.51 169.56
2019-06-16 PLACE nb 4 508 20 59.56 169.56
Thanks in advance for any tips.
UPDATE after tests:
Thanks to all of you. I tested your propositions in the query and Dense_Rank do the job correctly and quite fast. Helped me a lot.

You need to use one more window function i.e. Dense_Rank.
This may help you get required output.
select EventDate,EventPlace,Par1,Par2,Result,ResultEventDate_sum from(
SELECT T2.EventDate, T2.EventPlace, T2.Par1,
T2.Par2, T2.Result, T2.ResultEventDate_sum ,
DENSE_RANK()over(order by ResultEventDate_sum,T2.EventDate desc)rnk
FROM
(SELECT EventDate, EventPlace, Par1, Par2,
Result, SUM(Result) OVER (PARTITION BY EventDate) AS ResultEventDate_SUM
FROM
(SELECT EventDate, EventPlace, Par1, Par2, Result,
ROW_NUMBER() OVER (PARTITION BY EventDate ORDER BY Result DESC) AS
EventDate_rank
FROM MainTable
WHERE AND CAST(UserNb AS int) = 103
AND Col1 = 'X' AND Result > 0 AND Col2 LIKE '%Y%'
AND Par1 > 500 AND Par1 <= 700) ranked
WHERE EventDate_rank <= 5 ) T2
)T1
WHERE RNK<=3

In this following query, I have considered EventPlace as Event_ID. This query will return TOP 3 Event (Based on SUM of Result per Event) Details (TOP 5 Rows for each Event)
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY EventPlace ORDER BY EventPlace,Result DESC) RN
FROM your_table
WHERE EventPlace IN
(
-- You can set any number based on
-- How many event details you wants to see
SELECT TOP 3 EventPlace
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY EventPlace ORDER BY EventPlace,Result DESC) RN
FROM your_table
)A
-- You can set any number based on
-- How many row's result you want to SUM for checking
WHERE RN <= 5
GROUP BY EventPlace
ORDER BY SUM(Result) DESC
)
)B
WHERE RN <= 5

Related

Summing column that is grouped - SQL

I have a query:
SELECT
date,
COUNT(o.row_number)FILTER (WHERE o.row_number > 1 AND date_ddr IS NOT NULL AND telephone_number <> 'Anonymous' ) repeat_calls_24h
(
SELECT
telephone_number,
date_ddr,
ROW_NUMBER() OVER(PARTITION BY ddr.telephone_number ORDER BY ddr.date) row_number,
FROM
table_a
)o
GROUP BY 1
Generating the following table:
date
Repeat calls_24h
17/09/2022
182
18/09/2022
381
19/09/2022
81
20/09/2022
24
21/09/2022
91
22/09/2022
110
23/09/2022
231
What can I add to my query to provide a sum of the previous three days as below?:
date
Repeat calls_24h
Repeat Calls 3d
17/09/2022
182
18/09/2022
381
19/09/2022
81
644
20/09/2022
24
486
21/09/2022
91
196
22/09/2022
110
225
23/09/2022
231
432
Thanks

We can do it using lag.
select "date"
,"Repeat calls_24h"
,"Repeat calls_24h" + lag("Repeat calls_24h") over(order by "date") + lag("Repeat calls_24h", 2) over(order by "date") as "Repeat Calls 3d"
from t
date
Repeat calls_24h
Repeat Calls 3d
2022-09-17
182
null
2022-09-18
381
null
2022-09-19
81
644
2022-09-20
24
486
2022-09-21
91
196
2022-09-22
110
225
2022-09-23
231
432
Fiddle

how to select a value based on multiple criteria

I'm trying to select some values based on some proprietary data, and I just changed the variables to reference house prices.
I am trying to get the total offers for houses where they were sold at the bid or at the ask price, with offers under 15 and offers * sale price less than 5,000,000.
I then want to get the total number of offers for each neighborhood on each day, but instead I'm getting the total offers across each neighborhood (n1 + n2 + n3 + n4 + n5) across all dates and the total offers in the dataset across all dates.
My current query is this:
SELECT DISTINCT(neighborhood),
DATE(date_of_sale),
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`
WHERE ((offers * accepted_sale_price < 5000000)
AND (offers < 15)
AND (house_bid = sale_price OR
house_ask = sale_price))) as bid_ask_off,
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`) as
total_offers,
FROM `big_query.a_table_name.houseprices`
GROUP BY neighborhood, DATE(date_of_sale) LIMIT 100
Which I am expecting a result like, with date being repeated throughout as d1, d2, d3, etc.:
but am instead receiving
I'm aware that there are some inherent problems with what I'm trying to select / group, but I'm not sure what to google or what tutorials to look at in order to perform this operation.
It's querying quite a bit of data, and I want to keep costs down, as I've already racked up a smallish bill on queries.
Any help or advice would be greatly appreciated, and I hope I've provided enough information.
Here is a sample dataframe.
neighborhood date_of_sale offers accepted_sale_price house_bid house_ask
bronx 4/1/2022 3 323 320 323
manhattan 4/1/2022 4 244 230 244
manhattan 4/1/2022 8 856 856 900
queens 4/1/2022 15 110 110 135
brooklyn 4/2/2022 12 115 100 115
manhattan 4/2/2022 9 255 255 275
bronx 4/2/2022 6 330 300 330
queens 4/2/2022 10 405 395 405
brooklyn 4/2/2022 4 254 254 265
staten_island 4/3/2022 2 442 430 442
staten_island 4/3/2022 13 195 195 225
bronx 4/3/2022 4 650 650 690
manhattan 4/3/2022 2 286 266 286
manhattan 4/3/2022 6 356 356 400
staten_island 4/4/2022 4 361 361 401
staten_island 4/4/2022 5 348 348 399
bronx 4/4/2022 8 397 340 397
manhattan 4/4/2022 9 333 333 394
manhattan 4/4/2022 11 392 325 392

I think that this is what you need.
As we group by neighbourhood we do not need DISTINCT.
We take sum(offers) for total_offers directly from the table and bids from a sub-query which we join to so that it is grouped by neighbourhood.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY neighborhood) s
ON h.neighborhood = s.neighborhood
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
Or the following which modifies more the initial query but may be more like what you need.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
date_of_sale dos,
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY
neighborhood,
date_of_sale) s
ON h.neighborhood = s.neighborhood
AND h.date_of_sale = s.dos
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;

how to give different row id to sub group in in a group in sql query?

i have bunch of discount scheme for my item table , and for each item i have different discount scheme. now i want to give row id to that item but it should be start from zer0(0) for each item group, and when it got different DiscountId then it should be change, my table is in below image..
now for an example, for ItemCode 429 there are 7 same discount with DiscountId 427 so for this all i want row Id 0(zero) but when change DiscountId, it means for Same ItemCode and 428 DiscountId, then i want another RowId with increment. and when ItemCode change then rowId should be start from Zero(0).
can anyone help me please??
my current query is simpaly "select * from ItemDiscount_md".

Maybe something like this:
Test data:
DECLARE #tbl TABLE(ITEMCode INT,DiscountId INT)
INSERT INTO #tbl
VALUES
(73,419),(73,419),(73,420),(73,420),(73,420),
(429,427),(429,427),(429,427),(429,427),(429,427),
(429,427),(429,427),(429,427),(429,428),(429,428)
Query:
;WITH CTE
AS
(
SELECT
DENSE_RANK() OVER(PARTITION BY tbl.ITEMCode
ORDER BY DiscountId) AS Rownbr,
tbl.*
FROM
#tbl AS tbl
)
SELECT
CTE.Rownbr-1 AS RowNbr,
CTE.DiscountId,
CTE.ITEMCode
FROM
CTE
Of course you can simplify the query by writing this:
SELECT
(DENSE_RANK() OVER(PARTITION BY tbl.ITEMCode
ORDER BY DiscountId))-1 AS Rownbr,
tbl.*
FROM
#tbl AS tbl
I just thought it was nicer and more readable with a CTE function
References:
DENSE_RANK
OVER Clause
Using Common Table Expressions
ROW_NUMBER
EDIT
To answer the comment. No ROW_NUMBER will not return the same counter. This is the output with DENSE_RANK:
0 419 73
0 419 73
1 420 73
1 420 73
1 420 73
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
0 427 429
1 428 429
1 428 429
And this is with ROW_NUMBER:
0 419 73
1 419 73
2 420 73
3 420 73
4 420 73
0 427 429
1 427 429
2 427 429
3 427 429
4 427 429
5 427 429
6 427 429
7 427 429
8 428 429
9 428 429
As you see ROW_NUMBER() recounts the group when the DENSE_RANK ranks the group

Just more simplified Arion's Answer
DECLARE #tbl TABLE(ITEMCode INT,DiscountId INT)
INSERT INTO #tbl
VALUES
(73,419),
(73,419),
(73,420),
(73,420),
(73,420),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,427),
(429,428),
(429,428)
;
SELECT
(DENSE_RANK() OVER(PARTITION BY ITEMCode ORDER BY DiscountId) -1) AS Rownbr,
DiscountId,
ITEMCode
FROM
#tbl

if you got data like this: (from the image) #temp table
itemcode DiscountId DayId
----------- ----------- -----------
102 416 2
102 416 3
102 416 4
79 419 3
79 419 1
79 420 2
79 420 1
use row_number() to get below result
itemcode DiscountId DayId rowid
----------- ----------- ----------- --------------------
102 416 2 1
102 416 3 2
102 416 4 3
79 419 3 1
79 419 1 2
79 420 2 1
79 420 1 2
SQL example:
select itemcode, DiscountId, DayId
, ROW_NUMBER() over (partition by Discountid order by discountid) as 'rowid'
from #temp

Sqlite substract sums (with group by) with JOIN and duplicates

I previously found the solution to my problem but unfortunately I lost files on my harddrive and I can't find the statement I managed to produce.
I have 2 tables T2REQ and T2STOCK, both have 2 columns (typeID and quantity) and my problem reside in the fact that I can have multiple occurences of SAME typeID in BOTH tables.
What I'm trying to do is SUM(QUANTITY) grouped by typeID and substract the values of T2STOCK from T2REQ but since I have multiple occurences of same typeID in both tables, the SUM I get is multiplied by the number of occurences of typeID.
Here's a sample of T2REQ (take typeID 11399 for example):
typeID quantity
---------- ----------
34 102900
35 10500
36 3220
37 840
11399 700
563 140
9848 140
11486 28
11688 700
11399 390
4393 130
9840 390
9842 390
11399 390
11483 19.5
11541 780
And this is a sample of T2STOCK table :
typeID quantity
---------- ----------
9842 1921
9848 2400
11399 1700
11475 165
11476 27
11478 28
11481 34
11483 122
11476 2
And this is where I'm at for now, I know that the SUM(t2stock.quantity) is affected (multiplied) because of the JOIN 1 = 1 but whatever I tried, I'm not doing it in the right order:
SELECT
t2req.typeID, sum(t2req.quantity), sum(t2stock.quantity),
sum(t2req.quantity) - sum(t2stock.quantity) as diff
FROM t2req JOIN t2stock ON t2req.typeID = t2stock.typeID
GROUP BY t2req.typeID
ORDER BY diff DESC;
typeID sum(t2req.quantity) sum(t2stock.quantity) diff
---------- ------------------- --------------------- ----------
563 140 30 110
11541 780 780 0
11486 28 40 -12
11483 19.5 122 -102.5
9840 390 1000 -610
40 260 940 -680
9842 390 1921 -1531
9848 140 2400 -2260
11399 1480 5100 -3620
39 650 7650 -7000
37 1230 116336 -115106
36 28570 967098 -938528
35 33770 2477820 -2444050
34 102900 2798355 -2695455
You can see that SUM(t2req) for typeID 11399 is correct : 1480
And you can see that the SUM(t2stock) for typeID 11399 is not correct : 5100 instead of 1700 (which is 5100 divided by 3, the number of occurences in t2req)
What would be the best way to avoid multiplications because of multiple typeIDs (in both tables) with the JOIN for my sum substract ?
Sorry for the wall of text, just trying to explain as best as I can since english is not my mother tongue.
Thanks a lot for your help.

You can aggregate before join:
SELECT
t2req.typeID,
t2req.quantity,
t2stock.quantity,
t2req.quantity - t2stock.quantity as diff
FROM
(SELECT TypeID, SUM(Quantity) Quantity FROM t2req GROUP BY TypeID) t2req JOIN
(SELECT TypeID, SUM(Quantity) Quantity FROM t2stock GROUP BY TypeID) t2stock
ON t2req.typeID = t2stock.typeID
ORDER BY diff DESC;
Fiddle sample: http://sqlfiddle.com/#!7/06711/5

You can't do this in a single aggregation:
SELECT
COALESCE(r.typeID, s.typeID) AS typeID,
COALESCE(r.quantity, 0) AS req_quantity,
COALESCE(s.quantity, 0) AS stock_quantity,
COALESCE(r.quantity, 0) - COALESCE(s.quantity, 0) AS diff
FROM (
SELECT rr.typeID, SUM(rr.quantity) AS quantity
FROM t2req rr
GROUP BY rr.typeID
) r
CROSS JOIN (
SELECT ss.typeID, SUM(ss.quantity) AS quantity
FROM t2stock ss
GROUP BY ss.typeID
) s ON r.typeID = s.typeID
ORDER BY 4 DESC;

Calculating difference from previous record

May I ask for your help with the following please ?
I am trying to calculate a change from one record to the next in my results. It will probably help if I show you my current query and results ...
SELECT A.AuditDate, COUNT(A.NickName) as [TAccounts],
SUM(IIF((A.CurrGBP > 100 OR A.CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits A
GROUP BY A.AuditDate;
The query gives me these results ...
AuditDate D/M/Y TAccounts Funded
--------------------------------------------
30/12/2011 506 285
04/01/2012 514 287
05/01/2012 514 288
06/01/2012 516 288
09/01/2012 520 289
10/01/2012 522 289
11/01/2012 523 290
12/01/2012 524 290
13/01/2012 526 291
17/01/2012 531 292
18/01/2012 532 292
19/01/2012 533 293
20/01/2012 537 295
Ideally, the results I would like to get, would be similar to the following ...
AuditDate D/M/Y TAccounts TChange Funded FChange
------------------------------------------------------------------------
30/12/2011 506 0 285 0
04/01/2012 514 8 287 2
05/01/2012 514 0 288 1
06/01/2012 516 2 288 0
09/01/2012 520 4 289 1
10/01/2012 522 2 289 0
11/01/2012 523 1 290 1
12/01/2012 524 1 290 0
13/01/2012 526 2 291 1
17/01/2012 531 5 292 1
18/01/2012 532 1 292 0
19/01/2012 533 1 293 1
20/01/2012 537 4 295 2
Looking at the row for '17/01/2012', 'TChange' has a value of 5 as the 'TAccounts' has increased from previous 526 to 531. And the 'FChange' would be based on the 'Funded' field. I guess something to be aware of is the fact that the previous row to this example, is dated '13/01/2012'. What I mean is, there are some days where I have no data (for example over weekends).
I think I need to use a SubQuery but I am really struggling to figure out where to start. Could you show me how to get the results I need please ?
I am using MS Access 2010
Many thanks for your time.
Johnny.

Here is one approach you could try...
SELECT B.AuditDate,B.TAccounts,
B.TAccount -
(SELECT Count(NickName) FROM Audits WHERE AuditDate=B.PrevAuditDate) as TChange,
B.Funded -
(SELECT Count(*) FROM Audits WHERE AuditDate=B.PrevAuditDate AND (CurrGBP > 100 OR CurrUSD > 100)) as FChange
FROM (
SELECT A.AuditDate,
(SELECT Count(NickName) FROM Audits WHERE AuditDate=A.AuditDate) as TAccounts,
(SELECT Count(*) FROM Audits WHERE (CurrGBP > 100 OR CurrUSD > 100)) as Funded,
(SELECT Max(AuditDate) FROM Audits WHERE AuditDate<A.AuditDate) as PrevAuditDate
FROM
(SELECT DISTINCT AuditDate FROM Audits) AS A) AS B
Instead of using a Group By I've used subquerys to get both TAccounts and Funded, as well as the Previous Audit Date, which is then used on the main SELECT statement to get TAccounts and Funded again but this time for the previous date, so that any required calculation can be done against them.
But I would imagine this may be slow to process

It's a shame MS never made this type of thing simple in Access, how many rows are you working with on your report?
If it's under 65K then I would suggest dumping the data on to an Excel spreadsheet and using a simple formula to calculate the different between rows.

You can try something like the following (sql is untested and will require some changes)
SELECT
A.AuditDate,
A.TAccounts,
A.TAccounts - B.TAccounts AS TChange,
A.Funded,
A.Funded - B.Funded AS FChange
FROM
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) A
INNER JOIN
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) B ON B.ROW = A.ROW + 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select TOP or GROUP BY ranked data from nested SQL query - sql

Related

Summing column that is grouped - SQL

how to select a value based on multiple criteria

how to give different row id to sub group in in a group in sql query?

Sqlite substract sums (with group by) with JOIN and duplicates

Calculating difference from previous record

Categories

Resources