Calculating difference from previous record - sql

May I ask for your help with the following please ?
I am trying to calculate a change from one record to the next in my results. It will probably help if I show you my current query and results ...
SELECT A.AuditDate, COUNT(A.NickName) as [TAccounts],
SUM(IIF((A.CurrGBP > 100 OR A.CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits A
GROUP BY A.AuditDate;
The query gives me these results ...
AuditDate D/M/Y TAccounts Funded
--------------------------------------------
30/12/2011 506 285
04/01/2012 514 287
05/01/2012 514 288
06/01/2012 516 288
09/01/2012 520 289
10/01/2012 522 289
11/01/2012 523 290
12/01/2012 524 290
13/01/2012 526 291
17/01/2012 531 292
18/01/2012 532 292
19/01/2012 533 293
20/01/2012 537 295
Ideally, the results I would like to get, would be similar to the following ...
AuditDate D/M/Y TAccounts TChange Funded FChange
------------------------------------------------------------------------
30/12/2011 506 0 285 0
04/01/2012 514 8 287 2
05/01/2012 514 0 288 1
06/01/2012 516 2 288 0
09/01/2012 520 4 289 1
10/01/2012 522 2 289 0
11/01/2012 523 1 290 1
12/01/2012 524 1 290 0
13/01/2012 526 2 291 1
17/01/2012 531 5 292 1
18/01/2012 532 1 292 0
19/01/2012 533 1 293 1
20/01/2012 537 4 295 2
Looking at the row for '17/01/2012', 'TChange' has a value of 5 as the 'TAccounts' has increased from previous 526 to 531. And the 'FChange' would be based on the 'Funded' field. I guess something to be aware of is the fact that the previous row to this example, is dated '13/01/2012'. What I mean is, there are some days where I have no data (for example over weekends).
I think I need to use a SubQuery but I am really struggling to figure out where to start. Could you show me how to get the results I need please ?
I am using MS Access 2010
Many thanks for your time.
Johnny.

Here is one approach you could try...
SELECT B.AuditDate,B.TAccounts,
B.TAccount -
(SELECT Count(NickName) FROM Audits WHERE AuditDate=B.PrevAuditDate) as TChange,
B.Funded -
(SELECT Count(*) FROM Audits WHERE AuditDate=B.PrevAuditDate AND (CurrGBP > 100 OR CurrUSD > 100)) as FChange
FROM (
SELECT A.AuditDate,
(SELECT Count(NickName) FROM Audits WHERE AuditDate=A.AuditDate) as TAccounts,
(SELECT Count(*) FROM Audits WHERE (CurrGBP > 100 OR CurrUSD > 100)) as Funded,
(SELECT Max(AuditDate) FROM Audits WHERE AuditDate<A.AuditDate) as PrevAuditDate
FROM
(SELECT DISTINCT AuditDate FROM Audits) AS A) AS B
Instead of using a Group By I've used subquerys to get both TAccounts and Funded, as well as the Previous Audit Date, which is then used on the main SELECT statement to get TAccounts and Funded again but this time for the previous date, so that any required calculation can be done against them.
But I would imagine this may be slow to process

It's a shame MS never made this type of thing simple in Access, how many rows are you working with on your report?
If it's under 65K then I would suggest dumping the data on to an Excel spreadsheet and using a simple formula to calculate the different between rows.

You can try something like the following (sql is untested and will require some changes)
SELECT
A.AuditDate,
A.TAccounts,
A.TAccounts - B.TAccounts AS TChange,
A.Funded,
A.Funded - B.Funded AS FChange
FROM
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) A
INNER JOIN
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) B ON B.ROW = A.ROW + 1

Related

Summing column that is grouped - SQL

I have a query:
SELECT
date,
COUNT(o.row_number)FILTER (WHERE o.row_number > 1 AND date_ddr IS NOT NULL AND telephone_number <> 'Anonymous' ) repeat_calls_24h
(
SELECT
telephone_number,
date_ddr,
ROW_NUMBER() OVER(PARTITION BY ddr.telephone_number ORDER BY ddr.date) row_number,
FROM
table_a
)o
GROUP BY 1
Generating the following table:
date
Repeat calls_24h
17/09/2022
182
18/09/2022
381
19/09/2022
81
20/09/2022
24
21/09/2022
91
22/09/2022
110
23/09/2022
231
What can I add to my query to provide a sum of the previous three days as below?:
date
Repeat calls_24h
Repeat Calls 3d
17/09/2022
182
18/09/2022
381
19/09/2022
81
644
20/09/2022
24
486
21/09/2022
91
196
22/09/2022
110
225
23/09/2022
231
432
Thanks
We can do it using lag.
select "date"
,"Repeat calls_24h"
,"Repeat calls_24h" + lag("Repeat calls_24h") over(order by "date") + lag("Repeat calls_24h", 2) over(order by "date") as "Repeat Calls 3d"
from t
date
Repeat calls_24h
Repeat Calls 3d
2022-09-17
182
null
2022-09-18
381
null
2022-09-19
81
644
2022-09-20
24
486
2022-09-21
91
196
2022-09-22
110
225
2022-09-23
231
432
Fiddle

how to select a value based on multiple criteria

I'm trying to select some values based on some proprietary data, and I just changed the variables to reference house prices.
I am trying to get the total offers for houses where they were sold at the bid or at the ask price, with offers under 15 and offers * sale price less than 5,000,000.
I then want to get the total number of offers for each neighborhood on each day, but instead I'm getting the total offers across each neighborhood (n1 + n2 + n3 + n4 + n5) across all dates and the total offers in the dataset across all dates.
My current query is this:
SELECT DISTINCT(neighborhood),
DATE(date_of_sale),
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`
WHERE ((offers * accepted_sale_price < 5000000)
AND (offers < 15)
AND (house_bid = sale_price OR
house_ask = sale_price))) as bid_ask_off,
(SELECT SUM(offers)
FROM `big_query.a_table_name.houseprices`) as
total_offers,
FROM `big_query.a_table_name.houseprices`
GROUP BY neighborhood, DATE(date_of_sale) LIMIT 100
Which I am expecting a result like, with date being repeated throughout as d1, d2, d3, etc.:
but am instead receiving
I'm aware that there are some inherent problems with what I'm trying to select / group, but I'm not sure what to google or what tutorials to look at in order to perform this operation.
It's querying quite a bit of data, and I want to keep costs down, as I've already racked up a smallish bill on queries.
Any help or advice would be greatly appreciated, and I hope I've provided enough information.
Here is a sample dataframe.
neighborhood date_of_sale offers accepted_sale_price house_bid house_ask
bronx 4/1/2022 3 323 320 323
manhattan 4/1/2022 4 244 230 244
manhattan 4/1/2022 8 856 856 900
queens 4/1/2022 15 110 110 135
brooklyn 4/2/2022 12 115 100 115
manhattan 4/2/2022 9 255 255 275
bronx 4/2/2022 6 330 300 330
queens 4/2/2022 10 405 395 405
brooklyn 4/2/2022 4 254 254 265
staten_island 4/3/2022 2 442 430 442
staten_island 4/3/2022 13 195 195 225
bronx 4/3/2022 4 650 650 690
manhattan 4/3/2022 2 286 266 286
manhattan 4/3/2022 6 356 356 400
staten_island 4/4/2022 4 361 361 401
staten_island 4/4/2022 5 348 348 399
bronx 4/4/2022 8 397 340 397
manhattan 4/4/2022 9 333 333 394
manhattan 4/4/2022 11 392 325 392
I think that this is what you need.
As we group by neighbourhood we do not need DISTINCT.
We take sum(offers) for total_offers directly from the table and bids from a sub-query which we join to so that it is grouped by neighbourhood.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY neighborhood) s
ON h.neighborhood = s.neighborhood
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;
Or the following which modifies more the initial query but may be more like what you need.
SELECT
h.neighborhood,
DATE(h.date_of_sale) AS date_,
s.bids AS bid_ask_off,
SUM(h.offers) AS total_offers,
FROM
`big_query.a_table_name.houseprices` h
LEFT JOIN
(SELECT
date_of_sale dos,
neighborhood,
SUM(offers) AS bids
FROM
`big_query.a_table_name.houseprices`
WHERE offers * accepted_sale_price < 5000000
AND offers < 15
AND (house_bid = sale_price OR
house_ask = sale_price)
GROUP BY
neighborhood,
date_of_sale) s
ON h.neighborhood = s.neighborhood
AND h.date_of_sale = s.dos
GROUP BY
h.neighborhood,
DATE(date_of_sale),
s.bids
LIMIT 100;

Select TOP or GROUP BY ranked data from nested SQL query

I have a table with events and results and query like this:
SELECT T2.EventDate, T2.EventPlace, T2.Par1,
T2.Par2, T2.Result, T2.ResultEventDate_sum
FROM
(SELECT TOP 15 EventDate, EventPlace, Par1, Par2,
Result, SUM(Result) OVER (PARTITION BY EventDate) AS ResultEventDate_SUM
FROM
(SELECT EventDate, EventPlace, Par1, Par2, Result,
ROW_NUMBER() OVER (PARTITION BY EventDate ORDER BY Result DESC) AS EventDate_rank
FROM MainTable
WHERE AND CAST(UserNb AS int) = 103
AND Col1 = 'X' AND Result > 0 AND Col2 LIKE '%Y%'
AND Par1 > 500 AND Par1 <= 700) ranked
WHERE EventDate_rank <= 5 ORDER BY ResultEventDate_sum DESC) T2
The results I get:
EventDate EventPlace Par1 Par2 Result ResultEventDate_SUM
2019-05-26 PLACE nb 1 508 604 51.20 278.11
2019-05-26 PLACE nb 1 508 571 51.68 278.11
2019-05-26 PLACE nb 1 508 249 56.38 278.11
2019-05-26 PLACE nb 1 508 42 59.40 278.11
2019-05-26 PLACE nb 1 508 39 59.45 278.11
2019-06-09 PLACE nb 3 508 449 50.95 217.05
2019-06-09 PLACE nb 3 508 259 54.79 217.05
2019-06-09 PLACE nb 3 508 254 54.89 217.05
2019-06-09 PLACE nb 3 508 178 56.42 217.05
2019-06-16 PLACE nb 4 508 372 51.49 169.56
2019-06-16 PLACE nb 4 508 66 58.51 169.56
2019-06-16 PLACE nb 4 508 20 59.56 169.56
2019-06-02 PLACE nb 2 508 533 50.19 107.46
2019-06-02 PLACE nb 2 508 149 57.27 107.46
I need to get list of best (highest) sum of max 5 result from each event, but take only 3 best events. So I put SELECT TOP 15 (3 events by 5 results) and the query results is ok when there is a 5 results by each event. But if there is less then 5 results for each event I get also records from 4 event. How to modify query to be sure there will be only 3 events no matter how many results are for each one? In this example cut last 2 records (1 event with the smallest result 107.46). Is there a way to achieve this by simplify query without adding another code?
I tried to put COUNT(*) in the first line, but it counts the same like col resulteventdate, so I cannot use it as condition. Also If I added for ex. EXISTS, the statement is too big and take to much time.
Expected results is a table with only 3 events with best ResultEventDate_sum:
EventDate EventPlace Par1 Par2 Result ResultEventDate_SUM
2019-05-26 PLACE nb 1 508 604 51.20 278.11
2019-05-26 PLACE nb 1 508 571 51.68 278.11
2019-05-26 PLACE nb 1 508 249 56.38 278.11
2019-05-26 PLACE nb 1 508 42 59.40 278.11
2019-05-26 PLACE nb 1 508 39 59.45 278.11
2019-06-09 PLACE nb 3 508 449 50.95 217.05
2019-06-09 PLACE nb 3 508 259 54.79 217.05
2019-06-09 PLACE nb 3 508 254 54.89 217.05
2019-06-09 PLACE nb 3 508 178 56.42 217.05
2019-06-16 PLACE nb 4 508 372 51.49 169.56
2019-06-16 PLACE nb 4 508 66 58.51 169.56
2019-06-16 PLACE nb 4 508 20 59.56 169.56
Thanks in advance for any tips.
UPDATE after tests:
Thanks to all of you. I tested your propositions in the query and Dense_Rank do the job correctly and quite fast. Helped me a lot.
You need to use one more window function i.e. Dense_Rank.
This may help you get required output.
select EventDate,EventPlace,Par1,Par2,Result,ResultEventDate_sum from(
SELECT T2.EventDate, T2.EventPlace, T2.Par1,
T2.Par2, T2.Result, T2.ResultEventDate_sum ,
DENSE_RANK()over(order by ResultEventDate_sum,T2.EventDate desc)rnk
FROM
(SELECT EventDate, EventPlace, Par1, Par2,
Result, SUM(Result) OVER (PARTITION BY EventDate) AS ResultEventDate_SUM
FROM
(SELECT EventDate, EventPlace, Par1, Par2, Result,
ROW_NUMBER() OVER (PARTITION BY EventDate ORDER BY Result DESC) AS
EventDate_rank
FROM MainTable
WHERE AND CAST(UserNb AS int) = 103
AND Col1 = 'X' AND Result > 0 AND Col2 LIKE '%Y%'
AND Par1 > 500 AND Par1 <= 700) ranked
WHERE EventDate_rank <= 5 ) T2
)T1
WHERE RNK<=3
In this following query, I have considered EventPlace as Event_ID. This query will return TOP 3 Event (Based on SUM of Result per Event) Details (TOP 5 Rows for each Event)
SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY EventPlace ORDER BY EventPlace,Result DESC) RN
FROM your_table
WHERE EventPlace IN
(
-- You can set any number based on
-- How many event details you wants to see
SELECT TOP 3 EventPlace
FROM
(
SELECT *, ROW_NUMBER() OVER(PARTITION BY EventPlace ORDER BY EventPlace,Result DESC) RN
FROM your_table
)A
-- You can set any number based on
-- How many row's result you want to SUM for checking
WHERE RN <= 5
GROUP BY EventPlace
ORDER BY SUM(Result) DESC
)
)B
WHERE RN <= 5

ORACLE SQL Select timestamps column from table. Producing more results than intended

Hello currently I have a working script below. I am using Oracle 10
SELECT z.no as "ID_One",
MAX(r.value) as "Max",
round(MAX(r.value)/80000,2) as "ROUND"
FROM Table1 r, Table2 z
WHERE r.timestamp > ((SYSDATE - TO_DATE('01/01/1970 00:00:00', 'MM-DD-YYYY HH24:MI:SS')) * 24 * 60 * 60) - 80000
AND r.va=21
AND r.nor IN ('7','98','3','3')
AND r.nor = z.re
GROUP BY r.nor, r.varr, z.no;
It produces a table like this
ID_ONE MAX ROUND
105 500 232
106 232 32
333 23 .21
444 34 .321
I want to select a row call timestamp from table r. However when I add " r.timestamp " in to my query it produces 500 rows of data instead of 4. It looks like it is producing the the highest number for each timestamp instead. How would I produce a table that looks like this ? fyi timestamp column is in unix time. I can do the conversion myself. I just need to know how to get out these rows.
ID_ONE MAX ROUND TIMESTAMP
105 500 232 DEC 21,2021 10:00
106 232 32 DEC 21,2021 23:12
333 23 .21 DEC 31,2021 2:12
444 34 .321 DEC 31,2021 23:12
When I add the column time stamp it does not create what is above. What I am getting instead is something like that looks like this the other two ids are below in this 500 long row of data. I only wanted the 4 that is the highest value (MAX) from this set of time. ID_ONE is my id for a stock of inventory for a warehouse.
ID_ONE ROUND TIMESTAMP MAX
106 338
.06 1406694567
106 355
.06 1406696037
106 246
.04 1406696337
106 363
.06 1406700687
106 330
.06 1406700987
106 512
.09 1406701347
106 459
.08 1406704047
106 427
.07 1406711038
106 596
.1 1406713111
106 401
.07 1406715872
106 682
.11 1406726192
106 2776
.46 1406726492
105 414
.07 1406728863
105 380
.06 1406734055
105 378
.06 1406734655
105 722
.12 1406735555
105 144
.02 1406665697
105 5
I have edited my answer kindly try the below
SELECT z.no as "ID_One",
max(r.value) as "Max",
round(MAX(r.value)/80000,2) as "ROUND",r.Timestamp
FROM Table1 r, Table2 z
where r.timestamp > ((SYSDATE - TO_DATE ('01/01/1970 00:00:00', 'MM
-DD-YYYY HH24:MI:SS')) * 24 * 60 * 60) - 80000
and r.va=21
AND r.nor IN ('7','98','3','3')
AND r.value=(select max(r1.value) from Table1 r1 where r1.va=r.va and r1.nor=r.nor)
AND r.nor = z.re group by r.nor, r.varr, z.no;
This looks like an ideal use case for analytic functions:
SELECT
v1.*,
round(v1.value/80000,2) as rounded_max_value
FROM (
SELECT
z.no as id_one,
r.value,
row_number() over (partition by r.nor, r.varr, z.no order by r.value desc) as rn,
r.timestamp
FROM Table1 r, Table2 z
WHERE r.timestamp >
((SYSDATE - TO_DATE('01/01/1970 00:00:00', 'MM-DD-YYYY HH24:MI:SS')) * 24 * 60 * 60) - 80000
AND r.va=21
AND r.nor IN ('7','98','3','3')
AND r.nor = z.re
) v1
where v1.rn = 1
This query
uses row_number over (partition by .. order by ) to get an ordering of the rows within a group
uses rn = 1 in the outer query to get only the row having the maximum value
Some additional recommendations:
if your r.nor column is numeric, then don't use string literals; use IN (7,98,3,3) instead (BTW: why do you have 3 twice in your IN list?
don't use " for column aliases unless absolutely necessary (since it makes them case-sensitive) ; they are a PITA
don't put your JOIN conditions into the WHERE clause; it makes your query harder to read. Use ANSI style joins instead.

How to sort a sql result based on values in previous row?

I'm trying to sort a sql data selection by values in columns of the result set. The data looks like:
(This data is not sorted correctly, just an example)
ID projectID testName objectBefore objectAfter
=======================================================================================
13147 280 CDM-710 Generic TP-0000120 TOC~~#~~ -1 13148
1145 280 3.2 Quadrature/Carrier Null 25 Deg C 4940 1146
1146 280 3.2 Quadrature/Carrier Null 0 Deg C 1145 1147
1147 280 3.3 External Frequency Reference 1146 1148
1148 280 3.4 Phase Noise 50 Deg C 1147 1149
1149 280 3.4 Phase Noise 25 Deg C 1148 1150
1150 280 3.4 Phase Noise 0 Deg C 1149 1151
1151 280 3.5 Output Spurious 50 Deg C 1150 1152
1152 280 3.5 Output Spurious 25 Deg C 1151 1153
1153 280 3.5 Output Spurious 0 Deg C 1152 1154
............
18196 280 IP Regression Suite 18195 -1
The order of the data is based on the objectBefore and the objectAfter columns. The first row will always be when objectBefore = -1 and the last row will be when objectAfter = -1. In the above example, the second row would be ID 13148 as that is what row 1 objectAfter is equal to. Is there any way to write a query that would order the data in this manner?
This is actually sorting a linked list:
WITH SortedList (Id, objectBefore , projectID, testName, Level)
AS
(
SELECT Id, objectBefore , projectID, testName, 0 as Level
FROM YourTable
WHERE objectBefore = -1
UNION ALL
SELECT ll.Id, ll.objectBefore , ll.projectID, ll.testName, Level+1 as Level
FROM YourTable ll
INNER JOIN SortedList as s
ON ll.objectBefore = s.Id
)
SELECT Id, objectBefore , projectID, testName
FROM SortedList
ORDER BY Level
You can find more details in this post