Create group based on the data available in SQL server table having specific condition? - sql

There is a table which contains SQL server blocking chain data, like below.
I am trying to pull only those blocking chain groups whose average wait time is greater than 20 seconds.Group can be identified like - It starts from where it founds blocked value as 0 and ends on where it found again blocked value as 0. And last found with 0 value should not be consider in group
Blocking_time SPID blocked WAIT_MS Blocking_Chain_tree_details_by_Session_id_and_header Wait_type
7/28/19 5:14 AM 130 0   HEAD -  SPID (130) - EL.dbo.test;1
7/28/19 5:14 AM 292 130 1   |      |-----  SPID (292) - EL.dbo.test123;1 PAGELATCH_EX
7/28/19 5:14 AM 949 130 1   |      |-----  SPID (949) - EL.dbo.sstest123;1 PAGELATCH_EX
7/28/19 5:32 AM 106 130 1   |      |-----  SPID (106) - EL.dbo.checjmark;1 PAGELATCH_EX
7/28/19 5:32 AM 130 0   HEAD -  SPID (130) - Eli.dbo.sss;1
7/28/19 5:32 AM 292 130 1   |      |-----  SPID (292) - EL.dbo.variable;1 PAGELATCH_EX
7/28/19 5:32 AM 949 130 1   |      |-----  SPID (949) - Eldbo.anything;1 PAGELATCH_EX
7/28/19 5:32 AM 1578 130 12000   |      |-----  SPID (1578) - EL.dbo.something;1 PAGELATCH_EX
7/28/19 9:20 AM 196 513 21700   |      |-----  SPID (196) - (#P1 uniqueidentifier,#P2 int,#P3 int,#P4 int,#P5 int,#P6 int,#P7 int,#P8 int,#P ... LCK_M_IX
NA
Actual result should be like as below-
Blocking_time SPID blocked WAIT_MS Blocking_Chain_tree_details_by_Session_id_and_header Wait_type
7/28/19 5:32 AM 130 0   HEAD -  SPID (130) - Eli.dbo.sss;1
7/28/19 5:32 AM 292 130 1   |      |-----  SPID (292) - EL.dbo.variable;1 PAGELATCH_EX
7/28/19 5:32 AM 949 130 1   |      |-----  SPID (949) - Eldbo.anything;1 PAGELATCH_EX
7/28/19 5:32 AM 1578 130 12000   |      |-----  SPID (1578) - EL.dbo.something;1 PAGELATCH_EX
7/28/19 9:20 AM 196 513 21700   |      |-----  SPID (196) - (#P1 uniqueidentifier,#P2 int,#P3 int,#P4 int,#P5 int,#P6 int,#P7 int,#P8 int,#P ... LCK_M_IX

You can use a window function for this. So long as you put your grouping columns in the PARTITION BY you'll be able to get the MAX value for the group. Then you can filter to just the groups where the max time is over 20 seconds.
SELECT *
FROM
(
SELECT Blocking_time,
SPID,
blocked,
WAIT_MS,
Blocking_Chain_tree_details_by_Session_id_and_header,
Wait_type,
MAX(WAIT_MS) OVER (PARTITION BY Blocking_time ORDER BY Blocking_time ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) [Max_WAIT_MS]
FROM <YourTable>
) rawData
WHERE [Max_WAIT_MS] > 20000

Related

Summing column that is grouped - SQL

I have a query:
SELECT
date,
COUNT(o.row_number)FILTER (WHERE o.row_number > 1 AND date_ddr IS NOT NULL AND telephone_number <> 'Anonymous' ) repeat_calls_24h
(
SELECT
telephone_number,
date_ddr,
ROW_NUMBER() OVER(PARTITION BY ddr.telephone_number ORDER BY ddr.date) row_number,
FROM
table_a
)o
GROUP BY 1
Generating the following table:
date
Repeat calls_24h
17/09/2022
182
18/09/2022
381
19/09/2022
81
20/09/2022
24
21/09/2022
91
22/09/2022
110
23/09/2022
231
What can I add to my query to provide a sum of the previous three days as below?:
date
Repeat calls_24h
Repeat Calls 3d
17/09/2022
182
18/09/2022
381
19/09/2022
81
644
20/09/2022
24
486
21/09/2022
91
196
22/09/2022
110
225
23/09/2022
231
432
Thanks
We can do it using lag.
select "date"
,"Repeat calls_24h"
,"Repeat calls_24h" + lag("Repeat calls_24h") over(order by "date") + lag("Repeat calls_24h", 2) over(order by "date") as "Repeat Calls 3d"
from t
date
Repeat calls_24h
Repeat Calls 3d
2022-09-17
182
null
2022-09-18
381
null
2022-09-19
81
644
2022-09-20
24
486
2022-09-21
91
196
2022-09-22
110
225
2022-09-23
231
432
Fiddle

ID rows containing values greater than corresponding values based on a criteria from another row

I have a grouped dataframe. I have created a flag that identifies if values in a row are less than the group maximums. This works fine. However I want to unflag rows where the value contained in a third column is greater than the value in the same (third) column within each group. I have a feeling there shoule be an elegant and pythonic way to do this but I can't figure it out.
The flag I have shown in the code compares the maximum value of tour_duration within each hh_id to the corresponding value of "comp_expr" and if found less, assigns "1" to the column flag. However, I want values in the flag column to be 0 if min(arrivaltime) for each subgroup tour_id > max(arrivaltime) for the tour_id whose tour_duration is found to be maximum within each hh_id. For example, in the given data, tour_id 16300 has the highest value of tour_duration. But tour_id 16200 has min arrivaltime 1080 which is < max(arrivaltime) for tour_id 16300 (960). So flag for all tour_id 16200 should be 0.
Kindly assist.
import pandas as pd
import numpy as np
stops_data = pd.DataFrame({'hh_id': [20044,20044,20044,20044,20044,20044,20044,20044,20044,20044,20044,20122,20122,20122,20122,20122,20122,20122,20122,20122,20122,20122,20122,20122,],'tour_id':[16300,16300,16100,16100,16100,16100,16200,16200,16200,16000,16000,38100,38100,37900,37900,37900,38000,38000,38000,38000,38000,38000,37800,37800],'arrivaltime':[360,960,900,900,900,960,1080,1140,1140,420,840,300,960,780,720,960,1080,1080,1080,1080,1140,1140,480,900],'tour_duration':[600,600,60,60,60,60,60,60,60,420,420,660,660,240,240,240,60,60,60,60,60,60,420,420],'comp_expr':[1350,1350,268,268,268,268,406,406,406,974,974,1568,1568,606,606,606,298,298,298,298,298,298,840,840]})
stops_data['flag'] = np.where(stops_data.groupby(['hh_id'])
['tour_duration'].transform(max) < stops_data['comp_expr'],0,1)
This is my current output:Current dataset and output
This is my desired output, please see flag column: Desired output, see changed flag values in bold
>>> stops_data.loc[stops_data.tour_id
.isin(stops_data.loc[stops_data.loc[stops_data
.groupby(['hh_id','tour_id'])['arrivaltime'].idxmin()]
.groupby('hh_id')['arrivaltime'].idxmax()]['tour_id']), 'flag'] = 0
>>> stops_data
hh_id tour_id arrivaltime tour_duration comp_expr flag
0 20044 16300 360 600 1350 0
1 20044 16300 960 600 1350 0
2 20044 16100 900 60 268 1
3 20044 16100 900 60 268 1
4 20044 16100 900 60 268 1
5 20044 16100 960 60 268 1
6 20044 16200 1080 60 406 0
7 20044 16200 1140 60 406 0
8 20044 16200 1140 60 406 0
9 20044 16000 420 420 974 0
10 20044 16000 840 420 974 0
11 20122 38100 300 660 1568 0
12 20122 38100 960 660 1568 0
13 20122 37900 780 240 606 1
14 20122 37900 720 240 606 1
15 20122 37900 960 240 606 1
16 20122 38000 1080 60 298 0
17 20122 38000 1080 60 298 0
18 20122 38000 1080 60 298 0
19 20122 38000 1080 60 298 0
20 20122 38000 1140 60 298 0
21 20122 38000 1140 60 298 0
22 20122 37800 480 420 840 0
23 20122 37800 900 420 840 0

Parsing Json Data from select query in SQL Server

I have a situation where I have a table that has a single varchar(max) column called dbo.JsonData. It has x number of rows with x number of properties.
How can I create a query that will allow me to turn the result set from a select * query into a row/column result set?
Here is what I have tried:
SELECT *
FROM JSONDATA
FOR JSON Path
But it returns a single row of the json data all in a single column:
JSON_F52E2B61-18A1-11d1-B105-00805F49916B
[{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":17,\"Defrost_Cycles\":0,\"Freeze_Cycles\":2304,\"Float_Switch_Raw_ADC\":1328,\"Bin_status\":2304,\"Line_Voltage\":0,\"ADC_Evaporator_Temperature\":0,\"Mem_Sw\":1280,\"Freeze_Timer\":2560,\"Defrost_Timer\":593,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":1259,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":17.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":11264.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-63.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Bin Full\",\"Compressor_Runtime\":134215680}"},{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":200,\"Defrost_Cycles\":559,\"Freeze_Cycles\":510,\"Float_Switch_Raw_ADC\":106,\"Bin_status\":0,\"Line_Voltage\":119,\"ADC_Evaporator_Temperature\":123,\"Mem_Sw\":113,\"Freeze_Timer\":0,\"Defrost_Timer\":66,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":54,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":199.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":10984.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-60.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Defrost\",\"Compressor_Runtime\":11304}"},{"Json_Data":"{\"Seri...
What am I missing?
I can't specify the columns explicitly because the json strings aren't always the same.
This what I expect:
Serial_Number Gateway_Uptime Defrost_Cycles Freeze_Cycles Float_Switch_Raw_ADC Bin_status Line_Voltage ADC_Evaporator_Temperature Mem_Sw Freeze_Timer Defrost_Timer Water_Flow_Switch ADC_Mid_Temperature ADC_Water_Temperature Ambient_Temperature Mid_Temperature Water_Temperature Evaporator_Temperature Ambient_Temperature_Off_Board Ambient_Temperature_On_Board ADC_Ambient_Temperature Control_State Compressor_Runtime temp_sensor temp_pcb gw_uptime winc_fw gw_fw_version gw_fw_version_git gw_sn heap_free gw_sig_csq gw_sig_quality wifi_sig_strength wifi_resets LastModifiedDateUTC Defrost_Cycle_time Freeze_Cycle_time
12345 251402 540 494 106 0 98 158 113 221 184 0 0 0 1259 1259 1259 33 0 0 0 Freeze 10833 0 78 251402 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -61 0 2018-03-20 11:15:28.000 0 0
12345 251702 540 494 106 0 98 178 113 517 184 0 0 0 1259 1259 1259 22 0 0 0 Freeze 10838 0 78 251702 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -62 0 2018-03-20 11:15:42.000 0 0
...
Thank you,
Ron

SQL seems to round up the number automatically on select statement?

Hi Here is my SQL code:
SELECT a."Date", a."Missed", b."Total Client Schedules", cast(100-((a."Missed"*100) / b."Total Client Schedules")AS decimal) as "Pct Completed" -
FROM -
( -
SELECT DATE(scheduled_start) as "Date",count(*) as "Missed" FROM -
events WHERE node_name IS NOT NULL AND status IN ('Missed') GROUP BY DATE(scheduled_start) -
) as a, -
( -
SELECT DATE(scheduled_start) as "Date", count(*) as -
"Total Client Schedules" FROM events WHERE node_name IS NOT NULL GROUP BY DATE(scheduled_start) -
) as b -
WHERE a."Date" = b."Date" ORDER BY "Date" desc
and Here is the output
Date Missed Total Client Schedules Pct Completed
----------- ------------ ----------------------- --------------
2013-02-20 2 805 100
2013-02-19 14 805 99
2013-02-18 29 805 97
2013-02-17 59 805 93
2013-02-16 29 806 97
2013-02-15 49 805 94
2013-02-14 33 805 96
2013-02-13 57 805 93
2013-02-12 21 805 98
2013-02-11 35 805 96
2013-02-10 34 805 96
it always seems to round to the highest number when i want it to be like 99.99% or 97.2% etc..
You don't specify what database you are using. However, some databases do integer arithmetic, so 1/2 is 0 not 0.5.
To fix this, just make the constants you are using numeric rather than integer:
cast(100.0-((a."Missed"*100.0) / b."Total Client Schedules")AS decimal)
It will then convert to a non-integer type for the arithmetic.

Calculating difference from previous record

May I ask for your help with the following please ?
I am trying to calculate a change from one record to the next in my results. It will probably help if I show you my current query and results ...
SELECT A.AuditDate, COUNT(A.NickName) as [TAccounts],
SUM(IIF((A.CurrGBP > 100 OR A.CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits A
GROUP BY A.AuditDate;
The query gives me these results ...
AuditDate D/M/Y TAccounts Funded
--------------------------------------------
30/12/2011 506 285
04/01/2012 514 287
05/01/2012 514 288
06/01/2012 516 288
09/01/2012 520 289
10/01/2012 522 289
11/01/2012 523 290
12/01/2012 524 290
13/01/2012 526 291
17/01/2012 531 292
18/01/2012 532 292
19/01/2012 533 293
20/01/2012 537 295
Ideally, the results I would like to get, would be similar to the following ...
AuditDate D/M/Y TAccounts TChange Funded FChange
------------------------------------------------------------------------
30/12/2011 506 0 285 0
04/01/2012 514 8 287 2
05/01/2012 514 0 288 1
06/01/2012 516 2 288 0
09/01/2012 520 4 289 1
10/01/2012 522 2 289 0
11/01/2012 523 1 290 1
12/01/2012 524 1 290 0
13/01/2012 526 2 291 1
17/01/2012 531 5 292 1
18/01/2012 532 1 292 0
19/01/2012 533 1 293 1
20/01/2012 537 4 295 2
Looking at the row for '17/01/2012', 'TChange' has a value of 5 as the 'TAccounts' has increased from previous 526 to 531. And the 'FChange' would be based on the 'Funded' field. I guess something to be aware of is the fact that the previous row to this example, is dated '13/01/2012'. What I mean is, there are some days where I have no data (for example over weekends).
I think I need to use a SubQuery but I am really struggling to figure out where to start. Could you show me how to get the results I need please ?
I am using MS Access 2010
Many thanks for your time.
Johnny.
Here is one approach you could try...
SELECT B.AuditDate,B.TAccounts,
B.TAccount -
(SELECT Count(NickName) FROM Audits WHERE AuditDate=B.PrevAuditDate) as TChange,
B.Funded -
(SELECT Count(*) FROM Audits WHERE AuditDate=B.PrevAuditDate AND (CurrGBP > 100 OR CurrUSD > 100)) as FChange
FROM (
SELECT A.AuditDate,
(SELECT Count(NickName) FROM Audits WHERE AuditDate=A.AuditDate) as TAccounts,
(SELECT Count(*) FROM Audits WHERE (CurrGBP > 100 OR CurrUSD > 100)) as Funded,
(SELECT Max(AuditDate) FROM Audits WHERE AuditDate<A.AuditDate) as PrevAuditDate
FROM
(SELECT DISTINCT AuditDate FROM Audits) AS A) AS B
Instead of using a Group By I've used subquerys to get both TAccounts and Funded, as well as the Previous Audit Date, which is then used on the main SELECT statement to get TAccounts and Funded again but this time for the previous date, so that any required calculation can be done against them.
But I would imagine this may be slow to process
It's a shame MS never made this type of thing simple in Access, how many rows are you working with on your report?
If it's under 65K then I would suggest dumping the data on to an Excel spreadsheet and using a simple formula to calculate the different between rows.
You can try something like the following (sql is untested and will require some changes)
SELECT
A.AuditDate,
A.TAccounts,
A.TAccounts - B.TAccounts AS TChange,
A.Funded,
A.Funded - B.Funded AS FChange
FROM
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) A
INNER JOIN
( SELECT
ROW_NUMBER() OVER (ORDER BY AuditDate DESC) AS ROW,
AuditDate,
COUNT(NickName) as [TAccounts],
SUM(IIF((CurrGBP > 100 OR CurrUSD > 100), 1, 0)) as [Funded]
FROM Audits
GROUP BY AuditDate
) B ON B.ROW = A.ROW + 1