SQL: Increment a row when value in another row changes - sql

I have the following table:
Sequence Change
100 0
101 0
103 0
106 0
107 1
110 0
112 1
114 0
115 0
121 0
126 1
127 0
134 0
I need an additional column, Group, whose values increment based on the occurrence of 1 in Change. How is that done? I'm using Microsoft Server 2012.
Sequence Change Group
100 0 0
101 0 0
103 0 0
106 0 0
107 1 1
110 0 1
112 1 2
114 0 2
115 0 2
121 0 2
126 1 3
127 0 3
134 0 3

You want a cumulative sum:
select t.*, sum(change) over (order by sequence) as grp
from t;

Related

how to do sum with multiple joins in PostgreSQL?

I know that my question would be duplicated but I really don't know how to created sql which return results of sum with multiple join.
Tables I have
result_summary
num_bin id_summary count_bin
3 172 0
4 172 0
5 172 0
6 172 0
7 172 0
8 172 0
1 174 1
2 174 0
3 174 0
4 174 0
5 174 0
6 174 0
7 174 0
8 174 0
1 175 0
summary_assembly
num_lot id_machine sabun date_work date_write id_product shift count_total count_fail count_good id_summary id_operation
adfe 1 21312 2020-11-25 2020-11-25 1 A 10 2 8 170 2000
adfe 1 21312 2020-11-25 2020-11-25 1 A 1000 1 999 171 2000
adfe 1 21312 2020-11-25 2020-11-25 2 A 100 1 99 172 2000
333 1 21312 2020-12-06 2020-12-06 1 A 10 2 8 500 2000
333 1 21312 2020-11-26 2020-11-26 1 A 10000 1 9999 174 2000
333 1 21312 2020-11-26. 2020-11-26 1 A 100 0 100 175 2000
333 1 21312 2020-12-06 2020-12-06 1 A 10 2 8 503 2000
333 1 21312 2020-12-07 2020-12-07 1 A 10 2 8 651 2000
333 1 21312 2020-12-02 2020-12-02 1 A 10 2 8 178 2000
employees
sabun name_emp
3532 Kim
12345 JS
4444 Gilsoo
21312 Wayn Hahn
123 Lee too
333 JD
info_product
id_product name_product
1 typeA
2 typeB
machine
id_machine id_operation name_machine
1 2000 name1
2 2000 name2
3 2000 name3
4 3000 name1
5 3000 name2
6 3000 name3
7 4000 name1
8 4000 name2
query
select S.id_summary, I.name_product, M.name_machine,
E.name_emp, S.sabun, S.date_work,
S.shift, S.num_lot, S.count_total,
S.count_good, S.count_fail,
sum(case num_bin when '1' then count_bin else 0 end) as bin1,
sum(case num_bin when '2' then count_bin else 0 end) as bin2,
sum(case num_bin when '3' then count_bin else 0 end) as bin3,
sum(case num_bin when '4' then count_bin else 0 end) as bin4,
sum(case num_bin when '5' then count_bin else 0 end) as bin5,
sum(case num_bin when '6' then count_bin else 0 end) as bin6,
sum(case num_bin when '7' then count_bin else 0 end) as bin7,
sum(case num_bin when '8' then count_bin else 0 end) as bin8
from result_assembly as R
join summary_assembly as S on R.id_summary = S.id_summary
join employees as E on S.sabun = E.sabun
join info_product as I on S.id_product = I.id_product
join machine as M on S.id_machine = M.id_machine
where I.id_product = '1'
and E.sabun='21312'
and S.shift = 'A'
and S.date_work between '2020-11-10' and '2020-12-20'
group by S.id_summary, E.name_emp, S.num_lot,
I.name_product,M.name_machine
order by S.id_summary;
result
id_summary name_product name_machine name_emp sabun date_work shift num_lot count_total count_good count_fail bin1 bin2 bin3 bin4 bin5 bin6 bin7 bin8
170 TypeA name1 Kim 21312 2020-11-25 A adfe 10 8 2 1 1 0 0 0 0 0 0
171 TypeA name1 Kim 21312 2020-11-25 A adfe 1000 999 1 1 1 0 0 0 0 0 0
174 TypeA name1 Kim 21312 2020-11-26 A 333 10000 9999 1 1 1 0 0 0 0 0 0
175 TypeA name1 Kim 21312 2020-11-26 A 333 100 100 0 0 0 0 0 0 0 0 0
178 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
179 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
180 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
181 TypeA name1 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
182 TypeA name2 Kim 21312 2020-12-02 A 333 10 8 2 1 1 0 0 0 0 0 0
186 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 1 1 0 0 0 0 0 0
193 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
194 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
195 TypeA name2 Kim 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
196 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
197 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
198 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
199 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
200 TypeA name2 JS 21312 2020-12-06 A 333 10 8 2 0 0 0 0 0 0 0 0
expected output(when sum by num_lot)
num_lot count_total count_good count_fail bin1 bin2 bin3 bin4 bin5 bin6 bin7 bin8
adfe 323 300 23 22 1 0 0 0 0 0 0
333 4312 4300 12 10 2 0 0 0 0 0 0
All of them were modified from original one because they were non-English, so there would be typo.
Here now I need to sum by num_lot, name_product or sabun.
id_summary is unique.
Thanks
As expected in the comments: It seems like you simple need a subquery which groups your table by the column num_lot
SELECT
num_lot,
SUM(count_total),
SUM(count_good)
-- some more SUM()
FROM (
--<your query>
) s
GROUP BY num_lot
It was asked in the comments what the s stands for: A subquery needs an alias, an identifier. Because I didn't want to think about a better name, I just called the subselect s. It is the shortcut for AS s
It sounds like you want to use crosstab() -- https://www.postgresql.org/docs/current/tablefunc.html

Get the non 0 value for group ID in the column

I want to print only the IDs which have flag=1 but not have a group value as 0
(i.e print only flag = 1 in all visit)
Sample:
ID Val Flag
123 12 0
123 15 0
123 25 1
123 48 0
321 78 1
321 56 1
456 23 0
456 54 0
789 78 1
Expected Result:
ID
321
789
You can try the below -
select id
from tablename
group by id
having min(flag)=max(flag) and min(flag)=1
You could do the following:
SELECT ID
FROM tablename
WHERE flag != 0

Properly 'Joining' two Cross Applies

I've got a query with three Cross-Applies that gather data from three different tables. The first Cr-Ap assists the 2nd and 3rd Cr-Ap's. It finds the most recent ID of a certain refill for a 'cartridge', the higher the ID the more recent the refill.
The second and third Cr-Ap's gather the SUMS of items that have been refilled and items that have been dispensed under the most recent Refill.
If I run the query for Cr-Ap 2 or 3 separately the output would look something like:
ID Amount
1 100
2 1000
3 100
4 0
5 0
etc
Amount would be either the amount of dispensed or refilled items.
Only I don't want to run these queries separately, I want them next to each other.
So what I want is a table that looks like this:
ID Refill Dispense
1 100 1
2 1000 5
3 100 7
4 0 99
5 0 3
etc
My gut tells me to do
INNER JOIN crossaply2 ON crossapply3.ID = crossapply2.ID
But this doesn't work. I'm still new to SQL so I don't exactly know what I can and can't join, what I do know is that you can use crossapply as a join (sorta?). I think that might be what I need to do here, I just don't know how.
But that's not it, there's another complication, there are certain refills where nothing gets dispensed. In these scenarios the crossapply I wrote for dispenses won't return anything for that refillID. With nothing I don't mean NULL, I mean it just skips the refillID. But I'd like to see a 0 in those cases. Because it just skips over those ID's I can't get COALESCE or ISNULL to work, this might also complicate the joining of these two applies. Because an INNER JOIN would skip any line where there is no Dispensed amount, even though there is a Refilled amount Id like to see.
Here is my code:
-- Dispensed SUM and Refilled SUM combined
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id]
,Refills.Refilled
,Dispenses.Dispensed
FROM [CartridgeRefill]
CROSS APPLY(
SELECT MAX([CartridgeRefill].[Id]) AS RecentRefillID
FROM [CartridgeRefill]
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS RecentRefill
CROSS APPLY(
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id] AS RefilledID
,SUM([CartridgeRefillMedication].[Amount]) AS Refilled
FROM [CartridgeRefillMedication]
INNER JOIN [CartridgeRefill] ON [CartridgeRefillMedication].[FK_CartridgeRefill_Id] = [CartridgeRefill].[Id]
WHERE [CartridgeRefillMedication].[FK_CartridgeRefill_Id] = RecentRefill.RecentRefillID
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS Refills
CROSS APPLY(
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id] AS DispensedID
,SUM([CartridgeDispenseAttempt].[Amount]) AS Dispensed
FROM [CartridgeDispenseAttempt]
INNER JOIN [CartridgeRefill] ON [CartridgeDispenseAttempt].[FK_CartridgeRefill_Id] = [CartridgeRefill].[Id]
WHERE [CartridgeDispenseAttempt].[FK_CartridgeRefill_Id] = RecentRefill.RecentRefillID
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS Dispenses
GO
The output of this code is as follows:
1 300 1
1 300 1
1 200 194
1 200 194
1 200 8
1 200 8
1 0 39
1 0 39
1 100 14
1 100 14
1 200 1
1 200 1
1 0 28
1 0 28
1 1000 102
1 1000 102
1 1000 557
1 1000 557
1 2000 92
1 2000 92
1 100 75
1 100 75
1 100 100
1 100 100
1 100 51
1 100 51
1 600 28
1 600 28
1 200 47
1 200 47
1 200 152
1 200 152
1 234 26
1 234 26
1 0 227
1 0 227
1 10 6
1 10 6
1 300 86
1 300 86
1 0 194
1 0 194
1 500 18
1 500 18
1 1000 51
1 1000 51
1 1000 56
1 1000 56
1 500 48
1 500 48
1 0 10
1 0 10
1 1500 111
1 1500 111
1 56 79
1 56 79
1 100 6
1 100 6
1 44 134
1 44 134
1 1000 488
1 1000 488
1 100 32
1 100 32
1 100 178
1 100 178
1 500 672
1 500 672
1 200 26
1 200 26
1 500 373
1 500 373
1 100 10
1 100 10
1 900 28
1 900 28
2 900 28
2 900 28
2 900 28
etc
It is total nonsense that I can't do much with, it goes on for about 20k lines and goes through all the ID's, eventually.
Any help is more than appreciated :)
Looks like overcomplicated a bit.
Try
WITH cr AS (
SELECT [FK_CartridgeRegistration_Id]
,MAX([CartridgeRefill].[Id]) RecentRefillID
FROM [CartridgeRefill]
GROUP BY [FK_CartridgeRegistration_Id]
)
SELECT cr.[FK_CartridgeRegistration_Id], Refills.Refilled, Dispenses.Dispensed
FROM cr
CROSS APPLY(
SELECT SUM(crm.[Amount]) AS Refilled
FROM [CartridgeRefillMedication] crm
WHERE crm.[FK_CartridgeRefill_Id] = cr.RecentRefillID
) AS Refills
CROSS APPLY(
SELECT SUM(cda.[Amount]) AS Dispensed
FROM [CartridgeDispenseAttempt] cda
WHERE cda.[FK_CartridgeRefill_Id] = cr.RecentRefillID
) AS Dispenses;

Parsing Json Data from select query in SQL Server

I have a situation where I have a table that has a single varchar(max) column called dbo.JsonData. It has x number of rows with x number of properties.
How can I create a query that will allow me to turn the result set from a select * query into a row/column result set?
Here is what I have tried:
SELECT *
FROM JSONDATA
FOR JSON Path
But it returns a single row of the json data all in a single column:
JSON_F52E2B61-18A1-11d1-B105-00805F49916B
[{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":17,\"Defrost_Cycles\":0,\"Freeze_Cycles\":2304,\"Float_Switch_Raw_ADC\":1328,\"Bin_status\":2304,\"Line_Voltage\":0,\"ADC_Evaporator_Temperature\":0,\"Mem_Sw\":1280,\"Freeze_Timer\":2560,\"Defrost_Timer\":593,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":1259,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":17.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":11264.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-63.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Bin Full\",\"Compressor_Runtime\":134215680}"},{"Json_Data":"{\"Serial_Number\":\"12345\",\"Gateway_Uptime\":200,\"Defrost_Cycles\":559,\"Freeze_Cycles\":510,\"Float_Switch_Raw_ADC\":106,\"Bin_status\":0,\"Line_Voltage\":119,\"ADC_Evaporator_Temperature\":123,\"Mem_Sw\":113,\"Freeze_Timer\":0,\"Defrost_Timer\":66,\"Water_Flow_Switch\":3328,\"ADC_Mid_Temperature\":2560,\"ADC_Water_Temperature\":0,\"Ambient_Temperature\":71,\"Mid_Temperature\":1259,\"Water_Temperature\":1259,\"Evaporator_Temperature\":54,\"Ambient_Temperature_Off_Board\":0,\"Ambient_Temperature_On_Board\":0,\"Gateway_Info\":\"{\\\"temp_sensor\\\":0.00,\\\"temp_pcb\\\":82.00,\\\"gw_uptime\\\":199.00,\\\"winc_fw\\\":\\\"19.5.4\\\",\\\"gw_fw_version\\\":\\\"0.0.0\\\",\\\"gw_fw_version_git\\\":\\\"2a75f20-dirty\\\",\\\"gw_sn\\\":\\\"328\\\",\\\"heap_free\\\":10984.00,\\\"gw_sig_csq\\\":0.00,\\\"gw_sig_quality\\\":0.00,\\\"wifi_sig_strength\\\":-60.00,\\\"wifi_resets\\\":0.00}\",\"ADC_Ambient_Temperature\":1120,\"Control_State\":\"Defrost\",\"Compressor_Runtime\":11304}"},{"Json_Data":"{\"Seri...
What am I missing?
I can't specify the columns explicitly because the json strings aren't always the same.
This what I expect:
Serial_Number Gateway_Uptime Defrost_Cycles Freeze_Cycles Float_Switch_Raw_ADC Bin_status Line_Voltage ADC_Evaporator_Temperature Mem_Sw Freeze_Timer Defrost_Timer Water_Flow_Switch ADC_Mid_Temperature ADC_Water_Temperature Ambient_Temperature Mid_Temperature Water_Temperature Evaporator_Temperature Ambient_Temperature_Off_Board Ambient_Temperature_On_Board ADC_Ambient_Temperature Control_State Compressor_Runtime temp_sensor temp_pcb gw_uptime winc_fw gw_fw_version gw_fw_version_git gw_sn heap_free gw_sig_csq gw_sig_quality wifi_sig_strength wifi_resets LastModifiedDateUTC Defrost_Cycle_time Freeze_Cycle_time
12345 251402 540 494 106 0 98 158 113 221 184 0 0 0 1259 1259 1259 33 0 0 0 Freeze 10833 0 78 251402 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -61 0 2018-03-20 11:15:28.000 0 0
12345 251702 540 494 106 0 98 178 113 517 184 0 0 0 1259 1259 1259 22 0 0 0 Freeze 10838 0 78 251702 19.5.4 0.0.0 2a75f20-dirty 328.00000000 10976 0 0 -62 0 2018-03-20 11:15:42.000 0 0
...
Thank you,
Ron

proc sql statement to sum on values/rows that match a condition

I have a data table like below:
Table 1:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT
10 111 2009 . 100 .
110 120 2009 9 10 .
231 120 2009 0 20 10
222 120 2010 0 40 20
221 222 2009 102 10 30
321 222 2009 0 30 20
213 222 2009 0 10 20
432 321 2009 99 10 0
211 432 2009 111 20 10
212 432 2009 0 20 0
I want to sum over the DAYSBETVISIT column only when the pidDifference value is 0 for each PERSONID. So I wrote the following proc sql statement.
proc sql;
create table table5 as
(
select rowid, YEAR, PERSONID, pidDifference, TIMETOEVENT, DAYSBETVISIT,
SUM(CASE WHEN PIDDifference = 0 THEN DaysBetVisit ELSE 0 END)
from WORK.Table4_1
group by PERSONID,TIMETOEVENT, YEAR
);
quit;
However, the result I got was not summing the DAYSBETVISIT values in rows where PIDDifference = 0 within the same PERSONID. It just output the same value as was present in DAYSBETVISIT in that specific row.
Column that I NEED (sumdays) but don't get with above statement (showing the resultant column using above statement as OUT:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT sumdays OUT
10 111 2009 . 100 . 0 0
110 120 2009 9 10 . 0 0
231 120 2009 0 20 10 30 10
222 120 2010 0 40 20 30 20
221 222 2009 102 10 30 0 0
321 222 2009 0 30 20 40 20
213 222 2009 0 10 20 40 20
432 321 2009 99 10 0 0 0
211 432 2009 111 20 10 0 0
212 432 2009 0 20 0 0 0
I do not know what I am doing wrong.
I am using SAS EG Version 7.15, Base SAS version 9.4.
For your example data it looks like you just need to use two CASE statements. One to define which values to SUM() and another to define whether to report the SUM or not.
proc sql ;
select personid, piddifference, daysbetvisit, sumdays
, case when piddifference = 0
then sum(case when piddifference=0 then daysbetvisit else 0 end)
else 0 end as WANT
from expect
group by personid
;
quit;
Results
pid
PERSONID Difference DAYSBETVISIT sumdays WANT
--------------------------------------------------------
111 . . 0 0
120 0 10 30 30
120 0 20 30 30
120 9 . 0 0
222 0 20 40 40
222 0 20 40 40
222 102 30 0 0
321 99 0 0 0
432 0 0 0 0
432 111 10 0 0
SAS proc sql doesn't support window functions. I find the re-merging aggregations to be a bit difficult to use, except in the obvious cases. So, use a subquery or join and group by:
proc sql;
create table table5 as
select t.rowid, t.YEAR, t.PERSONID, t.pidDifference, t.TIMETOEVENT, t.DAYSBETVISIT,
tt.sum_DaysBetVisit
from WORK.Table4_1 t left join
(select personid, sum(DaysBetVisit) as sum_DaysBetVisit
from WORK.Table4_1
group by personid
having min(pidDifference) = max(pidDifference) and min(pidDifference) = 0
) tt
on tt.personid = t.personid;
Note: This doesn't handle NULL values for pidDifference. If that is a concern, you can add count(pidDifference) = count(*) to the having clause.