proc sql statement to sum on values/rows that match a condition - sql

I have a data table like below:
Table 1:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT
10 111 2009 . 100 .
110 120 2009 9 10 .
231 120 2009 0 20 10
222 120 2010 0 40 20
221 222 2009 102 10 30
321 222 2009 0 30 20
213 222 2009 0 10 20
432 321 2009 99 10 0
211 432 2009 111 20 10
212 432 2009 0 20 0
I want to sum over the DAYSBETVISIT column only when the pidDifference value is 0 for each PERSONID. So I wrote the following proc sql statement.
proc sql;
create table table5 as
(
select rowid, YEAR, PERSONID, pidDifference, TIMETOEVENT, DAYSBETVISIT,
SUM(CASE WHEN PIDDifference = 0 THEN DaysBetVisit ELSE 0 END)
from WORK.Table4_1
group by PERSONID,TIMETOEVENT, YEAR
);
quit;
However, the result I got was not summing the DAYSBETVISIT values in rows where PIDDifference = 0 within the same PERSONID. It just output the same value as was present in DAYSBETVISIT in that specific row.
Column that I NEED (sumdays) but don't get with above statement (showing the resultant column using above statement as OUT:
ROWID PERSONID YEAR pidDifference TIMETOEVENT DAYSBETVISIT sumdays OUT
10 111 2009 . 100 . 0 0
110 120 2009 9 10 . 0 0
231 120 2009 0 20 10 30 10
222 120 2010 0 40 20 30 20
221 222 2009 102 10 30 0 0
321 222 2009 0 30 20 40 20
213 222 2009 0 10 20 40 20
432 321 2009 99 10 0 0 0
211 432 2009 111 20 10 0 0
212 432 2009 0 20 0 0 0
I do not know what I am doing wrong.
I am using SAS EG Version 7.15, Base SAS version 9.4.

For your example data it looks like you just need to use two CASE statements. One to define which values to SUM() and another to define whether to report the SUM or not.
proc sql ;
select personid, piddifference, daysbetvisit, sumdays
, case when piddifference = 0
then sum(case when piddifference=0 then daysbetvisit else 0 end)
else 0 end as WANT
from expect
group by personid
;
quit;
Results
pid
PERSONID Difference DAYSBETVISIT sumdays WANT
--------------------------------------------------------
111 . . 0 0
120 0 10 30 30
120 0 20 30 30
120 9 . 0 0
222 0 20 40 40
222 0 20 40 40
222 102 30 0 0
321 99 0 0 0
432 0 0 0 0
432 111 10 0 0

SAS proc sql doesn't support window functions. I find the re-merging aggregations to be a bit difficult to use, except in the obvious cases. So, use a subquery or join and group by:
proc sql;
create table table5 as
select t.rowid, t.YEAR, t.PERSONID, t.pidDifference, t.TIMETOEVENT, t.DAYSBETVISIT,
tt.sum_DaysBetVisit
from WORK.Table4_1 t left join
(select personid, sum(DaysBetVisit) as sum_DaysBetVisit
from WORK.Table4_1
group by personid
having min(pidDifference) = max(pidDifference) and min(pidDifference) = 0
) tt
on tt.personid = t.personid;
Note: This doesn't handle NULL values for pidDifference. If that is a concern, you can add count(pidDifference) = count(*) to the having clause.

Related

Properly 'Joining' two Cross Applies

I've got a query with three Cross-Applies that gather data from three different tables. The first Cr-Ap assists the 2nd and 3rd Cr-Ap's. It finds the most recent ID of a certain refill for a 'cartridge', the higher the ID the more recent the refill.
The second and third Cr-Ap's gather the SUMS of items that have been refilled and items that have been dispensed under the most recent Refill.
If I run the query for Cr-Ap 2 or 3 separately the output would look something like:
ID Amount
1 100
2 1000
3 100
4 0
5 0
etc
Amount would be either the amount of dispensed or refilled items.
Only I don't want to run these queries separately, I want them next to each other.
So what I want is a table that looks like this:
ID Refill Dispense
1 100 1
2 1000 5
3 100 7
4 0 99
5 0 3
etc
My gut tells me to do
INNER JOIN crossaply2 ON crossapply3.ID = crossapply2.ID
But this doesn't work. I'm still new to SQL so I don't exactly know what I can and can't join, what I do know is that you can use crossapply as a join (sorta?). I think that might be what I need to do here, I just don't know how.
But that's not it, there's another complication, there are certain refills where nothing gets dispensed. In these scenarios the crossapply I wrote for dispenses won't return anything for that refillID. With nothing I don't mean NULL, I mean it just skips the refillID. But I'd like to see a 0 in those cases. Because it just skips over those ID's I can't get COALESCE or ISNULL to work, this might also complicate the joining of these two applies. Because an INNER JOIN would skip any line where there is no Dispensed amount, even though there is a Refilled amount Id like to see.
Here is my code:
-- Dispensed SUM and Refilled SUM combined
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id]
,Refills.Refilled
,Dispenses.Dispensed
FROM [CartridgeRefill]
CROSS APPLY(
SELECT MAX([CartridgeRefill].[Id]) AS RecentRefillID
FROM [CartridgeRefill]
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS RecentRefill
CROSS APPLY(
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id] AS RefilledID
,SUM([CartridgeRefillMedication].[Amount]) AS Refilled
FROM [CartridgeRefillMedication]
INNER JOIN [CartridgeRefill] ON [CartridgeRefillMedication].[FK_CartridgeRefill_Id] = [CartridgeRefill].[Id]
WHERE [CartridgeRefillMedication].[FK_CartridgeRefill_Id] = RecentRefill.RecentRefillID
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS Refills
CROSS APPLY(
SELECT [CartridgeRefill].[FK_CartridgeRegistration_Id] AS DispensedID
,SUM([CartridgeDispenseAttempt].[Amount]) AS Dispensed
FROM [CartridgeDispenseAttempt]
INNER JOIN [CartridgeRefill] ON [CartridgeDispenseAttempt].[FK_CartridgeRefill_Id] = [CartridgeRefill].[Id]
WHERE [CartridgeDispenseAttempt].[FK_CartridgeRefill_Id] = RecentRefill.RecentRefillID
GROUP BY [CartridgeRefill].[FK_CartridgeRegistration_Id]
) AS Dispenses
GO
The output of this code is as follows:
1 300 1
1 300 1
1 200 194
1 200 194
1 200 8
1 200 8
1 0 39
1 0 39
1 100 14
1 100 14
1 200 1
1 200 1
1 0 28
1 0 28
1 1000 102
1 1000 102
1 1000 557
1 1000 557
1 2000 92
1 2000 92
1 100 75
1 100 75
1 100 100
1 100 100
1 100 51
1 100 51
1 600 28
1 600 28
1 200 47
1 200 47
1 200 152
1 200 152
1 234 26
1 234 26
1 0 227
1 0 227
1 10 6
1 10 6
1 300 86
1 300 86
1 0 194
1 0 194
1 500 18
1 500 18
1 1000 51
1 1000 51
1 1000 56
1 1000 56
1 500 48
1 500 48
1 0 10
1 0 10
1 1500 111
1 1500 111
1 56 79
1 56 79
1 100 6
1 100 6
1 44 134
1 44 134
1 1000 488
1 1000 488
1 100 32
1 100 32
1 100 178
1 100 178
1 500 672
1 500 672
1 200 26
1 200 26
1 500 373
1 500 373
1 100 10
1 100 10
1 900 28
1 900 28
2 900 28
2 900 28
2 900 28
etc
It is total nonsense that I can't do much with, it goes on for about 20k lines and goes through all the ID's, eventually.
Any help is more than appreciated :)
Looks like overcomplicated a bit.
Try
WITH cr AS (
SELECT [FK_CartridgeRegistration_Id]
,MAX([CartridgeRefill].[Id]) RecentRefillID
FROM [CartridgeRefill]
GROUP BY [FK_CartridgeRegistration_Id]
)
SELECT cr.[FK_CartridgeRegistration_Id], Refills.Refilled, Dispenses.Dispensed
FROM cr
CROSS APPLY(
SELECT SUM(crm.[Amount]) AS Refilled
FROM [CartridgeRefillMedication] crm
WHERE crm.[FK_CartridgeRefill_Id] = cr.RecentRefillID
) AS Refills
CROSS APPLY(
SELECT SUM(cda.[Amount]) AS Dispensed
FROM [CartridgeDispenseAttempt] cda
WHERE cda.[FK_CartridgeRefill_Id] = cr.RecentRefillID
) AS Dispenses;

T SQL CTE Previous Row Calculation

I'm using SQL Server 2016.
I have the below table:
SKU Shop Week ShopPrioirty Replen Open_Stk Open_Stk Calc
111 100 1 1 0 17 NULL
111 200 1 2 2 NULL NULL
111 300 1 3 0 NULL NULL
111 400 1 4 0 NULL NULL
222 100 2 1 5 17 NULL
222 200 2 2 5 NULL NULL
222 300 2 3 5 NULL NULL
222 400 2 4 5 NULL NULL
This is the desired result:
SKU Shop Week ShopPrioirty Replen Open_Stk Open_Stk Calc
111 100 1 1 0 17 17
111 200 1 2 2 NULL 17
111 300 1 3 0 NULL 15
111 400 1 4 0 NULL 15
222 100 2 1 20 17 17
222 200 2 2 15 NULL 12
222 300 2 3 12 NULL 7
222 400 2 4 10 NULL 2
I need to update the 'Open_Stk Calc' based on the previous row:
'Open_Stk Calc' - IIF('Replen'<=IIF('Open_Stk'>=0,'Open_Stk',0),'Replen',0)
I am using a CTE to update a row based on a calculation of the previous rows. This is my SQL:
;WITH CTE AS
(
SELECT
SKU,
[Shop],
[Week],
[Store_Priority],
[Replen],
[Open_Stk],
[Open_Stk Calc],
FIRST_VALUE([Open_Stk]) OVER ( PARTITION BY [SKU] ,[Week] ORDER BY [Store_Priority] ROWS UNBOUNDED PRECEDING)
-
ISNULL(SUM(IIF([Replen] <= IIF([Open_Stk]>=0,[Open_Stk],0),[Replen],0))
OVER (PARTITION BY [SKU] ,[Week] ORDER BY [Store_Priority] ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS CurrentStock
FROM [tblTEST])
UPDATE CTE
SET [Open_Stk Calc] = CurrentStock
However, this produces the following result:
SKU Shop Week ShopPrioirty Replen Open_Stk Open_Stk Calc
111 100 1 1 0 17 17
111 200 1 2 2 NULL 17
111 300 1 3 0 NULL 17
111 400 1 4 0 NULL 17
And not the desired result - where have I gone wrong?
As one can see in the MS documentation, the OVER clauses supports specific kind of functions:
Ranking functions
Aggregate functions
Analytic functions
NEXT VALUE FOR function
None of them include IIF, as Luis Cazares noted in their comment.
Your code indicates you do have a clue about what you are doing - maybe you forgot to put your IIF inside a SUM?

Oracle - Group By Creating Duplicate Rows

I have a query that looks like this:
select nvl(trim(a.code), 'Blanks') as Ward, count(b.apcasekey) as UNSP, count(c.apcasekey) as GRAPH,
count(d.apcasekey) as "ANI/PIG",
(count(b.apcasekey) + count(c.apcasekey) + count(d.apcasekey)) as "TOTAL ACTIVE",
count(a.apcasekey) as "TOTAL OPEN" from (etc...)
group by a.code
order by Ward
The reason I have nvl(trim(a.code), 'Blanks') as Ward is that sometimes a.code is a blank string, sometimes it's a null.
The problem is that when I use the Group By statement, I can't use Ward or I get the error
Ward: Invalid Identifier
I can only use a.code so I get 2 rows for 'Blanks', as per below
1 Blanks 7 0 0 7 7
2 Blanks 23 1 1 25 30
3 W01 75 4 0 79 91
4 W02 62 1 0 63 72
5 W03 140 2 0 142 162
6 W04 6 1 0 7 7
7 W05 46 0 1 47 48
8 W06 322 46 1 369 425
9 W07 91 0 1 92 108
10 W08 93 2 0 95 104
11 W09 28 1 0 29 30
12 W10 25 0 0 25 28
What I need, is for the row with 'Blanks' to combined into 1 row. Little help?
Thanks.
You can not use the alias in the GROUP BY, but you can use the expression that builds the value:
GROUP BY nvl(trim(a.code), 'Blanks')

SUM column values based on two rows in the same tables in SQL

I have one table like below in my SQL server.
Trans_id br_code bill_no amount
1 22 111 10
2 22 111 20
3 22 111 30
4 22 111 40
5 22 111 10
6 23 112 20
7 23 112 20
8 23 112 20
9 23 112 30
and I want desired output like below table
s.no br_code bill_no amount
1 22 111 110
2 23 112 90
try this:
select br_code, bill_no, sum(amount)
from TABLE
group by br_code, bill_no

Oracle SQL Query one row with monthly value only, but need to group it by month

I can't figure out the correct query on Oracle SQL
I have the following data:
Name Monthly_amount Start_date
Bob 100 April 2014
Mike 120 June 2014
Steve 80 Sept 2014
Bob 50 Dec 2014
And I would like to get the following result
Name |Jan-14| Feb-14| Mar-14| Apr-14 |May-14| Jun-14| Jul-14 |Aug-14|Sep-14|Oct-14| Nov-14| Dec-14
Bob 0 0 0 100 100 100 100 100 100 100 100 150
Mike 0 0 0 0 0 120 120 120 120 120 120 120
Steve 0 0 0 0 0 0 0 0 80 80 80 80
Something along the lines of this should work for you. If you need the column definitions to be dynamic, you would have to create it dynamically, but it's better to do that kind of things in your application.
SELECT Name,
SUM(CASE WHEN Start_date <= TO_DATE('01-JAN-2014','DD-MON-YYYY')
THEN Monthly_Amount ELSE 0 END) AS Jan14,
SUM(CASE WHEN Start_date <= TO_DATE('01-FEB-2014','DD-MON-YYYY')
THEN Monthly_Amount ELSE 0 END) AS Feb14,
(etc.)
FROM table1
GROUP BY Name;
(This assumes Name uniquely defines a person)