Utilizing Case When & Possibly "Lead" or "Lag" - sql

I know what I want the data to display but can't seem to figure out the correct logic for it. Example below.
Dataset
**'ID', 'Admission Mth', 'Admission Yr', 'Category', 'Facility', 'ID_Yr_Cat', 'ID_Yr_Cat_Fac',**
'123456', 'Jan', '2017', 'Hospital', 'NYMC', '123456-2017-Hospital', '123456-2017-Hospital-NYMC',
'123456', 'Jul', '2017', 'Hospital', 'NYMC', '123456-2017-Hospital', '123456-2017-Hospital-NYMC',
'123456', 'Oct', '2018', 'Hospital', 'NYMC', '123456-2018-Hospital', '123456-2018-Hospital-NYMC',
'123456', 'Nov', '2018', 'Hospital', 'NJMC', '123456-2018-Hospital', '123456-2018-Hospital-NJMC',
'789123', 'Feb', '2017', 'Clinic', 'Philly Clinic', '789123-2017-Clinic', '789123-2017-Clinic-Philly Clinic',
'987654', 'May', '2018', 'Hospital', 'PAMC', '987654-2018-Hospital', '987654-2018-Hospital-PAMC',
'456123', 'Sept', '2017', 'Clinic', 'Philly Clinic', '456123-2017-Clinic', '456123-2017-Clinic-Philly Clinic',
'456123', 'Aug', '2018', 'Hospital', 'NYMC', '456123-2018-Hospital', '456123-2018-Hospital-NYMC',
'456123', 'Nov', '2018', 'Hospital', 'NYMC', '456123-2018-Hospital', '456123-2018-Hospital-NYMC',
'456123', 'Dec', '2018', 'Hospital', 'NJMC', '456123-2018-Hospital', '456123-2018-Hospital-NJMC'
I want the final results to display "1" flags for the hospital readmit.
Final results should show:
**'Hospital Readmit per Yr'**,
'0',
'1',
'0',
'1',
'0',
'0',
'0',
'0',
'1',
'1'
**'Hospital Readmit per Yr & Fac'**,
'0',
'1',
'0',
'0',
'0',
'0',
'0',
'0',
'1',
'0'
My thoughts were to use some sort of case when with a lead function including a partition. Just not sure how to write it out. I'm using SQL Server MS 2008.

This can be achieved either with CASE WHEN or with LEAD or LAG function. However, since you mentioned you are using SQL server 2008, LEAD or LAG may not work as LEAD and LAG are the analytical functions which can be used in SQL Server 2012 or higher version. You may want to try something like this if you want to use CASE WHEN.
For getting [Hospital Readmit per Yr] flag value:
SELECT
CASE WHEN A.[RowNumber]>1 THEN 1 else 0 END As [Hospital Readmit per Yr]
FROM (select ROW_NUMBER() OVER (PARTITION BY [ID_Yr_Cat] order by [Admission Mth], [Admission Yr]) as [RowNumber], * from hospital) as A
For getting [ID_Yr_Cat_Fac] flag value:
SELECT
CASE WHEN A.[RowNumber]>1 THEN 1 else 0 END As [Hospital Readmit per Yr & Fac]
FROM (select ROW_NUMBER() OVER (PARTITION BY [ID_Yr_Cat_Fac] order by [Admission Mth], [Admission Yr]) as [RowNumber], * from hospital) as A
In case you want to look at ALL the columns to understand how the query is returning results, check the screenshots below:

Related

Find third last event in table - Incomplete output

As my research into Firebird continues, I've attempted to improve some of my queries. As I use Libreoffice Base, I'm not 100% sure how the data entry code works, but I believe it's something like this:
CREATE TABLE "Data Entry"(
ID int,
Date date,
"Vehicle Type" varchar,
events int,
"Hours 1" int,
"Hours 2" int
);
INSERT INTO "Data Entry" VALUES
(1, '31/12/22', 'A', '1', '0', '1'),
(2, '31/12/22', 'A', '1', '0', '1'),
(3, '29/12/22', 'A', '3', '0', '1'),
(4, '25/06/22', 'B1', '1', '0', '1'),
(5, '24/06/22' , 'B1', '1', '1', '0'),
(6, '24/06/22' , 'B1', '1', '1', '0'),
(7, '31/12/22' , 'B2', '7', '0', '1'),
(8, '29/12/22' , 'C', '1', '0', '1'),
(9, '29/12/22' , 'C', '2', '0', '1'),
(10, '19/01/22' , 'D1', '5', '1', '0'),
(11, '23/01/22' , 'D2', '6', '1', '1'),
(12, '29/07/19' , 'D3', '5', '0', '1'),
(13, '21/12/22' , 'D4', '1', '0', '1'),
(14, '19/12/22' , 'D4', '1', '1', '1'),
(15, '19/12/22' , 'D4', '1', '0', '1'),
(16, '28/12/22' , 'E', '2', '0', '1'),
(17, '24/12/22' , 'E', '3', '0', '1'),
(18, '14/07/07' , '1', '0', '0', '1'),
(19, '22/12/22' , '2', '1', '0', '1');
I tried this through the online Fiddle pages, but it throws up errors, so either I'm doing it incorrectly, or it's because there was no option for Firebird. Hopefully irrelevant, as I have the table already through the front-end.
One of my earlier queries which works as expected is shown below, along with its output:
SELECT
"Vehicle Type",
DATEDIFF(DAY, "Date", CURRENT_DATE) AS "Days Since 3rd Last Event"
FROM
(
SELECT
"Date",
"Events",
"Vehicle Type",
"Event Count",
ROW_NUMBER() OVER (PARTITION BY "Vehicle Type" ORDER BY "Date" DESC) AS "rn"
FROM
(
SELECT
"Date",
"Events",
"Vehicle Type",
SUM("Events") OVER (PARTITION BY "Vehicle Type" ORDER BY "Date" DESC) AS "Event Count"
FROM "Data Entry"
)
WHERE "Event Count" >= 3
)
WHERE "rn" = 1
Vehicle Type
Days Since 3rd Last Event
A
3
B1
191
B2
1
C
3
D1
347
D2
343
D3
1252
D4
14
E
8
In this output, it does not list every vehicle because not all vehicles have an Event Count that is equal to or greater than 3. The new query I am trying to put together is a combination of different queries (omitted to keep things relevant, plus they already work on their own), with a rewrite of the above code as well:
SELECT
"Vehicle Type",
SUM("Hours 1" + "Hours 2") AS "Total Hours",
MAX(CASE
WHEN
"Total Events" = 3
THEN
DATEDIFF(DAY, "Date", CURRENT_DATE)
END
) "Days Since 3rd Last Event"
FROM
(
SELECT
"Vehicle Type",
"Date",
"Hours 1",
"Hours 2",
CASE
WHEN
"Events" > 0
THEN
SUM( "Events")
OVER(
PARTITION BY "Vehicle Type"
ORDER BY "Date" DESC
)
END
"Total Events"
FROM
"Data Entry"
)
GROUP BY "Vehicle Type"
ORDER BY "Vehicle Type"
The expected output should be:
Vehicle Type
Days Since 3rd Last Event
Total Hours
1
1
2
1
A
3
3
B1
191
3
B2
1
1
C
3
2
D1
347
1
D2
343
2
D3
1252
1
D4
14
4
E
8
2
However, the actual output is:
Vehicle Type
Days Since 3rd Last Event
Total Hours
1
1
2
1
A
3
B1
191
3
B2
1
C
3
2
D1
1
D2
2
D3
1
D4
14
4
E
2
Granted, I've mixed and matched code, made some up myself, and copied some parts from elsewhere online, so there's a good chance I've not understood something correctly and blindly added it in thinking it would work, but now I'm at a loss as to what that could be. I've had a play around with changing the values of the WHEN statements and altering the operators between =, >, and >=, but any deviation from what's currently shown above outputs incorrect numbers. At least the three numbers displayed in the actual output are correct.
You could try using two rankings:
the first one that catches last three rows
the second one that catches your last row among the possible three
then get your date differences.
WITH last_three AS (
SELECT "Vehicle Type", "Date",
SUM("Hours 1"+"Hours 2") OVER(PARTITION BY "Vehicle Type") AS "Total Hours",
ROW_NUMBER() OVER(PARTITION BY "Vehicle Type" ORDER BY "Date" DESC) AS rn
FROM "Data Entry"
), last_third AS (
SELECT "Vehicle Type", "Date", "Total Hours",
ROW_NUMBER() OVER(PARTITION BY "Vehicle Type" ORDER BY rn DESC) AS rn2
FROM last_three
WHERE rn <= 3
)
SELECT "Vehicle Type",
DATEDIFF(DAY, "Date", CURRENT_DATE) AS "Days Since 3rd Last Event",
"Total Hours"
FROM last_third
WHERE rn2 = 1
ORDER BY "Vehicle Type"
Check the demo here.
Note: You will get values for the "Vehicle Type" 1 and 2 too. If you can explain the rationale behind having those values empty, this query can be tweaked accordingly.

SSRS - Cannot read the next row for dataset DataSet1

with TotCFS as (select count(*)*1.0 as TotalCFS,
'Total CFS' as RowTitle
from PrivilegeData.TABLENAMEC c
where cast(CallCreatedDateTime as date) between #StartDate and #EndDate and CallPriority in ('1', '2', '3', '4', '5') and AreaCommand in ('FH', 'VA', 'NE', 'NW', 'SE', 'SW') and IsBolo = 0
)
select AreaCommand, CallPriority,
avg(datediff(second, CallCreatedDateTime, CallEntryDateTime)) as AverageSeconds,
left(dbo.[ConvertTimeToHHMMSS](avg(datediff(second, CallCreatedDateTime, CallEntryDateTime)), 's'), 7) as DisplayAvg,
'Create to Entry' as RowTitle, 1 as RowSort, b.SortOrder as ColumnSort
from PrivilegeData.TABLENAMEC c
inner join (select distinct AreaCommandAbbreviation, SortOrder from dimBeat) b on c.AreaCommand = b.AreaCommandAbbreviation
where cast(CallCreatedDateTime as date) between #StartDate and #EndDate and CallPriority in ('1', '2', '3', '4', '5') and AreaCommand in ('FH', 'VA', 'NE', 'NW', 'SE', 'SW') and IsBolo = 0
group by AreaCommand, CallPriority, SortOrder
UNION
select AreaCommand, CallPriority,
avg(datediff(second, CallEntryDateTime, CallDispatchDateTime)) as AvgEntryToDispatchSeconds,
left(dbo.ConvertTimeToHHMMSS(avg(datediff(second, CallEntryDateTime, CallDispatchDateTime)), 's'), 7) as DisplayAvgEntryToDispatchSeconds,
'Entry to Dispatch' as RowTitle, 2 , b.SortOrder
from PrivilegeData.TABLENAMEC c
inner join (select distinct AreaCommandAbbreviation, SortOrder from dimBeat) b on c.AreaCommand = b.AreaCommandAbbreviation
where cast(CallCreatedDateTime as date) between #StartDate and #EndDate and CallPriority in ('1', '2', '3', '4', '5') and AreaCommand in ('FH', 'VA', 'NE', 'NW', 'SE', 'SW') and IsBolo = 0
group by AreaCommand, CallPriority, SortOrder
I have about 8 unions I'm doing for this code. the difference is the name of the Row titles. this report has been running for about a year without any problems. I use this code in SSRS query type text. I also have one of my rowset name 'AverageSeconds' configured to read this expression
=IIf((Fields!RowSort.Value) < 7,Format(DateAdd("s", Avg(Fields!AverageSeconds.Value), "00:00:00"), "H:mm:ss"), Sum(Fields!AverageSeconds.Value))
the report some how broke and I have tried everything I find searching to fix it. Please help me with this error 'rsErrorReadingNextDataRow'.
This has got to be an issue with the data being operated upon. Maybe a 0 or NULL value condition.. I would start with reviewing records that were added or changed around the time that the problem began.
I Dropped and recreated the fact table and run my ssis package, which seams to fix it. The reason I did that is because I couldn't find a NULL or 0 value.

Deleting rows in a single database table based on the values of other rows

I have a fairly complex requirement I would like to solve using SQL in a Postgres DB. I'm sure this would be addressed in any order management system however I cannot find anything of a similar nature.
I have the following table (and values):
CREATE TABLE TABLE1 (
ID varchar(8),
ORIG_ID varchar(8),
STATUS varchar(8),
VALIDITY varchar(8)
);
INSERT INTO TABLE1
(ID, ORIG_ID, STATUS, VALIDITY)
VALUES
('1', '1', 'REPLACED','DAY'),
('2', '1', 'REPLACED','DAY'),
('3', '1', 'FILLED','DAY'),
('4', '4', 'REJECTED','DAY'),
('5', '5', 'PARTIAL','GTC'),
('6', '6', 'EXPIRED','GTD'),
('7', '7', 'REPLACED','GTD'),
('8', '7', 'PARTIAL','GTD'),
('9', '9', 'FILLED', 'GTD'),
('10', '10', 'NEW', 'DAY'),
('11', '11', 'NEW', 'GTD'),
('12', '12', 'DFD', 'GTD'),
('13', '13', 'REPLACED', 'GTD'),
('14', '13', 'FILLED', 'GTD')
;
N.B -
Please ignore the data types on the fields
The final table may have thousands of entries to process
The above can be pasted directly into SQL Fiddle if required (PostgreSQL 9.3.1)
The requirements I have are:
Delete all entries that have a STATUS of either:
FILLED, EXPIRED, REJECTED, CANCELLED
PARTIAL/NEW - If the VALIDITY is not GTD/GTC (i.e. only DAY)
REPLACED - Unless there are other entries with the same ORIG_ID in a PARTIAL/NEW STATUS and not GTD/GTC (still working orders)
TBD - To Be Deleted:
TBD ('1', '1', 'REPLACED','DAY'),
TBD ('2', '1', 'REPLACED','DAY'),
TBD ('3', '1', 'FILLED','DAY'),
TBD ('4', '4', 'REJECTED','DAY'),
('5', '5', 'PARTIAL','GTC'),
TBD ('6', '6', 'EXPIRED','GTD'),
('7', '7', 'REPLACED','GTD'),
('8', '7', 'PARTIAL','GTD'),
TBD ('9', '9', 'FILLED', 'GTD'),
TBD ('10', '10', 'NEW', 'DAY'),
('11', '11', 'NEW', 'GTD'),
('12', '12', 'DFD', 'GTD'),
TBD ('13', '13', 'REPLACED', 'GTD'),
TBD ('14', '13', 'FILLED', 'GTD')
I've tried looking and the closet I could find was the following:
Delete with join on the same table and limit clause
However I couldn't get it to work while incorporating the requirements above.
As this will be run at the end of day I have had a few thoughts in such as changing all entries with VALIDITY of DAY, setting STATUS to EXPIRED. Then just deleting them all but then still hit the issue of STATUS with the GTD/GTC orders. I'm unsure if this would also be faster than handling it all under the same logic.
Any help (or new ideas) would be appreciated on how to tackle this issue.
Delete from table1
WHERE status in ('FILLED','EXPIRED','REJECTED','CANCELLED')
OR (status in ('PARTIAL','NEW') AND validity not in ('GTD','GTC'))
OR (status = 'REPLACED' and orig_ID not in
(select ORig_ID from table1 where status in ('PARTIAL','NEW')));
http://sqlfiddle.com/#!15/9e465/28/0
what is your PostgreSQL version ?
anyways here is what I come up with
delete from TABLE1 where
STATUS in ('FILLED','EXPIRED','REJECTED','CANCELLED') or
STATUS in (select STATUS from TABLE1 where STATUS in ('PARTIAL','NEW')
and VALIDITY!='DAY') OR
STATUS in (select status from TABLE1 where STATUS ='REPLACED'
and orig_ID not in
(select ORig_ID from table1 where status in ('PARTIAL','NEW')))
http://sqlfiddle.com/#!15/9e465/54
Try with this DELETE sentence:
DELETE FROM TABLE1 as MASTER
WHERE
STATUS IN ('FILLED', 'EXPIRED', 'REJECTED', 'CANCELLED') OR
(STATUS IN ('PARTIAL','NEW') AND VALIDITY NOT IN ('GTD','GTC')) OR
(STATUS='REPLACED' AND ID NOT IN
(SELECT ORIG_ID FROM TABLE1 OTHER
WHERE OTHER.ORIG_ID=MASTER.ID AND
STATUS IN ('PARTIAL','NEW') AND VALIDITY IN ('GTD','GTC')));
SQL Fiddle Example (With SELECT sentence)
Unless I'm mistaken, there's either an error in the output in the OP, or the logic.
DELETE
FROM TABLE1 AS T
WHERE STATUS IN ('FILLED', 'EXPIRED', 'REJECTED', 'CANCELLED')
OR ( STATUS IN ('PARTIAL', 'NEW')
AND VALIDITY NOT IN ('GTD', 'GTC') )
OR ( STATUS = 'REPLACED'
AND NOT EXISTS
(SELECT 1
FROM TABLE1 tbl1
WHERE T.ID <> tbl1.ID
AND T.ORIG_ID = tbl1.ORIG_ID
AND tbl1.STATUS IN ('PARTIAL', 'NEW')
AND tbl1.VALIDITY NOT IN ('GTD', 'GTC')
)
);
http://sqlfiddle.com/#!15/9e465/16
EDIT: It seems OP means
"REPLACED - Unless [it does not have GTD/GTC status ] and there are other entries with the same ORIG_ID"
rather than
"REPLACED - Unless there are other entries with the same ORIG_ID [that do not have a GTD/GTC status]"

Two counts in three tables

I am trying to count two columns using the following query:
select distinct [District],
count (Distinct [Student Identifier Statewide California])
as '11-12 Enrollment',
(select count (Distinct IncdtKey)
From [dbo].[DisciplineStudentFile1112]
where GrdLvLKey in ('15', '01', '02', '03', '04',
'05', '06', '07', '08', '09',
'10', '11', '12', '18', '19')) as Total_Incidents
From
dbo.SSID1112StudentEnrollmentRecords with (nolock)
inner join
[dbo].[SchoolDetail] on CDSCode = dbo.SSID1112StudentEnrollmentRecords.CDSOrig
where
[EnrollStatCodeOrig] like '10'
and
[Grade Level Code] in ('PS', 'KN', '01', '02', '03',
'04', '05', '06', '07', '08',
'09', '10', '11', '12', 'UE', 'US')
group by [District]
order by [District]
My results are:
District 11-12 Enrollment Total_Incidents
AB Unified 20662 896371
CE Unified 5387 896371
DR Unified 526 896371
FJ Unified 1506 896371
KT Unified 8415 896371
I can't figure out how to get the individual counts in the Total_Incidents column instead of a total 896371 count?
An easy approach is to correlate the subquery with the outer query:
select distinct SD.District,
count ( distinct SER.[Student Identifier Statewide California] ) as [11-12 Enrollment],
( select count( distinct DSF.IncdtKey ) from dbo.DisciplineStudentFile1112 as DSF
where DSF.GrdLvLKey in ( '15', '01', '02', '03', '04', '05', '06', '07', '08', '09',
'10', '11', '12', '18', '19' ) and -- Note additional condition here.
DSF.District = SD.District ) as Total_Incidents
from dbo.SSID1112StudentEnrollmentRecords as SER with (nolock) inner join
dbo.SchoolDetail as SD on SD.CDSCode = SER.CDSOrig
where SER.EnrollStatCodeOrig like '10' and
[Grade Level Code] in ( 'PS', 'KN', '01', '02', '03', '04', '05', '06', '07', '08',
'09', '10', '11', '12', 'UE', 'US' )
group by SD.District
order by SD.District
I have made some assumptions about the table schemas. I would recommend that when using joins that you supply an alias for each table and use the alias on every reference to avoid confusion.
An alternative solution would be to use another join with DisciplineStudentFile1112 and then summarize the results using GROUP BY.

Aggregating Data in SQL

When looking at the picture in the link I need the Imploded Units to Average and the Exploded Units to Sum, I will attach my code below the Results shown below. I have been searching for an answer to this for a few days. I fear I may be trying to code over my head a little bit.
SELECT A.to_load_id AS "Cases",
A.from_qty - A.to_qty AS "Imploded Units",
( A.from_qty - A.to_qty ) * Nvl(SUM(p.qty), 1) AS "Exploded Units",
A.wskusku AS "Sku's",
A.wave AS "Wave",
A.from_loc AS "Processed Location",
A.free_form_text AS "Zone" ,
FROM audits A,
prepack P
WHERE A.from_loc LIKE 'S%'
AND A.to_loc = A.from_loc
AND free_form_text IN ( '01', '02', '03', '04',
'05', '06', '07', '08',
'09', '10', '11', '12',
'13', '14' )
AND A.wskusku = p.sku (+)
AND A.from_load_id = '42419472'
AND A.wave in ('WC055193','','','','','','','','','')
AND To_date(Substr(a.date_wms, 1, 12), 'YYYY/MM/DD HH24:MI') >=
SYSDATE - 4
GROUP BY A.to_load_id,
A.from_qty,
A.to_qty,
P.qty,
A.wskusku,
A.wave,
A.from_loc,
A.free_form_text
You need to use avg() / sum() and not GROUP BY columns that are involved in aggregates:
SELECT A.to_load_id AS "Cases"
,avg(A.from_qty - A.to_qty) AS "Imploded Units"
,sum(A.from_qty - A.to_qty) * Nvl(SUM(P.qty), 1) AS "Exploded Units"
,A.wskusku AS "Sku's"
,A.wave AS "Wave"
,A.from_loc AS "Processed Location"
,A.free_form_text AS "Zone"
FROM audits A
JOIN prepack P ON P.sku (+) = A.wskusku
WHERE A.from_loc LIKE 'S%'
AND A.to_loc = A.from_loc
AND A.free_form_text IN ( '01', '02', '03', '04',
'05', '06', '07', '08',
'09', '10', '11', '12',
'13', '14' )
AND A.from_load_id = '42419472'
AND A.wave in ('WC055193','','','','','','','','','')
AND To_date(Substr(A.date_wms, 1, 12), 'YYYY/MM/DD HH24:MI') >= SYSDATE - 4
GROUP BY A.to_load_id,
,A.wskusku
,A.wave
,A.from_loc
,A.free_form_text
Not sure why you multiple the "Exploded Units", but not the "Imploded Units". I copied what you have there.