Odd WHERE NOT EXISTS performance on DB2 - sql

I am experiencing very odd performance on DB2 version 9.1 when running the query below:
select a.CYCL_NUM
, a.AC_NUM
, a.AUTHS_DTE
, a.PL_ID
, a.APRVD_RSPN_CDE
, a.AUTHS_AMT
, a.AUTHS_STS_CDE
, a.TRAN_CTGR_CDE
, a.MRCHN_CTGR_CDE
, d.out_pu_au_amt
from nwhd12.chldr_auths a, nwhd12.w_chldr_ac d
where cycl_num = 200911
and a.ac_num = d.ac_num
and APRVD_RSPN_CDE = 'APV'
and not exists (
select 1 from auths_rev_hist b
where a.cycl_num = b.cycl_num
and a.auths_dte = b.auths_dte
and a.TRAN_CTGR_CDE = b.TRAN_CTGR_CDE
and a.PL_ID = b.pl_id
and a.APRVD_RSPN_CDE = b.APRVD_RSPN_CDE
and a.AUTHS_AMT = b.auths_amt
and a.TRAN_CTGR_CDE = b.TRAN_CTGR_CDE
and a.MRCHN_CTGR_CDE = MRCHN_CTGR_CDE
)
;
What is supposed to happen is that the query accesses partion 97 of nwhd12.chldr_auths, since that is the partition corresponding to cycle 200911. Instead, after accessing partition 97, it starts accessing every other partition in nwhd12.chldr_auths. Now, I was told that this is because of the "WHERE NOT EXISTS", but there is still the restriction on cycles in this statement (a.cycl_num = b.cycl_num), so why is it scanning all the partitions?
If I hard code the cycle in the where not exists, then the query performs as expected.
Thanks,
Dave

if the planner is this easily confused, you need to try a few different formulations. this untested (I don't even have DB2, but CTEs originated there):
WITH hist AS (
cycl_num
, ac_num
, auths_dte
, pl_id
, aprvd_rspn_cde
, auths_amt
, auths_sts_cde
, tran_ctgr_cde
, mrchn_ctgr_cde
FROM auths_rev_hist b
)
, auths AS (
SELECT
cycl_num
, ac_num
, auths_dte
, pl_id
, aprvd_rspn_cde
, auths_amt
, auths_sts_cde
, tran_ctgr_cde
, mrchn_ctgr_cde
FROM nwhd12.chldr_auths
WHERE cycl_num = 200911
AND aprvd_rspn_cde = 'APV'
EXCEPT
SELECT ... FROM hist
)
SELECT a.*, d.out_pu_au_amt
FROM auths a, nwhd12.w_chldr_ac d
WHERE a.ac_num = d.ac_num

Related

Looking for the best way to return pass in a parameter to allow me to return a specific customer or all customers in SQL

I have the below query that could defenitely be optimized but I'm looking for the best way to add a parameter that allows me to return the same results and pass in a Parameter that would say if pass it in as "All" it would return all WO.BillTo records but if I passed in a value other than "All" it would compare that value to the WO.BillTo to only return records for that specific BillTo Customer.
Any help or suggestions would be greatly appreciated.
SELECT WO.WONo ,
WO.BillTo ,
WO.ShipTo ,
WO.ShipName ,
WO.ClosedDate ,
WOParts.PartNo ,
WOParts.Description ,
WOParts.ShipQty ,
WOParts.SellRate ,
Customer.Name
FROM WO,WOParts,Customer
WHERE (((
((WO.ClosedDate >= #startdate ) AND (WO.ClosedDate < #closedate ) )
AND (WO.Disposition = 2 ) )
AND (WO.WONo = WOParts.WONo ) )
AND (WO.BillTo = Customer.Number ) )
AND WOParts.TransferWONoTo =''
and Woparts.shipqty <> 0
order by wo.ShipTo
This is known as a "catch all" query. I would suggest using NULL rather than 'all', but the syntax would be the same. The OPTION (RECOMPILE) is there to stop poor query plan caching. Also I've "updated" you to 1992's ANSI-92 explicit JOIN syntax, as it has been around for around 30 years now:
SELECT W.WONo ,
W.BillTo ,
W.ShipTo ,
W.ShipName ,
W.ClosedDate ,
WP.PartNo ,
WP.Description ,
WP.ShipQty ,
WP.SellRate ,
C.Name
FROM dbo.WO W --Let's update to 1992!
JOIN dbo.WOParts WP ON W.WONo = WP.WONo
JOIN dbo.Customer C ON W.BillTo = C.Number
WHERE W.ClosedDate >= #startdate AND W.ClosedDate < #closedate
AND W.Disposition = 2
AND WP.TransferWONoTo =''
AND WP.shipqty <> 0
AND (WO.BillTo = #YourParameter OR #YourParameter IS NULL)
ORDER BY W.ShipTo
OPTION (RECOMPILE);

How to increase performance of merge statement in SQL server

I have to execute merge statement on SQL server DB with below requirement.
MERGE INTO dbo.cross_charge D
USING
(
SELECT
PROJECT_CODE
, CROSS_CHARGE_OU
FROM
dbo.cross_charge
WHERE
PROJECT_CODE = #PROJECT_CODE1
AND CROSS_CHARGE_OU = #CROSS_CHARGE_OU1
) S
ON (
D.PROJECT_CODE = S.PROJECT_CODE
AND D.CROSS_CHARGE_OU = S.CROSS_CHARGE_OU
)
WHEN MATCHED THEN UPDATE SET
D.PROJECT_CODE = #PROJECT_CODE3
, D.CROSS_CHARGE_OU = #CROSS_CHARGE_OU3
, D.CROSS_CHARGE_START = #CROSS_CHARGE_START1
, D.CROSS_CHARGE_END = #CROSS_CHARGE_END1
, D.IS_Enable = #IS_Enable1
, D.UpdatedDate = #UpdatedDate1
WHEN NOT MATCHED THEN INSERT
(
PROJECT_CODE
, CROSS_CHARGE_OU
, CROSS_CHARGE_START
, CROSS_CHARGE_END
, IS_Enable
, UpdatedDate
)
VALUES
(
#PROJECT_CODE2, #CROSS_CHARGE_OU2, #CROSS_CHARGE_START, #CROSS_CHARGE_END, #IS_Enable, #UpdatedDate
)
I have to pass Project_code and cross_charge_ou as input parameter and based on select result it will either insert or update the record.
When I am executing this query for 100 records then It is taking 35 seconds which is too much as I have to execute this statement for 50K+ records.
Do we have any options to improve the performance of this query as I am not much aware of SQL?
Thank you.

Convert Oracle SQL statement into SQL Server statement

A bookings project now requires the same data extract - but from an SQL Server database - instead of Oracle. Can anyone assist converting the following into SQL Server syntax?
SELECT *
FROM (
SELECT o.ot_outlet_code
,v.lab_site_code ot_outlet_code
,v.brand
,v.region
, bd.cd_day_date booking_date, dd.cd_day_date dining_date
, f.last_change_date, f.created_date
, f.modified_date, t15.ts_timeslot_desc
, t.TIME, s.session_type
, tbs.booking_status, f.ADDED_BY_USER
, bp.product, bs.booking_source
, f.SPECIAL_OFFER, f.SEATING_PREFERENCE
, f.Tables_guest_id, covers
, booking_occurrence, breakfast_flag
, row_number() OVER (PARTITION BY f.Tables_guest_id ORDER BY f.last_change_date DESC, f.last_change_time DESC) rank_latest_record
, f.title, f.emailoptout
, f.MOBILE_OPT_IN, f.HIGH_CHAIR_COVERS
, f.GUEST_TYPE, f.Booking_ID
FROM owbi.whs_fact_rest_booking f
, owbi.whs_dim_cal_date bd
, owbi.whs_dim_cal_date dd
, owbi.whs_dim_bat_booking_source bs
, owbi.whs_dim_time_of_day t
, owbi.whs_dim_bat_product bp
, owbi.whs_dim_15_timeslot t15
, owbi.whs_dim_bat_booking_status tbs
, owbi.whs_dim_bat_session s
, owbi.bat_restaurants_v v
WHERE f.whs_dim_outlet = v.outlet
AND f.whs_dim_booking_date = bd.dimension_Key
AND f.whs_dim_dining_date = dd.dimension_key
AND f.whs_dim_bat_session = s.dimension_key
AND f.whs_dim_bat_booking_status = tbs.dimension_key
AND f.whs_dim_bat_product = bp.dimension_Key
AND f.whs_dim_bat_booking_source = bs.dimension_key
AND f.whs_dim_booking_time = t.dimension_Key
AND f.whs_dim_dining_15_timeslot = t15.dimension_key
AND dd.ey_year_code in ('2018')
AND f.whs_dim_dining_date >= 20170303
)
WHERE rank_latest_record = 1
ORDER BY BOOKING_DATE DESC;
The derived table must have an alias. eg
SELECT *
FROM (
SELECT o.ot_outlet_code
,v.lab_site_code ot_outlet_code
,v.brand
,v.region
, bd.cd_day_date booking_date, dd.cd_day_date dining_date
, f.last_change_date, f.created_date
, f.modified_date, t15.ts_timeslot_desc
, t.TIME, s.session_type
, tbs.booking_status, f.ADDED_BY_USER
, bp.product, bs.booking_source
, f.SPECIAL_OFFER, f.SEATING_PREFERENCE
, f.Tables_guest_id, covers
, booking_occurrence, breakfast_flag
, row_number() OVER (PARTITION BY f.Tables_guest_id ORDER BY f.last_change_date DESC, f.last_change_time DESC) rank_latest_record
, f.title, f.emailoptout
, f.MOBILE_OPT_IN, f.HIGH_CHAIR_COVERS
, f.GUEST_TYPE, f.Booking_ID
FROM owbi.whs_fact_rest_booking f
, owbi.whs_dim_cal_date bd
, owbi.whs_dim_cal_date dd
, owbi.whs_dim_bat_booking_source bs
, owbi.whs_dim_time_of_day t
, owbi.whs_dim_bat_product bp
, owbi.whs_dim_15_timeslot t15
, owbi.whs_dim_bat_booking_status tbs
, owbi.whs_dim_bat_session s
, owbi.bat_restaurants_v v
WHERE f.whs_dim_outlet = v.outlet
AND f.whs_dim_booking_date = bd.dimension_Key
AND f.whs_dim_dining_date = dd.dimension_key
AND f.whs_dim_bat_session = s.dimension_key
AND f.whs_dim_bat_booking_status = tbs.dimension_key
AND f.whs_dim_bat_product = bp.dimension_Key
AND f.whs_dim_bat_booking_source = bs.dimension_key
AND f.whs_dim_booking_time = t.dimension_Key
AND f.whs_dim_dining_15_timeslot = t15.dimension_key
AND dd.ey_year_code in ('2018')
AND f.whs_dim_dining_date >= 20170303
) dt
WHERE rank_latest_record = 1
ORDER BY BOOKING_DATE DESC;
In SQL Server it's considered poor form not use ANSI-style JOINs, although it's perfectly legal for inner joins to write them as a cross-join with the join criteria in the WHERE clause.
And it's generally better to use CTEs instead of subqueries/inline views/derived tables in the FROM clause.

SQL Server Select one column by aggregate but not another

Not good with sql. Forgive me if the question isn't 100% clear. Here is my query
SELECT
MAX(PatientId),
[Date],
[Time],
CASE WHEN MAX(CAST(HealthScoreSkipped as INT)) = 1
THEN '--'
ELSE MAX(DailyHealthScore)
END DailyHealthScore,
ProtocolGroupName,
MAX(BloodPressure) BloodPressure,
MAX(SystolicAlert) SystolicAlert,
MAX(DiastolicAlert) DiastolicAlert,
MAX(BloodPressureSkipped) BloodPressureSkipped,
MAX(Pulse) Pulse,
MAX(PulseAlert) PulseAlert,
MAX(PulseSkipped) PulseSkipped,
MAX(BloodSugar) BloodSugar,
MAX(BloodSugarAlert) BloodSugarAlert,
MAX(BloodSugarSkipped) BloodSugarSkipped,
MAX(Steps) Steps,
MAX(StepsAlert) StepsAlert,
MAX(StepsSkipped) StepsSkipped,
MAX(O2) O2,
MAX(O2Alert) O2Alert,
MAX(O2Skipped) O2Skipped,
MAX(Weight) Weight,
MAX(WeightAlert) WeightAlert,
#BaselineWeight AS BaselineWeight,
MAX(WeightSkipped) WeightSkipped,
MAX(Temperature) Temperature,
MAX(TemperatureAlert) TemperatureAlert,
MAX(TemperatureUnit) TemperatureUnit,
MAX(TemperatureSkipped) TemperatureSkipped,
MAX(PEF) PEF,
MAX(PEFAlert) PEFAlert,
MAX(PEFSkipped) PEFSkipped,
MAX(FEV1) FEV1,
MAX(FEV1Alert) FEV1Alert,
MAX(FEV1Skipped) FEV1Skipped,
MAX(FEVRatio) FEVRatio,
MAX(FEVRatioAlert) FEVRatioAlert,
MAX(FEVRatioSkipped) FEVRatioSkipped,
#SpiroEnabled SpiroEnabled
FROM #bioAndScores
GROUP BY PatientId, Date, Time, ProtocolGroupName
The problem here is on the lines
MAX(Steps) Steps,
MAX(StepsAlert) StepsAlert
I want to select the max Steps but the stepalert value that goes with that row not the max of the stepAlert.
You can create a sub query in the select statement to get the steps alert that corresponds to your step.
something along the lines of the below (note that I'm not sure why you are grouping by patientId, if you are taking the max(patientId) if you do want to group by patient id, the where clause of the sub query should also match on patient Id
SELECT
MAX(bas.PatientId),
bas.[Date],
bas.[Time],
bas.ProtocolGroupName,
.
.
.
MAX(bas.Steps) Steps,
--sub query to get the StepsAlert that corresponds to max steps
(SELECT
StepsAlert
FROM
#bioAndScores subBas
WHERE
--This is the important part of finding the match for Max Steps
MAX(bas.Steps) = subBas.Steps AND
--commented out because the MAX(PatientId) was ambiguous
--bas.PatientId = subBas.PatientId AND
bas.[Date] = subBas.[Date] AND
bas.[Time] = subBas.[Time] AND
bas.ProtocolGroupName = subBas.ProtocolGroupName) as StepsAlert
FROM
#bioAndScores as bas
GROUP BY
--PatientId,
bas.Date,
bas.Time,
bas.ProtocolGroupName
Remove the MAX() function from StepAlerts and add StepAlerts to your GROUP BY clause.
MAX(Steps) AS Steps,
StepsAlert AS StepsAlert
And in your GROUP BY:
GROUP BY PatientId, Date, Time, ProtocolGroupName, StepAlerts
Just add StepsAlert column to the group by clause and remove the MAX aggregate function.
GROUP BY PatientId, Date, Time, ProtocolGroupName,StepsAlert
I would suggest you go through this to better understand about how group by works.
You can do this using apply() to select the set of values that correspond to the highest Steps at the earliest StepsAlert like so:
select
PatientId
, [Date]
, [Time]
, DailyHealthScore = case
when MAX(CAST(HealthScoreSkipped as int))= 1 then '--'
else MAX(DailyHealthScore)
end
, ProtocolGroupName
, BloodPressure = MAX(BloodPressure)
, SystolicAlert = MAX(SystolicAlert)
, DiastolicAlert = MAX(DiastolicAlert)
, BloodPressureSkipped= MAX(BloodPressureSkipped)
, Pulse = MAX(Pulse)
, PulseAlert = MAX(PulseAlert)
, PulseSkipped = MAX(PulseSkipped)
, BloodSugar = MAX(BloodSugar)
, BloodSugarAlert = MAX(BloodSugarAlert)
, BloodSugarSkipped = MAX(BloodSugarSkipped)
, Steps = x.Steps
, StepsAlert = x.StepsAlert
, StepsSkipped = MAX(StepsSkipped)
, O2 = MAX(O2)
, O2Alert = MAX(O2Alert)
, O2Skipped = MAX(O2Skipped)
, Weight = MAX(Weight)
, WeightAlert = MAX(WeightAlert)
, BaselineWeight = #BaselineWeight
, WeightSkipped = MAX(WeightSkipped)
, Temperature = MAX(Temperature)
, TemperatureAlert = MAX(TemperatureAlert)
, TemperatureUnit = MAX(TemperatureUnit)
, TemperatureSkipped = MAX(TemperatureSkipped)
, PEF = MAX(PEF)
, PEFAlert = MAX(PEFAlert)
, PEFSkipped = MAX(PEFSkipped)
, FEV1 = MAX(FEV1)
, FEV1Alert = MAX(FEV1Alert)
, FEV1Skipped = MAX(FEV1Skipped)
, FEVRatio = MAX(FEVRatio)
, FEVRatioAlert = MAX(FEVRatioAlert)
, FEVRatioSkipped = MAX(FEVRatioSkipped)
, SpiroEnabled = #SpiroEnabled
from #bioAndScores b
cross apply (
select top 1
i.Steps
, i.StepsAlert
from #bioAndScores i
where b.PatientId = i.PatientId
and b.[Date] = i.[Date]
and b.[Time] = i.[Time]
and b.ProtocolGroupName = i.ProtocolGroupName
order by i.Steps desc, i.StepsAlert asc
) x
group by
PatientId
, date
, time
, ProtocolGroupName

ORA-30926: unable to get a stable set of rows in the source tables

I have a customer who gets: ORA-30926: unable to get a stable set of rows in the source tables:
Log show error Massage Error (30926)
13:52:19 (00:00:02.406) ERROR : Error (30926) (00:00:02.406) ORA-30926: Stabile Zeilengruppe in den Quelltabellen kann nicht eingelesen werden
TS03_MIN0100: UpdTable failed. Update inv_value in cMinTimeTable:
MERGE INTO HUBWBPMS5_ENTTS03005400223 a USING ( SELECT DISTINCT a.inv_value +
( a.inv_value_sum - h.inv_value ) AS inv_value , a.rowid xzfd_rid
FROM HUBWBPMS5_ENTTS03005700223 h , HUBWBPMS5_ENTTS03005400223 a
WHERE a.voucher_no = h.voucher_no AND a.sequence_no = h.max_seq_no
AND a.client = h.client ) xzfd_t ON ( xzfd_t.xzfd_rid = a.rowid )
WHEN MATCHED THEN
UPDATE
SET a.inv_value = xzfd_t.inv_value
I have checked for duplicate values in the tables but cant find anything unusual.
Maybe someone has an idea that could be useful.
The query is:
Query causing error (temp table):
INSERT INTO HUBWBPMS5_ENTTS03005700228 ( agg_flag , ace_code , activity , category , client , cost_dep , description , dim1 , dim2 , dim3 , dim4 , inc_ref , inv_value , max_seq_no , pd , period , project , resource_id , resource_typ , trans_date , unit , voucher_no , work_order , work_type )
SELECT agg_flag , ace_code , activity , category , client , cost_dep , description , dim1 , dim2 , dim3 , dim4 , inc_ref , SUM ( inv_value ) inv_value , max_seq_no , pd , period , project , resource_id , resource_typ , trans_date , unit , voucher_no , work_order , work_type
FROM HUBWBPMS5_ENTTS03005400228
WHERE agg_flag = 1
GROUP BY agg_flag , ace_code , activity , category , client , cost_dep , description , dim1 , dim2 , dim3 , dim4 , period , trans_date , voucher_no , max_seq_no , inc_ref , pd , project , resource_id , resource_typ , unit , work_order , work_type
When you get that error, it will be from a MERGE statement, and it indicates that there are multiple rows in the source dataset that match to a row you're joining to in the target table, and as such, Oracle doesn't know which one to use to do the update.
Taking your merge statement:
MERGE INTO HUBWBPMS5_ENTTS03005400223 a
USING (SELECT DISTINCT a.inv_value + ( a.inv_value_sum - h.inv_value ) AS inv_value,
a.rowid xzfd_rid
FROM HUBWBPMS5_ENTTS03005700223 h,
HUBWBPMS5_ENTTS03005400223 a
WHERE a.voucher_no = h.voucher_no
AND a.sequence_no = h.max_seq_no
AND a.client = h.client) xzfd_t
ON (xzfd_t.xzfd_rid = a.rowid)
WHEN MATCHED THEN
UPDATE SET a.inv_value = xzfd_t.inv_value;
it looks like the join between the two tables HUBWBPMS5_ENTTS03005700223 and HUBWBPMS5_ENTTS03005400223 in the xzfd_t subquery causes multiple rows to be returned for one or more of the HUBWBPMS5_ENTTS03005400223 rows (ie. you get multiple rows returned for at least one a.rowid).
To check this, run:
SELECT xzfd_rid,
COUNT(*) cnt
FROM (SELECT DISTINCT a.inv_value + ( a.inv_value_sum - h.inv_value ) AS inv_value,
a.rowid xzfd_rid
FROM HUBWBPMS5_ENTTS03005700223 h,
HUBWBPMS5_ENTTS03005400223 a
WHERE a.voucher_no = h.voucher_no
AND a.sequence_no = h.max_seq_no
AND a.client = h.client)
GROUP BY xzfd_rid
HAVING COUNT(*) > 1;
In order to fix this, you'd need to make the xzfd_t subquery return a single row for each xzfd_rid. Possibly using row_number() to pick a single row, or an aggregate query to sum up all the h.inv_value fields per a.rowid instead of the DISTINCT.