Select unique members from a set by weighted categories/properties

Select unique members from a set by weighted categories/properties - sql

I have a collection of objects (5000+) with 7 different properties. Two properties are tertiary the rest are binary. Each object have all 7 properties specified. It is possible that in some scenarios binary property may become unary.
Once in awhile I need to select top N random objects from this collection weighted by the frequency of a label in each category against the objects' total.
Currently, I have all data in sql server table as object, propertyMask pairs; however, I can reorganize that any other way necessary.
Examples:
black blue yellow (1,2,4)
circle square triangle (8,16,32)
solid color/meshed color (64)
dashed contour/no contour (128)
etc. (256)
The data is :
object1|9 <- 1001 black circle only (all other properties are 0)
object2|81 <- 101 0001 black square with solid color (all other properties are 0)
object3|148 <- 1001 0100 yellow square with dashed contour
etc.
Say, I end up with 1k objects with 600 black, 300 yellow and 100 blue objects. And I need to select top 10 objects. If I just consider one property, I'll just take any 6 black, 3 yellow and 1 blue objects. But I have 6 other properties to consider and ensure I have right amount of circles, squares and triangles. Etc. At this point I don't even know how to approach this problem.
Any suggestions would be appreciated.
*EDIT:
I repopulated data in the following format
name | att1 | att2 | ...
obj1 | 1 | 8 | ...
obj2 | 2 | 16 | ...
obj3 | 1 | 32 | ...
Is there a way to select TOP N objects weighted by the frequency of each attribute? I have 7 attributes for each objects; no Null values.
Thanks!

It's messy, it does not always fetch the exact number of rows required or an perfect distribution but it comes pretty darn close.
So how does it work:
ValuesPivotted: pivot all the distinct values and give each row a random rownumber
TargetDistribution: for each distinct value determine how many you need
SelectRows: go through each row in ValuesPivotted on a per row basis see if the row is to be skipped because it would otherwise breach an target for an distinct value. Otherwise increment the Sum for each value applicable for that row.
DECLARE #TargetRowNum INT = 100;
WITH ValuesPivotted AS(
SELECT O.id
, RowNum = ROW_NUMBER() OVER (ORDER BY NEWID())
, [0] = CASE WHEN O.atr1 = 0 THEN 1 ELSE 0 END
, [1] = CASE WHEN O.atr1 = 1 THEN 1 ELSE 0 END
, [2] = CASE WHEN O.atr1 = 2 THEN 1 ELSE 0 END
, [4] = CASE WHEN O.atr2 = 4 THEN 1 ELSE 0 END
, [8] = CASE WHEN O.atr2 = 8 THEN 1 ELSE 0 END
, [16] = CASE WHEN O.atr3 = 16 THEN 1 ELSE 0 END
, [32] = CASE WHEN O.atr3 = 32 THEN 1 ELSE 0 END
, [64] = CASE WHEN O.atr4 = 64 THEN 1 ELSE 0 END
, [128] = CASE WHEN O.atr4 = 128 THEN 1 ELSE 0 END
FROM dbo.objects AS O
),
TargetDistribution AS (
SELECT Target0 = ROUND(CAST(SUM([0] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target1 = ROUND(CAST(SUM([1] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target2 = ROUND(CAST(SUM([2] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target4 = ROUND(CAST(SUM([4] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target8 = ROUND(CAST(SUM([8] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target16 = ROUND(CAST(SUM([16] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target32 = ROUND(CAST(SUM([32] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target64 = ROUND(CAST(SUM([64] ) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
, Target128 = ROUND(CAST(SUM([128]) AS FLOAT) / COUNT(*) * #TargetRowNum, 0)
FROM ValuesPivotted
),
SelectRows AS(
SELECT VP.id
, RowNum
, KeepRow = 1
, Target0 , Sum0 = [0]
, Target1 , Sum1 = [1]
, Target2 , Sum2 = [2]
, Target4 , Sum4 = [4]
, Target8 , Sum8 = [8]
, Target16 , Sum16 = [16]
, Target32 , Sum32 = [32]
, Target64 , Sum64 = [64]
, Target128 , Sum128 = [128]
FROM ValuesPivotted AS VP
CROSS JOIN TargetDistribution AS TD
WHERE VP.RowNum = 1
UNION ALL
SELECT
VP.id
, VP.RowNum
, KeepRow = ISNULL(SkipRow.Value, 1)
, Target0 , Sum0 = Sum0 + ISNULL(SkipRow.Value, [0] )
, Target1 , Sum1 = Sum1 + ISNULL(SkipRow.Value, [1] )
, Target2 , Sum2 = Sum2 + ISNULL(SkipRow.Value, [2] )
, Target4 , Sum4 = Sum4 + ISNULL(SkipRow.Value, [4] )
, Target8 , Sum8 = Sum8 + ISNULL(SkipRow.Value, [8] )
, Target16 , Sum16 = Sum16 + ISNULL(SkipRow.Value, [16] )
, Target32 , Sum32 = Sum32 + ISNULL(SkipRow.Value, [32] )
, Target64 , Sum64 = Sum64 + ISNULL(SkipRow.Value, [64] )
, Target128 , Sum128 = Sum128 + ISNULL(SkipRow.Value, [128])
FROM SelectRows AS SR
INNER JOIN ValuesPivotted AS VP
ON VP.RowNum = SR.RowNum + 1
CROSS APPLY(
SELECT Value =
CASE WHEN Sum0 + [0] <= Target0
AND Sum1 + [1] <= Target1
AND Sum2 + [2] <= Target2
AND Sum4 + [4] <= Target4
AND Sum8 + [8] <= Target8
AND Sum16 + [16] <= Target16
AND Sum32 + [32] <= Target32
AND Sum64 + [64] <= Target64
AND Sum128 + [128] <= Target128
THEN NULL ELSE 0 END
) AS SkipRow
WHERE Sum0 < Target0
OR Sum1 < Target1
OR Sum2 < Target2
OR Sum4 < Target4
OR Sum8 < Target8
OR Sum16 < Target16
OR Sum32 < Target32
OR Sum64 < Target64
OR Sum128 < Target128
)
SELECT O.*
FROM SelectRows AS SR
INNER JOIN dbo.objects AS O
ON SR.id = O.id
WHERE SR.KeepRow = 1
OPTION(MAXRECURSION 0)
EDIT: The WHERE clause in SelectRows did not do what it was supposed to, stop the recursion when all targets were met, now it does.

Related

Stuff with xml is taking so much time in my query

I have a view from the view need to fetch some result set for that purpose I have wrote following query.
So here the problem is stuff with xml is taking so much time any suggestions please.
;WITH CTE_transaction_type1 AS
(
SELECT case_summary_id,Primary_Payer, Physician,Generate_Bill,
CASE
WHEN Current_Payer IS NULL
THEN CASE
WHEN Resposible_Patient_Name IS NULL
AND PG_Patient_Name IS NULL
THEN 'PG:' + Patient_Last_Name + ',' + Patient_First_Name
WHEN Resposible_Patient_Name IS NULL
AND NULLIF(PG_Patient_Name,' ') IS NOT NULL
THEN 'SG:' + Patient_Last_Name + ',' + Patient_First_Name
ELSE Resposible_Patient_Name
END
ELSE
CASE
WHEN Current_Payer_Name IS NULL
THEN ''
ELSE Insurance_Type + Current_Payer_Name + IIF(Plan_Name IS NULL, '', ',' + Plan_Name)
END
END AS Responsible_Party_List
FROM [charts].[vw_billing_analysis]
WHERE sort_order = 0 AND transaction_type_id = 1
)
SELECT ba.Case_Summary_ID
,ba.date_of_surgery
,ba.Organization_ID
,ba.Organization_Name
,ba.Patient_Name
,ba.Patient_First_Name
,ba.Patient_Last_Name
,ba.MRN
,ba.Physician_ID
,ba.Physician
,ba.[PROCEDURE]
,ba.Current_Payer
,ba.Current_Payer_Name
,ba.Primary_Payer
,ba.Primary_Payer_ID
,ba.Insurance_Plan_ID
,ba.Insurance_Plan_Type
,ba.Plan_Name
,ba.Speciality_ID
,ba.Speciality_Name
,ba.Transaction_Responsible_Party_ID
,ba.Charge_ID
,ba.Charge_Procedure_ID
,ba.Patient_Insurance_ID
,ba.Insurance_Type
,ba.Resposible_Patient_Name
,ba.Received_From
,ba.Guarantor_Master_ID
,ba.PG_Patient_Name
,ba.Batch_Description
,ba.Period_Name
,ba.Unassigned_payment
,ba.Charge
,ba.Charge_Corr
,ba.Debit
,ba.Payments
,ba.WriteOff
,ba.Balance
,ba.Corrected_TF
,ba.Period_ID
,ba.Batch_ID
,ba.transaction_type_id
,STUFF((
SELECT '; ' + [Procedure]
FROM [charts].[vw_billing_analysis] vba
WHERE vba.case_summary_id = ba.case_summary_id
AND vba.transaction_type_id = 1
ORDER BY ba.sort_order
FOR XML PATH('')
,TYPE
).value('.', 'NVARCHAR(MAX)'), 1, 2, '') AS Procedure_List
,trty1.Primary_Payer AS Primary_Payer_List
,trty1.Responsible_Party_List
,trty1.Physician AS Physician_List
,ba.Allowed_Amount
,IIF(ba.insurance_contract_id IS NULL
OR ba.Insurance_Plan_Type = 'Self-Pay', NULL, iif(ba.procedure_type IS NOT NULL, round(ba.[allowed_amount] - (ba.[allowed_amount] * ba.procedure_discount / 100.0), 3), ba.[procedure_amount])) AS [Expected_Amount]
,ba.Procedure_Amount
,ba.Generate_Bill
,trty1.Generate_Bill AS Generate_Bill_Case
,ba.Actual_Payment
,ba.First_Bill_Date
FROM [charts].[vw_billing_analysis] ba
left join cte_transaction_type1 trty1
ON ba.Case_Summary_ID=trty1.Case_Summary_ID
WHERE ba.Organization_ID =52
AND
(
(
(
(
ba.Period_Id = ('999999999')
OR '999999999' = '999999999'
)
AND 0 = 1
)
AND (
(
ba.Batch_Id IN ('999999999')
OR '999999999' = '999999999'
)
AND 0 = 1
)
)
OR (
0 = 0
AND ba.date_of_surgery >= '2018-02-01'
AND ba.date_of_surgery <= '2021-03-03'
)
)
AND (
CASE
WHEN ba.Insurance_Plan_Type IS NULL
AND ba.Insurance_Plan_ID IS NULL
AND ba.Current_Payer IS NULL
THEN '-1'
ELSE ba.Insurance_Plan_Type_Id
END IN ('999999999')
OR '999999999' = '999999999'
)
AND (
ba.Current_Payer IN ('999999999')
OR '999999999' = '999999999'
)
AND (
ba.Speciality_ID IN ('999999999')
OR '999999999' = '999999999'
)
AND (
ba.Physician_ID IN ('999999999')
OR '999999999' = '999999999'
)
AND (
isnull(ba.Primary_Payer_ID, - 1) IN ('999999999')
OR '999999999' = '999999999'
)
AND (
ba.appointment_type_id IN ('999999999')
OR '999999999' = '999999999'
)
AND (
SELECT SUM(CASE
WHEN vba.transaction_type_id = 1
THEN balance
WHEN vba.transaction_type_id = 3
THEN - balance
END)
FROM [charts].[vw_billing_analysis] vba
WHERE vba.case_summary_id = ba.case_summary_id
) >= 0
ORDER BY ba.Date_Of_Surgery DESC
,ba.MRN DESC
,ba.Patient_Name ASC
,ba.upd_tr_sort ASC
,ba.Sort_Order ASC

Two fast queries, slow when combined

I have this massive query that I am trying to add a left join to. The addition is commented out.
The main query runs < 4 sec, 32,000 rows.
The commented part runs < 1 sec, 51,000 rows.
But, when I combine them i.e join the second query, the whole thing runs in 15 sec.
There are already 2 massive joins in the original query (50,000 rows both), so I don't get why this join is special.
PS: I might also be doing other suboptimal things, please criticize.
select
*,
case
when t2.status = 1 and t2.price > t2.buyprice then round((t2.price - t2.buyprice) * 0.04, 2)
when t2.status = 2 and t2.price > t2.buyprice then round((t2.price - t2.buyprice) * 0.03, 2)
when t2.status = 3 and t2.price > t2.buyprice then round((t2.price - t2.buyprice) * 0.02, 2)
when t2.status = 4 and t2.price > t2.buyprice then round((t2.price - t2.buyprice) * 0.01, 2)
else 0
end as bonus
from (
select *,
case
when t1.gratis = 1 then 10
when t1.price_vat = 0 or t1.price = 0 then
case
when t1.stock > 0 or soldLast180DaysQty > 0 then -1
when t1.stock = 0 then 12
end
when t1.buyprice = 0 then
case
when t1.stock > 0 then -1
when t1.stock = 0 then 12
end
when soldLast180DaysQty < 0 then 1
when t1.age_days < 60 then 9
when t1.last_import <= 180 then
case
when t1.soldLast180DaysQty <= t1.stock then 0
when t1.soldLast180DaysQty > t1.stock then 7
when t1.stock = 0 then 5
end
when t1.last_import >= 180 and t1.stock = 0 then
case
when soldLast180DaysQty > 0 then 10
when soldLast180DaysQty = 0 then 11
end
when t1.last_import >= 180 then
case
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) < 0.3 and t1.stock_retail / t1.stock >= 0.9 then 5
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0 and 0.1 then 1
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.1 and 0.2 then 2
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.2 and 0.3 then 3
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.3 and 0.4 then 4
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.4 and 0.7 then 0
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) >= 0.9 then 6
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.8 and 0.9 then 7
when t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) between 0.7 and 0.8 then 8
end
end as status,
round(t1.soldLast180DaysQty / nullif(t1.stock + t1.soldLast180DaysQty, 0) * 100, 0) as ratio
from (
select
si.anqid id,
CAST(rtrim(si.acident) as nvarchar(7)) as code,
CAST(rtrim(si.acname) as nvarchar(100)) as name,
si.anvat as vat,
si.ansaleprice as price_vat,
round(si.anrtprice, 2) as price,
cenovnik.clientPrice, -- <---------------------- This part
round(si.anbuyprice, 2) as buyprice,
concat(round(anpricesupp, 2), ' ', acpurchcurr) as fakturna,
round(si.anrtprice - si.anbuyprice, 2) as profit,
case
when si.anrtprice is not null and si.anrtprice > 0 and si.anbuyprice is not null and si.anbuyprice > 0
then round((si.anrtprice / si.anbuyprice - 1) * 100, 0)
end as margin,
cast(si.acfieldsa as nvarchar(12)) as [group],
cast(rtrim(si.acClassif2) as nvarchar(16)) as category,
cast(rtrim(ss.acsubject) as nvarchar(7)) as supplier_code,
cast(left(ss.acname2, 30) as nvarchar(30)) as supplier_name,
rtrim(si.acclassif) as rebate,
si.anFieldNA as webActive,
si.anfieldNF as gratis,
case
when si.acpicture is not null then 'true'
else 'false'
end as picture,
isnull((select sum(anstock) from the_stock where acident = si.acident and acwarehouse = '00051'), 0) as stock_warehouse,
isnull((select sum(anstock) from the_stock where acident = si.acident and acwarehouse <> '00051'), 0) as stock_retail,
isnull((select sum(anstock) from the_stock where acident = si.acident), 0) as stock,
isnull((select sum(anReserved) from the_stock where acident = si.acident), 0) as stock_reserved,
isnull((select sum(anvalue) from the_stock where acident = si.acident), 0) as stock_value,
(
select isnull(datediff(day, max(m.addate), getdate()), 9999)
from the_moveitem mi
left join the_move m
on mi.ackey = m.ackey
where mi.acident = si.acident and m.acDocType in ('1900', '1000', '6800', '1A00')
) as last_import,
isnull(round(soldLast180Days.soldLast180DaysQty, 0), 0) soldLast180DaysQty,
isnull(round(soldLast180Days.soldLast180DaysCogs, 0), 0) soldLast180DaysCogs,
isnull(round(soldLast180Days.soldLast180DaysRevenue, 0), 0) soldLast180DaysRevenue,
isnull(round(soldLast180Days.soldLast180DaysProfit, 0), 0) soldLast180DaysProfit,
datediff(day, si.adtimeins, getdate()) as age_days
from the_setitem si
/*
left join (
SELECT
si.acident sku,
case
when dogovoren.anPrice is null and matrica.anRebate is null then si.anRTPrice
when dogovoren.anPrice is not null then dogovoren.anPrice
when dogovoren.anPrice is null then si.anRTPrice * (1 - matrica.anRebate/100)
end as clientPrice
FROM tHE_SetItem si
left join (
select acident, anPrice
from vHE_SetSubjPriceItemExtToday
where acsubject = '1111'
) dogovoren
on dogovoren.acident = si.acident
left join (
select acClassif, anRebate
from vHE_SetSubjTypePriceCateg
where acSubjType = (select acsubjtypebuyer from tHE_SetSubj where acsubject = '1111')
) matrica
on si.acClassif = matrica.acClassif
) cenovnik
on cenovnik.sku = si.acident
*/
left join tHE_SetSubj ss
on ss.acsubject = si.acsupplier
left join (
select
mi.acident,
sum(mi.anQty) soldLast180DaysQty,
sum(mi.anQty * mi.anStockPrice) soldLast180DaysCogs,
sum(mi.anPVVATBase) soldLast180DaysRevenue,
sum(mi.anPVVATBase - mi.anQty * mi.anStockPrice) soldLast180DaysProfit
from the_moveitem mi
left join the_move m
on m.ackey = mi.ackey
where m.acDocType in ('3000', '3050', '3190', '3800', '3550', '3X10', '3950', '3500', '3510', '6700', '3A00', '3210', '3220', '3230', '3240', '3450', '3250', '3260', '3270', '3540', '3460', '3280', '3290', '3310', '3320', '3440', '3330', '3340', '3350', '3360', '3370', '3380', '3390', '3410', '3470', '3420', '3430', '3480', '3490', '3520', '3530', '3560', '3610', '2540', '2740', '2730'
) and m.addate >= getdate() - 180
group by mi.acident
) soldLast180Days
on soldLast180Days.acIdent = si.acident
) t1
) t2
where
t2.status < 11
order by
t2.status asc,
t2.stock_value desc
I am using SQL Server if it's relevant.

Not really an answer - but when I've had this problem I've just created temporary tables. In SQL Server you prefix the table name with # and they will get deleted when your session ends.
You might realise your nested table t1 as a temporary table #t1
CREATE TABLE #t1 (id INT, code NVARCHAR(7), etc...)
INSERT INTO #t1
select
si.anqid id,
CAST(rtrim(si.acident) as nvarchar(7)) as code,
CAST(rtrim(si.acname) as nvarchar(100)) as name,
etc..
SELECT * FROM .... #t1 ...
Replace all references to t1 with #t1

SQL query cannot be converted to LINQ

When I'm trying to convert this piece of SQL to LINQ, I received this error:
SQL cannot be converted to LINQ: Table [GL] not found in the current
Data Context.
But in SQL it works fine.
This is my SQL query:
SELECT itm.Id ,
ISNULL(itm.Debit, 0) AS Debit ,
ISNULL(itm.Credit, 0) AS Credit ,
itm.State ,
itm.DocCreateDate ,
ISNULL(itm.Num, 0) AS Num ,
ISNULL(itm.DocTypeRef, 0) AS DocTypeRef ,
itm.Year ,
itm.Month ,
ISNULL(itm.DebitCount, 0) AS DebitCount ,
ISNULL(itm.CreditCount, 0) AS CreditCount ,
itm.DL ,
itm.DL2 ,
itm.DL3 ,
itm.DL4 ,
itm.DL5 ,
itm.DL6 ,
itm.DL7 ,
ISNULL(itm.FCRef, 0) AS FCRef ,
itm.FollowUpNum ,
ISNULL(itm.BranchRef, 1) AS BranchRef ,
itm.DocHeaderRef ,
ISNULL(itm.RowNum, 0) AS RowNum ,
ISNULL(itm.DailyNum, 0) AS DailyNum ,
ISNULL(itm.TempNum, 0) AS TempNum ,
ISNULL(itm.RefNum, 0) AS RefNum ,
itm.Descript ,
itm.Count ,
itm.FollowUpDate ,
itm.FCVal ,
itm.FCRateVal ,
itm.FactorNum ,
ISNULL(itm.DebitFCVal, 0) AS DebitFCVal ,
ISNULL(itm.CreditFCVal, 0) AS CreditFCVal ,
sl.Id AS SLRef ,
sl.SLCode ,
sl.Title AS SLTitle ,
CASE WHEN ISNULL(sl.DLSRef, 0) > 0 THEN 1
ELSE 0
END AS HasDL ,
CASE WHEN ISNULL(sl.DLSRef2, 0) > 0 THEN 1
ELSE 0
END AS HasDL2 ,
CASE WHEN ISNULL(sl.DLSRef3, 0) > 0 THEN 1
ELSE 0
END AS HasDL3 ,
CASE WHEN ISNULL(sl.DLSRef4, 0) > 0 THEN 1
ELSE 0
END AS HasDL4 ,
CASE WHEN ISNULL(sl.DLSRef5, 0) > 0 THEN 1
ELSE 0
END AS HasDL5 ,
CASE WHEN ISNULL(sl.DLSRef6, 0) > 0 THEN 1
ELSE 0
END AS HasDL6 ,
CASE WHEN ISNULL(sl.DLSRef7, 0) > 0 THEN 1
ELSE 0
END AS HasDL7 ,
CASE WHEN ISNULL(sl.HasFC, 0) > 0 THEN 1
ELSE 0
END AS HasFC ,
1 AS HasFollow ,
tl.Id AS TLRef ,
tl.Title AS TLTitle ,
tl.TLCode ,
gl.Id AS GLRef ,
gl.Title AS GLTitle ,
gl.GLCode ,
gl.Balance AS GLBalance
FROM Acc.DocItem AS itm
LEFT OUTER JOIN Acc.SL AS sl ON itm.SLRef = sl.Id
LEFT OUTER JOIN Acc.TL AS tl ON sl.TLRef = tl.Id
LEFT OUTER JOIN Acc.GL AS gl ON tl.GLRef = gl.Id
WHERE ( itm.SLRef > 0 )
If there is no way to pass this error so can you tell me its LINQ equal?

There is a way to do that in linq:
var q =
from c in categories
join p in products on c.Category equals p.Category into ps
from p in ps.DefaultIfEmpty()
select new { Category = c, ProductName = p == null ? "(No products)" : p.ProductName };
This is the big idea for left outer join, you as the syntax for 'into' like 'as' in SQL query.

Calculations in SQL and preventing Divide By Zero

I am trying to rework a query that is based on cursors.
The query calculates certain stats based on multiple values. In the snippet below the first, second and third CASE works out the happiness of Unit1. Any of these fields are 0 (they can never be NULL) I will get a Divide by Zero error. I could just add 1 to each field (Unit2 + 1) / (Unit1 + 1) and that will stop the error. However, it seems like a bodge and it will potentially give the incorrect result. ie. Unit1 needs the same amount of Unit2 to keep them happy. If I have one Unit1 and no Unit2 this bodge will give 100% happy for that check. So my first problem is how do I prevent the divide by zero but not distort the results. Each CASE gives me a % happy
Select
CASE WHEN ((Unit2 / Unit1) * 100) > 100 Then 100 Else ((Unit2 / Unit1) * 100) END Unit1Happy1,
CASE WHEN ((Stock3 / (Unit1 * 2)) * 100) > 100 Then 100 Else ((Stock3 / (Unit1 * 2)) * 100) END Unit1Happy2,
CASE WHEN (((Drug3 + (Drug1 / 2)) / Unit1) * 100) > 100 Then 100 Else (((Drug3 + (Drug1 / 2)) / Unit1) * 100) END Unit1Happy3,
CASE WHEN (((Weapon6 + Weapon7 + Weapon8 + Weapon9) / Unit2) * 100) > 100 Then 100 ELSE (((Weapon6 + Weapon7 + Weapon8 + Weapon9) / Unit2) * 100) END Unit2Happ1,
CASE WHEN (((Stock2 + (Stock1 / 2)) / Unit2) * 100) > 100 Then 100 Else (((Stock2 + (Stock1 / 2)) / Unit2) * 100) END Unit2Happ2
FROM tblUserFiles
My next problem is that I need to take to lowest value for each UnitHappiness and store that value in the table. So in tblUserFiles are 5 fields Unit1Happ, Unit2Happ .... Unit5Happ. Looking at the above query if Unit1Happy1 is the lowest figure I store that figure into Unit1Happ, If Unit2Happy is the lowest I store that etc.
My record Identifier is UserId and I need to run this for a given UserId or for the whole table.
What I am basically asking is:
What is the best method to identify the lowest value of each Unit
calculation?
What is the best way to prevent the divide by zero
error?
Can I approach this problem in a better way?
Update
I am working through the suggestions posted in the answers below. As this is just a training exercise it may take a while. I do have a working query that gives the results I am looking for I am just not sure if the suggested answers would be more efficient.
Update tblUserFiles Set Unit1Happ = happyvals.Unit1Happiness, Unit2Happ = happyvals.Unit2Happiness, Unit3Happ = happyvals.Unit3Happiness, Unit4Happ = happyvals.Unit4Happiness, Unit5Happ = happyvals.Unit5Happiness FROM
(SELECT ch.UserId,
Case When ch.Unit1Happy1 < ch.Unit1Happy2 And ch.Unit1Happy1 < ch.Unit1Happy3 Then ch.Unit1Happy1
When ch.Unit1Happy2 < ch.Unit1Happy1 And ch.Unit1Happy2 < ch.Unit1Happy3 Then ch.Unit1Happy2
Else ch.Unit1Happy3
End As Unit1Happiness,
CASE WHEN ch.Unit2Happy1 > ch.Unit2Happy2 THEN ch.Unit2Happy1
ELSE ch.Unit2Happy2
END AS Unit2Happiness,
ch.Unit3Happy1 AS Unit3Happiness,
ch.Unit4Happy1 AS Unit4Happiness,
ch.Unit5Happy1 AS Unit5Happiness
FROM
(
Select
UserId,
CASE WHEN Unit2 = 0 OR Unit1 = 0 THEN 0
WHEN ((Unit2 / Unit1) * 100) > 100 Then 100
ELSE ((Unit2 / Unit1) * 100)
END Unit1Happy1,
CASE WHEN Stock3 = 0 OR Unit1 = 0 THEN 0
WHEN ((Stock3 / (Unit1 * 2)) * 100) > 100 THEN 100
ELSE ((Stock3 / (Unit1 * 2)) * 100)
END Unit1Happy2,
CASE WHEN Unit1 = 0 THEN 0
WHEN Drug3 = 0 AND Drug1 = 0 THEN 0
WHEN Drug1 = 0 THEN
CASE WHEN (Drug3 / Unit1) * 100 > 100 THEN 100
ELSE (Drug3 / Unit1) * 100
END
WHEN Drug3 = 0 THEN
CASE WHEN (Drug1 / 2) / Unit1 > 100 THEN 100
ELSE (Drug1 / 2) / Unit1
END
ELSE
CASE WHEN (((Drug3 + (Drug1 / 2)) / Unit1) * 100) > 100 THEN 100
ELSE (((Drug3 + (Drug1 / 2)) / Unit1) * 100)
END
END Unit1Happy3,
CASE WHEN Unit2 = 0 THEN 0
WHEN (Weapon6 + Weapon7 + Weapon8 + Weapon9) = 0 THEN 0
WHEN (((Weapon6 + Weapon7 + Weapon8 + Weapon9) / Unit2) * 100) > 100 THEN 100
ELSE (((Weapon6 + Weapon7 + Weapon8 + Weapon9) / Unit2) * 100)
END Unit2Happy1,
CASE WHEN Unit2 = 0 THEN 0
WHEN Stock1 = 0 AND Stock2 = 0 THEN 0
WHEN Stock1 = 0 THEN
CASE WHEN ((Stock2 / Unit2) * 100) > 100 THEN 100
ELSE ((Stock2 / Unit2) * 100)
END
WHEN Stock2 = 0 THEN
CASE WHEN (((Stock1 / 2) / Unit2) * 100) > 100 THEN 100
ELSE (((Stock1 / 2) / Unit2) * 100)
END
WHEN (((Stock2 + (Stock1 / 2)) / Unit2) * 100) > 100 THEN 100
ELSE (((Stock2 + (Stock1 / 2)) / Unit2) * 100)
END Unit2Happy2,
CASE WHEN Unit2 = 0 OR Unit3 = 0 THEN 0
WHEN ((Unit2 / Unit3) * 100) > 100 THEN 100
ELSE ((Unit2 / Unit3) * 100)
END Unit3Happy1,
CASE WHEN Unit2 = 0 OR Unit4 = 0 THEN 0
WHEN ((Unit2 / Unit4) * 100) > 100 THEN 100
ELSE ((Unit2 / Unit4) * 100)
END Unit4Happy1,
CASE WHEN Unit2 = 0 OR Unit5 = 0 THEN 0
WHEN ((Unit2 / Unit5) * 100) > 100 THEN 100
ELSE ((Unit2 / Unit5) * 100)
END Unit5Happy1
FROM tblUserFiles) ch) happyvals
Join tblUserFiles ON tblUserFiles.UserID = happyvals.UserID

put a nullif(Unit1, 0) around every divide by group ?

Something like this.
The idea is to wrap your division in a UDF (scalar user defined function)....and use the "Max(v)" trick.
The code below may not be perfect, I've supplying the idea.
Cursors are horrible performers, 99.9% of the time. Try to solve this without cursors.
/* or Create */
ALTER FUNCTION dbo.udfSafeDivision (#num float , #denom float)
RETURNS float
AS
BEGIN
declare #returnValue float
select #returnValue = 0
if(isnull(#denom,0) != 0)
BEGIN
select #returnValue = convert(float, convert(float, #num)/convert(float, #denom))
END
return #returnValue
END
GO
IF OBJECT_ID('tempdb..#TableOne') IS NOT NULL
begin
drop table #TableOne
end
CREATE TABLE #TableOne
(
SurrogateKey int IDENTITY(1001, 1),
Unit1 int ,
Unit2 int ,
Stock1 int ,
Stock2 int ,
Stock3 int ,
Drug1 int ,
Drug3 int ,
Weapon6 int , Weapon7 int , Weapon8 int , Weapon9 int
)
Insert into #TableOne (Unit1, Unit2, Stock1, Stock2 , Stock3 , Drug1, Drug3 , Weapon6 , Weapon7 , Weapon8 , Weapon9)
select 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 0 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 7 , 0 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 0 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 0 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 0
UNION ALL select 1 , 2 , 3 , 4 , 5 , 6 , 7 , 8 , 9 , 10 , 11
UNION ALL select 5 , 1 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 5 , 0 , 0 , 0 , 5 , 0 , 0 , 0 , 0 , 0 , 0
UNION ALL select 10 , 0 , 0 , 0 , 0 , 5 , 10 , 0 , 0 , 0 , 0
UNION ALL select 0 , 1 , 0 , 0 , 0 , 5 , 10 , 10 , 20 , 30 , 40
UNION ALL select 0 , 50 , 20 , 10 , 0 , 0 , 0 , 0 , 0 , 0 , 0
SELECT
(SELECT Max(v)
FROM (VALUES (Unit1Happy1), (Unit1Happy2), (Unit1Happy3), (Unit2Happ1), (Unit2Happ2)) AS value(v)) as [MaxValue]
, '--------' as Sep1
, derived1.*
FROM
(
Select
CASE WHEN ((dbo.udfSafeDivision(Unit2 , Unit1)) * 100) > 100 Then 100 Else (dbo.udfSafeDivision(Unit2 , Unit1) * 100) END Unit1Happy1,
CASE WHEN (dbo.udfSafeDivision(Stock3 , (Unit1 * 2)) * 100) > 100 Then 100 Else (dbo.udfSafeDivision(Stock3 , (Unit1 * 2)) * 100) END Unit1Happy2,
CASE WHEN (dbo.udfSafeDivision((Drug3 + (Drug1 / 2)) , Unit1) * 100) > 100 Then 100 Else (dbo.udfSafeDivision((Drug3 + (Drug1 / 2)) , Unit1) * 100) END Unit1Happy3,
CASE WHEN (dbo.udfSafeDivision((Weapon6 + Weapon7 + Weapon8 + Weapon9) , Unit2) * 100) > 100 Then 100 ELSE (dbo.udfSafeDivision((Weapon6 + Weapon7 + Weapon8 + Weapon9) , Unit2) * 100) END Unit2Happ1,
CASE WHEN (dbo.udfSafeDivision((Stock2 + (Stock1 / 2)) , Unit2) * 100) > 100 Then 100 Else (dbo.udfSafeDivision((Stock2 + (Stock1 / 2)) , Unit2) * 100) END Unit2Happ2
/* the below is to debug */
, '--' as Sep1
, Unit1, Unit2, Stock1, Stock2 , Stock3 , Drug1, Drug3 , Weapon6 , Weapon7 , Weapon8 , Weapon9
, dbo.udfSafeDivision(Unit2 , Unit1) as Div1
, dbo.udfSafeDivision(Stock3 , (Unit1 * 2)) as Div3
FROM #TableOne
) as derived1
IF OBJECT_ID('tempdb..#TableOne') IS NOT NULL
begin
drop table #TableOne
end

Query Runs Slower The Second Time

I have this massive query that I can typically run in under 2 minutes. However, when I run it a second time about a minute after, it goes on infinitely... so I kill the process and my SSMS session. I don't have any other jobs running in the background.
Is something else being retained on the server? I think I'm missing something as far as how SQL Server works.
Thanks.
EDIT: Here's the SQL (had to do a little obfuscation)
SELECT pl.OrangeLocationID ,
e.EventID ,
cr.Title AS [Event Type] ,
su.LastName + ', ' + su.FirstName AS FMR ,
CONVERT(VARCHAR(20), pl.Report_Date, 101) AS [Report Entry Date] ,
l.Name ,
l.Number ,
ll.SodaPopLocationID AS [SodaPop Location ID] ,
l.Zip ,
c.Channel ,
pl.DT AS [ReportedDate] ,
RIGHT(pl.DT_start, 8) AS [ReportedStartTime] ,
RIGHT(pl.DT_end, 8) AS [ReportedEndTime] ,
[CMS].dbo.dDateDiff(pl.DT_start, pl.DT_end) AS [ReportedDuration] ,
pl.scheduled_date AS [ScheduledDate] ,
RIGHT(pl.scheduled_start, 8) AS [ScheduledStartTime] ,
RIGHT(pl.scheduled_end, 8) AS [ScheduledEndTime] ,
[CMS].dbo.dDateDiff(pl.scheduled_start, pl.DT_end) AS [ScheduledDuration] ,
e.HoursPaid AS [Rep Hours Worked] ,
ISNULL(PP.[RepCount], 0) AS [RepCount] ,
CASE WHEN [CMS].dbo.dDateDiff(pl.DT_start, pl.DT_end) = ( e.HoursPaid / ISNULL(PP.[RepCount], 1) )
THEN [CMS].dbo.oa_HourDateDiff(pl.DT_start, pl.DT_end)
WHEN [CMS].dbo.dDateDiff(pl.scheduled_start, pl.DT_end) = ( e.HoursPaid / ISNULL(PP.[RepCount], 1) )
THEN [CMS].dbo.oa_HourDateDiff(pl.scheduled_start, pl.DT_end)
ELSE ( e.HoursPaid / ISNULL(PP.[RepCount], 1) )
END AS [FinalDuration] ,
g.[Description] AS [OA Market] ,
g.SodaPop_Region AS [SodaPop Region] ,
g.SodaPop_Area AS [SodaPop Area] ,
coup4 ,
coupo ,
coupo_e ,
card_num ,
promo ,
promo_no ,
promo_no_o ,
highlight1 ,
highlight2 ,
highlight3 ,
mgmt_reaction ,
mgmt_reaction_e ,
comm_p ,
comm_n ,
r.comments ,
s_fname ,
s_lname ,
v_title ,
ll.KeyAccountCorp AS [Key Account Corp.] ,
interact_new + interact_rep AS [interact_total] ,
samp_new + samp_rep AS [samp_total] ,
purch_new + purch_rep AS [purch_total] ,
23 / ( NULLIF(( interact_new + interact_rep ), 0) * 1.0 ) AS [Int_Crate] ,
CASE WHEN sampletype = 11 THEN ( purch_new + purch_rep ) / ( NULLIF(( samp_new + samp_rep ), 0) * 1.0 )
ELSE NULL
END AS [Samp_Crate] ,
coup1 + coup2 AS [coup_total] ,
CASE WHEN coup1 + coupo > 0 THEN 1
ELSE 0
END AS [CoupDist] ,
DATEPART(month, pl.DT) AS [Visit_Month] ,
DATEPART(quarter, pl.DT) AS [Quarter] ,
DATEPART(weekday, pl.DT) AS [Weekday] ,
CASE DATEPART(weekday, pl.DT)
WHEN 6 THEN 'Fri'
WHEN 7 THEN 'Sat'
WHEN 1 THEN 'Sun'
ELSE 'Mon-Thurs'
END AS [Weekday_Grouped] ,
CASE WHEN dbo.Exception(pl.OrangeLocationID, 12) = 1
OR dbo.Exception(pl.OrangeLocationID, 13) = 1
OR dbo.Exception(pl.OrangeLocationID, 14) = 1 THEN 1
ELSE 0
END AS [EVolume] ,
CASE WHEN dbo.DoesHaveException(pl.OrangeLocationID, 18) = 1 THEN 1
ELSE 0
END AS [CVolume] ,
CASE WHEN dbo.eException(pl.OrangeLocationID, 9) = 1
OR dbo.eException(pl.OrangeLocationID, 22) = 1 THEN 1
ELSE 0
END AS [Volumes] ,
CASE WHEN dbo.eException(pl.OrangeLocationID, 8) = 1
OR dbo.eException(pl.OrangeLocationID, 21) = 1 THEN 1
ELSE 0
END AS [Sales Price] ,
CASE WHEN dbo.eException(pl.OrangeLocationID, 11) = 1 THEN 1
ELSE 0
END AS [Sample Volume] ,
ISNULL(i.[NormalizedSold], 0) AS [EQBottlesSold] ,
CASE WHEN ISNULL(purch_new, 0) = 0 THEN 0
ELSE ISNULL(i.[NormalizedSold], 0) / ( purch_new + purch_rep )
END AS [EQBottlesSoldPerPurch] ,
ac.AvgSales ,
ac.STDEVSales ,
( ISNULL(i.[NormalizedSold], 0) - ac.AvgSales ) / ac.STDEVSales AS [sl] ,
ac.AvgPurchasers ,
ac.STDEVPurchasers ,
( ISNULL(r.purch_new, 0) - ac.AvgPurchasers ) / ac.STrchasers AS [ZScore_Purchasers] ,
ac.AvgConversions ,
ac.STDEVConversions ,
( ISNULL(( purch_new + purch_rep ) / ( NULLIF(( interact_new ), 0) ), 0) - ac.AvgConversions )
/ ac.STDEVConversions AS [ZScore_Conversions] ,
ac.[AvgSalesPerPurchaser] ,
ac.[STDEVSalesPerPurchaser] ,
( ISNULL(( CASE WHEN ISNULL(purch_new, 0) = 0 THEN 0
ELSE ISNULL(i.[NormalizedSold], 0) / ( purch_new + purch_rep )
END ), 0) - ac.[AvgSalesPerPurchaser] ) / ac.[STDEVSalesPerPurchaser] AS [SalesPerPurchaser] ,
( ( ( ISNULL(i.[NormalizedSold], 0) - ac.AvgSales ) / ac.STDEVSales )
+ ( (ISNULL(( CASE WHEN ISNULL(purch_new + purch_rep, 0) = 0 THEN 0
ELSE ISNULL(i.[NormalizedSold], 0) / ( purch_new + purch_rep )
END ), 0) - ac.[AvgSalesPerPurchaser]) ) ) / 4 AS [core] ,
( ( (( ISNULL(i.[NormalizedSold], 0) - ac.AvgSales ) / ac.STDEVSales) ) / 4 ) + 3 AS [core] ,
su.aaUserID ,
l.LsocationID
FROM [CMS_SodaPop].dbo.Schedule pl WITH ( NOLOCK )
INNER JOIN [CMS_SodaPop].dbo.Report r WITH ( NOLOCK ) ON r.OrangeLocationID = pl.OrangeLocationID
INNER JOIN [CMS].dbo.Users su WITH ( NOLOCK ) ON su.UserID = pl.Rep_FMR
INNER JOIN [CMS].dbo.Locations l WITH ( NOLOCK ) ON l.LocationID = pl.LocationID
INNER JOIN [CMS].dbo.OrangeReports cr WITH ( NOLOCK ) ON cr.RedID = pl.RedID
INNER JOIN [CMS_SodaPop].dbo.Events e WITH ( NOLOCK ) ON e.OrangeLocationID = pl.OrangeLocationID
INNER JOIN [CMS_SodaPop].dbo.MarketList g WITH ( NOLOCK ) ON g.GroupID = pl.GroupID
INNER JOIN [CMS_SodaPop].dbo.Locations ll WITH ( NOLOCK ) ON ll.LocationID = pl.LocationID
LEFT JOIN [CMS_SodaPop].dbo.Channels c WITH ( NOLOCK ) ON ll.ChannelID = c.ChannelID
LEFT JOIN ( SELECT PLocationID ,
COUNT(DISTINCT UserID) AS [RepCount]
FROM [CMS_roll].dbo.rollItems WITH ( NOLOCK )
WHERE RedID = 154
GROUP BY OrangeLocationID
) PP ON PP.OrangeLocationID = pl.OrangeLocationID
LEFT JOIN ( SELECT OrangeLocationID ,
SUM(NormalizedSold) AS [NormalizedSold]
FROM [Analysis].dbo.[vSodaPop_Retail_Inventory] WITH ( NOLOCK )
GROUP BY OrangeLocationID
) i ON i.OrangeLocationID = pl.OrangeLocationID
LEFT JOIN [Analysis].dbo.[vSodaPop_Calculations] ac WITH ( NOLOCK ) ON ac.[Quarter] = CASE WHEN DATEPART(MM,
[DT]) IN ( 10,
11, 12 ) THEN 4
END
AND ac.[Year] = DATEPART(YY, pl.DT)
WHERE pl.Activity = 1
AND pl.RedID = 154
AND pl.GroupID <> 444
AND pl.[DT] < GETDATE()
AND DATEPART(YY, [DT]) >= 2010
AND ISNULL(i.NormalizedSold, 0) >= 0
AND DATEPART(year, GETDATE()) = DATEPART(year, r.Insert_Date)

Would have to see the query to really dig in however..
you could try adding OPTION (RECOMPILE) to the end of the query to force it to create a new execution plan.
Are you using Temp Tables?
Cursors not deallocated & closed?
You can look at Profiler to see if anything looks different between the 2 executions.

Are you sure it isn't being blocked by another process the second time?

What happens if you execute
CHECKPOINT;
GO;
DBCC DROPCLEANBUFFERS;
GO;
DBCC FREEPROCCACHE;
GO;
between the queries? This is not really a solution but will help diagnosis.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select unique members from a set by weighted categories/properties - sql

Related

Stuff with xml is taking so much time in my query

Two fast queries, slow when combined

SQL query cannot be converted to LINQ

Calculations in SQL and preventing Divide By Zero

Query Runs Slower The Second Time

Categories

Resources