Oracle sql tuning multiple counts and distinct - sql

Hey dudes i have the following query running on oracle.
SELECT DISTINCT
T_TRATAMIENTO.CampaignID AS CAMPAIGNID,
T_TRATAMIENTO.OfferID AS OFFERID,
T_CALENDARIO.ActualDate AS ACTUALDATE,
count(CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 0 THEN T_TRATAMIENTO.TreatmentSize END) as NUM_OF_OFFERS,
count(CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 1 THEN T_TRATAMIENTO.TreatmentSize END) as NUM_OF_OFFERS_CG,
count (distinct (case T_TRATAMIENTO.CntrlTreatmtFlag when 0 then T_TRATAMIENTO.OfferHistoryID END)) as NUM_OFF_VERS,
count (distinct (case T_TRATAMIENTO.CntrlTreatmtFlag when 1 then T_TRATAMIENTO.OfferHistoryID END)) as NUM_OFF_VERS_CG,
count(distinct (CASE WHEN T_TRATAMIENTO.CntrlTreatmtFlag = 0 and T_ESTATUSCONTACTO.CountsAsContact=1 THEN T_HISTORIALCONTACTO.CustomerID END)) as UNIQUE_RECIPIENTS,
count(distinct (CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 1 THEN T_HISTORIALCONTACTO.CustomerID END)) as UNIQUE_RECIP_CG FROM
T_ESTATUSCONTACTO,
T_CALENDARIO,
T_TRATAMIENTO
LEFT OUTER JOIN
T_HISTORIALCONTACTO ON
T_TRATAMIENTO.PackageID = T_HISTORIALCONTACTO.PackageID
WHERE
T_HISTORIALCONTACTO.CellID = T_TRATAMIENTO.CellID
AND
T_HISTORIALCONTACTO.ContactStatusID = T_ESTATUSCONTACTO.ContactStatusID
AND
T_HISTORIALCONTACTO.DateID = T_CALENDARIO.DateID
AND
T_TRATAMIENTO.HasDetailHistory = 0 GROUP BY
T_TRATAMIENTO.CampaignID,
T_TRATAMIENTO.OfferID, T_CALENDARIO.ActualDate;
Table T_HISTORIALCONTACTO has 80 million records, still growing, other tables just less than 100 records, and thereĀ“s timing in response. Also making full scan. I had already implement indexes but still shows slow performance.
How can i tune this sql query? What would u recommend. I really apprecciate ur help. Thanxs in advance

Firstly you don't need first DISTINCT because you have group by on first 3 columns (CAMPAIGNID, OFFERID,ACTUALDATE).
Secondly I recommend avoid "cartesian merge join" which is very consuming. So try this new join approach.
If cartesian join will still occur try to make "join" between these tables: T_ESTATUSCONTACTO, T_CALENDARIO, T_TRATAMIENTO.
Currently you have only separated joins to T_HISTORIALCONTACTO
SELECT
T_TRATAMIENTO.CampaignID AS CAMPAIGNID,
T_TRATAMIENTO.OfferID AS OFFERID,
T_CALENDARIO.ActualDate AS ACTUALDATE,
count(CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 0 THEN T_TRATAMIENTO.TreatmentSize END) as NUM_OF_OFFERS,
count(CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 1 THEN T_TRATAMIENTO.TreatmentSize END) as NUM_OF_OFFERS_CG,
count (distinct (case T_TRATAMIENTO.CntrlTreatmtFlag when 0 then T_TRATAMIENTO.OfferHistoryID END)) as NUM_OFF_VERS,
count (distinct (case T_TRATAMIENTO.CntrlTreatmtFlag when 1 then T_TRATAMIENTO.OfferHistoryID END)) as NUM_OFF_VERS_CG,
count(distinct (CASE WHEN T_TRATAMIENTO.CntrlTreatmtFlag = 0 and T_ESTATUSCONTACTO.CountsAsContact=1 THEN T_HISTORIALCONTACTO.CustomerID END)) as UNIQUE_RECIPIENTS,
count(distinct (CASE T_TRATAMIENTO.CntrlTreatmtFlag WHEN 1 THEN T_HISTORIALCONTACTO.CustomerID END)) as UNIQUE_RECIP_CG FROM
T_TRATAMIENTO
LEFT OUTER JOIN T_HISTORIALCONTACTO ON T_HISTORIALCONTACTO.CellID = T_TRATAMIENTO.CellID
JOIN T_CALENDARIO on T_HISTORIALCONTACTO.DateID = T_CALENDARIO.DateID
JOIN T_ESTATUSCONTACTO on T_HISTORIALCONTACTO.ContactStatusID = T_ESTATUSCONTACTO.ContactStatusID
WHERE
T_TRATAMIENTO.HasDetailHistory = 0
GROUP BY
T_TRATAMIENTO.CampaignID,
T_TRATAMIENTO.OfferID, T_CALENDARIO.ActualDate;

Related

Sum a column and perform more calculations on the result? [duplicate]

This question already has an answer here:
How to use an Alias in a Calculation for Another Field
(1 answer)
Closed 3 years ago.
In my query below I am counting occurrences in a table based on the Status column. I also want to perform calculations based on the counts I am returning. For example, let's say I want to add 100 to the Snoozed value... how do I do this? Below is what I thought would do it:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
Snoozed + 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
I get this error:
Invalid column name 'Snoozed'.
How can I take the value of the previous SUM statement, add 100 to it, and return it as another column? What I was aiming for is an additional column labeled Test that has the Snooze count + 100.
You can't use one column to create another column in the same way that you are attempting. You have 2 options:
Do the full calculation (as #forpas has mentioned in the comments above)
Use a temp table or table variable to store the data, this way you can get the first 5 columns, and then you can add the last column or you can select from the temp table and do the last column calculations from there.
You can not use an alias as a column reference in the same query. The correct script is:
SELECT
pu.ID Id, pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+100 AS Snoozed
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
MSSQL does not allow you to reference fields (or aliases) in the SELECT statement from within the same SELECT statement.
To work around this:
Use a CTE. Define the columns you want to select from in the CTE, and then select from them outside the CTE.
;WITH OurCte AS (
SELECT
5 + 5 - 3 AS OurInitialValue
)
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM OurCte
Use a temp table. This is very similar in functionality to using a CTE, however, it does have different performance implications.
SELECT
5 + 5 - 3 AS OurInitialValue
INTO #OurTempTable
SELECT
OurInitialValue / 2 AS OurFinalValue
FROM #OurTempTable
Use a subquery. This tends to be more difficult to read than the above. I'm not certain what the advantage is to this - maybe someone in the comments can enlighten me.
SELECT
5 + 5 - 3 AS OurInitialValue
FROM (
SELECT
OurInitialValue / 2 AS OurFinalValue
) OurSubquery
Embed your calculations. opinion warning This is really sloppy, and not a great approach as you end up having to duplicate code, and can easily throw columns out-of-sync if you update the calculation in one location and not the other.
SELECT
5 + 5 - 3 AS OurInitialValue
, (5 + 5 - 3) / 2 AS OurFinalValue
You can't use a column alias in the same select. The column alias do not precedence / sequence; they are all created after the eval of the select result, just before group by and order by.
You must repeat code :
SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END)+ 100 AS Test
FROM
Prospects p
INNER JOIN
ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE
p.Store = '108'
GROUP BY
pu.Name, pu.Id
ORDER BY
Name
If you don't want to repeat the code, use a subquery
SELECT
ID, Name, LeadCount, Working, Uninterested,Converted, Snoozed, Snoozed +100 AS test
FROM
(SELECT
pu.ID Id,pu.Name Name,
COUNT(*) LeadCount,
SUM(CASE WHEN Status = 'Working' THEN 1 ELSE 0 END) AS Working,
SUM(CASE WHEN Status = 'Uninterested' THEN 1 ELSE 0 END) AS Uninterested,
SUM(CASE WHEN Status = 'Converted' THEN 1 ELSE 0 END) AS Converted,
SUM(CASE WHEN SnoozedId > 0 THEN 1 ELSE 0 END) AS Snoozed
FROM Prospects p
INNER JOIN ProspectsUsers pu on p.OwnerId = pu.SalesForceId
WHERE p.Store = '108'
GROUP BY pu.Name, pu.Id) t
ORDER BY Name
or a view

Change sub query to join

all I have used subquery[Below] to identify the percentage. But I need a query without subquery. Can anyone please help me, how to use joins to calculate the percentage?
Query used
SELECT 'Dropping_Percentage',
( Cast(dropped_count AS DECIMAL(16, 9)) / Cast(new_count AS DECIMAL(16, 9
)) ) *
100
FROM (SELECT count AS New_count,
'1' a
FROM new_count)a,
(SELECT Count(*) Dropped_count,
'1' b
FROM pfo_bhi_new N
RIGHT JOIN pfo_bhi_old o
ON o.id_membid_claimid_c = N.id_membid_claimid_c
WHERE N.id_membid_claimid_c IS NULL)c
WHERE a.a = c.b
"I need a query without subquery."
Answer:
SELECT Count(*) All_count
, SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end) as Dropped_count
, SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end) * 1.0 / Count(*) as Dropping_Percentage
FROM pfo_bhi_new N
RIGHT JOIN pfo_bhi_old o ON o.id_membid_claimid_c = N.id_membid_claimid_c
Explanation:
Reviewing the main query with assumptions about the content:
SELECT Count(*) Dropped_count, '1' b
FROM pfo_bhi_new N
RIGHT JOIN pfo_bhi_old o ON o.id_membid_claimid_c = N.id_membid_claimid_c
WHERE N.id_membid_claimid_c IS NULL
should give you the number of records in pfo_bhi_old that do not appear in pfo_bhi_new. Assumption here is that you need to do the total based on the existing right join. All the matching and non matching records.
Therefore It's possible to count all the existing records by removing the where clause,
COUNT(*) would give you that total.
Next you want to count the ones where there's no match which will give you the "Dropping count" the value you had before with the where clause, that is where ones where the id_membid_claimid_c was null that is SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end). Said differently it will add 1 to the sum only when the id_membid_claimid_c field is null, otherwise it will add zero (0).
I've multiplied the numerator by 1.0 to force SQL Server to use decimal values and make the query easier to read.
Here's what it should look like if you needed to use decimals(16,9) as the result.
SELECT Count(*) All_count
, SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end) as Dropped_count
, SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end) * 1.0 / Count(*) as Dropping_Percentage
, CAST(SUM(CASE WHEN N.id_membid_claimid_c IS NULL THEN 1 else 0 end) AS DECIMAL(16,9)) / CAST(Count(*) AS DECIMAL(16,9)) as Dropping_Pct_16_9
FROM pfo_bhi_new N
RIGHT JOIN pfo_bhi_old o ON o.id_membid_claimid_c = N.id_membid_claimid_c

Problems with my WHERE clause (SQL)

I'm trying to write a query that returns the following columns:
owner_id,
number_of_concluded_bookings,
number_of_declined_bookings,
number_of_inquiries
However, the problem is that my WHERE clause messes up the query because I am querying the same table. Here is the code:
SELECT owner_id,
Count(*) AS number_of_cancelled_bookings
FROM bookings
WHERE state IN ('cancelled')
GROUP BY owner_id
ORDER BY 1;
It's easy to retrieve the columns individually, but I want all of them. Say I wanted number_of_concluded_bookings as well, that would mean I'd have to alter the WHERE clause ...
Help is greatly appreciated!
Consider conditional aggregations:
SELECT owner_id,
SUM(CASE WHEN state='concluded' THEN 1 ELSE 0 END) AS number_of_concluded_bookings,
SUM(CASE WHEN state='cancelled' THEN 1 ELSE 0 END) AS number_of_cancelled_bookings,
SUM(CASE WHEN state='declined' THEN 1 ELSE 0 END) AS number_of_declined_bookings,
SUM(CASE WHEN state='inquiries' THEN 1 ELSE 0 END) AS number_of_inquiries
FROM bookings
GROUP BY owner_id

efficiently compute percentages in hive or sql

SELECT
(CASE WHEN tag=FRAUD THEN 0
ELSE 1 END) fraud_tag,
COUNT(DISTINCT account_id) AS distinct_account_count
FROM fraud_tags a
GROUP BY
(CASE WHEN c.name='riskclass_NotFraud' THEN 0
ELSE 1 END)
RESULT
fraud_tag distinct_account_count
0 100
1 500
Now I want to compute fraud_percentages, number of distinct accounts with fraud_tag=0 over total number of accounts. I have to do it two steps. Any suggestions to make it more efficient?
The easiest way is to do this with the values in one row:
SELECT COUNT(DISTINCT case when tag = FRAUD then account_id end) as distinct_fraud,
COUNT(DISTINCT case when tag = FRAUD then NULL else account_id end) as distinct_notfraud,
(COUNT(DISTINCT case when tag = FRAUD then account_id end)*1.0/count(distinct account_id)
) as fraud_rate
FROM fraud_tags ft;

SQL - How to count yes and no items

I am using SQL Server 2008.
I am writing a query where I need to count how many yesses (1) and how many nos (0 or NULL).
SELECT B.Brand, B.BrandID, COUNT(M.ModelID) AS TotalModels
FROM Brands B LEFT JOIN Models M ON B.BrandID = M.BrandID
GROUP BY B.Brand, B.BrandID
ORDER BY B.Brand
There's another field called IsBestValue in the Model table that will be NULL, 0, or 1. I want to be able to count TotalBestValueYes, TotalBestValueNo, and TotalBestValueNULL.
A long time ago...I use to use something like ..
(CASE WHEN IsBestValue = 1 END) // ADD ONE TO TotalBestValueYes
(CASE WHEN IsBestValue = 0 END) // ADD ONE TO TotalBestValueNo
(CASE WHEN IsBestValue = NULL END) // ADD ONE TO TotalBestValueNULL
Is using CASE in the fashion a good idea? Bad idea? Overkill?
Is there are better way to count yesses and nos and NULLs?
I don't see anything wrong with using the CASE like that if this is what you mean.
SELECT B.Brand,
B.BrandID,
COUNT(M.ModelID) AS TotalModels,
SUM((CASE WHEN M.IsBestValue = 1 THEN 1 ELSE 0 END)) TotalBestValueYes,
SUM((CASE WHEN M.IsBestValue = 0 THEN 1 ELSE 0 END)) TotalBestValueNo,
SUM((CASE WHEN M.IsBestValue IS NULL THEN 1 ELSE 0 END)) TotalBestValueNull,
FROM Brands B
LEFT JOIN Models M ON B.BrandID = M.BrandID
GROUP BY B.Brand,
B.BrandID
ORDER BY B.Brand
The is the perfect case for CASE (pun intended).
CASE is a very well optimized operator and was designed for just such a usage scenario.
The normal syntax for a conditional count is along the lines of:
SELECT SUM (CASE WHEN x=y then 1 ELSE 0 END) as 'XequalsY'
...
select count(nullif(IsBestValue, 0)) as TotalBestValueYes,
count(nullif(IsBestValue, 1)) as TotalBestValueNo,
count(case when IsBestValue is null then 1 end) as TotalBestValueNull