How do I make this query fast? It reaches time out anytime I run it in the SQL Server database - sql

SELECT firstpartno, nOccurrence, nMale, nFemale, COUNT(nMale) / CAST
((SELECT SUM(nOccurrence) AS Expr1
FROM (SELECT COUNT(dbo.vw_Tally1.nMale) AS nOccurrence
FROM dbo.vw_Split4) AS SumTally) AS decimal) AS nMProportion, COUNT(nFemale) / CAST
((SELECT SUM(nOccurrence) AS Expr1
FROM (SELECT COUNT(dbo.vw_Tally1.nFemale) AS nOccurrence
FROM dbo.vw_Split4 AS vw_Split4_1) AS SumTally_1) AS decimal) AS nFProportion
FROM dbo.vw_Tally1
GROUP BY firstpartno, nOccurrence, nMale, nFemale

If i understood your question here's the solution for you :
SELECT
firstpartno
,nOccurrence
,nMale
,nFemale
,CASE WHEN SUM_nOccurrence.SUM_nOccurrenceMale = 0
THEN 0
ELSE COUNT(nMale)/SUM_nOccurrence.SUM_nOccurrenceMale
END AS nMProportion
,CASE WHEN SUM_nOccurrence.nOccurrenceFemale = 0
THEN 0
ELSE COUNT(nFemale)/SUM_nOccurrence.nOccurrenceFemale
END AS nFProportion
FROM
dbo.vw_Tally1
LEFT JOIN
(SELECT
CAST(SUM(nOccurrenceMale)AS decimal) AS SUM_nOccurrenceMale
,CAST(SUM(nOccurrenceFemale)AS decimal) AS SUM_nOccurrenceFemale
FROM (SELECT
COUNT(dbo.vw_Tally1.nMale) AS nOccurrenceMale
,COUNT(dbo.vw_Tally1.nFemale) AS nOccurrenceFemale
FROM dbo.vw_Split4 ) AS SumTally) SUM_nOccurrence
ON 1=1
GROUP BY
firstpartno
,nOccurrence
,nMale
,nFemale
I hope this will help you
Good Luck :)

The query looks dubious to say the least. You select all records of table vw_Tally1. For each of these records you do the following:
select COUNT(vw_Tally1.nMale) from table vw_Split4. This is COUNT(*) of vw_Split4 when vw_Tally1.nMale is not null, otherwise it is null.
Then you sum this value. Which makes no sense, as the sum of a value is the value itself.
You do the same for nFemale.
At last you group by (firstpartno, nOccurrence, nMale, nFemale) and use the values found so strangly to calculate something. As you don't aggregate the found values, you get a random match per group. I.e. the dbms takes one of the matching records. As nMale and nFemale are grouping columns, the values are constant for all records of the group. So no big problem, but a lot of useless work.
So to speed this up, first think of what you want to select actually. This looks like to become a very simple select statement in the end. We can help you, if you tell us what your tables contain, what result set you are after, what does nMale and nFemale stand for, and what are the primary keys or unique columns of the tables involved.

Related

Alter a existing SQL statement, to give an additional column of data, but to not affect performance, so best approach

In this query, I want to add a new column, which gives the SUM of a.VolumetricCharge, but only where PremiseProviderBillings.BillingCategory = 'Water'. But i don't want to add it in the obvious place since that would limit the rows returned, I only want it to get the new column value
SELECT b.customerbillid,
-- Here i need SUM(a.VolumetricCharge) but where a.BillingCategory is equal to 'Water'
Sum(a.volumetriccharge) AS Volumetric,
Sum(a.fixedcharge) AS Fixed,
Sum(a.vat) AS VAT,
Sum(a.discount) + Sum(deferral) AS Discount,
Sum(Isnull(a.estimatedconsumption, 0)) AS Consumption,
Count_big(*) AS Records
FROM dbo.premiseproviderbillings AS a WITH (nolock)
LEFT JOIN dbo.premiseproviderbills AS b WITH (nolock)
ON a.premiseproviderbillid = b.premiseproviderbillid
-- Cannot add a where here since that would limit the results and change the output
GROUP BY b.customerbillid;
Bit of a tricky one, as what you're asking for will definitely affect performance (your asking SQL Server to do more work after all!).
However, we can add a column to your results which performs a conditional sum so that it does not affect the result of the other columns.
The answer lies in using a CASE expression!
Sum(
CASE
WHEN PremiseProviderBillings.BillingCategory = 'Water' THEN
a.volumetriccharge
ELSE
0
END
) AS WaterVolumetric

Compare value in each row to average of the column (SQL)

I am trying to create a view where the score for each row is compared to the average for that column, so that I can easily identify records by their rough "grade". Simplified code:
select recordID,
case when table.ColumnA>avg(all table.ColumnA) then 'Hard' else 'Easy' end as Difficulty,
from table
group by recordID, ColumnA
I've tried various combinations of this and the case formula keeps defaulting to 'else', which on investigation seems to be that every calculated value is coming out as 0, as both the row value and average value are being deemed the same.
I have a feeling the answer has something to do with Rollup, either on this table or the source table, but the syntax required is beyond me.
Anyone?
You want a window function:
select recordID,
(case when table.ColumnA > avg(table.ColumnA) over ()
then 'Hard' else 'Easy'
end) as Difficulty
from table;

Grouping a percentage calculation in postgres/redshift

I keep running in to the same problem over and over again, hoping someone can help...
I have a large table with a category column that has 28 entries for donkey breed, then I'm counting two specific values grouped by each of those categories in subqueries like this:
WITH totaldonkeys AS (
SELECT donkeybreed,
COUNT(*) AS total
FROM donkeytable1
GROUP BY donkeybreed
)
,
sickdonkeys AS (
SELECT donkeybreed,
COUNT(*) AS totalsick
FROM donkeytable1
JOIN donkeyhealth on donkeytable1.donkeyid = donkeyhealth.donkeyid
WHERE donkeyhealth.sick IS TRUE
GROUP BY donkeybreed
)
,
It's my goal to end up with a table that has primarily the percentage of sick donkeys for each breed but I always end up struggling like hell with the problem of not being able to group by without using an aggregate function which I cannot do here:
SELECT (CAST(sickdonkeys.totalsick AS float) / totaldonkeys.total) * 100 AS percentsick,
totaldonkeys.donkeybreed
FROM totaldonkeys, sickdonkeys
GROUP BY totaldonkeys.donkeybreed
When I run this I end up with 28 results for each breed of donkey, one correct I believe but obviously hundreds of useless datapoints.
I know I'm probably being really dumb here but I keep hitting in to this same problem again and again with new donkeydata, I should obviously be structuring the whole thing a new way because you just can't do this final query without an aggregate function, I think I must be missing something significant.
You can easily count the proportion that are sick in the donkeyhealth table
SELECT d.donkeybreed,
AVG( (dh.sick)::int ) AS proportion_sick
FROM donkeytable1 d JOIN
donkeyhealth dh
ON d.donkeyid = dh.donkeyid
GROUP BY d.donkeybreed

SQL Oracle: Trying to pull a count with AND operators, New and needs experienced eyes

I am new to SQL and have had pretty good luck figuring things out thus far but I am missing something in this query:
The question is how to return a distinct count from two columns using another column and the criteria if the value is greater than 0.
I have tried IF and AND operators (My current query returns a 0 not an error, and it works when only using one .shp criteria)
select count (distinct ti.TO_ADDRESS)
from ti
where ti.input_id = 'xxx_029_01z_c_zzzzbab_ecrm.shp'
and ti.input_id = 'xxx_030_01z_c_zzzzbab_ecrm.shp'
and ti.OPENED>0;
Thanks so much!!
I think you want two levels of aggregation:
select count(*)
from (select ti.TO_ADDRESS
from ti
where ti.input_id in ('xxx_029_01z_c_zzzzbab_ecrm.shp', 'xxx_030_01z_c_zzzzbab_ecrm.shp') and
ti.OPENED > 0
group by ti.TO_ADDRESS
having count(distinct ti.input_id) = 2 -- has both of them
) ti;

Getting a unique value from an aggregated result set

I've got an aggregated query that checks if I have more than one record matching certain conditions.
SELECT RegardingObjectId, COUNT(*) FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN ('55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58')
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
That's nice, but then there is one field in AsyncOperationBase that will be unique. Say count(*) = 3, well, AsyncOperationBaseId in AsyncOperationBase will have 3 different values since AsyncOperationBase is the table's primary key.
To be honest, I would not even know what terms and expressions to Google to find a solution.
If anyone has a solution and also, is there any words to describe what I'm looking for ? Perhaps BI people are often faced with such a requirement or something...
I could do it with an SSRS report where the report would visually do the grouping then I could expand each grouped row to get the AsyncOperationBaseId value, but simply through SQL, I can't seem to find a way out...
Thanks.
select * from [CRM_MSCRM].[dbo].[AsyncOperationBase]
where RegardingObjectId in
(
SELECT RegardingObjectId
FROM [CRM_MSCRM].[dbo].[AsyncOperationBase] a
where WorkflowActivationId IN
(
'55D9A3CF-4BB7-E311-B56B-0050569512FE',
'1BF5B3B9-0CAE-E211-AEB5-0050569512FE',
'EB231B79-84A4-E211-97E9-0050569512FE',
'F0DDF5AE-83A3-E211-97E9-0050569512FE',
'9C34F416-F99A-464E-8309-D3B56686FE58'
)
and StatusCode = 10
group by RegardingObjectId
having COUNT(*) > 1
)