Optimize help for sql query - sql

We've got some SQL code I'm trying to optimize. In the code is a view that is rather expensive to run. For the sake of this question, let's call it ExpensiveView. On top of the view there is a query that joins the view to itself via a two sub-queries.
For example:
select v1.varCharCol1, v1.intCol, v2.intCol from (
select someId, varCharCol1, intCol from ExpensiveView where rank=1
) as v1 inner join (
select someId, intCol from ExpensiveView where rank=2
) as v2 on v1.someId = v2.someId
An example result set:
some random string, 5, 10
other random string, 15, 15
This works, but it's slow since I'm having to select from ExpensiveView twice. What I'd like to do is use a case statement to only select from ExpensiveView once.
For example:
select someId,
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
I could then group the above results by someId and get almost the same thing as the first query:
select sum(rank1IntCol), sum(rank2Intcol)
from ( *the above query* ) SubQueryData
group by someId
The problem is the varCharCol1 that I need to get when the rank is 1. I can't use it in the group since that column will contain different values when rank is 1 than it does when rank is 2.
Does anyone have any solutions to optimize the query so it only selects from ExpensiveView once and still is able to get the varchar data?
Thanks in advance.

It's hard to guess since we don't see your view definition, but try this:
SELECT MIN(CASE rank WHEN 1 THEN v1.varCharCol1 ELSE NULL END),
SUM(CASE rank WHEN 1 THEN rank1IntCol ELSE 0 END),
SUM(CASE rank WHEN 2 THEN rank2IntCol ELSE 0 END)
FROM query
GROUP BY
someId
Note that in most cases for the queries like this:
SELECT *
FROM mytable1 m1
JOIN mytable1 m2
ON …
the SQL Server optimizer will just build an Eager Spool (a temporary index), which will later be used for searching for the JOIN condition, so probably these tricks are redundant.

select someId,
case when rank = 1 then varCharCol1 else '_' as varCharCol1
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
then use min() or max in the enclosing query

Related

GROUP BY with COUNT condition

I have a result set such as:
Code No
1 *
1 -
1 4
1
1
Now i basically want a query that has 2 columns, a count for the total amount and a count for those that dont have numbers.
Code No_Number Total
1 4 5
Im assuming this needs a group by and a count but how can i do the 2 different counts in a query like this?
This is what i had so far, but i am a bit stuck with the rest of it
SELECT CODE,NO
Sum(Case when No IN ('*', '-', '') then 1 else 0 end) as Count
I think you basically just need GROUP BY:
SELECT CODE,
SUM(Case when No IN ('*', '-', '') then 1 else 0 end) as Count,
COUNT(*) as total
FROM t
GROUP BY CODE;
Well, this took a moment :-), however here it is...I have used a CASE statement to create and populate the No_Number column; the database gives the row in the original table a value of 1 if the original table value is a number or gives it a NULL and discards it from the COUNT if not. Then when it makes the count it is only recognising values which were originally numbers and ignoring everything else..
If the result set is in a table or temp table:
SELECT Code,
COUNT(CASE WHEN [No] NOT LIKE '[0-9]' THEN 1 ELSE NULL END) AS No_Number,
COUNT(Code) AS Total
FROM <tablename>
GROUP BY Code
If the result set is the product of a previous query you can use a CTE (Common Table Expression) to arrive at the required result or you could include parts of this code in the earlier query.

Not a GROUP BY Expression & aggregate functions

I was wondering why, for this query that I have right here, why I have to use the MAX() aggregate function for the case statements, and not just jump directly into the case statement:
select
bank_id,
tran_branch_code,
acct_sol_id,
acct_sol_name,
transaction_date,
gl_date,
transaction_id,
account_number,
max(case
when cast(substr(GLSH_Code,0,1) as int) >= 1
and cast(substr(GLSH_Code,0,1) as int) <= 5
and trans_type = 'D'
then (trans_amount)
--else 0
end ) Ind_Part_Tran_Dr_RBU,
max(case
when cast(substr(GLSH_Code,0,1) as int) >= 1
and cast(substr(GLSH_Code,0,1) as int) <= 5
and trans_type = 'C'
then (trans_amount)
--else 0
end) Ind_Part_Tran_Cr_RBU,
max(case
when cast(substr(GLSH_Code,0,1) as int) = 0
or (cast(substr(GLSH_Code,0,1) as int) >= 6
and cast(substr(GLSH_Code,0,1) as int) <= 9)
and trans_type = 'D'
then (trans_amount)
--else 0
end)Ind_Part_Tran_Dr_FCDU,
max(case
when cast(substr(GLSH_Code,0,1) as int) = 0
or (cast(substr(GLSH_Code,0,1) as int) >= 6
and cast(substr(GLSH_Code,0,1) as int) <= 9)
and trans_type = 'C'
then (trans_amount)
--else 0
end) Ind_Part_Tran_Cr_FCDU,
ccy_alias,
ccy_name,
acct_currency,
tran_currency
from
(
SELECT
DTD.BANK_ID,
DTD.SOL_ID Acct_Sol_ID, --Account Sol ID
dtd.br_code Tran_branch_code, -- branch code of the transacting branch
sol.sol_desc Acct_sol_name, -- name/description of SOL
DTD.TRAN_DATE Transaction_Date, --TransactionDate
DTD.GL_DATE GL_Date, --GL Date
TRIM(DTD.TRAN_ID) Transaction_ID, --Transaction ID
DTD.GL_SUB_HEAD_CODE GLSH_Code, --GLSH Code
dtd.tran_amt trans_amount,
GAM.ACCT_CRNCY_CODE Acct_Currency, --Account Currency
DTD.TRAN_CRNCY_CODE Tran_Currency, --Transaction Currency
cnc.crncy_alias_num ccy_alias,
cnc.crncy_name ccy_name,
GAM.FORACID Account_Number, --Account Number
DTD.TRAN_PARTICULAR Transaction_Particulars, --Transaction Particulars
DTD.CRNCY_CODE DTD_CCY,
--GSH.CRNCY_CODE GSH_CCY,
DTD.PART_TRAN_TYPE Transaction_Code,
--'Closing_Balance',
DTD.PSTD_USER_ID PostedBy,
CASE WHEN DTD.REVERSAL_DATE IS NOT NULL
THEN 'Y' ELSE 'N' END Reversal,
TRIM(DTD.TRAN_ID) REV_ORIG_TRAN_ID,
--OTT.REF_NUM OAP_REF_NUM,
'OAP_SETTLEMENT',
'RATE_CODE',
EAB.EOD_DATE
FROM TBAADM.DTD
LEFT OUTER JOIN TBAADM.GAM ON DTD.ACID = GAM.ACID AND DTD.BANK_ID = GAM.BANK_ID
LEFT OUTER JOIN TBAADM.EAB ON DTD.ACID = EAB.ACID AND DTD.BANK_ID = EAB.BANK_ID AND EAB.EOD_DATE = '24-MAR-2014'
left outer join tbaadm.sol on dtd.sol_id = sol.sol_id and dtd.bank_id = sol.bank_id
left outer join tbaadm.cnc on dtd.tran_crncy_code = cnc.crncy_code
WHERE DTD.BANK_ID = 'CBC01'
AND GAM.ACCT_OWNERSHIP = 'O'
AND GAM.DEL_FLG != 'Y'
--AND DTD.TRAN_DATE = '14-APR-2014'
AND DTD.TRAN_DATE between '01-APR-2014' and '21-APR-2014'
--and foracid in ('50010112441109','50010161635051')
--and DTD.SOL_ID = '5001'
and GAM.ACCT_CRNCY_CODE = 'USD'
)
group by
bank_id,
tran_branch_code,
acct_sol_id,
acct_sol_name,
transaction_date,
gl_date,
transaction_id,
account_number,
ccy_alias,
ccy_name,
Acct_Currency,
Tran_Currency
Because If I would remove the MAX(), I'd get the "Not a GROUP BY Expression", and Toad points me to the first occurrence of the GLSH_Code. Based from other websites, the cure for this is really adding the MAX() function. I would just like to understand why should I use that particular function, what it exactly does in the query, stuff like that.
EDIT: inserted the rest of the code.
I know for sure what MAX() does, it returns the largest value in an expression. But in this case, I can't seem to figure out exactly what that largest value is that the function is attempting to return.
The GROUP BY statement declares that all columns returned in the SELECT should be aggregated, but that you want to separate the results by those listed in the GROUP BY.
This means we have to use aggregate functions like MIN, MAX, AVG, SUM, etc. on any column that is NOT listed in the GROUP BY.
It's about telling the SQL engine what the expected results should be when there is more than one option.
In a simple example, we have a table with three columns:
PrimaryId SubId RowValue
1 1 1
2 1 2
3 2 4
4 2 8
And an SQL like the following (which is invalid):
SELECT SubId, RowValue
FROM SampleTable
GROUP BY SubId
We know we want the distinct SubId's (because of the GROUP BY), but we don't know what RowValue should be when we aggregate the results.
SubId RowValue
1 ?
2 ?
We have to be explicit in our query, and indicate what RowValue should be as the results can vary.
If we choose MIN(RowValue) we see:
SubId RowValue
1 1
2 4
If we choose MAX(RowValue) we see:
SubId RowValue
1 2
2 8
If we choose SUM(RowValue) we see:
SubId RowValue
1 3
2 12
Without being explicit there's a high likelihood that the results will be wrong, so our SQL engine of choice protects us from ourselves by enforcing the need for aggregate functions.
You have group by clause at the end on all the columns except for Ind_Part_Tran_Dr_RBU, Ind_Part_Tran_Cr_RBU, Ind_Part_Tran_Dr_FCDU, Ind_Part_Tran_Cr_FCDU. In this case oracle wants you to tell what to do with these columns, i.e. based on which function it has to aggregate them for every group it finds.

Stuck on a slightly tricky query, trying to ignore multiple results based on a single field

Here is a simple database representation of what I'm stuck on:
IDNumber TimeSpent Completed
1 0 No
1 0 No
1 2 No
2 0 No
3 0 No
I'm currently querying the database as such...
"SELECT Distinct (IDNumber) AS Info FROM TestTable
ORDER BY WorkOrderNumber";
And it gives me back the results
1
2
3
Which is expected.
Now, I'd like to adjust it to where any instance of an IDNumber that have TimeSpent != 0 or Completed != No means that the IDNumber isn't grabbed at all. So for example in the database given, since TimeSpent = 2, I don't want IDNumber 1 to be returned in my query at all.
My first instinct was to jump to something like this...
"SELECT Distinct (IDNumber) AS Info FROM TestTable
WHERE TimeSpent='0' AND Completed='No'
ORDER BY WorkOrderNumber";
But obviously that wouldn't work. It would correctly ignore one of the IDNumber 1's but since two others still satisfy the WHERE clause it would still return 1.
Any pointers here?
SELECT DISTINCT IDNumber
FROM TestTable
WHERE IDNumber NOT IN
(SELECT IDNUmber FROM TestTable WHERE TimeSPent <> 0 OR Completed <> 'No')
You can do this with an aggregation, using a having clause:
select IDNumber
from TestTable
group by IDNumber
having sum(case when TimeSpent = 0 then 1 else 0 end) = 0 and
sum(case when Completed = 'No' then 1 else 0 end) = 0
The having clause is counting the number of rows that meet each condition. The = 0 is simply saying that there are no matches.
I prefer the aggregation method because it is more flexible in terms of the conditions that you can set on the groups.

How to do a SUM() inside a case statement in SQL server

I want to add some calculation inside my case statement to dynamically create the contents of a new column but I get the error:
Column 'Test1.qrank' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
This is the code I'm working on
case
when test1.TotalType = 'Average' then Test2.avgscore
when test1.TotalType = 'PercentOfTot' then (cnt/SUM(test1.qrank))
else cnt
end as displayscore
I did try to group but it didn't work.
Any hints?
The error you posted can happen when you're using a clause in the GROUP BY statement without including it in the select.
Example
This one works!
SELECT t.device,
SUM(case when transits.direction = 1 then 1 else 0 end) ,
SUM(case when transits.direction = 0 then 1 else 0 end) from t1 t
where t.device in ('A','B') group by t.device
This one not (omitted t.device from the select)
SELECT
SUM(case when transits.direction = 1 then 1 else 0 end) ,
SUM(case when transits.direction = 0 then 1 else 0 end) from t1 t
where t.device in ('A','B') group by t.device
This will produce your error complaining that I'm grouping for something that is not included in the select
Please, provide all the query to get more support.
You could use a Common Table Expression to create the SUM first, join it to the table, and then use the WHEN to to get the value from the CTE or the original table as necessary.
WITH PercentageOfTotal (Id, Percentage)
AS
(
SELECT Id, (cnt / SUM(AreaId)) FROM dbo.MyTable GROUP BY Id
)
SELECT
CASE
WHEN o.TotalType = 'Average' THEN r.avgscore
WHEN o.TotalType = 'PercentOfTot' THEN pt.Percentage
ELSE o.cnt
END AS [displayscore]
FROM PercentageOfTotal pt
JOIN dbo.MyTable t ON pt.Id = t.Id
If you're using SQL Server 2005 or above, you can use the windowing function SUM() OVER ().
case
when test1.TotalType = 'Average' then Test2.avgscore
when test1.TotalType = 'PercentOfTot' then (cnt/SUM(test1.qrank) over ())
else cnt
end as displayscore
But it'll be better if you show your full query to get context of what you actually need.

Order by Maximum condition match

Please help me to create a select query which contains 10 'where' clause and the order should be like that:
the results should be displayed in order of most keywords(where conditions) matched down to least matched.
NOTE: all 10 condition are with "OR".
Please help me to create this query.
i am using ms-sql server 2005
Like:
Select *
from employee
where empid in (1,2,4,332,434)
or empname like 'raj%'
or city = 'jodhpur'
or salary >5000
In above query all those record which matches maximum conditions should be on top and less matching condition record should be at bottom.
SELECT *
FROM (SELECT (CASE WHEN cond1 THEN 1 ELSE 0 END +
CASE WHEN cond2 THEN 1 ELSE 0 END +
CASE WHEN cond2 THEN 1 ELSE 0 END +
...
CASE WHEN cond10 THEN 1 ELSE 0 END
) AS numMatches,
other_columns...
FROM mytable
) xxx
WHERE numMatches > 0
ORDER BY numMatches DESC
EDIT: This answer was posted before the question was modified with a concrete example. Marcelo's solution addresses the actual problem. On the other hand, my answer was giving priority to matches of specific fields.
You may want to try something like the following, using the same expressions in the ORDER BY clause as in your WHERE clause:
SELECT *
FROM your_table
WHERE field_1 = 100 OR
field_2 = 200 OR
field_3 = 300
ORDER BY field_1 = 100 DESC,
field_2 = 200 DESC,
field_3 = 300 DESC;
I've recently answered a similar question on Stack Overflow which you might be interested in checking out:
Is there a SQL technique for ordering by matching multiple criteria?
There are many options/answers possible. Best answer depends on size of the data, non-functional requirements, etc.
That said, what I would do is something like this (easy to read / debug):
Select * from
(Select *, iif(condition1 = bla, 1, 0) as match1, ..... , match1+match2...+match10 as totalmatchscore from sourcetable
where
condition1 = bla or
condition2 = bla2
....) as helperquery
order by helperquery.totalmatchscore desc
I could not get this to work for me on Oracle.
If using oracle, then this Order by Maximum condition match is a good solution.
Utilizes the case when language feature