Alternative to executing Netezza SQL subquery multiple times in Case When? - sql

SELECT DISTINCT lr.id,
lr.dept,
lr.name
Case When lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value1%') AND lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value2%') Then 1
Else 0
End As both_panels,
Case When lr.id IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value1%') AND lr.id NOT IN (SELECT id FROM RESULTS WHERE PANEL_FLAG LIKE '%value2%') Then 1
Else 0
End As only_value1_panel,
FROM RESULTS lr
I have simplified this, in reality I actually need many more Case When statements and it's performance nightmare because the subquery executes each time. Is there a more performant way to do this?
I tried creating Common Table Expressions and Temp Tables before the query, but the way I was doing it (replacing the subquery statements with a SELECT from the CTE or the Temp Table) doesn't seem to make any performance difference as it is still executing a query each time.

This would usually be handled with conditional aggregation. I think this captures your logic:
SELECT lr.id, lr.dept, lr.name,
LEAST(MAX(Case When PANEL_FLAG LIKE '%value1%' THEN 1 ELSE 0 END),
MAX(Case When PANEL_FLAG LIKE '%value2%' THEN 1 ELSE 0 END)
) As both_panels,
LEAST(MAX(Case When PANEL_FLAG LIKE '%value1%' THEN 1 ELSE 0 END),
MAX(Case When PANEL_FLAG LIKE '%value2%' THEN 0 ELSE 1 END)
) as only_value1_panel,
FROM RESULTS lr
GROUP BY lr.id, lr.dept, lr.name

If the sub queries for the ‘in’ lists are with a ‘constant’ whereclause I would consider populating them in a script (commas and all), then inject them into a ‘sql template’ file and run it.
That will run very fast.
Of course the resulting lists should be fairly small (less than 60KB for all lists in total) otherwise the sql statement will become too large.

Related

sql count query with case statement

I have to execute a query from three tables avg_salary, person and emails. This simple sql query works fine.
SELECT avg_salary.id, COUNT(emails.message_from) AS email_PGA
FROM avg_salary, person, emails
WHERE person.works_in = avg_salary.id
AND person.email_address = emails.message_from
AND person.salary::numeric > avg_salary.avg
GROUP BY avg_salary.id
But I want to add another column email_PLA with the condition when
person.salary::numeric < avg_salary.avg. I can do that by joining the whole query again. But I want to use CASE in this situation. And even after trying so many times I can't get the syntax right.
I assume you need another count?
You would need something like:
SUM(CASE WHEN (person.salary::numeric < avg_salary.avg) THEN 1 ELSE 0 END) AS email_PGA
You can do conditional aggregation by using case expression and always use explicit JOIN syntax
SELECT avg_salary.id,
SUM(CASE WHEN p.salary::numeric > asal.avg THEN 1 ELSE 0 END) AS email_PGA,
SUM(CASE WHEN p.salary::numeric < asal.avg THEN 1 ELSE 0 END) AS email_PLA
FROM avg_salary asal,
INNER JOIN person p on p.works_in = asal.id
INNER JOIN emails e on e.message_from = p.email_address
--WHERE p.salary::numeric > asal.avg
GROUP BY avg_salary.id;
If you need different columns on specific conditions you have to do different SQL queries.

SQL: See how many items in a list, return a hit in a table (using LIKE)

For the sake of argument, say I'm looking for first names in a table. And, the table is irregular - no pattern, some people have 2 names, some 3, some it's all run together, whatever.
SELECT
COUNT(*)
FROM
NAME_TABLE N
WHERE
--the list
N.NAME LIKE 'John%'
OR N.NAME LIKE 'Mich%'
OR N.NAME LIKE 'Rob%'
The above would give how many hits - maybe 70 or 70000, who knows. But, what I really want is a response from 0-3.
i.e., how many of my search terms, get a hit in the table.
I could just run the query and pull the entire table of hits, then
use Excel to get the answer.
Or if in a more typical programming language, I could run a loop
that has an x + 1 in it.
But is there a way to do this directly in an SQL query? Specifically T-SQL I guess...very specifically SQL Server 2008, but I'm kinda curious in general.
Your question is not very clear, at least for me, but my magic crystall ball tells me, that eventually you are looking for something like this:
USE master;
GO
SELECT SUM(CASE WHEN PATINDEX('sys%',[name])>0 THEN 1 ELSE 0 END) AS CountOfSys
,SUM(CASE WHEN PATINDEX('plan%',[name])>0 THEN 1 ELSE 0 END) AS CountOfPlan
,SUM(CASE WHEN PATINDEX('spt_%',[name])>0 THEN 1 ELSE 0 END) AS CountOfSpt
FROM sys.objects;
Another approach was to do a
SELECT 'sys%' AS Pattern, COUNT(*) AS CountPattern
FROM ... WHERE [name] LIKE 'sys%'
UNION ALL
SELECT 'plan%',COUNT(*)
...
UNION ALL
...
This would return a list of all your counts in tabular form.
The third chance was to place all your search patterns into a table and use this table in a CROSS JOIN (similar idea then the UNION approach, but more flexible and more generic):
USE master;
GO
DECLARE #tblPattern TABLE(Pattern VARCHAR(100));
INSERT INTO #tblPattern VALUES('sys%'),('plan%'),('spt_%');
SELECT p.Pattern
,SUM(CASE WHEN PATINDEX(p.Pattern,o.[name])>0 THEN 1 ELSE 0 END) AS CountPattern
FROM sys.objects AS o
CROSS JOIN #tblPattern AS p
GROUP BY p.Pattern
You can use a case statement for each name that resolves to 1 if the name exists in the table or 0 if it does not. Then just add them together and the result will be 0-3 i.e. the number of names that exist in the table.
select
case when exists
select 1 from name_table
where name like 'John%'
then 1 else 0 end
+
case when exists
select 1 from name_table
where name like 'Mich%'
then 1 else 0 end
+
case when exists
select 1 from name_table
where name like 'Rob%'
then 1 else 0 end

Most optimized way to get column totals in SQL Server 2005+

I am creating some reports for an application to be used by various states. The database has the potential to be very large. I would like to know which way is the best way to get column totals.
Currently I have SQL similar to the following:
SELECT count(case when prg.prefix_id = 1 then iss.id end) +
count(case when prg.prefix_id = 2 then iss.id end) as total,
count(case when prg.prefix_id = 1 then iss.id end) as c1,
count(case when prg.prefix_id = 2 then iss.id end) as c2
FROM dbo.TableName
WHERE ...
As you can see, the columns are in there twice. In one instance, im adding them and showing the total, in the other im just showing the individual values which is required for the report.
This is a very small sample of the SQL, there are 20+ columns and w/i those columns 4 or more of them are being summed at times.
I was thinking of declaring some #Parameters and setting each of the columns equal to a #Parameter, then I could just add up which ever #Parameters I needed to show the column totals, IE: SET #Total = #c1 + #c2
But, does the SQL Server engine even care the columns are in there multiple times like that? Is there a better way of doing this?
Any reason this isn't done as
select prg.prefix_id, count(1) from tablename where... group by prg.prefix_id
It would leave you with a result set of the prefix_id and the count of rows for each prefix_ID...might be preferential over a series of count(case) statements, and I think it should be quicker, but I can't confirm for sure.
I would use a subquery before resorting to #vars myself. Something like this:
select c1,c2,c1+c1 as total from
(SELECT
count(case when prg.prefix_id = 1 then iss.id end) as c1,
count(case when prg.prefix_id = 2 then iss.id end) as c2
FROM dbo.TableName
WHERE ... ) a
Use straight SQL if you can before resorting to T-SQL procedure logic. Rule of thumb if you can do it in SQL do it in SQL. If you want to emulate static values with straight SQL try a inline view like this:
SELECT iv1.c1 + iv1.c2 as total,
iv1.c1,
iv1.c2
FROM
(
SELECT count(case when prg.prefix_id = 1 then iss.id end) as c1,
count(case when prg.prefix_id = 2 then iss.id end) as c2
FROM dbo.TableName
WHERE ...
) AS iv1
This way you logically are getting the counts once and can compute values based on those counts. However I think SQL Server is smart enough to not have to scan for the count n number of times so I don't know that your plan would differ from the SQL I sent and the SQL you have.

Grouping data in the select statement

I have huge data which needs to be classifed in to different groups while retrieving. Each group has a different condition. I don't want to retrieve them separately. I want to know the number of items in each group using a single sql statement.
For example, the pseudo code will be like this:
Select count(IssueID) as Issue1_Count if(condition1),
count(IssueID) as Issue2_Count if(condition2),
count(IssueID) as Issue3_Count if(condition3)
From table1, table2, tabl3
where common_condition1 and common_Condition2;
Can somebody help me in making an Oralce query for this...
Put it like this:
SELECT
SUM(CASE WHEN condition1 THEN 1 ELSE 0 END) as Issue1_Count,
SUM(CASE WHEN condition2 THEN 1 ELSE 0 END) as Issue2_Count,
SUM(CASE WHEN condition3 THEN 1 ELSE 0 END) as Issue3_Count,
FROM
table1, table2, tabl3
WHERE
common_condition1 and common_Condition2;
Oracle's CASE statement should help you here. Have a look at this: http://www.dba-oracle.com/t_case_sql_clause.htm
There are limits though, so I'm not 100% positive you can do exactly what you have here using them.

Optimize help for sql query

We've got some SQL code I'm trying to optimize. In the code is a view that is rather expensive to run. For the sake of this question, let's call it ExpensiveView. On top of the view there is a query that joins the view to itself via a two sub-queries.
For example:
select v1.varCharCol1, v1.intCol, v2.intCol from (
select someId, varCharCol1, intCol from ExpensiveView where rank=1
) as v1 inner join (
select someId, intCol from ExpensiveView where rank=2
) as v2 on v1.someId = v2.someId
An example result set:
some random string, 5, 10
other random string, 15, 15
This works, but it's slow since I'm having to select from ExpensiveView twice. What I'd like to do is use a case statement to only select from ExpensiveView once.
For example:
select someId,
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
I could then group the above results by someId and get almost the same thing as the first query:
select sum(rank1IntCol), sum(rank2Intcol)
from ( *the above query* ) SubQueryData
group by someId
The problem is the varCharCol1 that I need to get when the rank is 1. I can't use it in the group since that column will contain different values when rank is 1 than it does when rank is 2.
Does anyone have any solutions to optimize the query so it only selects from ExpensiveView once and still is able to get the varchar data?
Thanks in advance.
It's hard to guess since we don't see your view definition, but try this:
SELECT MIN(CASE rank WHEN 1 THEN v1.varCharCol1 ELSE NULL END),
SUM(CASE rank WHEN 1 THEN rank1IntCol ELSE 0 END),
SUM(CASE rank WHEN 2 THEN rank2IntCol ELSE 0 END)
FROM query
GROUP BY
someId
Note that in most cases for the queries like this:
SELECT *
FROM mytable1 m1
JOIN mytable1 m2
ON …
the SQL Server optimizer will just build an Eager Spool (a temporary index), which will later be used for searching for the JOIN condition, so probably these tricks are redundant.
select someId,
case when rank = 1 then varCharCol1 else '_' as varCharCol1
case when rank = 1 then intCol else 0 end as rank1IntCol,
case when rank = 2 then intCol else 0 end as rank2IntCol
from ExpensiveView where rank in (1,2)
then use min() or max in the enclosing query