Why do multiple EXISTS break a query

Why do multiple EXISTS break a query - sql

I am attempting to include a new table with values that need to be checked and included in a stored procedure. Statement 1 is the existing table that needs to be checked against, while statement 2 is the new table to check against.
I currently have 2 EXISTS conditions that function independently and produce the results I am expecting. By this I mean if I comment out Statement 1, statement 2 works and vice versa. When I put them together the query doesn't complete, there is no error but it times out which is unexpected because each statement only takes a few seconds.
I understand there is likely a better way to do this but before I do, I would like to know why I cannot seem to do multiple exists statements like this? Are there not meant to be multiple EXISTS conditions in the WHERE clause?
SELECT *
FROM table1 S
WHERE
--Statement 1
EXISTS
(
SELECT 1
FROM table2 P WITH (NOLOCK)
INNER JOIN table3 SA ON SA.ID = P.ID
WHERE P.DATE = #Date AND P.OTHER_ID = S.ID
AND
(
SA.FILTER = ''
OR
(
SA.FILTER = 'bar'
AND
LOWER(S.OTHER) = 'foo'
)
)
)
OR
(
--Statement 2
EXISTS
(
SELECT 1
FROM table4 P WITH (NOLOCK)
INNER JOIN table5 SA ON SA.ID = P.ID
WHERE P.DATE = #Date
AND P.OTHER_ID = S.ID
AND LOWER(S.OTHER) = 'foo'
)
)
EDIT: I have included the query details. Table 1-5 represent different tables, there are no repeated tables.

Too long to comment.
Your query as written seems correct. The timeout will only be able to be troubleshot from the execution plan, but here are a few things that could be happening or that you could benefit from.
Parameter sniffing on #Date. Try hard-coding this value and see if you still get the same slowness
No covering index on P.OTHER_ID or P.DATE or P.ID or SA.ID which would cause a table scan for these predicates
Indexes for the above columns which aren't optimal (including too many columns, etc)
Your query being serial when it may benefit from parallelism.
Using the LOWER function on a database which doesn't have a case sensitive collation (most don't, though this function doesn't slow things down that much)
You have a bad query plan in cache. Try adding OPTION (RECOMPILE) at the bottom so you get a new query plan. This is also done when comparing the speed of two queries to ensure they aren't using cached plans, or one isn't when another is which would skew the results.
Since your query is timing out, try including the estimated execution plan and post it for us at past the plan

I found putting 2 EXISTS in the WHERE condition made the whole process take significantly longer. What I found fixed it was using UNION and keeping the EXISTS in separate queries. The final result looked like the following:
SELECT *
FROM table1 S
WHERE
--Statement 1
EXISTS
(
SELECT 1
FROM table2 P WITH (NOLOCK)
INNER JOIN table3 SA ON SA.ID = P.ID
WHERE P.DATE = #Date AND P.OTHER_ID = S.ID
AND
(
SA.FILTER = ''
OR
(
SA.FILTER = 'bar'
AND
LOWER(S.OTHER) = 'foo'
)
)
)
UNION
--Statement 2
SELECT *
FROM table1 S
WHERE
EXISTS
(
SELECT 1
FROM table4 P WITH (NOLOCK)
INNER JOIN table5 SA ON SA.ID = P.ID
WHERE P.DATE = #Date
AND P.OTHER_ID = S.ID
AND LOWER(S.OTHER) = 'foo'
)

Related

Need help in optimizing sql query

I am new to sql and have created the below sql to fetch the required results.However the query seems to take ages in running and is quite slow. It will be great if any help in optimization is provided.
Below is the sql query i am using:
SELECT
Date_trunc('week',a.pair_date) as pair_week,
a.used_code,
a.used_name,
b.line,
b.channel,
count(
case when b.sku = c.sku then used_code else null end
)
from
a
left join b on a.ma_number = b.ma_number
and (a.imei = b.set_id or a.imei = b.repair_imei
)
left join c on a.used_code = c.code
group by 1,2,3,4,5

I would rewrite the query as:
select Date_trunc('week',a.pair_date) as pair_week,
a.used_code, a.used_name, b.line, b.channel,
count(*) filter (where b.sku = c.sku)
from a left join
b
on a.ma_number = b.ma_number and
a.imei in ( b.set_id, b.repair_imei ) left join
c
on a.used_code = c.code
group by 1,2,3,4,5;
For this query, you want indexes on b(ma_number, set_id, repair_imei) and c(code, sku). However, this doesn't leave much scope for optimization.
There might be some other possibilities, depending on the tables. For instance, or/in in the on clause is usually a bad sign -- but it is unclear what your intention really is.

SQL How to optimize insert to table from temporary table

I created procedure where dynamically collecting from various projects (Databases) some records into temporary table and from that temporary table I am inserting into table. With WHERE statement , but unfortunately when I checked with Execution plan I find out, that this query part take a lot of load. How can I optimize this INSERT part or WHERE statement ?
INSERT INTO dbo.PROJECTS_TESTS ( PROJECTID, ANOTHERTID, DOMAINID, is_test)
SELECT * FROM #temp_Test AS tC
WHERE NOT EXISTS (SELECT TOP 1 1
FROM dbo.PROJECTS_TESTS AS ps WITH (NOLOCK)
WHERE ps.PROJECTID = tC.projectId
AND ps.ANOTHERTID = tC.anotherLink
AND ps.DOMAINID = tC.DOMAINID
AND ps.is_test = tC.test_project
)

I think you'd be better served by doing a JOIN than EXISTS. Depending on the cardinality of your join condition (currently in your WHERE) you might need DISTINCT in there too.
INSERT INTO dbo.PROJECTS_TESTS ( PROJECTID, ANOTHERTID, DOMAINID, is_test)
SELECT <maybe distinct> tC.* FROM #temp_Test AS tC
LEFT OUTER JOIN FROM dbo.PROJECTS_TESTS AS ps on
ps.PROJECTID = tC.projectId
AND ps.ANOTHERTID = tC.anotherLink
AND ps.DOMAINID = tC.DOMAINID
AND ps.is_test = tC.test_project
where ps.PROJECT ID IS NULL
or something like that

Retrieve additional rows if bit flag is true

I have a large stored procedure that is used to return results for a dialog with many selections. I have a new criteria to get "extra" rows if a particular bit column is set to true. The current setup looks like this:
SELECT
CustomerID,
FirstName,
LastName,
...
FROM HumongousQuery hq
LEFT JOIN (
-- New Query Text
) newSubQuery nsq ON hq.CustomerID = nsq.CustomerID
I have the first half of the new query:
SELECT DISTINCT
c.CustomerID,
pp.ProjectID,
ep.ProductID
FROM Customers c
JOIN Evaluations e (NOLOCK)
ON c.CustomerID = e.CustomerID
JOIN EvaluationProducts ep (NOLOCK)
ON e.EvaluationID = ep.EvaluationID
JOIN ProjectProducts pp (NOLOCK)
ON ep.ProductID = pp.ProductID
JOIN Projects p
ON pp.ProjectID = p.ProjectID
WHERE
c.EmployeeID = #EmployeeID
AND e.CurrentStepID = 5
AND p.IsComplete = 0
The Projects table has a bit column, AllowIndirectCustomers, which tells me that this project can use additional customers when the value is true. As far as I can tell, the majority of the different SQL constructs are geared towards adding additional columns to the result set. I tried different permutations of the UNION command, with no luck. Normally, I would turn to a table-valued function, but I haven't been able to make it work with this scenerio.
This one has been a stumper for me. Any ideas?

So basically, you're looking to negate the need to match pp.ProjectID = p.ProjectID when the flag is set. You can do that right in the JOIN criteria:
JOIN Projects p
ON pp.ProjectID = p.ProjectID OR p.AllowIndirectCustomers = 1

Depending on the complexity of your tables, this might not work out too easily, but you could do a case statement on your bit column. Something like this:
select table1.id, table1.value,
case table1.flag
when 1 then
table2.value
else null
end as secondvalue
from table1
left join table2 on table1.id = table2.id
Here's a SQL Fiddle demo

SQL Server 2008 R2 query

I'm running the following query, but it is taking too long. Is there a way to make it faster or change the way the query is written?
Please help.
SELECT *
FROM ProductGroupLocUpdate WITH (nolock)
WHERE CmStatusFlag > 2
AND EngineID IN ( 0, 1 )
AND NOT EXISTS (SELECT DISTINCT APGV.LocationID
FROM CM_ST_ActiveProductGroupsView AS APGV WITH(nolock)
WHERE APGV.LocationID = ProductGroupLocUpdate.Locationid);

Try rewriting the query with a join
SELECT PGLU.* from ProductGroupLocUpdate PGLU WITH (NOLOCK)
LEFT JOIN CM_ST_ActiveProductGroupsView APGV WITH (NOLOCK)
ON PGLU.LocationId = APGV.LocationID
WHERE APGV.LocationID IS NULL AND CmStatusFlag>2 AND EngineID IN (0,1)
Depending on how much data is in your table, check add indexes to LocationId (in both tables), CmStatusFlag and EngineID

Joining on one of Two Tables Based on Parameter

Not sure if this can be done, but here is what I am trying to do.
I have two tables:
Table 1 is called Task and it contains all of the possible Task Names
Table 2 is called Task_subset and it contains only a subset of the Task Names included in Table 1
I have a variable called #TaskControl, that is passed in as a parameter, it either is equal to Table1 or Table2
Based on the value of the #TaskControl variable I want to join one of my Task Tables
For example:
If #TaskControl = 'Table1':
Select * From Orders O Join Task T on T.id = O.id
If #TaskControl = 'Table2):
Select * From Orders O Join Task_subset T on T.id = O.id
How would I do this, Sql Server 08

Don't overcomplicate it. Put it into a stored proc like so:
CREATE PROCEDURE dbo.MyProcedure(#TaskControl varchar(20))
AS
If #TaskControl = 'Table1'
Select * From Orders O Join Task T on T.id = O.id
ELSE If #TaskControl = 'Table2'
Select * From Orders O Join Task_subset T on T.id = O.id
ELSE SELECT 'Invalid Parameter'
Or just straight TSQL with no proc:
If #TaskControl = 'Table1'
Select * From Orders O Join Task T on T.id = O.id
ELSE If #TaskControl = 'Table2'
Select * From Orders O Join Task_subset T on T.id = O.id

Doing it exactly as you do it right now is the best way. Having one single statement that attempts to somehow dynamically join one of two statements is the last thing you want. T-SQL is a language for data access, not for DRY code-reuse programming. If you attempt to have a single statement then the optimizer has to come up with a plan that always work, no matter the value of #TaskControl, and so the plan will always have to join both tables.
A more lengthy discussion on this topic is Dynamic Search Conditions in T-SQL (your dynamic join falls into the same topic as dynamic search).

If they are UNION compatible you could give this a shot. From a quick test this end it only appears to access the relevant table.
I do agree more with JNK's and Remus's answers however. This does have a recompilation cost for every invocation and not much benefit.
;WITH T AS
(
SELECT 'Table1' AS TaskControl, id
FROM Task
UNION ALL
SELECT 'Table2' AS TaskControl, id
FROM Task_subset
)
SELECT *
FROM T
JOIN Orders O on T.id = O.id
WHERE TaskControl = #TaskControl
OPTION (RECOMPILE)

I don't know how good performance would be, and this would not scale well as you add on additional optional tables, but this should work in the situation that you present.
SELECT
O.some_column,
COALESCE(T.some_task_column, TS.some_task_subset_column)
FROM
Orders O
LEFT OUTER JOIN Tasks T ON
#task_control = 'Tasks' AND
T.id = O.id
LEFT OUTER JOIN Task_Subsets TS ON
#task_control = 'Task Subsets' AND
TS.id = O.id

Try the following. It should avoid the stored procedure plan getting bound based on the value of the parameter passed during the first execution of the stored procedure (See SQL Server Parameter Sniffing for details):
create proc dbo.foo
#TaskControl varchar(32)
as
declare #selection varchar(32)
set #selection = #TaskControl
select *
from dbo.Orders t
join dbo.Task t1 on t1.id = t.id
where #selection = 'Table1'
UNION ALL
select *
from dbo.Orders t
join dbo.Task_subset t1 on t1.id = t.id
where #selection = 'Table2'
return 0
go
The stored procedure shouldn't get recompiled for each invocation, either, as #Martin suggested might happen, but the parameter value 1st passed in should not influence the execution plan the gets bound. But if performance is an issue, run a sql trace with the profiler and see if the cached execution plan is reused or if a recompile is triggered.
One thing, though: you will need to ensure, though, that each individual select in the UNION returns the exact same columns. Each select in a UNION must have the same number of columns and each column must have a common type (or default conversion to the common type). The 1st select defines the number, types and names of the columns in the result set.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Why do multiple EXISTS break a query - sql

Related

Need help in optimizing sql query

SQL How to optimize insert to table from temporary table

Retrieve additional rows if bit flag is true

SQL Server 2008 R2 query

Joining on one of Two Tables Based on Parameter

Categories

Resources