SQL Server Query optimiser problem - sql

The way that SQL server seems to be optimising a query is causing it to break. This is illustrated with the two examples below:
SELECT distinct ET.ElementName, ET.Shared, CONVERT(float,ED.Value), ED.SheetSetVersionID, ED.SheetDataID
FROM tElementData ED
INNER JOIN tElementTemplate ET
ON ED.ElementTemplateID = ET.ElementTemplateID
AND ET.ElementName like 'RPODCQRated'
The above query works fine but is not the query I need to run.
SELECT distinct ET.ElementName, ET.Shared, CONVERT(float,ED.Value), ED.SheetSetVersionID, ED.SheetDataID
FROM tElementData ED
INNER JOIN tElementTemplate ET
ON ED.ElementTemplateID = ET.ElementTemplateID
AND ET.ElementName like 'RPODCQRated'
AND CONVERT(float,ED.Value) = 0.006388
The above query throws an exception saying that it cannot convert an nvarchar value to a float. tElementData.Value is an nvarchar(500) field and some records do have none numeric values but all values where tElementTemplate = 'RPODCQRated' can be converted to a float, as the top query proves. It seems that SQL server in its wisdom is applying the CONVERT(float,ED.Value) before it tries the join. I need the second query to work somehow, I can rewrite it but there are limitations on what I can do without rewriting the entire data layer of an existing application.
Things i have tried that don't help: moving the last criteria into a where clause rather that the join, making the first query into a CTE and applying the where clause to the CTE, creating a scalar function that calls IsNumeric on the data before trying to do a convert.
The only thin i could get to work was to insert all the data in a temporary table then apply a where clause to the temporary table. Unfortunately to implement this as a solution would involve extensive refactoring of the data layer of an application in order to solve an obscure bug when searching for certain records.
Any ideas?

The only way in SQL to ensure linear evaluation is to use a Case Statement
SELECT distinct ET.ElementName, ET.Shared, CONVERT(float,ED.Value), ED.SheetSetVersionID, ED.SheetDataID
FROM tElementData ED
INNER JOIN tElementTemplate ET
ON ED.ElementTemplateID = ET.ElementTemplateID
AND ET.ElementName like 'RPODCQRated'
AND CASE(WHEN ET.ElementName like 'RPODCQRated' then CONVERT(float,ED.Value) else 0 end) = 0.006388
This will likely cause a duplicate check on the ElementName, but as far as i know, it's the only way to ensure the order of evaluation.
Unless, of course, you move the entire eval out of the query and nest the results in a CTP and do the cast on the results.

i would try breaking it out into something like this:
;with a as
(
SELECT distinct
ET.ElementName,
ET.Shared,
CONVERT(float, ED.Value),
ED.SheetSetVersionID,
ED.SheetDataID
FROM
tElementData ED
INNER JOIN tElementTemplate ET
ON ED.ElementTemplateID = ET.ElementTemplateID
AND ET.ElementName like 'RPODCQRated'
)
select *
from a
where CONVERT(float, ED.Value) = 0.006388
or, have you tried "where ED.Value='0.006388' or whatever the varchar equivilent is?

I have solved this problem by using a table function. The element name, the operator and the right hand value of the last join clause are all dynamically generated. I created the below tvf and replaced the relevant part of the select statement with a call to the tvf.
CREATE FUNCTION tvfAdvancedSearch
(
#TemplateType nvarchar(500)
)
RETURNS
#Results TABLE
(
ElementName nvarchar(50),
Shared tinyint,
Value NVARCHAR(500),
SheetSetVersionID int,
SheetDataID int
)
AS
BEGIN
INSERT INTO #Results
SELECT distinct ET.ElementName, ET.Shared, ED.Value, ED.SheetSetVersionID, ED.SheetDataID
FROM tElementData ED
INNER JOIN tElementTemplate ET
ON ED.ElementTemplateID = ET.ElementTemplateID
AND ET.ElementName like #TemplateType
RETURN
END
GO
I would also like to mention that Brian Rudolph's answer also worked but i had already implemented this solution before I saw his post.

Related

using an alias of a function in a sql condition

I have something like this:
SELECT
cansa1.NAME,
mod(cansa1.PRODUCT_ID, 1000000) prodIdHash
FROM CANSA_TABLE cansa1
INNER JOIN CUSER_TABLE cuser1 ON cansa1.PRODUCT_ID = cuser1.PRODUCT_ID
AND mod(cansa1.PRODUCT_ID, 1000000) = cuser1.PRODUCT_HASH
This query is working, but I want replace the second occurrence (in the inner join) of the mod() function, to avoid execute it two times. I tried replace it by the alias in the select clause but not works. Any idea of that I can use to make this query don't repeat the mod() function?
Sorry by my english
Don't worry about executing it twice, the SQL engine will optimize the query and will decide whether the function value is cached or it executes twice and can end up re-writing the query so that what is executed has a different structure than the written query because it has determined that it would be more efficient.
If you really want to try to rewrite it then:
SELECT c.NAME,
c.prodIdHash
FROM (
SELECT name,
mod(PRODUCT_ID, 1000000) As prodIdHash
FROM CANSA_TABLE
) c
INNER JOIN CUSER_TABLE u
ON ( c.PRODUCT_ID = u.PRODUCT_ID
AND c.prodIdHash = u.PRODUCT_HASH )
However, the SQL engine may rewrite the query and push the function to the outer scope so you may need a seemingly irrelevant filter condition to materialize the inner query and force the calculation not to be rewritten:
SELECT c.NAME,
c.prodIdHash
FROM (
SELECT name,
mod(PRODUCT_ID, 1000000) As prodIdHash
FROM CANSA_TABLE
WHERE ROWNUM > 0
) c
INNER JOIN CUSER_TABLE u
ON ( c.PRODUCT_ID = u.PRODUCT_ID
AND c.prodIdHash = u.PRODUCT_HASH )
However, this really seems like a case of premature optimisation. You should check if there is actually a problem first before you try and apply an optimisation that probably is not needed.
You can use a derived table (i.e. a subquery in the FROM clause):
SELECT dt.NAME, dt.prodIdHash
FROM
(SELECT
cansa1.NAME,
mod(cansa1.PRODUCT_ID, 1000000) prodIdHash
FROM CANSA_TABLE cansa1) dt
INNER JOIN CUSER_TABLE cuser1 ON dt.PRODUCT_ID = cuser1.PRODUCT_ID
AND dt.prodIdHash = cuser1.PRODUCT_HASH

Recursive SQL function and change tracking don't work together

I have a recursive function which gives allows me to give any GUID in the heirarchy and it pulls back all the values below it. This is used for folder security.
ALTER FUNCTION dbo.ValidSiteClass
(
#GUID UNIQUEIDENTIFIER
)
RETURNS TABLE
AS
RETURN
(
-- Add the SELECT statement with parameter references here
WITH previous
AS ( SELECT
PK_SiteClass,
FK_Species_SiteClass,
CK_ParentClass,
ClassID,
ClassName,
Description,
SyncKey,
SyncState
FROM
dbo.SiteClass
WHERE
PK_SiteClass = #GUID
UNION ALL
SELECT
Cur.PK_SiteClass,
Cur.FK_Species_SiteClass,
Cur.CK_ParentClass,
Cur.ClassID,
Cur.ClassName,
Cur.Description,
Cur.SyncKey,
Cur.SyncState
FROM
dbo.SiteClass Cur,
previous
WHERE
Cur.CK_ParentClass = previous.PK_SiteClass)
SELECT DISTINCT
previous.PK_SiteClass,
previous.FK_Species_SiteClass,
previous.CK_ParentClass,
previous.ClassID,
previous.ClassName,
previous.Description,
previous.SyncKey,
previous.syncState
FROM
previous
)
I have a stored procudure which then later needs to figure out what folders have changed in the user's heirarchy which I use for change tracking. When I try to join it with my change tracking it never returns the query. For example, the following doesn't ever return any results (It just spins, I stop it after 6 minutes)
DECLARE #ChangeTrackerNumber INT = 13;
DECLARE #SelectedSchema UNIQUEIDENTIFIER = '36EC6589-8297-4A82-86C3-E6AAECCC7D95';
WITH validones AS (SELECT PK_SITECLASS FROM ValidSiteClass(#SelectedSchema))
SELECT SiteClass.PK_SiteClass KeyGuid,
'' KeyString,
dbo.GetChangeOperationEnum(SYS_CHANGE_OPERATION) ChangeOp
FROM dbo.SiteClass
INNER JOIN CHANGETABLE(CHANGES SiteClass, #ChangeTrackerNumber) tracking --tracking
ON tracking.PK_SiteClass = SiteClass.PK_SiteClass
INNER JOIN validones
ON SiteClass.PK_SiteClass = validones.PK_SiteClass
WHERE SyncState IN ( 0, 2, 4 );
The only way I can make this work is with a temptable such as:
DECLARE #ChangeTrackerNumber INT = 13;
DECLARE #SelectedSchema UNIQUEIDENTIFIER = '36EC6589-8297-4A82-86C3-E6AAECCC7D95';
CREATE TABLE #temptable
(
[PK_SiteClass] UNIQUEIDENTIFIER
);
INSERT INTO #temptable
(
PK_SiteClass
)
SELECT PK_SiteClass
FROM dbo.ValidSiteClass(#SelectedSchema);
SELECT SiteClass.PK_SiteClass KeyGuid,
'' KeyString,
dbo.GetChangeOperationEnum(SYS_CHANGE_OPERATION) ChangeOp
FROM dbo.SiteClass
INNER JOIN CHANGETABLE(CHANGES SiteClass, #ChangeTrackerNumber) tracking --tracking
ON tracking.PK_SiteClass = SiteClass.PK_SiteClass
INNER JOIN #temptable
ON SiteClass.PK_SiteClass = #temptable.PK_SiteClass
WHERE SyncState IN ( 0, 2, 4 );
DROP TABLE #temptable;
In other words, the CTE doesn't work and I need to call the temptable.
First question, isn't the CTE supposed to be the same thing (but better) than a temptable?
Second question, does anyone know why this could be so? I have tried inner joins and using a where and in clause also. Is there something different about a recursive query that might cause this odd behavior?
Generally, when you have a table-valued function, you'd just include it like it was a regular table (assuming you have a parameter to pass to it). If you want to pass a series of parameters to it, you'd use outer apply, but that doesn't seem to be the case here.
I think (maybe) this is more like you want (notice no with clause):
select
s.PK_SiteClass KeyGuid,
'' KeyString,
dbo.GetChangeOperationEnum(t.SYS_CHANGE_OPERATION) ChangeOp
from
dbo.ValidSiteClass(#SelectedSchema) v
inner join
SiteClass s
on
s.PK_SiteClass = v.PK_SiteClass
inner join
changetable(changes SiteClass, #ChangeTrackerNumber) c
on
c.PK_SiteClass = s.PK_SiteClass
where
SyncState in ( 0, 2, 4 )
option (force order)
...which I'll admit doesn't look that mechanically different than what you have with the with clause. However, you could be running into an issue with SQL Server just picking a horrible plan not having any other clues. Including the option (force order) makes SQL Server perform the joins according to the order you put them in...and sometimes this makes an incredible difference.
I wouldn't say this is recommended. In fact, it's a hack...just to see WTF. But, play around with the order...and get SQL Server to show you the actual execution plans to see why it might have come up with something so heinous. An inline table-valued function is visible to SQL Server's query plan engine, and it may decide to not treat the function as an isolated thing the way programmers traditionally think about functions. I suspect this is why it took so long to begin with.
Funny enough, if your function were to be a so-called multi-lined table-valued function, SQL would definitely not have the same type of visibility into it when planning this query...and it might run faster. Again, not a recommendation, just something that might hack a better plan.

Check the query efficiency

I have this below SQL query that I want to get an opinion on whether I can improve it using Temp Tables or something else or is this good enough? So basically I am just feeding the result set from inner query to the outer one.
SELECT S.SolutionID
,S.SolutionName
,S.Enabled
FROM dbo.Solution S
WHERE s.SolutionID IN (
SELECT DISTINCT sf.SolutionID
FROM dbo.SolutionToFeature sf
WHERE sf.SolutionToFeatureID IN (
SELECT sfg.SolutionToFeatureID
FROM dbo.SolutionFeatureToUsergroup SFG
WHERE sfg.UsergroupID IN (
SELECT UG.UsergroupID
FROM dbo.Usergroup UG
WHERE ug.SiteID = #SiteID
)
)
)
It's going to depend largely on the indexes you have on those tables. Since you are only selecting data out of the Solution table, you can put everything else in an exists clause, do some proper joins, and it should perform better.
The exists clause will allow you to remove the distinct you have on the SolutionToFeature table. Distinct will cause a performance hit because it is basically creating a temp table behind the scenes to do the comparison on whether or not the record is unique against the rest of the result set. You take a pretty big hit as your tables grow.
It will look something similar to what I have below, but without sample data or anything I can't tell if it's exactly right.
Select S.SolutionID, S.SolutionName, S.Enabled
From dbo.Solutin S
Where Exists (
select 1
from dbo.SolutionToFeature sf
Inner Join dbo.SolutionToFeatureTousergroup SFG on sf.SolutionToFeatureID = SFG.SolutionToFeatureID
Inner Join dbo.UserGroup UG on sfg.UserGroupID = UG.UserGroupID
Where S.SolutionID = sf.SolutionID
and UG.SiteID = #SiteID
)

Declare a variable inside of a projection

Note below that I need to declare a variable which is the result another query. If don't do this, I need to repeat this query anytime where I need the value.
SQL Server is throwing an exception about not to write DECLARE inside of the SELECT keyword. What can I do or what I'm missing?
SELECT A.StudentId,
(
CASE WHEN (SELECT B.OverwrittenScore
FROM dbo.OverwrittenScores AS B
WHERE B.StudentId = A.StudentId AND B.AssignmentId = #assignmentId) IS NOT NULL
THEN (SELECT B.OverwrittenScore
FROM dbo.OverwrittenScores AS B
WHERE B.StudentId = A.StudentId AND B.AssignmentId = #assignmentId)
ELSE (-- ANOTHER QUERY, BY THE MOMENT: SELECT 0
) END
) AS FinalScore
FROM dbo.Students AS A
Inside of the parenthesis I need to implement some logic, I mean maybe implement another two queries.
I was thinking if here I can use the BEGIN keyword, but it didn't worked out
You don't need all that craziness. There are a lot of conceptual problems with what you're trying to do.
You can't declare variables in the middle of a query.
Scalar variables can only hold one value.
Scalar variables in SQL Server always begin with #. Cursor variables can be plain identifiers but you definitely don't want a cursor, here.
A simple JOIN will do what you're looking for. The subquery method works but is awkward (sticking queries in the SELECT statement), can't pull more than one column value, and can't be reused throughout the query like a JOIN can.
You can use a CASE statement directly on a column. There is no need to try to put the value into a variable first. And that wouldn't work anyway (see #2).
You can use the IsNull or Coalesce functions to turn a NULL into a 0 with simpler syntax.
I encourage you to use aliases that hint at the tables instead of using A and B. For example, S for Students and O for OverwrittenScores.
Taking all those points into consideration, you can do something like this instead:
SELECT
S.StudentId,
OverwrittenScore = Coalesce(O.OverwrittenScore, 0)
FROM
dbo.Students S
LEFT JOIN dbo.OverwrittenScores O
ON S.StudentId = O.StudentID
AND O.AssignmentId = #assignmentId
LEFT JOIN dbo.SomeOtherTable T -- add another join here if you like
ON S.StudentID = T.StudentID
AND O.OverwrittenScore IS NULL
UPDATE
I added another LEFT JOIN for you above. Do you see how it joins on the condition that O.OverwrittenScore IS NULL? This seems to me like it will probably do what you want.
Again, if you will provide more detail I will show you more answer.
Also, for what it's worth, your edit to your post is overcomplicated. If you were going to write your query that way, it would be better like this:
SELECT
S.StudentId,
FinalScore =
Coalesce(
(SELECT O.OverwrittenScore
FROM dbo.OverwrittenScores O
WHERE
S.StudentId = O.StudentId
AND O.AssignmentId = #assignmentId
),
(SELECT SomethingElse FROM SomewhereElse),
0
)
FROM dbo.Students S
I also encourage you when writing correlations or joins to put the other or outer table first in the join (as in S.StudentId = O.StudentId instead of O.StudentId = S.StudentId). I suggest this because it helps you understand the join faster, since you already know the local table and want to know the outer table, and thus your eye doesn't have to scan as far. I also recommend putting multiple conditions on separate lines. I promise you that you will be able to understand your own queries faster in the future if you get in the habit of doing this.

ORA-904 When referencing the base table, in a query that includes a nested table and other joins

I've distilled down a problem query to the following example, and I can't find anything on metalink indicating that this is a known bug, or expected behaviour.
The following script is self contained so anyone could reproduce this, I'm using Oracle 11.2.0.3.0 enterprise 64-bit running on RHEL, queries being submitted via SQL Developer, the same results occur in SQL Plus.
First : create the nested table type.
create type nested_type_test as table of varchar(32);
Second : create a table using the type as the nested type.
create table nested_table_test (
uuid varchar(32)
, some_values nested_type_test
)
NESTED TABLE some_values STORE AS nested_some_values;
Third : create an arbitary table that we will use to create a join statement with.
create table join_table (
uuid varchar(32)
);
Now for some simple SQL statements:
select *
from nested_table_test ntt, table(some_Values) sv;
select *
from nested_table_test ntt, table(some_Values) sv
where ntt.uuid = 'X';
Both work, I havn't inserted data but for the purposes of this bug / test we do not need to since the issue is a parse issue. Next statement, I add the join to the arbitary table I generated for this purposes.
select *
from nested_table_test ntt, table(some_Values) sv
inner join join_table jt on jt.uuid = ntt.uuid;
This now produces : ORA-00904: "NTT"."UUID": invalid identifier - yet I know that the field exists and could reference it in a where clause. Taking out the TABLE(some_values) clause like this,
select *
from nested_table_test ntt
inner join join_table jt on jt.uuid = ntt.uuid;
and the query works, so I know it is isolated to the existance of the nested table clause within the statement.
If I switch to using a manual join instead of an ANSI join, it then parses and executes again.
select *
from nested_table_test ntt, table(some_Values) sv, join_table jt
where jt.uuid = ntt.uuid;
Alternatively, and even more hack-ish:
select *
from nested_table_test ntt, table(some_Values) sv
inner join join_table jt on 1=1
where jt.uuid = ntt.uuid;
Also parses - so it is not the ntt.uuid itself that is the problem, but where it occurs within the statement that the parser seems to struggle with.
Known bug, Unknown bug or expected behaviour?
Edit : Using CROSS JOIN table(some_values) sv causes a seg fault and a trace file on the server whilst dumping the connection - that one for sure is a bug. It's also why I'm not in pure ANSI join syntax.
When you are mixing up oracle and ANSI join syntax it's the expected behavior. ANSI join takes precedence over , comma join(cross join), parsed from left to right, so it means that the ntt has not been joined yet when you were trying to reference ntt.uuid column and that is why you might get that error. To that end choose one of the join types. For instance:
select *
from nested_table_test ntt
, table(some_Values) sv
, join_table jt
where jt.uuid = ntt.uuid;
OR
select *
from nested_table_test ntt
cross join table(some_Values) sv
join join_table jt
on (jt.uuid = ntt.uuid)