Better way of performing dynamic conditional searching in TSQL? - sql

We have a large-ish query here that has several params, and for each one, the query only differs by one portion of the where clause, like so:
CASE WHEN #IncludeNames = 1 AND #NameFilter IS NULL THEN
(SELECT blah FROM blahBlah
INNER JOIN ...
INNER JOIN ...
INNER JOIN ...
WHERE blahBlah.Id = x.Id)
WHEN #IncludeNames = 1 AND #NameFilter IS NOT NULL THEN
(SELECT blah FROM blahBlah
INNER JOIN ...
INNER JOIN ...
INNER JOIN ...
WHERE blahBlah.Id = x.Id
AND table2.Id = #NameFilter
It goes on like that for several instances, differing only by one condition on the where clause.
Keep in mind this is in the middle of a larger select.
Is there a good way of cleaning this up, without placing it all into one large concatenated sql string and running exec on it, or using something absurd like multiple stored procs per block, as shown here: http://www.developerfusion.com/article/7305/dynamic-search-conditions-in-tsql/7/
Server is SQL Server 2008 R2. TIA!

Try setting up your query with an option of all or specific values for each clause e.g.
SELECT x.*
FROM x
WHERE (x.id = #NameFilter
OR #NameFilter is null)
AND (x.typeId = #typeFilter
OR -1 = #typeFilter)
AND (x.date = #date
OR #date is null)
AND (x.someStingType = #someStringType
Or '' = #someStringType)
This should allow you to concatenate your clauses into a single select statement. Each parameter may apply a filter or have no effect (if set to the default such as null, empty string or -1).

Related

Parameters table and conditions

In my current project there is a query where a set of parameters is given and I need to check those parameters against another table. Each of these parameters can be NULL and in this case has to be ignored. What I currently do is the following:
SELECT t.col1,
t.col2,
t.col3,
t.col4,
t.col5,
t.col6,
t.col7,
t.col8
FROM table1 t
INNER JOIN #parameters p ON (p.col1 IS NULL OR p.col1 = t.col1)
AND (p.col2 IS NULL OR p.col2 = t.col2)
AND (p.col3 IS NULL OR p.col3 = t.col3)
AND (p.col4 IS NULL OR p.col4 = t.col4)
AND (p.col5 IS NULL OR p.col5 = t.col5)
AND (p.col6 IS NULL OR p.col6 = t.col6)
AND (p.col7 IS NULL OR b.col7 >= t.col7)
AND (p.col8 IS NULL OR b.col8 <= t.col8)
This means if the column in the parameters table is NULL it will be ignored otherwise it will be compared to the corresponding column in table1. This works but unfortunately is VERY slow. Does anybody know a better solution (other then concatenating a string query)?
It seems like you don't have any real criteria that could be used to limit the data in your table, and that kind of structure usually never performs well. As far as I know, there's not much you can do to try to improve that.
Is any of these columns such that it is included in the parameters often (for all rows) and could limit the data a lot? You could use union to do something like this:
SELECT ...
FROM table1 t
INNER JOIN #parameters p ON p.col1 = t.col1 ...
union
SELECT ...
FROM table1 t
INNER JOIN #parameters p ... where p.col1 is NULL
If you're lucky something like that might work.
The other option that comes to my mind is somehow iterate the rows in the #parameters table, which is probably what you meant by string concatenating. Either by building a dynamic SQL with either or clauses or union or have a temp. table maybe with ignore dup key index and create & run dynamic insert clauses one by one for all the rows in parameters -table.

How does SQL Server Update rows with more than one value?

In an update statement for a temp table, how does SQL Server decide which value to use when there are multiple values returned, for example:
UPDATE A
SET A.dte_start_date = table1.dte_start_date
FROM #temp_table A
INNER JOIN table1 ON A.id = table1.id
In this situation the problem is more than one dte_start_date is returned for each id value in the temp table. There is there's no index or unique value in the tables I'm working on so I need to know how SQL Server will choose between the different values.
It is non-deterministic. See the following example for a better understanding. Though it is not exactly the same scenario explained here, it is pretty similar
When the single value is to be retrieved from the database also use the SET statement with a query to set the value. For example:
SET #v_user_user_id = (SELECT u.user_id FROM users u WHERE u.login = #v_login);
Reason: Unlike Oracle, SQL Server does not raise an error if more than one row is returned from a SELECT query that is used to populate variables. The above query will throw an exception whereas the following will not throw an exception and the variable will contain a random value from the queried table(s).
SELECT #v_user_user_id = u.user_id FROM users u WHERE u.login = #v_login;
It is non-deterministic which value is used if you have a one two many relationship.
In MS-SQL-Sever (>=2005) i would use a CTE since it's a readable way to specify what i want using ROW_NUMBER. Another advantage of a CTE is that you can change it easily to do a select instead of an update(or delete) to see what will happen.
Assuming that you want the latest record(acc.to dte_start_date) for every id:
WITH CTE AS
(
SELECT a.*, rn = ROW_NUMBER() OVER (PARTITION BY a.id
ORDER BY a.dte_start_date DESC)
FROM #temp_table A
INNER JOIN table1 ON A.id = table1.id
)
UPDATE A
SET A.dte_start_date = table1.dte_start_date
FROM #temp_table A INNER JOIN CTE ON A.ID = CTE.ID
WHERE CTE.RN = 1

Access query in SQL Server broken

I have a UNION query that was working in a Microsoft Access environment. The error I am getting in SQL Server is: "Each GROUP BY expression must contain at least one column that is not an outer reference". The query is in the below format:
SELECT tblA.ProjectID,
tblB.PersonnelID,
"TeamMember" AS ProjectRole
FROM tblA INNER JOIN tblB ON (tblA.ProjectID = tblB.ProjectID)
AND (tblA.ProjectID = tblB.ProjectID)
GROUP BY tblA.ProjectID, tblB.PersonnelID, "TeamMember"
HAVING ((Not (tblB.PersonnelID) Is Null) AND ((Sum(tblB.Hours))>0))
How to get this query working for SQL Server?
afaik having only applies to aggragate functions like sum, so i moved the not null thing into the where.also the quotes need to be singles. i have also put the join criteria in a single parenthesis. try this:
SELECT tblA.ProjectID,
tblB.PersonnelID,
'TeamMember' AS ProjectRole
FROM tblA
INNER JOIN
tblB
ON (tblA.ProjectID = tblB.ProjectID
AND tblA.ProjectID = tblB.ProjectID)
where tblB.PersonnelID is not null
GROUP BY
tblA.ProjectID
, tblB.PersonnelID
, 'TeamMember'
HAVING Sum(tblB.Hours) > 0

Optimize query in TSQL 2005

I have to optimize this query can some help me fine tune it so it will return data faster?
Currently the output is taking somewhere around 26 to 35 seconds. I also created index based on attachment table following is my query and index:
SELECT DISTINCT o.organizationlevel, o.organizationid, o.organizationname, o.organizationcode,
o.organizationcode + ' - ' + o.organizationname AS 'codeplusname'
FROM Organization o
JOIN Correspondence c ON c.organizationid = o.organizationid
JOIN UserProfile up ON up.userprofileid = c.operatorid
WHERE c.status = '4'
--AND c.correspondence > 0
AND o.organizationlevel = 1
AND (up.site = 'ALL' OR
up.site = up.site)
--AND (#Dept = 'ALL' OR #Dept = up.department)
AND EXISTS (SELECT 1 FROM Attachment a
WHERE a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
ORDER BY o.organizationcode
I can't just change anything in db due to permission issues, any help would be much appreciated.
I believe your headache is coming from this part in specific...like in a where exists can be your performance bottleneck.
AND EXISTS (SELECT 1 FROM Attachment a
WHERE a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
This can be written as a join instead.
SELECT DISTINCT o.organizationlevel, o.organizationid, o.organizationname, o.organizationcode,
o.organizationcode + ' - ' + o.organizationname AS 'codeplusname'
FROM Organization o
JOIN Correspondence c ON c.organizationid = o.organizationid
JOIN UserProfile up ON up.userprofileid = c.operatorid
left join article a on a.contextid = c.correspondenceid
AND a.context = 'correspondence'
and right(attachmentname,4) in ('.doc','.rtf')
....
This eliminates both the like and the where exists. put your where clause at the bottom.it's a left join, so a.anycolumn is null means the record does not exist and a.anycolumn is not null means a record was found. Where a.anycolumn is not null will be the equivalent of a true in the where exists logic.
Edit to add:
Another thought for you...I'm unsure what you are trying to do here...
AND (up.site = 'ALL' OR
up.site = up.site)
so where up.site = 'All' or 1=1? is the or really needed?
and quickly on right...Right(column,integer) gives you the characters from the right of the string (I used a 4, so it'll take the 4 right chars of the column specified). I've found it far faster than a like statement runs.
This is always going to return true so you can eliminate it (and maybe the join to up)
AND (up.site = 'ALL' OR up.site = up.site)
If you can live with dirty reads then with (nolock)
And I would try Attachement as a join. Might not help but worth a try. Like is relatively expensive and if it is doing that in a loop where it could it once that would really help.
Join Attachment a
on a.contextid = c.correspondenceid
AND a.context = 'correspondence'
AND ( a.attachmentname like '%.rtf' or a.attachmentname like '%.doc'))
I know there are some people on SO that insist that exists is always faster than a join. And yes it is often faster than a join but not always.
Another approach is the create a #temp table using
CREATE TABLE #Temp (contextid INT PRIMARY KEY CLUSTERED);
insert into #temp
Select distinct contextid
from atachment
where context = 'correspondence'
AND ( attachmentname like '%.rtf' or attachmentname like '%.doc'))
order by contextid;
go
select ...
from correspondence c
join #Temp
on #Temp.contextid = c.correspondenceid
go
drop table #temp
Especially if productID is the primary key or part of the primary key on correspondence creating the PK on #temp will help.
That way you can be sure that like expression is only evaluated once. If the like is the expensive part and in a loop then it could be tanking the query. I use this a lot where I have a fairly expensive core query and I need to those results to pick up reference data from multiple tables. If you do a lot of joins some times the query optimizer goes stupid. But if you give the query optimizer PK to PK then it does not get stupid and is fast. The down side is it takes about 0.5 seconds to create and populate the #temp.

Left Join With Where Clause

I need to retrieve all default settings from the settings table but also grab the character setting if exists for x character.
But this query is only retrieving those settings where character is = 1, not the default settings if the user havent setted anyone.
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1'
So i should need something like this:
array(
'0' => array('somekey' => 'keyname', 'value' => 'thevalue'),
'1' => array('somekey2' => 'keyname2'),
'2' => array('somekey3' => 'keyname3')
)
Where key 1 and 2 are the default values when key 0 contains the default value with the character value.
The where clause is filtering away rows where the left join doesn't succeed. Move it to the join:
SELECT `settings`.*, `character_settings`.`value`
FROM `settings`
LEFT JOIN
`character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1'
When making OUTER JOINs (ANSI-89 or ANSI-92), filtration location matters because criteria specified in the ON clause is applied before the JOIN is made. Criteria against an OUTER JOINed table provided in the WHERE clause is applied after the JOIN is made. This can produce very different result sets. In comparison, it doesn't matter for INNER JOINs if the criteria is provided in the ON or WHERE clauses -- the result will be the same.
SELECT s.*,
cs.`value`
FROM SETTINGS s
LEFT JOIN CHARACTER_SETTINGS cs ON cs.setting_id = s.id
AND cs.character_id = 1
If I understand your question correctly you want records from the settings database if they don't have a join accross to the character_settings table or if that joined record has character_id = 1.
You should therefore do
SELECT `settings`.*, `character_settings`.`value`
FROM (`settings`)
LEFT OUTER JOIN `character_settings`
ON `character_settings`.`setting_id` = `settings`.`id`
WHERE `character_settings`.`character_id` = '1' OR
`character_settings`.character_id is NULL
You might find it easier to understand by using a simple subquery
SELECT `settings`.*, (
SELECT `value` FROM `character_settings`
WHERE `character_settings`.`setting_id` = `settings`.`id`
AND `character_settings`.`character_id` = '1') AS cv_value
FROM `settings`
The subquery is allowed to return null, so you don't have to worry about JOIN/WHERE in the main query.
Sometimes, this works faster in MySQL, but compare it against the LEFT JOIN form to see what works best for you.
SELECT s.*, c.value
FROM settings s
LEFT JOIN character_settings c ON c.setting_id = s.id AND c.character_id = '1'
For this problem, as for many others involving non-trivial left joins such as left-joining on inner-joined tables, I find it convenient and somewhat more readable to split the query with a with clause. In your example,
with settings_for_char as (
select setting_id, value from character_settings where character_id = 1
)
select
settings.*,
settings_for_char.value
from
settings
left join settings_for_char on settings_for_char.setting_id = settings.id;
The way I finally understand the top answer is realising (following the Order Of Execution of the SQL query ) that the WHERE clause is applied to the joined table thereby filtering out rows that do not satisfy the WHERE condition from the joined (or output) table. However, moving the WHERE condition to the ON clause applies it to the individual tables prior to joining. This enables the left join to retain rows from the left table even though some column entries of those rows (entries from the right tables) do not satisfy the WHERE condition.
The result is correct based on the SQL statement. Left join returns all values from the right table, and only matching values from the left table.
ID and NAME columns are from the right side table, so are returned.
Score is from the left table, and 30 is returned, as this value relates to Name "Flow". The other Names are NULL as they do not relate to Name "Flow".
The below would return the result you were expecting:
SELECT a.*, b.Score
FROM #Table1 a
LEFT JOIN #Table2 b
ON a.ID = b.T1_ID
WHERE 1=1
AND a.Name = 'Flow'
The SQL applies a filter on the right hand table.