Select Fails With Nonexisitent Columns - sql

Executing the following statement with SQL Server 2005 (My tests are through SSMS) results in success upon first execution and failure upon subsequent executions.
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
What this means is that something is comparing the columns I am accessing in my select statement against the columns that exist on a table when the script is "compiled". For my purposes this is undesirable functionality. My question is if there is anything that can be done so that this code would execute successfully on every run, or if that is not possible perhaps someone could explain why the demonstrated functionality is desirable. The only solutions I have currently is to wrap the select with EXEC or select *, but I don't like either of those solution.
Thanks

If you put:
IF OBJECT_ID('tempdb..#test') IS NOT NULL
DROP TABLE #test
GO
At the start, then the problem will go away, as the batch will get parsed before the #test table exists.
What you're asking is for the system to recognise that "1=0" will always evaluate to false. If it were ever true (which could potentially be the case for most real-life conditions), then you'd probably want to know that you were about to run something that would cause failure.
If you drop the temporary table and then create a stored procedure that does the same:
CREATE PROC dbo.test
AS
BEGIN
IF OBJECT_ID('tempdb..#test') IS NULL
CREATE TABLE #test ( GoodColumn INT )
IF 1 = 0
SELECT BadColumn
FROM #test
END
Then this will happily be created, and you can run it as many times as you like.
Rob

Whether or not this behaviour is "desirable" from a programmer's point of view is debatable of course -- it basically comes down to the difference between statically typed and dynamically typed languages. From a performance point of view, it's desirable because SQL Server needs complete information in order to compile and optimize the execution plan (and also cache execution plans).
In a word, T-SQL is not an interpretted or dynamically typed language, and so you cannot write code like this. Your options are either to use EXEC, or to use another language and embed the SQL queries within it.

This problem is also visible in these situations:
IF 1 = 1
select dummy = GETDATE() into #tmp
ELSE
select dummy = GETDATE() into #tmp
Although the second statement is never executed the same error occurs.
It seems the query engine first level validation ignores all conditional statements.

You say you have problems with subsequent request and that is because the object already exits. It it recommended that you drop your temporary tables as soon as possible when you are done with it.
Read more about temporary table performance at:
SQL Server performance.com

Related

Why is a query under a IF statement that is false running?

I have a application that uses a lot of string interpolation for SQL queries. I know it is a SQL injection threat, this is something that the customer and us know about and is hopefully something we can focus on next big refactor. I say that to make sense of the {Root Container.property} things that come from a GUI.
I have this query
IF ({Root Container.UserSelectedProduct}=1)
begin
DECLARE #TestNumbers {Root Container.SQLProductType};
INSERT INTO #TestNumbers SELECT * FROM {Root Container.DBTable};
SELECT *
FROM {Root Container.SQLProductFunction} (#TestNumbers)
WHERE [ID] = {Root Container.Level};
end
else
Select 0
Before a user selects a product it looks like this
IF (0=1)     
BEGIN
DECLARE #TestNumbers myDataType;
INSERT INTO #TestNumbers SELECT * FROM [MySchema].[TheWrongTable];     
SELECT * FROM [dbo].[myfunction] (#TestNumbers)
WHERE [ID] = 1;
END
ELSE
SELECT 0
Which is giving me the error:
Column name or number of supplied values does not match table definition.
I am aware why this error shows up, the table I am selecting from is not made for that data type.
However, why is it even attempting to run the first IF clause when I have IF (0=1) - how come this part is not just skipped and the SELECT 0 is only run? I would have thought that is how it was supposed to work, but I keep getting the error regarding column name/number not matching the table definition. When the user does select a Product and I get IF (1=1) and I have the appropriate table/function/datatype, it all works smoothly. I just don't know why it throws me an error prior when IF(1=0). Why does this happen/how can I get my intended behavior that everything inside my BEGIN\END under my first IF statement does not run unless the expression is true.
T-SQL is not interpreted. It must make sense regardless of what the runtime conditions are. It doesn't even do short-circuiting, in fact. Your code is invalid, and it doesn't matter that it's unreachable - T-SQL isn't going to ignore a piece of invalid code just because it could be eliminated, that's a thing that is a common source of bugs (e.g. in C++ where it's pretty common with templates).
Just make sure you still get valid SQL for the case where no product is selected; use the wrong table (or a helper table) if you have to.
The answer is simple: SQL code is fully compiled by the server before being executed, so this is basically a compile error. It's a bit like trying to compile the following in C#
if(someBoolWhichIsFalse)
intValue = "hello";
It's simply not valid.
The runtime code has not even been executed, it's still in the parsing and lexing stage. Nothing is being skipped, it just needs to be fully valid code, irrespective of runtime conditions.
This happens in every scope, i.e. on every call to a procedure or ad-hoc batch, that code must be compilable.

Detect if SQL statement is correct

Question: Is there any way to detect if an SQL statement is syntactically correct?
Explanation:
I have a very complex application, which, at some point, need very specific (and different) processing for different cases.
The solution was to have a table where there is a record for each condition, and an SQL command that is to be executed.
That table is not accessible to normal users, only to system admins who define those cases when a new special case occurs. So far, a new record was added directly to the table.
However, from time to time there was typos, and the SQL was malformed, causing issues.
What I want to accomplish is to create a UI for managing that module, where to let admins to type the SQL command, and validate it before save.
My idea was to simply run the statement in a throw block and then capture the result (exception, if any), but I'm wondering of there is a more unobtrusive approach.
Any suggestion on this validation?
Thanks
PS. I'm aware of risk of SQL injection here, but it's not the case - the persons who have access to this are strictly controlled, and they are DBA or developers - so the risk of SQL injection here is the same as the risk to having access to Enterprise Manager
You can use SET PARSEONLY ON at the top of the query. Keep in mind that this will only check if the query is syntactically correct, and will not catch things like misspelled tables, insufficient permissions, etc.
Looking at the page here, you can modify the stored procedure to take a parameter:
CREATE PROC TestValid #stmt NVARCHAR(MAX)
AS
BEGIN
IF EXISTS (
SELECT 1 FROM sys.dm_exec_describe_first_result_set(#stmt, NULL, 0)
WHERE error_message IS NOT NULL
AND error_number IS NOT NULL
AND error_severity IS NOT NULL
AND error_state IS NOT NULL
AND error_type IS NOT NULL
AND error_type_desc IS NOT NULL )
BEGIN
SELECT error_message
FROM sys.dm_exec_describe_first_result_set(#stmt, NULL, 0)
WHERE column_ordinal = 0
END
END
GO
This will return an error if one exists and nothing otherwise.

Need help with SQL query on SQL Server 2005

We're seeing strange behavior when running two versions of a query on SQL Server 2005:
version A:
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = 1234
ORDER BY name ASC
version B:
DECLARE #Id AS INT;
SET #Id = 1234;
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId
WHERE listcontacts.listid = #Id
ORDER BY name ASC
Both queries return 1000 rows; version A takes on average 15s; version B on average takes 4s.
Could anyone help us understand the difference in execution times of these two versions of SQL?
If we invoke this query via named parameters using NHibernate, we see the following query via SQL Server profiler:
EXEC sp_executesql N'SELECT otherattributes.* FROM listcontacts JOIN otherattributes ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #id ORDER BY name ASC',
N'#id INT',
#id=1234;
...and this tends to perform as badly as version A.
Try take a look at the execution plan for your query. This should give you some more explanation on how your query is executed.
I've not seen the execution plans, but I strongly suspect that they are different in these two cases. The issue that you are having is that in case A (the faster query) the optimiser knows the value that you are using for the list id (1234) and using a combination of the distribution statistics and the indexes chooses an optimal plan.
In the second case, the optimiser is not able to sniff the value of the ID and so produces a plan that would be acceptable for any passed in list id. And where I say acceptable I do not mean optimal.
So what can you do to improve the scenario? There are a couple of alternatives here:
1) Create a stored procedure to perform the query as below:
CREATE PROCEDURE Foo
#Id INT
AS
SELECT otherattributes.* FROM listcontacts JOIN otherattributes
ON listcontacts.contactId = otherattributes.contactId WHERE listcontacts.listid = #Id
ORDER BY name ASC
GO
This will allow the optimiser to sniff the value of the input parameter when passed in and produce an appropriate execution plan for the first execution. Unfortunately it will cache that plan for reuse later so unless the you generally call the sproc with similarly selective values this may not help you too much
2) Create a stored procedure as above, but specify it to be WITH RECOMPILE. This will ensure that the stored procedure is recompiled each time it is executed and hence produce a new plan optimised for this input value
3) Add OPTION (RECOMPILE) to the end of the SQL Statement. Forces recompilation of this statement, and is able to optimise for the input value
4) Add OPTION (OPTIMIZE FOR (#Id = 1234)) to the end of the SQL statement. This will cause the plan that gets cached to be optimised for this specific input value. Great if this is a highly common value, or most common values are similarly selective, but not so great if the distribution of selectivity is more widely spread.
It's possible that instead of casting 1234 to be the same type as listcontacts.listid and then doing the comparison with each row, it might be casting the value in each row to be the same as 1234. The first requires just one cast, the second needs a cast per row (and that's probably on far more than 1000 rows, it may be for every row in the table). I'm not sure what type that constant will be interpreted as but it may be 'numeric' rather than 'int'.
If this is the cause, the second version is faster because it's forcing 1234 to be interpreted as an int and thus removing the need to cast the value in every row.
However, as the previous poster suggests, the query plan shown in SQL Server Management Studio may indicate an alternative explanation.
The best way to see what is happening is to compare the execution plans, everything else is speculation based on the limited details presented in the question.
To see the execution plan, go into SQL Server Management Studio and run SET SHOWPLAN_XML ON then run query version A, the query will not run but the execution plan will be displayed in XML. Then run query version B and see its execution plan. If you still can't tell the difference or solve the problem, post both execution plans and someone here will explain it.

Functions in SQL Server 2008

Does sql server cache the execution plan of functions?
Yes, see rexem's Tibor link and Andrew's answer.
However... a simple table value function is unnested/expanded into the outer query anyway. Like a view. And my answer (with links) here
That is, this type:
CREATE FUNC dbo.Foo ()
RETURNS TABLE
AS
RETURN (SELECT ...)
GO
According to the dmv yes, http://msdn.microsoft.com/en-us/library/ms189747.aspx but I'd have to run a test to confirm.
Object ID in the output is "ID of the object (for example, stored procedure or user-defined function) for this query plan".
Tested it and yes it does look like they are getting a separate plan cache entry.
Test Script:
create function foo (#a int)
returns int
as
begin
return #a
end
The most basic of functions created.
-- clear out the plan cache
dbcc freeproccache
dbcc dropcleanbuffers
go
-- use the function
select dbo.foo(5)
go
-- inspect the plan cache
select * from sys.dm_exec_cached_plans
go
The plan cache then has 4 entries, the one listed as objtype = Proc is the function plan cache, grab the handle and crack it open.
select * from sys.dm_exec_query_plan(<insertplanhandlehere>)
The first adhoc on my test was the actual query, the 2nd ad-hoc was the query asking for the plan cache. So it definitely received a separate entry under a different proc type to the adhoc query being issued. The plan handle was also different, and when extracted using the plan handle it provides an object id back to the original function, whilst an adhoc query provides no object ID.

running conditional DDL statements on sql server

i have a situation where i want to check a certain column ( like version number) and then apply a bunch of ddl changes
trouble is i am not able to do it with in a IF BEGIN END block, since DDL statements require a GO separator between them, and TSQL wont allow that.
I am wondering if there is any way around to accomplish this
You don't need to use a full block. A conditional will execute the next statement in its entirety if you don't use a BEGIN/END -- including a single DDL statement. This is equivalent to the behavior of if in Pascal, C, etc. Of course, that means that you will have to re-check your condition over and over and over. It also means that using variables to control the script's behavior is pretty much out of the question.
[Edit: CREATE PROCEDURE doesn't work in the example below, so I changed it to something else and moved CREATE PROCEDURE for a more extended discussion below]
If ((SELECT Version FROM table WHERE... ) <= 15)
CREATE TABLE dbo.MNP (
....
)
GO
If ((SELECT Version FROM table WHERE... ) <= 15)
ALTER TABLE dbo.T1
ALTER COLUMN Field1 AS CHAR(15)
GO
...
Or something like that, depending on what your condition is.
Unfortunately, CREATE/ALTER PROCEDURE and CREATE/ALTER VIEW have special requirements that make it much harder to work with. They are pretty much required to be the only thing in a statement, so you can't combine them with IF at all.
For many scenarios, when you want to "upgrade" your objects, you can work it as a conditional drop followed by a create:
IF(EXISTS(SELECT * FROM sys.objects WHERE type='p' AND object_id = OBJECT_ID('dbo.abc')))
DROP PROCEDURE dbo.abc
GO
CREATE PROCEDURE dbo.abc
AS
...
GO
If you do really need conditional logic to decide what to do, then the only way I know of is to use EXECUTE to run the DDL statements as a string.
If ((SELECT Version FROM table WHERE... ) <= 15)
EXECUTE 'CREATE PROC dbo.abc
AS
....
')
But this is very painful. You have to escape any quotes in the body of the procedure and it's really hard to read.
Depending on the changes that you need to apply, you can see all this can get very ugly fast. The above doesn't even include error checking, which is a royal pain all on its own. This is why hordes of toolmakers make a living by figuring out ways to automate the creation of deployment scripts.
Sorry; there is no easy "right" way that works for everything. This is just something that TSQL supports very poorly. Still, the above should be a good start.
GO is recognised by client tools, not by the server.
You can have CREATEs in your stored procedures or ad-hoc queries with no GO's.
Multiple "IF" statements? You can test then for the success of subsequent DDL statements
Dynamic SQL? EXEC ('ALTER TABLE foo WITH CHECK ADD CONSTRAINT ...')?
As mentioned, GO is a client only batch separator to break down a single SQL text block into batches that are submitted to the SQL Server.