sql server with an update and 2 datetime field and getdate() - sql-server-2005

requirement is, both field must be equal, what would you do
declare #var datetime
set #var = getdate()
update table set f1=#var,f2=#var
or simply
update table set f1=getdate(),f2=getdate()

Definitely the first way, because 2 calls to getdate() will most likely return different values.

Original Answer: getdate() seems to be like rand() and only evaluated once in a query. This query took more than a minute to return and all getdate()s are the same.
select getdate()
from sys.objects s1, sys.objects s2, sys.objects s3
Updated But when I looked at the query plan for an update of 2 different columns I could see the compute scalar operator was calling getdate() twice.
I tested doing an update with rand()
CREATE TABLE #t(
[f1] [float] NULL,
[f2] [float] NULL,
)
insert into #t values (1,1)
insert into #t values (2,2)
insert into #t values (3,3)
update #t set f1=rand(),f2=rand()
select * from #t
That Gives
f1 f2
---------------------- ----------------------
0.54168308978257 0.574235819564939
0.54168308978257 0.574235819564939
0.54168308978257 0.574235819564939

Actually, this depends on the version of SQL.
GetDate() was a deterministic function prior to SQL 2005. The answer returned was the same value for the duration of the statement.
In SQL 2005 (and onwards), Getdate() is non-deterministic, which means every time you call it you will get a different value.
Since both GetDate() functions will be evaluated before the update starts, IMO they will come back with the same value.
Not knowing the size of your table and partitions and the load on your server, I would go with option #1

I'm going to go with something other than performance: readability / communication of intent.
Along those lines, option one is probably better. You are, in effect, telling future developers "I am explicitly setting f1 and f2 to the same DateTime." If the requirements change in the future, and (for some reason) f1 and f2 have to be updated at separate times (or something changes and they get evaluated at different times), you still have the same datetime for both.
In option two, all you're saying is that f1 and f2 have to be updated with the current time of whenever their update operations run. Again, if something changes in your requirements and they have to be evaluated in separate statements for some reason, now they won't necessarilly be the same value any more.

Related

SQL Server subquery behaviour

I have a case where I want to check to see if an integer value is found in a column of a table that is varchar, but is a mix of values that can be integers some are just strings. My first thought was to use a subquery to just select the rows with numeric-esque values. The setup looks like:
CREATE TABLE #tmp (
EmployeeID varchar(50) NOT NULL
)
INSERT INTO #tmp VALUES ('aa1234')
INSERT INTO #tmp VALUES ('1234')
INSERT INTO #tmp VALUES ('5678')
DECLARE #eid int
SET #eid = 5678
SELECT *
FROM (
SELECT EmployeeID
FROM #tmp
WHERE IsNumeric(EmployeeID) = 1) AS UED
WHERE UED.EmployeeID = #eid
DROP TABLE #tmp
However, this fails, with: "Conversion failed when converting the varchar value 'aa1234' to data type int.".
I don't understand why it is still trying to compare #eid to 'aa1234' when I've selected only the rows '1234' and '5678' in the subquery.
(I realize I can just cast #eid to varchar but I'm curious about SQL Server's behaviour in this case)
You can't easily control the order things will happen when SQL Server looks at the query you wrote and then determines the optimal execution plan. It won't always produce a plan that follows the same logic you typed, in the same order.
In this case, in order to find the rows you're looking for, SQL Server has to perform two filters:
identify only the rows that match your variable
identify only the rows that are numeric
It can do this in either order, so this is also valid:
identify only the rows that are numeric
identify only the rows that match your variable
If you look at the properties of this execution plan, you see that the predicate for the match to your variable is listed first (which still doesn't guarantee order of operation), but in any case, due to data type precedence, it has to try to convert the column data to the type of the variable:
Subqueries, CTEs, or writing the query a different way - especially in simple cases like this - are unlikely to change the order SQL Server uses to perform those operations.
You can force evaluation order in most scenarios by using a CASE expression (you also don't need the subquery):
SELECT EmployeeID
FROM #tmp
WHERE EmployeeID = CASE IsNumeric(EmployeeID) WHEN 1 THEN #eid END;
In modern versions of SQL Server (you forgot to tell us which version you use), you can also use TRY_CONVERT() instead:
SELECT EmployeeID
FROM #tmp
WHERE TRY_CONVERT(int, EmployeeID) = #eid;
This is essentially shorthand for the CASE expression, but with the added bonus that it allows you to specify an explicit type, which is one of the downsides of ISNUMERIC(). All ISNUMERIC() tells you is if the value can be converted to any numeric type. The string '1e2' passes the ISNUMERIC() check, because it can be converted to float, but try converting that to an int...
For completeness, the best solution - if there is an index on EmployeeID - is to just use a variable that matches the column data type, as you suggested.
But even better would be to use a data type that prevents junk data like 'aa1234' from getting into the table in the first place.

T-Sql - Select query in another select query takes long time

I have a procedure with arguments but its calling takes a very long time. I decided to check what is wrong with my query and came to the conclusion that the problem is Column In (SELECT [...]).
Both queries return 1500 rows.
First query: time 45 second
Second query: time 0 second
1.
declare #FILTER_OPTION int
declare #ID_DISTRIBUTOR type_int_value
declare #ID_DATA_TYPE type_bigint_value
declare #ID_AGGREGATION_TYPE type_int_value
set #FILTER_OPTION = 8
insert into #ID_DISTRIBUTOR values (19)
insert into #ID_DATA_TYPE values (30025)
insert into #ID_AGGREGATION_TYPE values (10)
SELECT * FROM dbo.[DATA] WHERE
[ID_DISTRIBUTOR] IN (select [VALUE] from #ID_DISTRIBUTOR)
AND [ID_DATA_TYPE] IN (select [VALUE] from #ID_DATA_TYPE)
AND [ID_AGGREGATION_TYPE] IN (select [VALUE] from #ID_AGGREGATION_TYPE)
2.
select * FROM dbo.[DATA] WHERE
[ID_DISTRIBUTOR] IN (19)
AND [ID_DATA_TYPE] IN (30025)
AND [ID_AGGREGATION_TYPE] IN (10)
Why this is happening?
How should I create a stored procedure that takes an array of arguments to use it quickly?
Edit:
Maybe it's a problem with indexes? indexes are created on these three columns.
For such a large performance difference, I would guess that you have one or more indexes. In particular, if you have an index on (ID_DISTRIBUTOR, ID_DATA_TYPE, ID_AGGREGATION_TYPE), then the second query can make use of the index. SQL Server can recognize that the IN is really = and the query is a simple lookup.
In the first case, SQL Server doesn't "know" that the subqueries really have only one row in them. That requires a different set of optimizations. In particular, the above index cannot be used, because the IN generally optimizes differently from =.
As for what to do. First, look at the execution plans so you can see the different between the two versions. Then, test the second version with more than one value in the IN lists.
If you can live with just one value for each comparison, then use = rather than IN.

Having trouble understanding this query

Basically I can't understand what this query below does:
UPDATE #so_stockmove
SET #total_move_qty = total_move_qty = (
CASE WHEN #so_docdt_id <> so_docdt_id THEN 0
ELSE ISNULL(#total_move_qty, 0)
END
) + ISNULL(move_qty,0),
balance = so_qty - #total_move_qty,
#so_docdt_id = so_docdt_id
I only can guess that it updates each row for the columns total_move_qty,balance,so_docdt_id.
Can someone explain to me in detail what the query means:
UPDATE tbl SET #variable1 = columnA = expression
Update
After reading #MotoGP comments, I did some digging and found this article by Jeff Moden where he states the following:
Warning:
Well, sort of. Lots of folks (including some of the "big" names in the SQL world) warn against and, sometimes, outright condemn the
method contained in this article as "unreliable" & "unsupported". One
fellow MVP even called it an "undocumented hack" on the fairly recent
"24 hours of SQL". Even the very core of the method, the ability to
update a variable from row to row, has been cursed in a similar
fashion. Worse yet, except for the ability to do 3 part updates (SET
#variable = columnname = expression) and to update both variables and
columns at the same time, there is absolutely no Microsoft
documentation to support the use of this method in any way, shape, or
form. In fact, even Microsoft has stated that there is no guarantee
that this method will work correctly all the time.
Now, let me tell you that, except for one thing, that's ALL true. The
one thing that isn't true is its alleged unreliability. That's part of
the goal of the article... to prove its reliability (which really
can't be done unless you use it. It's like proving the reliability of
the SELECT statement). At the end of the article, make up your own
mind. If you decide that you don't want to use such a very old ,yet,
undocumented feature, then use a Cursor or While loop or maybe even a
CLR because all of the other methods are just too darned slow. Heh...
just stop telling me that it's an undocumented hack... I already know
that and, now, so do you. ;-)
First edition
Well, this query updates columns total_move_qty and balance in a table variable called #so_stockmove, and in the same time sets values to the variables called #total_move_qty and #so_docdt_id.
I didn't know it's possible to assign values to more then one target this way in Sql server (#variable1 = columnA = expression) but apparently that is possible.
Here is my test:
declare #bla char(1)
declare #tbl table
(
X char(1)
)
insert into #tbl VALUES ('A'),('B'), ('C')
SELECT *
FROM #tbl
UPDATE #tbl
SET #Bla = X = 'D'
SELECT *
FROM #tbl
SELECT #bla
Results:
X -- first select before update
----
A
B
C
X -- second select after update
----
D
D
D
---- select the variable value after update
D
It just sets the value to the variable and updates the field.

Puzzling SQL server behaviour - results in different formats if there is a 1<>2 expression in WHERE clause

I have two almost identical SELECT statements. I am running them on a SQL Server 2012 with server collation Danish_Norwegian_CI_AS, and database collation Danish_Norwegian_CI_AS. The database runs in compatibility level set to SQL Server 2005 (90).
I run both of the queries on the same client via a SQL Server 2012 Management Studio. The client is a Windows 8.1 laptop.
The puzzling part is, although the statements are almost identical, the resultset is different as shown below (one returns 24-hour format time, the other with AM / PM, which gets truncated tpo P in this case). The only difference is the 'and 1<>2' expression in WHERE clause. I looked up and down, searched in google, digged as deep as I could, cannot find an explanation. Tried COLLATE to force conversion, did not help. If I use 108 to force formatting in CONVERT call, then the resultsets are alike. But not knowing the reason why this does not work is eating me alive.
Issue recreated on SqlFiddle, SQL Server 2008:
http://sqlfiddle.com/#!3/a97f8/1
Have someone an explain for this?
The SQL DDL and statements after results can be used to recreate the issue. The script creates a table with two columns, inserts some rows, and makes two selects.
On my machine the sql without the 1<>2 expression returns:
Id StartTime
----------- ---------
2 2:00P
2 2:14P
The sql with the 1<>2 expression returns:
Id StartTime
----------- ---------
2 14:00
2 14:14
if NOT EXISTS (Select * from sysobjects where name = 'timeVarchar')
begin
create table timeVarchar (
Id int not null,
timeTest datetime not null
)
end
if not exists (select * from timeVarchar)
begin
-- delete from timeVarchar
print 'inserting'
insert into timeVarchar (Id, timeTest) values (1, '2014-04-09 11:37:00')
insert into timeVarchar (Id, timeTest) values (2, '1901-01-01 14:00:00')
insert into timeVarchar (Id, timeTest) values (3, '2014-04-05 15:00:00')
insert into timeVarchar (Id, timeTest) values (2, '1901-01-01 14:14:14')
end
select
Id,
convert ( varchar(5), convert ( time, timeTest)) as 'StartTime'
from
timeVarchar
where
Id = 2
select
Id,
convert ( varchar(5), convert ( time, timeTest)) as 'StartTime'
from
timeVarchar
where
Id = 2 and
1 <> 2
I can't answer why this is happening (at least not at the moment), but setting the conversion format explicitly does solve the issue:
select Id,
convert (varchar(5), convert (time, timeTest), 14) as "StartTime"
from timeVarchar
where Id = 2;
select Id,
convert (varchar(5), convert (time, timeTest), 14) as "StartTime"
from timeVarchar
where Id = 2
and 1 <> 2;
Going through the execution plan, the two queries end up very different indeed.
The first one passes 2 as a parameter and (!) does CONVERT_IMPLICIT of the value. The second one passes it as a part of the query itself!
In the end, the actual query that gets to run in the first case actually explicitly does CONVERT(x, y, 0). For US locale, this is not a problem, since 0 is the invariant (~US) culture. But outside of the US, you're suddenly using 0 instead of e.g. 4 (for Germany).
So, definitely, one thing to take from this is that queries that look very much alike could execute completely differently.
The second thing is - always use convert with a specific format. The defaults don't seem to be entirely reliable.
EDIT: Ah, finally fished the thing out of the MSDN:
http://msdn.microsoft.com/en-us/library/ms187928.aspx
In earlier versions of SQL Server, the default style for CAST and
CONVERT operations on time and datetime2 data types is 121 except when
either type is used in a computed column expression. For computed
columns, the default style is 0. This behavior impacts computed
columns when they are created, used in queries involving
auto-parameterization, or used in constraint definitions.
Since the first query is invoked as a parametrized query, it gets the default style 0, rather than 121. This behaviour is fixed in compatibility level 110+ (i.e. SQL SERVER 2012+) - on those servers, the default is always 121.
It seems the problem is solved in SQL2012
see this link
http://sqlfiddle.com/#!6/a97f8/4
p.s Your mentioned url on sqlfiddle is running on SQL2008

Optimizing stored procedure with multiple "LIKE"s

I am passing in a comma-delimited list of values that I need to compare to the database
Here is an example of the values I'm passing in:
#orgList = "1123, 223%, 54%"
To use the wildcard I think I have to do LIKE but the query runs a long time and only returns 14 rows (the results are correct, but it's just taking forever, probably because I'm using the join incorrectly)
Can I make it better?
This is what I do now:
declare #tempTable Table (SearchOrg nvarchar(max) )
insert into #tempTable
select * from dbo.udf_split(#orgList) as split
-- this splits the values at the comma and puts them in a temp table
-- then I do a join on the main table and the temp table to do a like on it....
-- but I think it's not right because it's too long.
select something
from maintable gt
join #tempTable tt on gt.org like tt.SearchOrg
where
AYEAR= ISNULL(#year, ayear)
and (AYEAR >= ISNULL(#yearR1, ayear) and ayear <= ISNULL(#yearr2, ayear))
and adate = ISNULL(#Date, adate)
and (adate >= ISNULL(#dateR1, adate) and adate <= ISNULL(#DateR2 , adate))
The final result would be all rows where the maintable.org is 1123, or starts with 223 or starts with 554
The reason for my date craziness is because sometimes the stored procedure only checks for a year, sometimes for a year range, sometimes for a specific date and sometimes for a date range... everything that's not used in passed in as null.
Maybe the problem is there?
Try something like this:
Declare #tempTable Table
(
-- Since the column is a varchar(10), you don't want to use nvarchar here.
SearchOrg varchar(20)
);
INSERT INTO #tempTable
SELECT * FROM dbo.udf_split(#orgList);
SELECT
something
FROM
maintable gt
WHERE
some where statements go here
And
Exists
(
SELECT 1
FROM #tempTable tt
WHERE gt.org Like tt.SearchOrg
)
Such a dynamic query with optional filters and LIKE driven by a table (!) are very hard to optimize because almost nothing is statically known. The optimizer has to create a very general plan.
You can do two things to speed this up by orders of magnitute:
Play with OPTION (RECOMPILE). If the compile times are acceptable this will at least deal with all the optional filters (but not with the LIKE table).
Do code generation and EXEC sp_executesql the code. Build a query with all LIKE clauses inlined into the SQL so that it looks like this: WHERE a LIKE #like0 OR a LIKE #like1 ... (not sure if you need OR or AND). This allows the optimizer to get rid of the join and just execute a normal predicate).
Your query may be difficult to optimize. Part of the question is what is in the where clause. You probably want to filter these first, and then do the join using like. Or, you can try to make the join faster, and then do a full table scan on the results.
SQL Server should optimize a like statement of the form 'abc%' -- that is, where the wildcard is at the end. (See here, for example.) So, you can start with an index on maintable.org. Fortunately, your examples meet this criteria. However, if you have '%abc' -- the wildcard comes first -- then the optimization won't work.
For the index to work best, it might also need to take into account the conditions in the where clause. In other words, adding the index is suggestive, but the rest of the query may preclude the use of the index.
And, let me add, the best solution for these types of searches is to use the full text search capability in SQL Server (see here).