Using while loop in T-SQL function - sql

Non-database programmer here. It happens so, that I need to create a function in T-SQL which returns workdays count between given dates. I believe that the easiest how it's done is with while loop. Problem is, that as soon as I write something like
while #date < #endDate
begin
end
the statement won't execute, claiming "incorrect syntax near the keyword 'return'" (not very helpful). Where's the problem?
P.S. Full code:
ALTER FUNCTION [dbo].[GetNormalWorkdaysCount] (
#startDate DATETIME,
#endDate DATETIME
)
RETURNS INT
AS
BEGIN
declare #Count INT,
#CurrDate DATETIME
set #CurrDate = #startDate
while (#CurrDate < #endDate)
begin
end
return #Count
END
GO

Unlike some languages, the BEGIN/END pair in SQL Server cannot be empty - they must contain at least one statement.
As to your actual problem - you've said you're not a DB programmer. Most beginners to SQL tend to go down the same route - trying to write procedural code to solve the problem.
Whereas, SQL is a set-based language - it's usually better to find a set-based solution, rather than using loops.
In this instance, a calendar table would be a real help. Such a table contains one row for each date, and additional columns indicating useful information for your business (e.g. what you consider to be a working day). It then makes your query for working days look like:
SELECT COUNT(*) from Calendar
where BaseDate >= #StartDate and BaseDate < #EndDate and IsWorkingDay = 1
Populating the Calendar table becomes a one off exercise, and you can populate it with e.g. 30 years worth of dates easily.

Using any loop within SQL server is never a good idea :)
There are few better solutions, referring to one presented on StackOverflow already.

Related

Avoiding while loops in SQL when a counter is required

I feel like this is a common problem, but it seems that none of the answers that I have found on SO or other sites seem to address the issue of a while loop with a counter.
Let's say that I am trying to write a stored procedure in SQL that will populate a user's timesheet by inserting a row for each day for the remainder of the month. If the #endMonth variable holds the last day of the month, then I know that I could easily write a while loop and do something along these lines:
WHILE #date <= #endMonth
BEGIN
//Do some action with the date, like an insert
SET #date = DATEADD(d, 1, #date) //increment the date by one day
END
However, looking at answers here and on other sites leads me to believe that it would be best to avoid using a while loop if at all possible.
So my question is this: is there a way I can implement a loop with a counter in SQL without using the WHILE structure? What technique would I use to go about converting a loop similar to the one I posted? Or with something like this, do I have to bite the bullet and just use a while loop?
As an aside, some of the following questions come close, but none of them seem to quite address the issue of needing a counter as a loop condition. Most of the answers seem to condemn using WHILE loops, but I can't seem to find a general purpose solution as to an alternative.
sql while loop with date counter
SQL Server 2008 Insert with WHILE LOOP (this one was close, but unfortunately for me it only works with an auto increment column)
I saw many examples of populating data.
First you create dates from starting to ending dates in cte and then you can insert it into table.
One of them is with cte:
DECLARE #StartDate DateTime = '2014-06-01'
DECLARE #EndDate DateTime = '2014-06-29'
;WITH populateDates (dates) AS (
SELECT #StartDate as dates
UNION ALL
SELECT DATEADD(d, 1, dates)
FROM populateDates
WHERE DATEADD(d, 1, dates)<=#EndDate
)
SELECT *
INTO dbo.SomeTable
FROM populateDates
You should try to look for on internet how to populate date in sql table
As a general case, you can increment values without using cursors by assigning values and incrementing the variable in the same select, like this:
DECLARE #i INT = 0
DECLARE #table TABLE
(
ID INT ,
testfield VARCHAR(5)
)
INSERT INTO #table
( testfield )
VALUES ( 'abcd'),
( 'efgh' ),
( 'ijkl' ),
( 'mnop' )
UPDATE #table
SET #I = ID = #i + 1
SELECT *
FROM #table
I used a sequence - create temporarily.
I needed to do my updates outside of script context, with plain SQL, sequence was the only "counter" I could come up with.

updating date by stored procedure

I have a problem!
My task is to count the age of books in my library database. After that call some books as too rare, some rare , and usual using value column.
My library table ( ... , age- date , value- date)
notice: "age" - is incorrect definition of a column, it would be better to say "year of publication". Actually my task is to find age!
So, I do this, and my value column does not change :(
create procedure foo
as
declare #bookdate date,
#currentdate date,
#diff int
set #currentdate = GETDATE()
select #bookdate = age from books
select #diff = DATEDIFF (yyyy , #bookdate , #currentdate )
Version #1:
UPDATE books SET value = DATEADD(year,#diff, age)
Version #2:
UPDATE books SET value = #diff
P.S. sorry for any mistakes I made, it is my first step in sql, programming at all, and asking for help in English!
To me it sounds like you want something like this (I'm assuming you're using SQL Server as you've used the GETDATE() function):
CREATE PROCEDURE foo
AS
BEGIN
SELECT *
,DATEDIFF(yyyy,age,GETDATE()) AS YearsSincePublication
,CASE WHEN DATEDIFF(yyyy,age,GETDATE()) > 200 THEN 'Too rare'
WHEN DATEDIFF(yyyy,age,GETDATE()) > 100 THEN 'Rare'
ELSE 'Usual'
END AS Value
FROM books
END
Working form the top:
* means all columns from all tables
The datediff is working out the number of years since the publication and the AS bit names the resulting column (gives it an alias).
The CASE Statement is a way to test statements (if a equals b, do c). The first statement checks to see iff the book is more than 200 years old and if so, writes 'Too rare', the second line checks for more than 100 years, otherwise it writes 'usual'. Again, the AS is used to label the column to Value.
Finally the table we want our data from is specified, Books.
To run the stored procedure once you have created it is simply:
EXEC foo

Does SQL Server optimize DATEADD calculation in select query?

I have a query like this on Sql Server 2008:
DECLARE #START_DATE DATETIME
SET #START_DATE = GETDATE()
SELECT * FROM MY_TABLE
WHERE TRANSACTION_DATE_TIME > DATEADD(MINUTE, -1440, #START_DATE)
In the select query that you see above, does SqlServer optimize the query in order to not calculate the DATEADD result again and again. Or is it my own responsibility to store the DATEADD result on a temp variable?
SQL Server functions that are considered runtime constants are evaluated only once. GETDATE() is such a function, and DATEADD(..., constant, GETDATE()) is also a runtime constant. By leaving the actual function call inside the query you let the optimizer see what value will actually be used (as opposed to a variable value sniff) and then it can adjust its cardinality estimations accordingly, possibly coming up with a better plan.
Also read this: Troubleshooting Poor Query Performance: Constant Folding and Expression Evaluation During Cardinality Estimation.
#Martin Smith
You can run this query:
set nocount on;
declare #known int;
select #known = count(*) from sysobjects;
declare #cnt int = #known;
while #cnt = #known
select #cnt = count(*) from sysobjects where getdate()=getdate()
select #cnt, #known;
In my case after 22 seconds it hit the boundary case and the loop exited. The inportant thing is that the loop exited with #cnt zero. One would expect that if the getdate() is evaluated per row then we would get a #cnt different from the correct #known count, but not 0. The fact that #cnt is zero when the loop exists shows each getdate() was evaluated once and then the same constant value was used for every row WHERE filtering (matching none). I am aware that one positive example does not prove a theorem, but I think the case is conclusive enough.
Surprisingly, I've found that using GETDATE() inline seems to be more efficient than performing this type of calculation beforehand.
DECLARE #sd1 DATETIME, #sd2 DATETIME;
SET #sd1 = GETDATE();
SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, #sd1)
SELECT * FROM dbo.table
WHERE datetime_column > DATEADD(MINUTE, -1440, GETDATE())
SET #sd2 = DATEADD(MINUTE, -1440, #sd1);
SELECT * FROM dbo.table
WHERE datetime_column > #sd2;
If you check the plans on those, the middle query will always come out with the lowest cost (but not always the lowest elapsed time). Of course it may depend on your indexes and data, and you should not make any assumptions based on one query that the same pre-emptive optimization will work on another query. My instinct would be to not perform any calculations inline, and instead use the #sd2 variation above... but I've learned that I can't trust my instinct all the time and I can't make general assumptions based on behavior I experience in particular scenarios.
It will be executed just once. You can double check it by checking execution plan ("Compute Scalar"->Estimated Number of execution == 1)

declare variables in sql ce

I am tryin to get records from a table within a month in sql compact edition.
Here is the sql query I know:
DECLARE #startDate as DATETIME, #EndDate as DATETIME
#startDate = GetDate();
#ENdDate = DATEADD(m,1,#startDate)
select * from table where (columnname between #startdate and #enddate)
I know that you have to send one script at a time, but how can you declare variables in sql ce(I guess it doesn't accept declare)?
I think my answer is very late for this question, but hope it will be useful for someone.
You can't declare some variable in SQL CE, because just one statement can be used per command. As ErikEJ said in this link.
You need to refactor your script to one big statement, if it's possible!
I will be very happy to hear a better solution.
Take a look at post How do I populate a SQL Server Compact database?
and see if tool referenced there may help you.
If you call it through an application (I'm not sure how you read the data)
Prepare your query just like this:
select * from table where (columnname between ? and ?)
but I'm not sure if you can use the between keyword. may be you need to change this.
then you need to add your SqlCeParameter objects like this:
cmd.Parameters.Add(new SqlCeParameter("p1", SqlDbType.DateTime, myDate));
I'm not familiar with SQL-CE, but I think you're missing some Set statements. Try this:
DECLARE #startDate as DATETIME, #EndDate as DATETIME
Set #startDate = GetDate();
Set #ENdDate = DATEADD(m,1,#startDate)
select * from table where (columnname between #startdate and #enddate)
Update
See the Using Parameters in Queries in SQL CE reference from MSDN. You are correct in that Declare is not a valid keyword, so you'll need to the query as a parameterized version from the application itself.
select * from table where (columnname between ? and ?)

Will index be used when using OR clause in where

I wrote a stored procedure with optional parameters.
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
SELECT columns
FROM table
WHERE (#startTime is NULL or table.StartTime >= #startTime) AND
(#endTIme is NULL or table.EndTime <= #endTime)
I'm wondering whether indexes on StartTime and EndTime will be used?
Yes they will be used (well probably, check the execution plan - but I do know that the optional-ness of your parameters shouldn't make any difference)
If you are having performance problems with your query then it might be a result of parameter sniffing. Try the following variation of your stored procedure and see if it makes any difference:
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
DECLARE #startTimeCopy DATETIME
DECLARE #endTimeCopy DATETIME
set #startTimeCopy = #startTime
set #endTimeCopy = #endTime
SELECT columns
FROM table
WHERE (#startTimeCopy is NULL or table.StartTime >= #startTimeCopy) AND
(#endTimeCopy is NULL or table.EndTime <= #endTimeCopy)
This disables parameter sniffing (SQL server using the actual values passed to the SP to optimise it) - In the past I've fixed some weird performance issues doing this - I still can't satisfactorily explain why however.
Another thing that you might want to try is splitting your query into several different statements depending on the NULL-ness of your parameters:
IF #startTime is NULL
BEGIN
IF #endTime IS NULL
SELECT columns FROM table
ELSE
SELECT columns FROM table WHERE table.EndTime <= #endTime
END
ELSE
IF #endTime IS NULL
SELECT columns FROM table WHERE table.StartTime >= #startTime
ELSE
SELECT columns FROM table WHERE table.StartTime >= #startTime AND table.EndTime <= #endTime
BEGIN
This is messy, but might be worth a try if you are having problems - the reason it helps is because SQL server can only have a single execution plan per sql statement, however your statement can potentially return vastly different result sets.
For example, if you pass in NULL and NULL you will return the entire table and the most optimal execution plan, however if you pass in a small range of dates it is more likely that a row lookup will be the most optimal execution plan.
With this query as a single statement SQL server is forced to choose between these two options, and so the query plan is likely to be sub-optimal in certain situations. By splitting the query into several statements however SQL server can have a different execution plan in each case.
(You could also use the exec function / dynamic SQL to achieve the same thing if you preferred)
There is a great article to do with dynamic search criteria in SQL. The method I personally use from the article is the X=#X or #X IS NULL style with the OPTION (RECOMPILE) added at the end. If you read the article it will explain why
http://www.sommarskog.se/dyn-search-2008.html
Yes, based on the query provided indexes on or including the StartTime and EndTime columns can be used.
However, the [variable] IS NULL OR... makes the query not sargable. If you don't want to use an IF statement (because CASE is an expression, and can not be used for control of flow decision logic), dynamic SQL is the next alternative for performant SQL.
IF #startTime IS NOT NULL AND #endTime IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE starttime >= #startTime
AND endtime <= #endTime
END
ELSE IF #startTime IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE endtime <= #endTime
END
ELSE IF #endTIme IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE starttime >= #startTime
END
ELSE
BEGIN
SELECT columns
FROM TABLE
END
Dynamically changing searches based on the given parameters is a complicated subject and doing it one way over another, even with only a very slight difference, can have massive performance implications. The key is to use an index, ignore compact code, ignore worrying about repeating code, you must make a good query execution plan (use an index).
Read this and consider all the methods. Your best method will depend on your parameters, your data, your schema, and your actual usage:
Dynamic Search Conditions in T-SQL by by Erland Sommarskog
The Curse and Blessings of Dynamic SQL by Erland Sommarskog
The portion of the above articles that apply to this query is Umachandar's Bag of Tricks, but it is basically defaulting the parameters to some value to eliminate needing to use the OR. This will give the best index usage and overall performance:
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
DECLARE #startTimeCopy DATETIME
DECLARE #endTimeCopy DATETIME
set #startTimeCopy = COALESCE(#startTime,'01/01/1753')
set #endTimeCopy = COALESCE(#endTime,'12/31/9999')
SELECT columns
FROM table
WHERE table.StartTime >= #startTimeCopy AND table.EndTime <= #endTimeCopy)
Probably not. Take a look at this blog posting from Tony Rogerson SQL Server MVP:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/05/17/444.aspx
You should at least get the idea that you need to test with credible data and examine the execution plans.
I don't think you can guarantee that the index will be used. It will depend a lot on the size of the table, the columns you are showing, the structure of the index and other factors.
Your best bet is to use SQL Server Management Studio (SSMS) and run the query, and include the "Actual Execution Plan". Then you can study that and see exactly which index or indices were used.
You'll often be surprised by what you find.
This is especially true if there in an OR or IN in the query.