Date wise optional parameter searching in SQL Stored Procedure - sql

I want to take a leave report from my leave application table when I search. In the table I have Leavefrom(datetime), LeaveTo(datetime) columns. Now I want to take the rows on the basis of these two columns. My searching parameters are nullable they are
#employeeid, #datefrom, #dateto.
I need to get the result must between the date of Leavefrom, LeaveTo.
I am trying to make a stored procedure for this.
ALTER PROCEDURE [dbo].[SP_GetSpecificLeaveReport]
#empid int=null,
#leavefrom date=null,
#leaveto date=null
AS
BEGIN
SET NOCOUNT ON;
SELECT ela.appliedDate,ela.appliedBy,ela.leaveFrom,ela.leaveTo,ela.noOfDays,
ejd.firstName,ejd.lastName,
ltm.leaveType
from dbo.tblEmployeeLeaveApplication as ela inner join dbo.tblEmployeeJobDetails as
ejd on ela.empId=ejd.recordId inner join dbo.tblLeaveTypeMaster as ltm
on ela.leaveTypeId=ltm.record_Id where
END

This kind of queries are called catch-all queries.
There are multiple ways to do this, using iif like in Mukesh's answer is one of them, but will only work on sql server 2012 or higher.
I would recommend working with a slighly longer where clause for better performance as well as compatibility (this should work with any version, even sql server 7):
where (#empid is null or ela.empId = #empid)
and (#leavefrom is null or (
ltm.leavedatefrom >= #leavefrom
and ltm.leavedatefrom < dateadd(day, 1, #leavefrom)
)
and (#leaveto is null or (
ltm.leavedateto >= #leaveto
and ltm.leavedateto < dateadd(day, 1, #leaveto))
Note: Since your database columns are of type datetime but your parameters are of type date, I've the >=...or condition to catch all datetime values that match a specific date.
Also, you might need to cast from date to datetime.
Also, you should be aware of the fact that using catch-all queries might suffer from poor performance due to query-plan cashing.
If you do encounter a performance problem you might want to add a recompile hint to your query so that each time you execute the stored procedure you will have the optimal query plan. read this article for more details.

try this query to add
where ela.empId=IFNULL(ela.empId,ela.empId)
and ltm.leavedatefrom>=IFNULL(#leavefrom,ltm.leavedatefrom)
and ltm.leavedateto<=IFNULL(#leaveto,ltm.leavedateto)

Try following,
ALTER PROCEDURE [dbo].[SP_GetSpecificLeaveReport]
#empid int=null,
#leavefrom date=null,
#leaveto date=null
AS
BEGIN
SET NOCOUNT ON;
SELECT
ela.appliedDate,ela.appliedBy,ela.leaveFrom,ela.leaveTo,ela.noOfDays,
ejd.firstName,ejd.lastName,
ltm.leaveType
from dbo.tblEmployeeLeaveApplication as ela
inner join dbo.tblEmployeeJobDetails as ejd on ela.empId=ejd.recordId
inner join dbo.tblLeaveTypeMaster as ltm on ela.leaveTypeId=ltm.record_Id
where
1 = case when Isnull(#empid,0) == 0 then
case when ela.empId == #empid then 1 else 0 end
else 0 end
And case when isnull(#leavefrom,'1900-01-01') == '1900-01-01' then
case when ela.leaveFrom >= #leavefrom then 1 else 0 end
else 0 end
And case when isnull(#leavefrom,'1900-01-01') == '1900-01-01' then
case when ela.leaveto <= #leaveto then 1 else 0 end
else 0 end
END

Related

Big difference in Estimated and Actual rows when using a local variable

This is my first post on Stackoverflow so I hope I'm correctly following all protocols!
I'm struggling with a stored procedure in which I create a table variable and filling this table with an insert statement using an inner join. The insert itself is simple, but it gets complicated because the inner join is done on a local variable. Since the optimizer doesn't have statistics for this variable my estimated row count is getting srewed up.
The specific piece of code that causes trouble:
declare #minorderid int
select #minorderid = MIN(lo.order_id)
from [order] lo with(nolock)
where lo.order_datetime >= #datefrom
insert into #OrderTableLog_initial
(order_id, order_log_id, order_id, order_datetime, account_id, domain_id)
select ot.order_id, lol.order_log_id, ot.order_id, ot.order_datetime, ot.account_id, ot.domain_id
from [order] ot with(nolock)
inner join order_log lol with(nolock)
on ot.order_id = lol.order_id
and ot.order_datetime >= #datefrom
where (ot.domain_id in (1,2,4) and lol.order_log_id not in ( select order_log_id
from dbo.order_log_detail lld with(nolock)
where order_id >= #minorderid
)
or
(ot.domain_id = 3 and ot.order_id not IN (select order_id
from dbo.order_log_detail_spa llds with(nolock)
where order_id >= #minorderid
)
))
order by lol.order_id, lol.order_log_id
The #datefrom local variable is also declared earlier in the stored procedure:
declare #datefrom datetime
if datepart(hour,GETDATE()) between 4 and 9
begin
set #datefrom = '2011-01-01'
end
else
begin
set #datefrom = DATEADD(DAY,-2,GETDATE())
end
I've also tested this with a temporary table in stead of a table variable, but nothing changes. However, when I replace the local variable >= #datefrom with a fixed datestamp then my estimates and actuals are almost the same.
ot.order_datetime >= #datefrom = SQL Sentry Plan Explorer
ot.order_datetime >= '2017-05-03 18:00:00.000' = SQL Sentry Plan Explorer
I've come to understand that there's a way to fix this by turning this code into a dynamic sp, but I'm not sure how to do this. I would be grateful if someone could give me suggestions on how to do this. Maybe I have to use a complete other approach? Forgive me if I forgot something to mention, this is my first post.
EDIT:
MSSQL version = 11.0.5636
I've also tested with trace flag 2453, but with no success
Best regards,
Peter
Indeed, the behavior what you are experiencing is because the variables. SQL Server won't store an execution plan for each and every possible inputs, thus for some queries the execution plan may or may not optimal.
To answer your explicit question: You'll have to create a varchar variable and build the query as a string, then execute it.
Some notes before the actual code:
This can be prone to SQL injection (in general)
SQL Server will store the plans separately, meaning they will use more memory and possibly knock out other plans from the cache
Using an imaginary setup, this is what you want to do:
DECLARE #inputDate DATETIME2 = '2017-01-01 12:21:54';
DELCARE #dynamiSQL NVARCHAR(MAX) = CONCAT('SELECT col1, col2 FROM MyTable WHERE myDateColumn = ''', FORMAT(#inputDate, 'yyyy-MM-dd HH:mm:ss'), ''';');
INSERT INTO #myTableVar (col1, col2)
EXEC sp_executesql #stmt = #dynamicSQL;
As an additional note:
you can try to use EXISTS and NOT EXISTS instead of IN and NOT IN.
You can try to use a temp table (#myTempTable) instead of a local variable and put some indexes on it. Physical temp tables can perform better with large amount of data and you can put indexes on it. (For more info you can go here: What's the difference between a temp table and table variable in SQL Server? or to the official documentation)

Iterative Union ALL's

I have a large SQL Server 2012 Database which I am querying 3 tables to create a result set of 5 fields.
I want to repeat this query in a WHILE - loop and "UNION ALL" the result sets obtained in each loop. This iteration will be on a variable: #this_date which will increment over the past 6 years and stop at today's date.
At each iteration a different results set will be obtained by the SELECT.
So I am trying to code the Stored Procedure as follows:
Declare #the_date as Date,
#to_date as Date
-- I setup the above dates, #the_date being 6 years behind #to_date
-- Want to loop for each day over the 6-year period
WHILE (#the_date <= #to_date)
BEGIN
-- the basic select query looks like this
Select Table1.Field-1, Table2.Field-2 ...
FROM Table1
Inner Join Table2 ...
On ( ..etc.. )
-- the JOIN conditions are based on table.attributes which are compared with
-- #the_date to get a different result set each time
-- now move the date up by 1
DateAdd(Day, +1, #the_date)
-- want to concatenate the result sets
UNION ALL
END
The above gives me a syntax error:
Incorrect syntax near the keyword 'Union'.
Any ideas on a solution to my problem would be welcome
- thanks.
Don't use a UNION. You can't in a loop anyway. Instead store the results of each iteration in a temp table or a table variable and select from the temp table / table variable instead.
DECLARE #the_date as Date,
#to_date as Date
CREATE TABLE #t (Col1 VARCHAR(100))
WHILE (#the_date <= #to_date)
BEGIN
INSERT #t (Col1) SELECT ... etc
DateAdd(Day, +1, #the_date)
END
SELECT Col1 FROM #t
That said, if you provide some sample data and expected results we might be able to help you with a more efficient set-based solution. You should avoid iterative looping in RDBMS whenever possible.

Stored procedure timing out on particular connection pool

I have a stored procedure which occasionally times out when called from our website (through the website connection pool). Once it has timed out, it has always been locked into the time-out, until the procedure is recompiled using drop/create or sp_recompile from a Management Studio session.
While it is timing out, there is no time-out using the same parameters for the same procedure using Management Studio.
Doing an "ALTER PROCEDURE" through Management Studio and (fairly drastically) changing the internal execution of the procedure did NOT clear the time out - it wouldn't clear until a full sp_recompile was run.
The stored procedure ends with OPTION (RECOMPILE)
The procedure calls two functions, which are used ubiquitously throughout the rest of the product. The other procedures which use these functions (in similar ways) all work, even during a period where the procedure in question is timing out.
If anyone can offer any further advice as to what could be causing this time out it would be greatly appreciated.
The stored procedure is as below:
ALTER PROCEDURE [dbo].[sp_g_VentureDealsCountSizeByYear] (
#DateFrom AS DATETIME = NULL
,#DateTo AS DATETIME = NULL
,#ProductRegion AS INT = NULL
,#PortFirmID AS INT = NULL
,#InvFirmID AS INT = NULL
,#SpecFndID AS INT = NULL
) AS BEGIN
-- Returns the stats used for Market Overview
DECLARE #IDs AS IDLIST
INSERT INTO #IDs
SELECT IDs
FROM dbo.fn_VentureDealIDs(#DateFrom,#DateTo,#ProductRegion,#PortFirmID,#InvFirmID,#SpecFndID)
CREATE TABLE #DealSizes (VentureID INT, DealYear INT, DealQuarter INT, DealSize_USD DECIMAL(18,2))
INSERT INTO #DealSizes
SELECT vDSQ.VentureID, vDSQ.DealYear, vDSQ.DealQuarter, vDSQ.DealSize_USD
FROM dbo.fn_VentureDealsSizeAndQuarter(#IDs) vDSQ
SELECT
yrs.Years Heading
,COUNT(vDSQ.VentureID) AS Num_Deals
,SUM(vDSQ.DealSize_USD) AS DealSize_USD
FROM tblYears yrs
LEFT OUTER JOIN #DealSizes vDSQ ON vDSQ.DealYear = yrs.Years
WHERE (
((#DateFrom IS NULL) AND (yrs.Years >= (SELECT MIN(DealYear) FROM #DealSizes))) -- If no minimum year has been passed through, take all years from the first year found to the present.
OR
((#DateFrom IS NOT NULL) AND (yrs.Years >= DATEPART(YEAR,#DateFrom))) -- If a minimum year has been passed through, take all years from that specified to the present.
) AND (
((#DateTo IS NULL) AND (yrs.Years <= (SELECT MAX(DealYear) FROM #DealSizes))) -- If no maximum year has been passed through, take all years up to the last year found.
OR
((#DateTo IS NOT NULL) AND (yrs.Years <= DATEPART(YEAR,#DateTo))) -- If a maximum year has been passed through, take all years up to that year.
)
GROUP BY yrs.Years
ORDER BY Heading DESC
OPTION (RECOMPILE)
END
If you wanted to recompile SP each time it is executed, you should have declared it with recompile; your syntax recompiles last select only:
ALTER PROCEDURE [dbo].[sp_g_VentureDealsCountSizeByYear] (
#DateFrom AS DATETIME = NULL
,#DateTo AS DATETIME = NULL
,#ProductRegion AS INT = NULL
,#PortFirmID AS INT = NULL
,#InvFirmID AS INT = NULL
,#SpecFndID AS INT = NULL
) WITH RECOMPILE
I could not tell which part of your procedure causes problems. You might try commenting out select part to see if creating temp tables from table functions produces performance issue; if it does not, then the query itself is a problem. You might rewrite filter as following:
WHERE (#DateFrom IS NULL OR yrs.Years >= DATEPART(YEAR,#DateFrom))
AND (#DateTo IS NULL OR yrs.Years <= DATEPART(YEAR,#DateTo))
Or, perhaps better, declare startYear and endYear variables, set them accordingly and change where like this:
declare #startYear int
set #startYear = isnull (year(#DateFrom), (SELECT MIN(DealYear) FROM #DealSizes))
declare #endYear int
set #endYear = isnull (year(#DateTo), (SELECT MAX(DealYear) FROM #DealSizes))
...
where yrs.Year between #startYear and #endYear
If WITH RECOMPILE does not solve the problem, and removing last query does not help either, then you need to check table functions you use to gather data.

Stored procedure date parameter filters - Ignore if Null

I am using the following SQL in my stored procedure to not filter by date parameters if they are null.
WHERE (Allocated >= ISNULL(#allocatedStartDate, '01/01/1900')
AND Allocated <= ISNULL(#allocatedEndDate,'01/01/3000'))
AND
(MatterOpened >= ISNULL(#matterOpenedStartDate, '01/01/1900')
AND MatterOpened <= ISNULL(#matterOpenedEndDate, '01/01/3000'))
Will this give any kind of performance hit when dealing with a lot of records?
Is there a better way to do this?
Number of records - around 500k
Or just let the query optimizer have it:
WHERE ( #allocatedStartDate is NULL or Allocated >= allocatedStartDate ) and
( #allocatedEndDate is NULL or Allocated <= #allocatedEndDate ) and
( #matterOpenedStartDate is NULL or MatterOpened >= #matterOpenedStartDate ) and
( #matterOpenedEndDate is NULL or MatterOpened <= #matterOpenedEndDate )
Note that this is not logically equivalent to your query. The last line uses column MatterOpened, not Allocated, as I assume that was a typographic error.
If performance is really an issue, you may want to consider adding indexes and changing the stored procedure to execute different queries based on the parameters. At least break it into: no filter, filter only on Allocated, filter only on MatterOpened, filter on both columns.
In a lot of cases, dynamic SQL can be better for you instead of trying to rely on the optimizer to cache a good plan for both NULL and non-NULL parameters.
DECLARE #sql NVARCHAR(MAX);
SET #sql = N'SELECT
...
WHERE 1 = 1';
SET #sql = #sql + CASE WHEN #allocatedStartDate IS NOT NULL THEN
' AND Allocated >= ''' + CONVERT(CHAR(8), #allocatedStartDate, 112) + '''';
-- repeat for other clauses
EXEC sp_executesql #sql;
No, it's not fun to maintain, but each variation should get its own plan in the cache. You'll want to test with different settings for "Optimize for ad hoc workloads" and database-level paramaterization settings. Oops, just noticed 2005. Keep those in mind for the future (and any readers who aren't still stuck on 2005).
Also make sure to use EXEC sp_executesql and not EXEC.
Instead of checking to see if the variable is null in your query, check them at the beginning of your stored procedure and change the value to your default
SELECT
#allocatedStartDate = ISNULL(#allocatedStartDate, '01/01/1900'),
#allocatedEndDate = ISNULL(#allocatedEndDate,'01/01/3000'),
#matterOpenedStartDate = ISNULL(#matterOpenedStartDate, '01/01/1900'),
#matterOpenedEndDate = ISNULL(#matterOpenedEndDate, '01/01/3000')
Maybe something like this:
DECLARE #allocatedStartDate DATETIME=GETDATE()
DECLARE #allocatedEndDate DATETIME=GETDATE()-2
;WITH CTE AS
(
SELECT
ISNULL(#allocatedStartDate, '01/01/1900') AS allocatedStartDate,
ISNULL(#allocatedEndDate,'01/01/3000') AS allocatedEndDate
)
SELECT
*
FROM
YourTable
CROSS JOIN CTE
WHERE (Allocated >= CTE.allocatedStartDate
AND Allocated <= CTE.allocatedEndDate)
AND
(MatterOpened >= CTE.allocatedStartDate
AND Allocated <= CTE.allocatedEndDate)

Will index be used when using OR clause in where

I wrote a stored procedure with optional parameters.
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
SELECT columns
FROM table
WHERE (#startTime is NULL or table.StartTime >= #startTime) AND
(#endTIme is NULL or table.EndTime <= #endTime)
I'm wondering whether indexes on StartTime and EndTime will be used?
Yes they will be used (well probably, check the execution plan - but I do know that the optional-ness of your parameters shouldn't make any difference)
If you are having performance problems with your query then it might be a result of parameter sniffing. Try the following variation of your stored procedure and see if it makes any difference:
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
DECLARE #startTimeCopy DATETIME
DECLARE #endTimeCopy DATETIME
set #startTimeCopy = #startTime
set #endTimeCopy = #endTime
SELECT columns
FROM table
WHERE (#startTimeCopy is NULL or table.StartTime >= #startTimeCopy) AND
(#endTimeCopy is NULL or table.EndTime <= #endTimeCopy)
This disables parameter sniffing (SQL server using the actual values passed to the SP to optimise it) - In the past I've fixed some weird performance issues doing this - I still can't satisfactorily explain why however.
Another thing that you might want to try is splitting your query into several different statements depending on the NULL-ness of your parameters:
IF #startTime is NULL
BEGIN
IF #endTime IS NULL
SELECT columns FROM table
ELSE
SELECT columns FROM table WHERE table.EndTime <= #endTime
END
ELSE
IF #endTime IS NULL
SELECT columns FROM table WHERE table.StartTime >= #startTime
ELSE
SELECT columns FROM table WHERE table.StartTime >= #startTime AND table.EndTime <= #endTime
BEGIN
This is messy, but might be worth a try if you are having problems - the reason it helps is because SQL server can only have a single execution plan per sql statement, however your statement can potentially return vastly different result sets.
For example, if you pass in NULL and NULL you will return the entire table and the most optimal execution plan, however if you pass in a small range of dates it is more likely that a row lookup will be the most optimal execution plan.
With this query as a single statement SQL server is forced to choose between these two options, and so the query plan is likely to be sub-optimal in certain situations. By splitting the query into several statements however SQL server can have a different execution plan in each case.
(You could also use the exec function / dynamic SQL to achieve the same thing if you preferred)
There is a great article to do with dynamic search criteria in SQL. The method I personally use from the article is the X=#X or #X IS NULL style with the OPTION (RECOMPILE) added at the end. If you read the article it will explain why
http://www.sommarskog.se/dyn-search-2008.html
Yes, based on the query provided indexes on or including the StartTime and EndTime columns can be used.
However, the [variable] IS NULL OR... makes the query not sargable. If you don't want to use an IF statement (because CASE is an expression, and can not be used for control of flow decision logic), dynamic SQL is the next alternative for performant SQL.
IF #startTime IS NOT NULL AND #endTime IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE starttime >= #startTime
AND endtime <= #endTime
END
ELSE IF #startTime IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE endtime <= #endTime
END
ELSE IF #endTIme IS NOT NULL
BEGIN
SELECT columns
FROM TABLE
WHERE starttime >= #startTime
END
ELSE
BEGIN
SELECT columns
FROM TABLE
END
Dynamically changing searches based on the given parameters is a complicated subject and doing it one way over another, even with only a very slight difference, can have massive performance implications. The key is to use an index, ignore compact code, ignore worrying about repeating code, you must make a good query execution plan (use an index).
Read this and consider all the methods. Your best method will depend on your parameters, your data, your schema, and your actual usage:
Dynamic Search Conditions in T-SQL by by Erland Sommarskog
The Curse and Blessings of Dynamic SQL by Erland Sommarskog
The portion of the above articles that apply to this query is Umachandar's Bag of Tricks, but it is basically defaulting the parameters to some value to eliminate needing to use the OR. This will give the best index usage and overall performance:
CREATE PROCEDURE dbo.GetActiveEmployee
#startTime DATETIME=NULL,
#endTime DATETIME=NULL
AS
SET NOCOUNT ON
DECLARE #startTimeCopy DATETIME
DECLARE #endTimeCopy DATETIME
set #startTimeCopy = COALESCE(#startTime,'01/01/1753')
set #endTimeCopy = COALESCE(#endTime,'12/31/9999')
SELECT columns
FROM table
WHERE table.StartTime >= #startTimeCopy AND table.EndTime <= #endTimeCopy)
Probably not. Take a look at this blog posting from Tony Rogerson SQL Server MVP:
http://sqlblogcasts.com/blogs/tonyrogerson/archive/2006/05/17/444.aspx
You should at least get the idea that you need to test with credible data and examine the execution plans.
I don't think you can guarantee that the index will be used. It will depend a lot on the size of the table, the columns you are showing, the structure of the index and other factors.
Your best bet is to use SQL Server Management Studio (SSMS) and run the query, and include the "Actual Execution Plan". Then you can study that and see exactly which index or indices were used.
You'll often be surprised by what you find.
This is especially true if there in an OR or IN in the query.