UDF inside an SP performance issue on ms sql

UDF inside an SP performance issue on ms sql - sql

I have the following SP
PROCEDURE [dbo].[order_s]
(
#user Uniqueidentifier
)
AS
BEGIN
SET NOCOUNT ON;
SELECT
id,
name,
[begin],
[end]
FROM
orders
WHERE
#user = dbo.hasAccess(#user,id,'select')
END
This SP calls a this UDF
FUNCTION [dbo].[hasAccess]
(
#user uniqueidentifier,
#orderId bigint,
#AccessType nchar(10)
)
RETURNS uniqueidentifier
AS
BEGIN
DECLARE #Result uniqueidentifier
SELECT
Top 1 #Result = [user]
FROM
access
WHERE
orderId = #orderId AND
[user] = #user AND
role >= CASE
WHEN #AccessType = 'select' then 1
WHEN #AccessType = 'insert' then 5
WHEN #AccessType = 'update' then 7
WHEN #AccessType = 'delete' then 10
END
RETURN #Result
END
My question is, calling an UDF from an SP have any performance issues?
Is there a better way to achieve the same functionality?
Thanks for your advise...

Yes this is a bad use of scalar UDFs. This should perform much better.
SELECT
id,
name,
[begin],
[end]
FROM
orders o
WHERE EXISTS(
SELECT *
FROM access
WHERE orderId = o.id AND [user] = #user AND role >= 1
)
Some discussion on Scalar UDFs and performance here

As always with questions re: performance- profile, profile, profile.
Other than that, the only reason I can see that a UDF would cause any performance problems was if it, itself, was particularly inefficient. It should be no less efficient than calling GetDate() in a Stored Proc.

Related

Return user information after validate user in stored procedure?

I have a stored procedure to validate a user. After validate right now I am returning true if user is validated but how can I return all the details of that user instead of true?
IF (#Flag='Authenticate')
BEGIN
IF EXISTS(SELECT UserID FROM UserInformation WITH(NOLOCK)WHERE UserName=#UserName and Password=#UserPassword )
BEGIN
INSERT INTO UserLoginHistory(UserId,LastLoggedIn)
SELECT #Userid,GETDATE()
select 'true' as [Output]
END
END

Try something like below query - You can declare more variables as needed, and store all those information in variables which you want to return.
IF (#Flag='Authenticate')
BEGIN
Declare #UserID varchar(50) = null
SELECT #UserID = UserID FROM UserInformation WITH(NOLOCK) WHERE UserName=#UserName and Password=#UserPassword )
IF (#UserID is not NULL)
BEGIN
INSERT INTO UserLoginHistory(UserId,LastLoggedIn)
SELECT #Userid,GETDATE()
SELECT #Userid
END
END

You don't need a separate "if" to check if the user already exists. You can put that all into a single query:
IF (#Flag = 'Authenticate')
BEGIN
INSERT INTO UserLoginHistory(UserId, LastLoggedIn)
SELECT v.Userid, GETDATE()
FROM (VALUES (#UserId)) v(UserId)
WHERE EXISTS (SELECT 1
FROM UserInformation ui
WHERE ui.UserName = #UserName AND ui.Password = #UserPassword
);
SELECT ui.*
FROM UserInformation ui
WHERE ui.UserName = #UserName AND ui.Password = #UserPassword;
END;
Also, I am concerned about #UserPassword. Hopefully that value is encrypted! You should not have clear-text passwords anywhere in an application that has a user other than you -- even for a demo or course.

SQL server Where Clause variable may be null

I have the following query (SQL server):
DECLARE #UserId INT;
SET #UserId = //... set by dynamic variable
SELECT *
FROM Users
WHERE userId = #UserId
My issue is that if the #UserId is null the query will not evaluate correctly.
How can I write this query to evaluate correctly if the variable is null or not null?
EDIT:
There have been many suggestions to use the following:
WHERE (#UserId IS NULL OR userId = #UserId)
OR similar.
In this case, if there is a table of 3 entries, with userId of 1,2 and 3 the variable '#UserId' IS NULL, this query will return all 3 entries. What I actually need it to return is no entries, as none of them have a userId of NULL

You need to use an OR:
DECLARE #UserId INT;
SET #UserId = //... set by dynamic variable
SELECT *
FROM Users
WHERE (userId = #UserId OR #UserId IS NULL);
This, however, could well have (severe) performance issues if you're writing this in a Stored Procedure, reusing this code a lot or adding more NULLable parameters. If so, include OPTION (RECOMPILE) in your query so that the query plan is generated each time it's run. This will stop the Data Engine using query plans generated that had a different set of NULL parameters.
Edit: The OP wasn't clear on their question. They don't want to pass the value NULL for #UserID and return all rows, they want to pass NULL and get rows where UserID has a value of NULL. That would be:
SELECT *
FROM Users
WHERE UserID = #UserID
OR (UserID IS NULL AND #UserID IS NULL);

After reading the edit, i think you want your query like
SELECT *
FROM Users
WHERE COALESCE(userId ,0) = COALESCE(#UserId,0)
Edit:
As pointed by Gordon Linoff & Larnu that above query will not be good in terms of performance as the query is "non-SARGable", for the better performance same query can be written as
SELECT *
FROM Users
WHERE userId = #UserId OR( userId is null and #UserId is null)

use coalesce
SELECT *
FROM Users
WHERE userId = coalesce(#UserId,val)

You can simplify the Boolean logic instead :
WHERE (#UserId IS NULL OR userId = #UserId)

Try this
DECLARE #UserId INT;
SET #UserId = //... set by dynamic variable
SELECT *
FROM Users
WHERE userId= (case when #UserId is null then userId else #UserId end)

I haven't seen the use of INTERSECT suggested so far so putting this out there as an alternative that also results in easy to read SQL when your list of potentially NULL variables might be long.
SELECT *
FROM Users
WHERE EXISTS (
SELECT UserId
INTERSECT SELECT #UserId
)
This pattern is useful when you have other variables to check e.g.
SELECT *
FROM Users
WHERE EXISTS (
SELECT
UserId,
Username,
UserRole
INTERSECT SELECT
#UserId,
#Username,
#UserRole
)

Passing Parameter in Stored Procedure 1

The SP is not treating #AgeBand parameter correctly.
How do i pass that parameter?
Alter Procedure sp_Dialer_Analysis
#AgeBand Varchar(50),
#Gender Varchar(50),
#Weekday Varchar(50)
AS
BEGIN
Select #AgeBand,#Gender,#Weekday,SUM(RPC)
from TableA a
left join TableB b
on a.[Contact Info] = b.MSI
where a.date >= '2017-01-01'
and b.gender = #Gender and b.AgeBand in (#AgeBand)
and DATENAME(WEEKDAY,a.date) = #Weekday
END
Exec sp_Dialer_Analysis "'50-54','55-59'",'F','Monday'
"'50-54','55-59'" is the issue.
Kindly suggest some alternative.

Condition b.AgeBand in (#AgeBand) will not work,
try using CHARINDEX(b.AgeBand,#AgeBand) > 0

You cannot pass an array in to stored procedure like that, doubly so using double quotes (")
Your best bet is to either run the procedure multiple times (yuck, performance hit) or split the array out using either a home brewed Split function or the new String_Split function in Sql Server 2016
Perhaps something like this (not tested, off the top of my head)
Alter Procedure sp_Dialer_Analysis
#AgeBand Varchar(50),
#Gender Varchar(50),
#Weekday Varchar(50)
AS
BEGIN
Select #AgeBand,#Gender,#Weekday,SUM(RPC)
from TableA a
Cross Apply String_Split(#AgeBand, ',') As s
left join TableB b
on a.[Contact Info] = b.MSI
where a.date >= '2017-01-01'
and b.gender = #Gender
And b.ageband = s.value
and DATENAME(WEEKDAY,a.date) = #Weekday
END
Exec sp_Dialer_Analysis '50-54,55-59','F','Monday'
Not tried SHD's answer but prima-facia I think it deserves merit... and may be a better answer

I think OP is asking about escape characters.
Please try this : Exec sp_Dialer_Analysis '''50-54'',''55-59''','F','Monday'

SQL Table Valued Function in Select Statement

SQL is not my best thing but I have been trying to optimize this stored procedure. It had multiple scalar-valued functions that I tried to change to table-valued functions because I read in many places that it's a more efficient way of doing it. And now I have them made but not real sure how to implement or if I maybe just didn't create them correctly.
This is the function I'm calling.
Alter FUNCTION [IsNotSenateActivityTableValue]
(
#ActivityCode int,
#BillId int,
#TextToDisplay varchar(max)
)
returns #T table(result varchar(max))
as
begin
DECLARE #result varchar(max);
declare #countcodes int;
declare #ishousebill int;
select #ishousebill = count(billid)
from BillMaster
where BillID = #BillID and Chamber = 'H'
If (#ishousebill = 0)
begin
SELECT #countcodes = count([ActivityCode])
FROM [HouseCoreData].[dbo].[ActivityCode]
where ActivityDescription not like '%(H)%' and ActivityType = 'S'
and [ActivityCode] = #ActivityCode
if (#countcodes = 0)
begin
set #result = 'test'
end
else
begin
set #result = 'test2'
end
end
else
begin
set #result = #TextToDisplay
end
RETURN
END
And this is how I was trying to call them like this. I would prefer just being able to put them in the top but really anything that works would be good.
SELECT distinct
ActionDates.result as ActionDate
,ActivityDescriptions.result as ActivityDescription
FROM BillWebReporting.vwBillDetailWithSubjectIndex as vw
left outer join [BillWebReporting].[HasHouseSummary] as HasSummary on vw.BillID = HasSummary.BillID
outer APPLY dbo.IsNotSenateActivityDateTableValue(ActivityCode,vw.BillID,[ActionDate]) ActionDates
OUTER APPLY dbo.IsNotSenateActivityTableValue(ActivityCode,vw.BillID,[ActivityDescription]) as ActivityDescriptions

Getting a count just to see if at least one row exists is very expensive. You should use EXISTS instead, which can potentially short circuit without materializing the entire count.
Here is a more efficient way using an inline table-valued function instead of a multi-statement table-valued function.
ALTER FUNCTION dbo.[IsNotSenateActivityTableValue] -- always use schema prefix!
(
#ActivityCode int,
#BillId int,
#TextToDisplay varchar(max)
)
RETURNS TABLE
AS
RETURN (SELECT result = CASE WHEN EXISTS
(SELECT 1 FROM dbo.BillMaster
WHERE BillID = #BillID AND Chamber = 'H'
) THEN #TextToDisplay ELSE CASE WHEN EXISTS
(SELECT 1 FROM [HouseCoreData].[dbo].[ActivityCode]
where ActivityDescription not like '%(H)%'
and ActivityType = 'S'
and [ActivityCode] = #ActivityCode
) THEN 'test2' ELSE 'test' END
END);
GO
Of course it could also just be a scalar UDF...
ALTER FUNCTION dbo.[IsNotSenateActivityScalar] -- always use schema prefix!
(
#ActivityCode int,
#BillId int,
#TextToDisplay varchar(max)
)
RETURNS VARCHAR(MAX)
AS
BEGIN
DECLARE #result VARCHAR(MAX);
SELECT #result = CASE WHEN EXISTS
(SELECT 1 FROM dbo.BillMaster
WHERE BillID = #BillID AND Chamber = 'H'
) THEN #TextToDisplay ELSE CASE WHEN EXISTS
(SELECT 1 FROM [HouseCoreData].[dbo].[ActivityCode]
where ActivityDescription not like '%(H)%'
and ActivityType = 'S'
and [ActivityCode] = #ActivityCode
) THEN 'test2' ELSE 'test' END
END;
RETURN (#result);
END
GO

Table-valued functions return a table, in which, like any other table, rows have to be inserted.
Instead of doing set #result = ....., do:
INSERT INTO #T (result) VALUES ( ..... )
EDIT: As a side note, I don't really understand the reason for this function to be table-valued. You are essentially returning one value.

First of all UDFs generally are very non-performant. I am not sure about MySQL, but in Sql Server a UDF is recompiled every time (FOR EACH ROW OF OUTPUT) it is executed, except for what are called inline UDFs, which only have a single select statement, which is folded into the SQL of the outer query it is included in... and so is only compiled once.
MySQL does have inline table-valued functions, use it instead... in SQL Server, the syntax would be:
CREATE FUNCTION IsNotSenateActivityTableValue
(
#ActivityCode int,
#BillId int,
#TextToDisplay varchar(max)
)
RETURNS TABLE
AS
RETURN
(
Select case
When y.bilCnt + z.actCnt = 0 Then 'test'
when y.bilCnt = 0 then 'test2'
else #TextToDisplay end result
From (Select Count(billId) bilCnt
From BillMaster
Where BillID = #BillID
And Chamber = 'H') y
Full Join
(Select count([ActivityCode]) actCnt
From [HouseCoreData].[dbo].[ActivityCode]
Where ActivityDescription not like '%(H)%'
And ActivityType = 'S'
And [ActivityCode] = #ActivityCode) z
)
GO

SQL Query Optimization

This report used to take about 16 seconds when there were 8000 rows to process. Now there are 50000 rows and the report takes 2:30 minutes.
This was my first pass at this and the client needed it yesterday, so I wrote this code in the logical order of what needed to be done, but without optimization in mind.
Now with the report taking longer as the data increases, I need to take a second look at this and optimize it. I'm thinking indexed views, table functions, etc.
I think the biggest bottleneck is looping through the temp table, making 4 select statements, and updating the temp table...50,000 times.
I think I can condense ALL of this into one large SELECT with either (a) 4 joins to the same table to get the 4 statuses, but then I am not sure how to get the TOP 1 in there, or I can try (b) using nested subqueries, but both seem really messy compared to the current code.
I'm not expecting anyone to write code for me, but if some SQL experts can peruse this code and tell me about any obvious inefficiencies and alternate methods, or ways to speed this up, or techniques I should be using instead, it would be appreciated.
PS: Assume that this DB is for the most part normalized, but poorly designed, and that I am not able to add indexes. I basically have to work with it, as is.
Where the code says (less than) I had to replace a "less than" symbol because it was cropping some of my code.
Thanks!
CREATE PROCEDURE RptCollectionAccountStatusReport AS
SET NOCOUNT ON;
DECLARE #Accounts TABLE
(
[AccountKey] INT IDENTITY(1,1) NOT NULL,
[ManagementCompany] NVARCHAR(50),
[Association] NVARCHAR(100),
[AccountNo] INT UNIQUE,
[StreetAddress] NVARCHAR(65),
[State] NVARCHAR(50),
[PrimaryStatus] NVARCHAR(100),
[PrimaryStatusDate] SMALLDATETIME,
[PrimaryDaysRemaining] INT,
[SecondaryStatus] NVARCHAR(100),
[SecondaryStatusDate] SMALLDATETIME,
[SecondaryDaysRemaining] INT,
[TertiaryStatus] NVARCHAR(100),
[TertiaryStatusDate] SMALLDATETIME,
[TertiaryDaysRemaining] INT,
[ExternalStatus] NVARCHAR(100),
[ExternalStatusDate] SMALLDATETIME,
[ExternalDaysRemaining] INT
);
INSERT INTO
#Accounts (
[ManagementCompany],
[Association],
[AccountNo],
[StreetAddress],
[State])
SELECT
mc.Name AS [ManagementCompany],
a.LegalName AS [Association],
c.CollectionKey AS [AccountNo],
u.StreetNumber + ' ' + u.StreetName AS [StreetAddress],
CASE WHEN c.InheritedAccount = 1 THEN 'ZZ' ELSE u.State END AS [State]
FROM
ManagementCompany mc WITH (NOLOCK)
JOIN
Association a WITH (NOLOCK) ON a.ManagementCompanyKey = mc.ManagementCompanyKey
JOIN
Unit u WITH (NOLOCK) ON u.AssociationKey = a.AssociationKey
JOIN
Collection c WITH (NOLOCK) ON c.UnitKey = u.UnitKey
WHERE
c.Closed IS NULL;
DECLARE #MaxAccountKey INT;
SELECT #MaxAccountKey = MAX([AccountKey]) FROM #Accounts;
DECLARE #index INT;
SET #index = 1;
WHILE #index (less than) #MaxAccountKey BEGIN
DECLARE #CollectionKey INT;
SELECT #CollectionKey = [AccountNo] FROM #Accounts WHERE [AccountKey] = #index;
DECLARE #PrimaryStatus NVARCHAR(100) = NULL;
DECLARE #PrimaryStatusDate SMALLDATETIME = NULL;
DECLARE #PrimaryDaysRemaining INT = NULL;
DECLARE #SecondaryStatus NVARCHAR(100) = NULL;
DECLARE #SecondaryStatusDate SMALLDATETIME = NULL;
DECLARE #SecondaryDaysRemaining INT = NULL;
DECLARE #TertiaryStatus NVARCHAR(100) = NULL;
DECLARE #TertiaryStatusDate SMALLDATETIME = NULL;
DECLARE #TertiaryDaysRemaining INT = NULL;
DECLARE #ExternalStatus NVARCHAR(100) = NULL;
DECLARE #ExternalStatusDate SMALLDATETIME = NULL;
DECLARE #ExternalDaysRemaining INT = NULL;
SELECT TOP 1
#PrimaryStatus = a.StatusName, #PrimaryStatusDate = c.StatusDate, #PrimaryDaysRemaining = c.DaysRemaining
FROM CollectionAccountStatus c WITH (NOLOCK) JOIN AccountStatus a WITH (NOLOCK) ON c.AccountStatusKey = a.AccountStatusKey
WHERE c.CollectionKey = #CollectionKey AND a.StatusType = 'Primary Status' AND a.StatusName 'Cleared'
ORDER BY c.sysCreated DESC;
SELECT TOP 1
#SecondaryStatus = a.StatusName, #SecondaryStatusDate = c.StatusDate, #SecondaryDaysRemaining = c.DaysRemaining
FROM CollectionAccountStatus c WITH (NOLOCK) JOIN AccountStatus a WITH (NOLOCK) ON c.AccountStatusKey = a.AccountStatusKey
WHERE c.CollectionKey = #CollectionKey AND a.StatusType = 'Secondary Status' AND a.StatusName 'Cleared'
ORDER BY c.sysCreated DESC;
SELECT TOP 1
#TertiaryStatus = a.StatusName, #TertiaryStatusDate = c.StatusDate, #TertiaryDaysRemaining = c.DaysRemaining
FROM CollectionAccountStatus c WITH (NOLOCK) JOIN AccountStatus a WITH (NOLOCK) ON c.AccountStatusKey = a.AccountStatusKey
WHERE c.CollectionKey = #CollectionKey AND a.StatusType = 'Tertiary Status' AND a.StatusName 'Cleared'
ORDER BY c.sysCreated DESC;
SELECT TOP 1
#ExternalStatus = a.StatusName, #ExternalStatusDate = c.StatusDate, #ExternalDaysRemaining = c.DaysRemaining
FROM CollectionAccountStatus c WITH (NOLOCK) JOIN AccountStatus a WITH (NOLOCK) ON c.AccountStatusKey = a.AccountStatusKey
WHERE c.CollectionKey = #CollectionKey AND a.StatusType = 'External Status' AND a.StatusName 'Cleared'
ORDER BY c.sysCreated DESC;
UPDATE
#Accounts
SET
[PrimaryStatus] = #PrimaryStatus,
[PrimaryStatusDate] = #PrimaryStatusDate,
[PrimaryDaysRemaining] = #PrimaryDaysRemaining,
[SecondaryStatus] = #SecondaryStatus,
[SecondaryStatusDate] = #SecondaryStatusDate,
[SecondaryDaysRemaining] = #SecondaryDaysRemaining,
[TertiaryStatus] = #TertiaryStatus,
[TertiaryStatusDate] = #TertiaryStatusDate,
[TertiaryDaysRemaining] = #TertiaryDaysRemaining,
[ExternalStatus] = #ExternalStatus,
[ExternalStatusDate] = #ExternalStatusDate,
[ExternalDaysRemaining] = #ExternalDaysRemaining
WHERE
[AccountNo] = #CollectionKey;
SET #index = #index + 1;
END;
SELECT
[ManagementCompany],
[Association],
[AccountNo],
[StreetAddress],
[State],
[PrimaryStatus],
CONVERT(VARCHAR, [PrimaryStatusDate], 101) AS [PrimaryStatusDate],
[PrimaryDaysRemaining],
[SecondaryStatus],
CONVERT(VARCHAR, [SecondaryStatusDate], 101) AS [SecondaryStatusDate],
[SecondaryDaysRemaining],
[TertiaryStatus],
CONVERT(VARCHAR, [TertiaryStatusDate], 101) AS [TertiaryStatusDate],
[TertiaryDaysRemaining],
[ExternalStatus],
CONVERT(VARCHAR, [ExternalStatusDate], 101) AS [ExternalStatusDate],
[ExternalDaysRemaining]
FROM
#Accounts
ORDER BY
[ManagementCompany],
[Association],
[StreetAddress]
ASC;

Don't try to guess where the query is going wrong - look at the execution plan. It will tell you what's chewing up your resources.
You can update directly from another table, even from a table variable: SQL update from one Table to another based on a ID match
That would allow you to combine everything in your loop into a single (massive) statement. You can join to the same tables for the secondary and tertiary statuses using different aliases, e.g.,
JOIN AccountStatus As TertiaryAccountStatus...AND a.StatusType = 'Tertiary Status'
JOIN AccountStatus AS SecondaryAccountStatus...AND a.StatusType = 'Secondary Status'
I'll bet you don't have an index on the AccountStatus.StatusType field. You might try using the PK of that table instead.
HTH.

First use a temp table instead of a table varaiable. These can be indexed.
Next, do not loop! Looping is bad for performance in virtually every case. This loop ran 50000 times rather than once for 50000 records, it will be horrible when you havea million records! Here is a link that will help you understand how to do set-based processing instead. It is written to avoid cursos but loops are similar to cursors, so it should help.
http://wiki.lessthandot.com/index.php/Cursors_and_How_to_Avoid_Them
And (nolock) will give dirty data reads which can be very bad for reporting. If you are in a version of SQl Server higher than 2000, there are better choices.

SELECT #CollectionKey = [AccountNo] FROM #Accounts WHERE [AccountKey] = #index;
This query would benefit from a PRIMARY KEY declaration on your table variable.
When you say IDENTITY, you are asking the database to auto-populate the column.
When you say PRIMARY KEY, you are asking the database to organize the data into a clustered index.
These two concepts are very different. Typically, you should use both of them.
DECLARE #Accounts TABLE
(
[AccountKey] INT IDENTITY(1,1) PRIMARY KEY,
I am not able to add indexes.
In that case, copy the data to a database where you may add indexes. And use: SET STATISTICS IO ON

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

UDF inside an SP performance issue on ms sql - sql

Yes this is a bad use of scalar UDFs. This should perform much better. SELECT id, name, [begin], [end] FROM orders o WHERE EXISTS( SELECT * FROM access WHERE orderId = o.id AND [user] = #user AND role >= 1 ) Some discussion on Scalar UDFs and performance here

As always with questions re: performance- profile, profile, profile. Other than that, the only reason I can see that a UDF would cause any performance problems was if it, itself, was particularly inefficient. It should be no less efficient than calling GetDate() in a Stored Proc.

Related

Return user information after validate user in stored procedure?

SQL server Where Clause variable may be null

Passing Parameter in Stored Procedure 1

SQL Table Valued Function in Select Statement

SQL Query Optimization

Categories

Resources