Sql Server 2008 r2 Using a WHILE loop inside a function - sql

I read an answer that said you don't want to use WHILE loops in SQL Server. I don't understand that generalization. I'm fairly new to SQL so I might not understand the explanation yet. I also read that you don't really want to use cursors unless you must. The search results I've found are too specific to the problem presented and I couldn't glean useful technique from them, so I present this to you.
What I'm trying to do is take the values in a client file and shorten them where necessary. There are a couple of things that need to be achieved here. I can't simply hack the field values provided. My company has standard abbreviations that are to be used. I have put these in a table, Abbreviations. the table has the LongName and the ShortName. I don't want to simply abbreviate every LongName in the row. I only want to apply the update as long as the field length is too long. This is why I need the WHILE loop.
My thought process was thus:
CREATE FUNCTION [dbo].[ScrubAbbrev]
(#Field nvarchar(25),#Abbrev nvarchar(255))
RETURNS varchar(255)
AS
BEGIN
DECLARE #max int = (select MAX(stepid) from Abbreviations)
DECLARE #StepID int = (select min(stepid) from Abbreviations)
DECLARE #find varchar(150)=(select Longname from Abbreviations where Stepid=#stepid)
DECLARE #replace varchar(150)=(select ShortName from Abbreviations where Stepid=#stepid)
DECLARE #size int = (select max_input_length from FieldDefinitions where FieldName = 'title')
DECLARE #isDone int = (select COUNT(*) from SizeTest where LEN(Title)>(#size))
WHILE #StepID<=#max or #isDone = 0 and LEN(#Abbrev)>(#size) and #Abbrev is not null
BEGIN
RETURN
REPLACE(#Abbrev,#find,#replace)
SET #StepID=#StepID+1
SET #find =(select Longname from Abbreviations where Stepid=#stepid)
SET #replace =(select ShortName from Abbreviations where Stepid=#stepid)
SET #isDone = (select COUNT(*) from SizeTest where LEN(Title)>(#size))
END
END
Obviously the RETURN should go at the end, but I need to reset the my variables to the next #stepID, #find, and #replace.
Is this one of those times where I'd have to use a cursor (which I've never yet written)?

Generally, you don't want to use cursors or while loops in SQL because they only process a single row at a time, and thus perform very poorly. SQL is designed and optimized to process (potentially very large) sets of data, not individual values.
You could factor out the while loop by doing something like this:
UPDATE t
SET t.targetColumn = a.ShortName
FROM targetTable t
INNER JOIN Abbreviations a
ON t.targetColumn = a.LongName
WHERE LEN(t.targetColumn) > #maxLength
This is generalized and you will need to tweak it to fit your specific data model, but here's what's going on:
For every row in "targetTable", set the value of "targetColumn" (what you want to abbreviate) to the relevant abbreviation (found in Abbreviations.ShortName) iff: the current value has a standardized abbreviation (the inner join) and the current value is longer than desired (the where condition).
You'll need to add an integer parameter or local variable, #maxLength, to indicate what constitutes "too long". This query processes the target table all at once, updating the value in the target column for every eligible row, while a function will only find the abbreviation for a single item (the intersection of one row and one column) at a time.
Note that this won't do anything if the value is too long but doesn't have a standard abbreviation. Your existing code has this same limitation, so I assume this is desired behavior.
I also recommend making this a stored procedure rather than a function. Functions on SQL Server are treated as black boxes and can seriously harm performance, because the optimizer generally doesn't have a good idea of what they're doing.

Related

How to get a list of IDs from a parameter which sometimes includes the IDs already, but sometimes include another sql query

I have developed a SQL query in SSMS-2017 like this:
DECLARE #property NVARCHAR(MAX) = #p;
SET #property = REPLACE(#property, '''', '');
DECLARE #propList TABLE (hproperty NUMERIC(18, 0));
IF CHARINDEX('SELECT', #property) > 0 OR CHARINDEX('select', #property) > 0
BEGIN
INSERT INTO #propList
EXECUTE sp_executesql #property;
END;
ELSE
BEGIN
DECLARE #x TABLE (val NUMERIC(18, 0));
INSERT INTO #x
SELECT CONVERT(NUMERIC(18, 0), strval)
FROM dbo.StringSplit(#property, ',');
INSERT INTO #propList
SELECT val
FROM #x;
END;
SELECT ...columns...
FROM ...tables and joins...
WHERE ...filters...
AND HMY IN (SELECT hproperty FROM #propList)
The issue is, it is possible that the value of the parameter #p can be a list of IDs (Example: 1,2,3,4) or a direct select query (Example: Select ID from mytable where code='A123').
The code is working well as shown above. However it causes a problem in our system (as we use Yardi7-Voyager), and we need to leave only the select statement as a query. To manage it, I was planning to create a function and use it in the where clause like:
WHERE HMY IN (SELECT myFunction(#p))
However I could not manage it as I see I cannot execute a dynamic query in an SQL Function. Then I am stacked. Any idea at this point to handle this issue will be so appreciated.
Others have pointed out that the best fix for this would be a design change, and I agree with them. However, I'd also like to treat your question as academic and answer it in case any future readers ever have the same question in a use case where a design change wouldn't be possible/desirable.
I can think of two ways you might be able to do what you're attempting in a single select, as long as there are no other restrictions on what you can do that you haven't mentioned yet. To keep this brief, I'm just going to give you psuedo-code that can be adapted to your situation as well as those of future readers:
OPENQUERY (or OPENROWSET)
You can incorporate your code above into a stored procedure instead of a function, since stored procedures DO allow dynamic sql, unlike functions. Then the SELECT query in your app would be a SELECT from OPENQUERY(Execute Your Stored Prodedure).
UNION ALL possibilities.
I'm about 99% sure no one would ever want to use this, but I'm mentioning it to be as academically complete as I know how to be.
The second possibility would only work if there are a limited, known, number of possible queries that might be supported by your application. For instance, you can only get your Properties from either TableA, filtered by column1, or from TableB, filtered by Column2 and/or Column3.
Could be more than these possibilities, but it has to be a limited, known quantity, and the more possibilities, the more complex and lengthy the code will get.
But if that's the case, you can simply SELECT from a UNION ALL of every possible scenario, and make it so that only one of the SELECTs in the UNION ALL will return results.
For example:
SELECT ... FROM TableA WHERE Column1=fnGetValue(#p, 'Column1')
AND CHARINDEX('SELECT', #property) > 0
AND CHARINDEX('TableA', #property) > 0
AND CHARINDEX('Column1', #property) > 0
AND (Whatever other filters are needed to uniquely identify this case)
UNION ALL
SELECT
...
Note that fnGetValue() isn't a built-in function. You'd have to write it. It would parse the string in #p, find the location of 'Column1=', and return whatever value comes after it.
At the end of your UNION ALL, you'd need to add a last UNION ALL to a query that will handle the case where the user passed a comma-separated string instead of a query, but that's easy, because all the steps in your code where you populated table variables are unnecessary. You can simply do the final query like this:
WHERE NOT CHARINDEX('SELECT', #p) > 0
AND HMY IN (SELECT strval FROM dbo.StringSplit(#p, ','))
I'm pretty sure this possibility is way more work than its worth, but it is an example of how, in general, dynamic SQL can be replaced with regular SQL that simply covers every possible option you wanted the dynamic sql to be able to handle.

Having trouble understanding this query

Basically I can't understand what this query below does:
UPDATE #so_stockmove
SET #total_move_qty = total_move_qty = (
CASE WHEN #so_docdt_id <> so_docdt_id THEN 0
ELSE ISNULL(#total_move_qty, 0)
END
) + ISNULL(move_qty,0),
balance = so_qty - #total_move_qty,
#so_docdt_id = so_docdt_id
I only can guess that it updates each row for the columns total_move_qty,balance,so_docdt_id.
Can someone explain to me in detail what the query means:
UPDATE tbl SET #variable1 = columnA = expression
Update
After reading #MotoGP comments, I did some digging and found this article by Jeff Moden where he states the following:
Warning:
Well, sort of. Lots of folks (including some of the "big" names in the SQL world) warn against and, sometimes, outright condemn the
method contained in this article as "unreliable" & "unsupported". One
fellow MVP even called it an "undocumented hack" on the fairly recent
"24 hours of SQL". Even the very core of the method, the ability to
update a variable from row to row, has been cursed in a similar
fashion. Worse yet, except for the ability to do 3 part updates (SET
#variable = columnname = expression) and to update both variables and
columns at the same time, there is absolutely no Microsoft
documentation to support the use of this method in any way, shape, or
form. In fact, even Microsoft has stated that there is no guarantee
that this method will work correctly all the time.
Now, let me tell you that, except for one thing, that's ALL true. The
one thing that isn't true is its alleged unreliability. That's part of
the goal of the article... to prove its reliability (which really
can't be done unless you use it. It's like proving the reliability of
the SELECT statement). At the end of the article, make up your own
mind. If you decide that you don't want to use such a very old ,yet,
undocumented feature, then use a Cursor or While loop or maybe even a
CLR because all of the other methods are just too darned slow. Heh...
just stop telling me that it's an undocumented hack... I already know
that and, now, so do you. ;-)
First edition
Well, this query updates columns total_move_qty and balance in a table variable called #so_stockmove, and in the same time sets values to the variables called #total_move_qty and #so_docdt_id.
I didn't know it's possible to assign values to more then one target this way in Sql server (#variable1 = columnA = expression) but apparently that is possible.
Here is my test:
declare #bla char(1)
declare #tbl table
(
X char(1)
)
insert into #tbl VALUES ('A'),('B'), ('C')
SELECT *
FROM #tbl
UPDATE #tbl
SET #Bla = X = 'D'
SELECT *
FROM #tbl
SELECT #bla
Results:
X -- first select before update
----
A
B
C
X -- second select after update
----
D
D
D
---- select the variable value after update
D
It just sets the value to the variable and updates the field.

Checking for a string within a string, and not in another string using SQL

I am trying to build a short SQL script that will check if #NewProductAdded is somewhere in #NewTotalProducts. And also if #NewProductAdded is NOT in #OldTotalProducts. Please have a look at the setup below (The real data is in tables, not variables, but a simple example is all I need):
declare #NewProductAdded as varchar(max)
declare #NewTotalProducts as varchar(max)
declare #OldTotalProducts as varchar(max)
set #NewProductAdded ='ProductB'
set #NewTotalProducts = 'ProductAProductBProductC'
set #OldTotalProducts = 'ProductAProductC'
SELECT CustomerID FROM Products WHERE NewProductAdded ...
I want to make sure that 'ProductB' is contained somewhere within #NewTotalProducts, and is NOT contained anywhere within #OldTotalProducts. Product names vary vastly with thousands of combinations, and there is no way to really separate them from each other in a string. I am sure there is a simple solution or function for this, I just don't know it yet.
The specific answer to your question is like (or charindex() if you are using SQL Server or Sybase):
where #NewTotalProducts like '%'+#NewProductAdded+'%' and
#OldTotalProducts not like '%'+#NewProductAdded+'%'
First comment. If you have to use lists stored in strings, at least use delimiters:
where ','+#NewTotalProducts+',' like '%,'+#NewProductAdded+',%' and
','+#OldTotalProducts+',' not like '%,'+#NewProductAdded+',%'
Second comment. Don't store lists in strings. Instead, use a temporary tables or table variable:
declare #NewTotalProducts table (name varchar(255));
insert into #NewTotalProducts(name)
select 'ProductA' union all
select 'ProductB' . . .
Note: throughout this answer I have used SQL Server syntax. The code appears to be SQL Server.

Selecting everything in a table... with a where statement

I have an interesting situation where I'm trying to select everything in a sql server table but I only have to access the table through an old company API instead of SQL. This API asks for a table name, a field name, and a value. It then plugs it in rather straightforward in this way:
select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR] = 'VALUE_VAR';
I'm not able to change the = sign to != or anything else, only those vars. I know this sounds awful, but I cannot change the API without going through a lot of hoops and it's all I have to work with.
There are multiple columns in this table that are all numbers, all strings, and set to not null. Is there a value I can pass this API function that would return everything in the table? Perhaps a constant or special value that means it's a number, it's not a number, it's a string, *, it's not null, etc? Any ideas?
No this isn't possible if the API is constructed correctly.
If this is some home grown thing it may not be, however. You could try entering YourTable]-- as the value for TABLE_NAME_VAR such that when plugged into the query it ends up as
select * from [YourTable]--] where [FIELD_NAME_VAR] = 'VALUE_VAR';
If the ] is either rejected or properly escaped (by doubling it up) this won't work however.
You might try to pass this VALUE_VAR
1'' or ''''=''
If it's used as-is and executed as Dynamic SQL it should result in
SELECT * FROM tab WHERE fieldname = '1' or ''=''
here is a simple example,
hope it might help
declare #a varchar(max)
set #a=' ''1'' or 1=1 '
declare #b varchar(max)
set #b=('select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR]='+#a)
exec(#b)
If your API allows column name instead of constant,
select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR] = [FIELD_NAME_VAR] ;

SQL Function in column running slow

I have a computed column(function) that is causing one of my tables to be extremely slow (its output is a column in my table. I thought it might be some logical statements in my function. I commented those out and just returned a string called 'test'. This still caused the table to be slow. I believe the SELECT statement is slowing down the function. When I comment out the select statement, everything is cherry. I think I am not using functions in the correct manner.
FUNCTION [dbo].[Pend_Type](#Suspense_ID int, #Loan_ID nvarchar(10),#Suspense_Date datetime, #Investor nvarchar(10))
RETURNS nvarchar(20)
AS
BEGIN
DECLARE #Closing_Date Datetime, #Paid_Date Datetime
DECLARE #pendtype nvarchar(20)
--This is the issue!!!!
SELECT #Closing_Date = Date_Closing, #Paid_Date = Date_Paid from TABLE where Loan_ID = #Loan_ID
SET #pendtype = 'test'
--commented out logic
RETURN #pendtype
END
UPDATE:
I have another computed column that does something similar and is a column in the same table. This one runs fast. Anyone see a difference in why this would be?
Declare #yOrn AS nvarchar(1)
IF((Select count(suspense_ID) From TABLE where suspense_ID = #suspenseID) = 0)
SET #yOrn = 'N'
ELSE
SET #yOrn = 'Y'
RETURN #yOrn
You have isolated the performance problem in the select statement:
SELECT TOP 1 #Closing_Date = Date_Closing, #Paid_Date = Date_Paid
from TABLE
where Loan_ID = #Loan_ID;
To make this run faster, create a composite index on table(Load_id, Date_Closing, Date_Paid).
By the way, you are using top with no order by. When multiple rows match, you can get any one of them back. Normally, top is used with order by.
EDIT:
You can create the index by issuing the following command:
create index idx_table_load_closing_paid on table(Load_id, Date_Closing, Date_Paid);
Scalar functions are often executed like cursors, one row at a time; that is why they are slow and are to be avoided. I would not use the function as written but would write a set-based version instead. incidentally a select top 1 without an order by column will not always give you the same record and is generally a poor practice. In this case I would think you would want the latest date for instance or the earliest one.
In this particular case I think you would be better off not using a function but using a derived table join.