How to get a list of IDs from a parameter which sometimes includes the IDs already, but sometimes include another sql query - sql

I have developed a SQL query in SSMS-2017 like this:
DECLARE #property NVARCHAR(MAX) = #p;
SET #property = REPLACE(#property, '''', '');
DECLARE #propList TABLE (hproperty NUMERIC(18, 0));
IF CHARINDEX('SELECT', #property) > 0 OR CHARINDEX('select', #property) > 0
BEGIN
INSERT INTO #propList
EXECUTE sp_executesql #property;
END;
ELSE
BEGIN
DECLARE #x TABLE (val NUMERIC(18, 0));
INSERT INTO #x
SELECT CONVERT(NUMERIC(18, 0), strval)
FROM dbo.StringSplit(#property, ',');
INSERT INTO #propList
SELECT val
FROM #x;
END;
SELECT ...columns...
FROM ...tables and joins...
WHERE ...filters...
AND HMY IN (SELECT hproperty FROM #propList)
The issue is, it is possible that the value of the parameter #p can be a list of IDs (Example: 1,2,3,4) or a direct select query (Example: Select ID from mytable where code='A123').
The code is working well as shown above. However it causes a problem in our system (as we use Yardi7-Voyager), and we need to leave only the select statement as a query. To manage it, I was planning to create a function and use it in the where clause like:
WHERE HMY IN (SELECT myFunction(#p))
However I could not manage it as I see I cannot execute a dynamic query in an SQL Function. Then I am stacked. Any idea at this point to handle this issue will be so appreciated.

Others have pointed out that the best fix for this would be a design change, and I agree with them. However, I'd also like to treat your question as academic and answer it in case any future readers ever have the same question in a use case where a design change wouldn't be possible/desirable.
I can think of two ways you might be able to do what you're attempting in a single select, as long as there are no other restrictions on what you can do that you haven't mentioned yet. To keep this brief, I'm just going to give you psuedo-code that can be adapted to your situation as well as those of future readers:
OPENQUERY (or OPENROWSET)
You can incorporate your code above into a stored procedure instead of a function, since stored procedures DO allow dynamic sql, unlike functions. Then the SELECT query in your app would be a SELECT from OPENQUERY(Execute Your Stored Prodedure).
UNION ALL possibilities.
I'm about 99% sure no one would ever want to use this, but I'm mentioning it to be as academically complete as I know how to be.
The second possibility would only work if there are a limited, known, number of possible queries that might be supported by your application. For instance, you can only get your Properties from either TableA, filtered by column1, or from TableB, filtered by Column2 and/or Column3.
Could be more than these possibilities, but it has to be a limited, known quantity, and the more possibilities, the more complex and lengthy the code will get.
But if that's the case, you can simply SELECT from a UNION ALL of every possible scenario, and make it so that only one of the SELECTs in the UNION ALL will return results.
For example:
SELECT ... FROM TableA WHERE Column1=fnGetValue(#p, 'Column1')
AND CHARINDEX('SELECT', #property) > 0
AND CHARINDEX('TableA', #property) > 0
AND CHARINDEX('Column1', #property) > 0
AND (Whatever other filters are needed to uniquely identify this case)
UNION ALL
SELECT
...
Note that fnGetValue() isn't a built-in function. You'd have to write it. It would parse the string in #p, find the location of 'Column1=', and return whatever value comes after it.
At the end of your UNION ALL, you'd need to add a last UNION ALL to a query that will handle the case where the user passed a comma-separated string instead of a query, but that's easy, because all the steps in your code where you populated table variables are unnecessary. You can simply do the final query like this:
WHERE NOT CHARINDEX('SELECT', #p) > 0
AND HMY IN (SELECT strval FROM dbo.StringSplit(#p, ','))
I'm pretty sure this possibility is way more work than its worth, but it is an example of how, in general, dynamic SQL can be replaced with regular SQL that simply covers every possible option you wanted the dynamic sql to be able to handle.

Related

Selecting everything in a table... with a where statement

I have an interesting situation where I'm trying to select everything in a sql server table but I only have to access the table through an old company API instead of SQL. This API asks for a table name, a field name, and a value. It then plugs it in rather straightforward in this way:
select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR] = 'VALUE_VAR';
I'm not able to change the = sign to != or anything else, only those vars. I know this sounds awful, but I cannot change the API without going through a lot of hoops and it's all I have to work with.
There are multiple columns in this table that are all numbers, all strings, and set to not null. Is there a value I can pass this API function that would return everything in the table? Perhaps a constant or special value that means it's a number, it's not a number, it's a string, *, it's not null, etc? Any ideas?
No this isn't possible if the API is constructed correctly.
If this is some home grown thing it may not be, however. You could try entering YourTable]-- as the value for TABLE_NAME_VAR such that when plugged into the query it ends up as
select * from [YourTable]--] where [FIELD_NAME_VAR] = 'VALUE_VAR';
If the ] is either rejected or properly escaped (by doubling it up) this won't work however.
You might try to pass this VALUE_VAR
1'' or ''''=''
If it's used as-is and executed as Dynamic SQL it should result in
SELECT * FROM tab WHERE fieldname = '1' or ''=''
here is a simple example,
hope it might help
declare #a varchar(max)
set #a=' ''1'' or 1=1 '
declare #b varchar(max)
set #b=('select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR]='+#a)
exec(#b)
If your API allows column name instead of constant,
select * from [TABLE_NAME_VAR] where [FIELD_NAME_VAR] = [FIELD_NAME_VAR] ;

Optimizing stored procedure with multiple "LIKE"s

I am passing in a comma-delimited list of values that I need to compare to the database
Here is an example of the values I'm passing in:
#orgList = "1123, 223%, 54%"
To use the wildcard I think I have to do LIKE but the query runs a long time and only returns 14 rows (the results are correct, but it's just taking forever, probably because I'm using the join incorrectly)
Can I make it better?
This is what I do now:
declare #tempTable Table (SearchOrg nvarchar(max) )
insert into #tempTable
select * from dbo.udf_split(#orgList) as split
-- this splits the values at the comma and puts them in a temp table
-- then I do a join on the main table and the temp table to do a like on it....
-- but I think it's not right because it's too long.
select something
from maintable gt
join #tempTable tt on gt.org like tt.SearchOrg
where
AYEAR= ISNULL(#year, ayear)
and (AYEAR >= ISNULL(#yearR1, ayear) and ayear <= ISNULL(#yearr2, ayear))
and adate = ISNULL(#Date, adate)
and (adate >= ISNULL(#dateR1, adate) and adate <= ISNULL(#DateR2 , adate))
The final result would be all rows where the maintable.org is 1123, or starts with 223 or starts with 554
The reason for my date craziness is because sometimes the stored procedure only checks for a year, sometimes for a year range, sometimes for a specific date and sometimes for a date range... everything that's not used in passed in as null.
Maybe the problem is there?
Try something like this:
Declare #tempTable Table
(
-- Since the column is a varchar(10), you don't want to use nvarchar here.
SearchOrg varchar(20)
);
INSERT INTO #tempTable
SELECT * FROM dbo.udf_split(#orgList);
SELECT
something
FROM
maintable gt
WHERE
some where statements go here
And
Exists
(
SELECT 1
FROM #tempTable tt
WHERE gt.org Like tt.SearchOrg
)
Such a dynamic query with optional filters and LIKE driven by a table (!) are very hard to optimize because almost nothing is statically known. The optimizer has to create a very general plan.
You can do two things to speed this up by orders of magnitute:
Play with OPTION (RECOMPILE). If the compile times are acceptable this will at least deal with all the optional filters (but not with the LIKE table).
Do code generation and EXEC sp_executesql the code. Build a query with all LIKE clauses inlined into the SQL so that it looks like this: WHERE a LIKE #like0 OR a LIKE #like1 ... (not sure if you need OR or AND). This allows the optimizer to get rid of the join and just execute a normal predicate).
Your query may be difficult to optimize. Part of the question is what is in the where clause. You probably want to filter these first, and then do the join using like. Or, you can try to make the join faster, and then do a full table scan on the results.
SQL Server should optimize a like statement of the form 'abc%' -- that is, where the wildcard is at the end. (See here, for example.) So, you can start with an index on maintable.org. Fortunately, your examples meet this criteria. However, if you have '%abc' -- the wildcard comes first -- then the optimization won't work.
For the index to work best, it might also need to take into account the conditions in the where clause. In other words, adding the index is suggestive, but the rest of the query may preclude the use of the index.
And, let me add, the best solution for these types of searches is to use the full text search capability in SQL Server (see here).

Fastest way to loop thru a SQL Query

What is the fastest way to loop thru a Query in T-SQL .
1) Cursors or
2) Temp tables with Key added or
any thing else.
The fastest way to "loop" thru a query is to just not do it. In SQL, you should be thinking set-based instead of loop-based. You should probably evaluate your query, ask why you need to loop, and look for ways to do it as a set.
With that said, using the FAST_FORWARD option on your cursors will help speed things along.
For your stated goal, something like this is actually a better bet - avoids the "looping" issue entirely.
declare #table table
(
ID int
)
insert into #table select 1 union select 2 union select 3 union select 4 union select 5
declare #concat varchar(256)
-- Add comma if it is not the first item in the list
select #concat = isnull(#concat + ', ', '') + ltrim(rtrim(str(ID))) from #table order by ID desc
-- or do whatever you want with the concatenated value now...
print #concat
Depends on what you're trying to do. Some tasks are better suited for cursors, some for temp tables. That's why they both exist.
I don't think you need a cursor for that (your comment about concat) if I understand what you're going for.
Here's one of mine that grabs all the phone numbers for a contact and plops them in a field and returns it.
DECLARE #numbers VARCHAR(255)
SELECT #numbers = COALESCE(#numbers + ' | ','') + PHONE_NUMB FROM my_table (NOLOCK)
WHERE CONTACT_ID=#contact_id RETURN #numbers
Cursors are usually resource hogs especially as your table size grows. So if your table size is small I would be okay with recommending a cursor, however, a larger table would probably do better with an external or temporary table.
Do you want to loop through query output inside stored procedure OR from C# code?
Generally speaking, you should avoid looping through query output one row at a time. SQL is meant for set based operations so see if you can solve your problem using set based approach.
Depending on the size of your result set - Table variables are in memory and require no disk read, can be treated just like a table (set operations) and are very fast until result set gets to large for memory (which then requires swap file writes).
Here's a shortcut to get a comma-delimited string of a single field from a query that returns a number of rows. Pretty quick compared to the alternatives of cursors, etc., and it can be part of a subquery (i.e., get some things, and in one column, the ids of all the things related to each thing in some other table):
SELECT
COALESCE(
REPLACE(
REPLACE(
REPLACE(
(SELECT MyField AS 'c' FROM [mytable] FOR XML PATH('')),'</c><c>',','),
'<c>',''),
'</c>',''),
'')
AS MyFieldCSV
Caveat: it won't play nice if your column contains characters that FOR XML PATH will escape.
Cursor is not good avoid cursor and use while loop in place of cursor
Temp table with key added is the best way to use looping.
i have to manipulate more than 1000000 rows in the table and
for cursor take 2 min because of complex logic.
but when convert cursor in to while loop it will take
25 seconds only. so that's big diffrence in performace.

Dynamic Query in SQL Server

I have a table with 10 columns as col_1,col_2,.... col_10. I want to write a select statement that will select a value of one of the row and from one of these 10 columns. I have a variable that will decide which column to select from. Can such query be written where the column name is dynamically decided from a variable.
Yes, using a CASE statement:
SELECT CASE #MyVariable
WHEN 1 THEN [Col_1]
WHEN 2 THEN [Col_2]
...
WHEN 10 THEN [Col_10]
END
Whether this is a good idea is another question entirely. You should use better names than Col_1, Col_2, etc.
You could also use a string substitution method, as suggested by others. However, that is an option of last resort because it can open up your code to sql injection attacks.
Sounds like a bad, denormalized design to me.
I think a better one would have the table as parent, with rows that contain a foreign key to a separate child table that contains ten rows, one for each of those columns you have now. Let the parent table set the foreign key according to that magic value when the row is inserted or updated in the parent table.
If the child table is fairly static, this will work.
Since I don't have enough details, I can't give code. Instead, I'll explain.
Declare a string variable, something like:
declare #sql varchar(5000)
Set that variable to be the completed SQL string you want (as a string, and not actually querying... so you embed the row-name you want using string concatenation).
Then call: exec(#sql)
All set.
I assume you are running purely within Transact-SQL. What you'll need to do is dynamically create the SQL statement with your variable as the column name and use the EXECUTE command to run it. For example:
EXECUTE('select ' + #myColumn + ' from MyTable')
You can do it with a T-SQl CASE statement:
SELECT 'The result' =
CASE
WHEN choice = 1 THEN col1
WHEN choice = 2 THEN col2
...
END
FROM sometable
IMHO, Joel Coehoorn's case statement is probably the best idea
... but if you really have to use dynamic SQL, you can do it with sp_executeSQL()
I have no idea what platform you are using but you can use Dynamic LINQ pretty easily to do this.
var query = context.Table
.Where( t => t.Id == row_id )
.Select( "Col_" + column_id );
IEnumerator enumerator = query.GetEnumerator();
enumerator.MoveNext();
object columnValue = enumerator.Current;
Presumably, you'll know which actual type to cast this to depending on the column. The nice thing about this is you get the parameterized query for free, protecting you against SQL injection attacks.
This isn't something you should ever need to do if your database is correctly designed. I'd revisit the design of that element of the schema to remove the need to do this.

Conditional Joins - Dynamic SQL

The DBA here at work is trying to turn my straightforward stored procs into a dynamic sql monstrosity. Admittedly, my stored procedure might not be as fast as they'd like, but I can't help but believe there's an adequate way to do what is basically a conditional join.
Here's an example of my stored proc:
SELECT
*
FROM
table
WHERE
(
#Filter IS NULL OR table.FilterField IN
(SELECT Value FROM dbo.udfGetTableFromStringList(#Filter, ','))
)
The UDF turns a comma delimited list of filters (for example, bank names) into a table.
Obviously, having the filter condition in the where clause isn't ideal. Any suggestions of a better way to conditionally join based on a stored proc parameter are welcome. Outside of that, does anyone have any suggestions for or against the dynamic sql approach?
Thanks
You could INNER JOIN on the table returned from the UDF instead of using it in an IN clause
Your UDF might be something like
CREATE FUNCTION [dbo].[csl_to_table] (#list varchar(8000) )
RETURNS #list_table TABLE ([id] INT)
AS
BEGIN
DECLARE #index INT,
#start_index INT,
#id INT
SELECT #index = 1
SELECT #start_index = 1
WHILE #index <= DATALENGTH(#list)
BEGIN
IF SUBSTRING(#list,#index,1) = ','
BEGIN
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
SELECT #start_index = #index + 1
END
SELECT #index = #index + 1
END
SELECT #id = CAST(SUBSTRING(#list, #start_index, #index - #start_index ) AS INT)
INSERT #list_table ([id]) VALUES (#id)
RETURN
END
and then INNER JOIN on the ids in the returned table. This UDF assumes that you're passing in INTs in your comma separated list
EDIT:
In order to handle a null or no value being passed in for #filter, the most straightforward way that I can see would be to execute a different query within the sproc based on the #filter value. I'm not certain how this affects the cached execution plan (will update if someone can confirm) or if the end result would be faster than your original sproc, I think that the answer here would lie in testing.
Looks like the rewrite of the code is being addressed in another answer, but a good argument against dynamic SQL in a stored procedure is that it breaks the ownership chain.
That is, when you call a stored procedure normally, it executes under the permissions of the stored procedure owner EXCEPT when executing dynamic SQL with the execute command,for the context of the dynamic SQL it reverts back to the permissions of the caller, which may be undesirable depending on your security model.
In the end, you are probably better off compromising and rewriting it to address the concerns of the DBA while avoiding dynamic SQL.
I am not sure I understand your aversion to dynamic SQL. Perhaps it is that your UDF has nicely abstracted away some of the messyness of the problem, and you feel dynamic SQL will bring that back. Well, consider that most if not all DAL or ORM tools will rely extensively on dynamic SQL, and I think your problem could be restated as "how can I nicely abstract away the messyness of dynamic SQL".
For my part, dynamic SQL gives me exactly the query I want, and subsequently the performance and behavior I am looking for.
I don't see anything wrong with your approach. Rewriting it to use dynamic SQL to execute two different queries based on whether #Filter is null seems silly to me, honestly.
The only potential downside I can see of what you have is that it could cause some difficulty in determining a good execution plan. But if the performance is good enough as it is, there's no reason to change it.
No matter what you do (and the answers here all have good points), be sure to compare the performance and execution plans of each option.
Sometimes, hand optimization is simply pointless if it impacts your code maintainability and really produces no difference in how the code executes.
I would first simply look at changing the IN to a simple LEFT JOIN with NULL check (this doesn't get rid of your udf, but it should only get called once):
SELECT *
FROM table
LEFT JOIN dbo.udfGetTableFromStringList(#Filter, ',') AS filter
ON table.FilterField = filter.Value
WHERE #Filter IS NULL
OR filter.Value IS NOT NULL
It appears that you are trying to write a a single query to deal with two scenarios:
1. #filter = "x,y,z"
2. #filter IS NULL
To optimise scenario 2, I would INNER JOIN on the UDF, rather than use an IN clause...
SELECT * FROM table
INNER JOIN dbo.udfGetTableFromStringList(#Filter, ',') AS filter
ON table.FilterField = filter.Value
To optimise for scenario 2, I would NOT try to adapt the existing query, instead I would deliberately keep those cases separate, either an IF statement or a UNION and simulate the IF with a WHERE clause...
TSQL IF
IF (#filter IS NULL)
SELECT * FROM table
ELSE
SELECT * FROM table
INNER JOIN dbo.udfGetTableFromStringList(#Filter, ',') AS filter
ON table.FilterField = filter.Value
UNION to Simulate IF
SELECT * FROM table
INNER JOIN dbo.udfGetTableFromStringList(#Filter, ',') AS filter
ON table.FilterField = filter.Value
UNION ALL
SELECT * FROM table WHERE #filter IS NULL
The advantage of such designs is that each case is simple, and determining which is simple is it self simple. Combining the two into a single query, however, leads to compromises such as LEFT JOINs and so introduces significant performance loss to each.