Wrap SQL CONTAINS as an expression? - sql

I have a question. I working on one site on Asp.Net, which uses some ORM. I need to use a couple of FullTextSearch functions, such as Contains. But when I try to generate it with that ORM, it generates such SQL code
SELECT
[Extent1].[ID] AS [ID],
[Extent1].[Name] AS [Name]
FROM [dbo].[SomeTable] AS [Extent1]
WHERE (Contains([Extent1].[Name], N'qq')) = 1
SQL can't parse it, because Contains doesn't return bit value. And unfortunately I can't modify SQL query generation process, but I can modify statements in it.
My question is - is it possible to wrap call of CONTAINS function to something else? I tried to create another function, that will SELECT with contains, but it requires specific table\column objects, and I don't want to do one function for each table..
EDIT
I can modify result type for that function in ORM. In previous sample result type is Bit. I can change it to int,nvarchar,etc. But as I understood there is no Boolean type in SQL, and I can't specify it.

Can't you put this in a stored procedure, and tell your ORM to call the stored procedure? Then you don't have to worry about the fact that your ORM only understands a subset of valid T-SQL.
I don't know that I believe the argument that requiring new stored procedures is a blocker. If you have to write a new CONTAINS expression in your ORM code, how much different is it to wrap that expression in a CREATE PROCEDURE statement in a different window? If you want to do this purely in ORM, then you're going to have to put pressure on the vendor to pick up the pace and start getting more complete coverage of the language they should fully support.

Related

suggestions on how to build query for search with multiple options

We are building a search form for users to search our database, the form will contain mulitlpe fields which are all optional. Fields including:
company name
company code
business type (service or product)
Product or Service
Product or Service subtype --> this will depend on what is chosen in #4
Basically the users can fill all or just some of the fields and submit the form. How best should we handle the sql for this? Is it best to use dynamic sql, build out the where clause in our webpage and then forward that to the sql stored procedure to use as it's where clause? Or is it better to pass all the values to the stored procedure and let it build the where clause dynamically. Also is dynamic sql the only way? I wasn't sure if using EXECUTE(#SQLStatement) is a good practice.
What I have done in the past is when a search option is not being usesd pass in a null for its value. Then in your select statement you would do something like
WHERE i.companyname = COALESCE(#CompanyName,i.companyname)
AND i.companycode = COALESCE(#CompanyCode,i.companycode)
What happens above is that if #CompanyName is null i.companyname will be compared to itself resulting in a match. If #CompanyName has a value it will compare i.companyname against that value.
I have used this way with 15 optional filters in a database with 15,000 rows and it has performed relatively well to date
More on the COALESCE operator
Dynamic SQL isn't the only way, it'd be better if you can avoid it with methods like: http://www.sommarskog.se/dyn-search.html
If you can't get the performance from the above method and go for dynamic SQL, do not allow the web-page to construct the SQL and execute it - you will end up getting SQL injected. Also avoid text strings being passed in, as sanitising them is very difficult. Ideally have the web page pass down parameters that are numbers only (IDs and such) for you to create the dynamic SQL from.
If you do decide to use dynamic SQL be sure to read all this: http://www.sommarskog.se/dynamic_sql.html

Dynamic Pivot Query without storing query as String

I am fully familiar with the following method in the link for performing a dynamic pivot query. Is there an alternative method to perform a dynamic pivot without storing the Query as a String and inserting a column string inside it?
http://www.simple-talk.com/community/blogs/andras/archive/2007/09/14/37265.aspx
Short answer: no.
Long answer:
Well, that's still no. But I will try to explain why. As of today, when you run the query, the DB engine demands to be aware of the result set structure (number of columns, column names, data types, etc) that the query will return. Therefore, you have to define the structure of the result set when you ask data from DB. Think about it: have you ever ran a query where you would not know the result set structure beforehand?
That also applies even when you do select *, which is just a sugar syntax. At the end, the returning structure is "all columns in such table(s)".
By assembling a string, you dynamically generate the structure that you desire, before asking for the result set. That's why it works.
Finally, you should be aware that assembling the string dynamically can theoretically and potentially (although not probable) get you a result set with infinite columns. Of course, that's not possible and it will fail, but I'm sure you understood the implications.
Update
I found this, which reinforces the reasons why it does not work.
Here:
SSIS relies on knowing the metadata of the dataflow in advance and a
dynamic pivot (which is what you are after) is not compatible with
that.
I'll keep looking and adding here.

Move SELECT to SQL Server side

I have an SQLCLR trigger. It contains a large and messy SELECT inside, with parts like:
(CASE WHEN EXISTS(SELECT * FROM INSERTED I WHERE I.ID = R.ID)
THEN '1' ELSE '0' END) AS IsUpdated -- Is selected row just added?
as well as JOINs etc. I like to have the result as a single table with all included.
Question 1. Can I move this SELECT to SQL Server side? If yes, how to do this?
Saying "move", I mean to create a stored procedure or something else that can be executed before reading dataset in while cycle.
The 2 following questions make sense only if answer is "yes".
Why do I want to move SELECT? First off, I don't like mixing SQL with C# code. At second, I suppose that server-side queries run faster, since the server have more chances to cache them.
Question 2. Am I right? Is it some sort of optimizing?
Also, the SELECT contains constant strings, but they are localizable. For instance,
WHERE R.Status = "Enabled"
"Enabled" should be changed for French, German etc. So, I want to write 2 static methods -- OnCreate and OnDestroy -- then mark them as stored procedures. When registering/unregistering my assembly on server side, just call them respectively. In OnCreate format the SELECT string, replacing {0}, {1}... with required values from the assembly resources. Then I can localize resources only, not every script.
Question 3. Is it good idea? Is there an existing attribute to mark methods to be executed by SQL Server automatically after (un)registartion an assembly?
Regards,
Well, the SQL-CLR trigger will also execute on the server, inside the server process - so that's server-side as well, no benefit there.
But I agree - triggers ought to be written in T-SQL whenever possible - no real big benefit in having triggers in C#.... can you show the the whole trigger code?? Unless it contains really odd balls stuff, it should be pretty easy to convert to T-SQL.
I don't see how you could "move" the SELECT to the SQL side and keep the rest of the code in C# - either your trigger is in T-SQL (my preference), or then it is in C#/SQL-CLR - I don't think there's any way to "mix and match".
To start with, you probably do not need to do that type of subquery inside of whatever query you are doing. The INSERTED table only has rows that have been updated (or inserted but we can assume this is an UPDATE Trigger based on the comment in your code). So you can either INNER JOIN and you will only match rows in the Table with the alias of "R" or you can LEFT JOIN and you can tell which rows in R have been updated as the ones showing NULL for all columns were not updated.
Question 1) As marc_s said below, the Trigger executes in the context of the database. But it goes beyond that. ALL database related code, including SQLCLR executes in the database. There is no client-side here. This is the issue that most people have with SQLCLR: it runs inside of the SQL Server context. And regarding wanting to call a Stored Proc from the Trigger: it can be done BUT the INSERTED and DELETED tables only exist within the context of the Trigger itself.
Question 2) It appears that this question should have started with the words "Also, the SELECT". There are two things to consider here. First, when testing for "Status" values (or any Lookup values) since this is not displayed to the user you should be using numeric values. A "status" of "Enabled" should be something like "1" so that the language is not relevant. A side benefit is that not only will storing Status values as numbers take up a lot less space, but they also compare much faster. Second is that any text that is to be displayed to the user that needs to be sensitive to language differences should be in a table so that you can pass in a LanguageId or LocaleId to get the appropriate French, German, etc. strings to display. You can set the LocaleId of the user or system in general in another table.
Question 3) If by "registration" you mean that the Assembly is either CREATED or DROPPED, then you can trap those events via DDL Triggers. You can look here for some basics:
http://msdn.microsoft.com/en-us/library/ms175941(v=SQL.90).aspx
But CREATE ASSEMBLY and DROP ASSEMBLY are events that are trappable.
If you are speaking of when Assemblies are loaded and unloaded from memory, then I do not know of a way to trap that.
Question 1.
http://www.sqlteam.com/article/stored-procedures-returning-data
Question 3.
It looks like there are no appropriate attributes, at least in Microsoft.SqlServer.Server Namespace.

Consolidated: SQL Pass comma separated values in SP for filtering

I'm here to share a consolidated analysis for the following scenario:
I've an 'Item' table and I've a search SP for it. I want to be able to search for multiple ItemCodes like:
- Table structure : Item(Id INT, ItemCode nvarchar(20))
- Filter query format: SELECT * FROM Item WHERE ItemCode IN ('xx','yy','zz')
I want to do this dynamically using stored procedure. I'll pass an #ItemCodes parameter which will have comma(',') separated values and the search shud be performed as above.
Well, I've already visited lot of posts\forums and here're some threads:
Dynamic SQL might be a least complex way but I don't want to consider it because of the parameters like performance,security (SQL-Injection, etc..)..
Also other approaches like XML, etc.. if they make things complex I can't use them.
And finally, no extra temp-table JOIN kind of performance hitting tricks please.
I've to manage the performance as well as the complexity.
T-SQL stored procedure that accepts multiple Id values
Passing an "in" list via stored procedure
I've reviewed the above two posts and gone thru some solutions provided, here're some limitations:
http://www.sommarskog.se/arrays-in-sql-2005.html
This will require me to 'declare' the parameter-type while passing it to the SP, it distorts the abstraction (I don't set type in any of my parameters because each of them is treated in a generic way)
http://www.sqlteam.com/article/sql-server-2008-table-valued-parameters
This is a structured approach but it increases complexity, required DB-structure level changes and its not abstract as above.
http://madprops.org/blog/splitting-text-into-words-in-sql-revisited/
Well, this seems to match-up with my old solutions. Here's what I did in the past -
I created an SQL function : [GetTableFromValues] (returns a temp table populated each item (one per row) from the comma separated #ItemCodes)
And, here's how I use it in my WHERE caluse filter in SP -
SELECT * FROM Item WHERE ItemCode in (SELECT * FROM[dbo].[GetTableFromValues](#ItemCodes))
This one is reusable and looks simple and short (comparatively of course). Anything I've missed or any expert with a better solution (obviously 'within' the limitations of the above mentioned points).
Thank you.
I think using dynamic T-SQL will be pragmatic in this scenario. If you are careful with the design, dynamic sql works like a charm. I have leveraged it in countless projects when it was the right fit. With that said let me address your two main concerns - performance and sql injection.
With regards to performance, read T-SQL reference on parameterized dynamic sql and sp_executesql (instead of sp_execute). A combination of parameterized sql and using sp_executesql will get you out of the woods on performance by ensuring that query plans are reused and sp_recompiles avoided! I have used dynamic sql even in real-time contexts and it works like a charm with these two items taken care of. For your satisfaction you can run a loop of million or so calls to the sp with and without the two optimizations, and use sql profiler to track sp_recompile events.
Now, about SQL-injection. This will be an issue if you use an incorrect user widget such as a textbox to allow the user to input the item codes. In that scenario it is possible that a hacker may write select statements and try to extract information about your system. You can write code to prevent this but I think going down that route is a trap. Instead consider using an appropriate user widget such as a listbox (depending on your frontend platform) that allows multiple selection. In this case the user will just select from a list of "presented items" and your code will generate the string containing the corresponding item codes. Basically you do not pass user text to the dynamic sql sp! You can even use slicker JQuery based selection widgets but the bottom line is that the user does not get to type any unacceptable text that hits your data layer.
Next, you just need a simple stored procedure on the database that takes a param for the itemcodes (for e.g. '''xyz''','''abc'''). Internally it should use sp_executesql with a parameterized dynamic query.
I hope this helps.
-Tabrez

Sql Optimization: Xml or Delimited String

This is hopefully just a simple question involving performance optimizations when it comes to queries in Sql 2008.
I've worked for companies that use Stored Procs a lot for their ETL processes as well as some of their websites. I've seen the scenario where they need to retrieve specific records based on a finite set of key values. I've seen it handled in 3 different ways, illustrated via pseudo-code below.
Dynamic Sql that concatinates a string and executes it.
EXEC('SELECT * FROM TableX WHERE xId IN (' + #Parameter + ')'
Using a user defined function to split a delimited string into a table
SELECT * FROM TableY INNER JOIN SPLIT(#Parameter) ON yID = splitId
USING XML as the Parameter instead of a delimited varchar value
SELECT * FROM TableZ JOIN #Parameter.Nodes(xpath) AS x (y) ON ...
While I know creating the dynamic sql in the first snippet is a bad idea for a large number of reasons, my curiosity comes from the last 2 examples. Is it more proficient to do the due diligence in my code to pass such lists via XML as in snippet 3 or is it better to just delimit the values and use an udf to take care of it?
There is now a 4th option - table valued parameters, whereby you can actually pass a table of values in to a sproc as a parameter and then use that as you would normally a table variable. I'd be preferring this approach over the XML (or CSV parsing approach)
I can't quote performance figures between all the different approaches, but that's one I'd be trying - I'd recommend doing some real performance tests on them.
Edit:
A little more on TVPs. In order to pass the values in to your sproc, you just define a SqlParameter (SqlDbType.Structured) - the value of this can be set to any IEnumerable, DataTable or DbDataReader source. So presumably, you already have the list of values in a list/array of some sort - you don't need to do anything to transform it into XML or CSV.
I think this also makes the sproc clearer, simpler and more maintainable, providing a more natural way to achieve the end result. One of the main points is that SQL performs best at set based/not looping/non string manipulation activities.
That's not to say it will perform great with a large set of values passed in. But with smaller sets (up to ~1000) it should be fine.
UDF invocation is a little bit more costly than splitting the XML using the built-in function.
However, this only needs to be done once per query, so the performance difference will be negligible.