I often write SQL scripts that have repetitive several lines in the WHERE statement to eliminate records.
For instance:
SELECT *
FROM tblAll
WHERE Field1 NOT LIKE '%AA%'
AND Field1 NOT LIKE '%BB%'
AND Field1 NOT LIKE '%00'
It would be less prone to mistakes if I didn't have to add these lines each time. How can I create a function that would help me to do this instead?:
SELECT *
FROM tblAll
WHERE Field1 NOT LIKE Function
This is what I have currently:
CREATE FUNCTION [dbo].[ExcludeField1]
(
#Field1 varchar(max),
#Date datetime
)
RETURNS VARCHAR(15) AS
BEGIN
DECLARE #Field1 varchar(15)
DECLARE #Date datetime
SELECT TOP 1 #Field1=Field1 ,#Date=Date
FROM tblAll A
WHERE A.Date=#Date
AND Field1 NOT LIKE '%AA%'
AND Field1 NOT LIKE 'BB%'
AND Field1 NOT LIKE '%00%'
ORDER BY LEN(A.Field1) ASC ;
RETURN #Field1
END
GO
But I feel I'm missing something vital. The function only provides what I consider to be valid values for Field1. So my future scripts should be:
Field1 = #Field1
What's not right?
What you are trying to do may not be the best approach.
You are trying to short circuit the speed of the code by adding in something that makes it easier to program.
If you have large files, this could be disastrous. If you have small files, it's still cumbersome.
You may find it better (and more readable) to create a temp table that executes just your repetitive stuff first, and then call that smaller temp file into your other code. This makes it more readable for future people finding your code AND reduces the size (and complexity) of your following code. Depending on the type of SQL you are running, you may even be able to put your temp table code into a sub proc that can be called before each SQL query that needs it.
Still, you'll get better performance just adding in the extra code to each query that needs it. That way SQL can decide what the fastest way is to grab your data and send it back to you. (Even if the SQL code is much bulkier and you have to repeat the same snippets in lots of different queries.)
However if you insist on being able to insert code snippets that don't change. You could try building the SQL code in a different language (like Visual Basic, C, or VBA, etc) where you can create a string of code. Then you can connect to your SQL database and send the generated SQL code over and have it executed. This gives you the luxury of building all kinds of Frankenstein code. (And could even let you build interaction front ends so users can add all kinds of different parameters.) But beware... there be all kinds of nasty dragons for you play with in those caves.
Personally, I'd go with temp files if your goal is to simplify and bring some order to your repetitive code.
Hope that helps a bit. :)
EDIT
Oooops, let's not forget dynamic SQL (if your version of SQL has it):
DECLARE #sql nvarchar(max);
DECLARE #snippet nvarchar(max);
set #snippet=
'Field1 NOT LIKE ''%AA%''
AND Field1 NOT LIKE ''%BB%''
AND Field1 NOT LIKE ''%00''';
set #sql=
'SELECT
Field1
,Field2
FROM tblAll
WHERE '
+ #snippet +
'ORDER BY
Field1;'
exec(#sql);
This is actually the best way for you to control repetitive code. It allows you to lock it down in one variable that you can insert into larger/many SQL statements. That way, if you need to make future changes to the snippet, you only need to make the change in one place instead of having to hunt down all the queries that have the code.
But it also gives the ability for the SQL engine to create the fastest code for returning your data.
Remember... function calls can be very expensive for processing. Avoid them if possible.
Related
I'm currently working on a .NET application and want to make it as modular as possible. I've already created a basic SELECT procedure, which returns data by checking inputted parameters on SQL Server side.
I want to create a procedure that parses structured data as string and inserts its' contents to corresponding table in database.
For example, I have a table as
CREATE TABLE ExampleTable (
id_exampleTable int IDENTITY (1, 1) NOT NULL,
exampleColumn1 nvarchar(200) NOT NULL,
exampleColumn2 int NULL,
exampleColumn3 int NOT NULL,
CONSTRAINT pk_exampleTable PRIMARY KEY ( id_exampleTable )
)
And my procedure starts as
CREATE PROCEDURE InsertDataIntoCorrespondingTable
#dataTable nvarchar(max), --name of Table in my DB
#data nvarchar(max) --normalized string parameter as 'column1, column2, column3, etc.'
AS
BEGIN
IF #dataTable = 'table'
BEGIN
/**Parse this string and execute insert command**/
END
ELSE IF /**Other statements**/
END
TL;DR
So basically, I'm looking for a solution that can help me achieve something like this
EXEC InsertDataIntoCorrespondingTableByID(
#dataTable = 'ExampleTable',
#data = '''exampleColumn1'', 2, 3'
)
Which should be equal to just
INSERT INTO ExampleTable SELECT 'exampleColumn1', 2, 3
Sure, I can push data as INSERT statements (for each and every 14 tables inside DB...), generated inside an app, but I want to conquer T-SQL :)
This might be reasonable (to some degree) on an RDBMS that supports structured data like JSON or XML natively, but doing this the way you are planning is going to cause some real pain-in-the-rear support and, more importantly, a sql injection attack vector. I would leave this to the realm of the web backend server where it belongs.
You are likely going to invent your own structured data markup language and parser to solve this as sql server. That's a wheel that doesn't need to be reinvented. If you do end up building this, highly consider going with JSON to avoid all the issues that structured data inherently bring with it, assuming your version of sql server supports json parsing/packaging.
Your front end that packages your data into your SDML is going to have to assume column ordinals, but column ordinal is not something that one should rely on in a database. SQL Amateurs often do, I know from years in the industry and dealing with end users that are upset when a new column is introduced in a position they don't want it. Adding a column to a table shouldn't break an application. If it does, that application has bad code.
Regarding the sql injection attack vector, your SP code is going to get ugly. You'll need to parse out each item in #data into a variable of its own in order to properly parameterize your dynamic sql that is being built. See here under the "working with parameters" section for what that will look like. Failure to add this to your SP code means that values passed in that #data SDML could become executable SQL instead of literals and that would be very bad. This is not easy to solve in SP language. Where it IS easy to solve though is in the backend server code. Every database library on the planet supports parameterized query building/execution natively.
Once you have this built you will be dynamically generating an INSERT statement and dynamically generating variables or an array or some data structure to pass in parameters to the INSERT statement to avoid sql injection attacks. It's going to be dynamic, on top of dynamic, on top of dynamic which leads to:
From a support context, imagine that your application just totally throws up one day. You have to dive into investigate. You track the SDML that your front end created that caused the failure, and you open up your SP code to troubleshoot. Imagine what this code ends up looking like
It has to determine if the table exists
It has to parse the SDML to get each literal
It has to read DB metadata to get the column list
It has to dynamically write the insert statement, listing the columns from metadata and dynamically creating sql parameters for the VALUES() list.
It has to execute sending a dynamic number of variables into the dynamically generated sql.
My support staff would hang me out to dry if they had to deal with that, and I'm the one paying them.
All of this is solved by using a proper backend to handle communication, deeper validation, sql parameter binding, error catching and handling, and all the other things that backend servers are meant to do.
I believe that your back end web server should be VERY aware of the underlying data model. It should be the connection between your view, your data, and your model. Leave the database to the things it's good at (reading and writing data). Leave your front end to the things that it's good at (presenting a UI for the end user).
I suppose you could do something like this (may need a little extra work)
declare #columns varchar(max);
select #columns = string_agg(name, ', ') WITHIN GROUP ( ORDER BY column_id )
from sys.all_columns
where object_id = object_id(#dataTable);
declare #sql varchar(max) = select concat('INSERT INTO ',#dataTable,' (',#columns,') VALUES (', #data, ')')
exec sp_executesql #sql
But please don't. If this were a good idea, there would be tons of examples of how to do it. There aren't so it's probably not a good idea.
There are however tons of examples of using ORMs or auto-generated code in stead - because that way your code is maintainable, debugable and performant.
I have about half a dozen generic, but fairly complex stored procedures and functions that I would like to use in a more generic fashion.
Ideally I'd like to be able to pass the table name as a parameter to the procedure, as currently it is hard coded.
The research I have done suggests I need to convert all existing SQL within my procedures to use dynamic SQL in order to splice in the dynamic table name from the parameter, however I was wondering if there is a easier way by referencing the table in another way?
For example:
SELECT * FROM #MyTable WHERE...
If so, how do I set the #MyTable variable from the table name?
I am using SQL Server 2005.
Dynamic SQL is the only way to do this, but I'd reconsider the architecture of your application if it requires this. SQL isn't very good at "generalized" code. It works best when it's designed and coded to do individual tasks.
Selecting from TableA is not the same as selecting from TableB, even if the select statements look the same. There may be different indexes, different table sizes, data distribution, etc.
You could generate your individual stored procedures, which is a common approach. Have a code generator that creates the various select stored procedures for the tables that you need. Each table would have its own SP(s), which you could then link into your application.
I've written these kinds of generators in T-SQL, but you could easily do it with most programming languages. It's pretty basic stuff.
Just to add one more thing since Scott E brought up ORMs... you should also be able to use these stored procedures with most sophisticated ORMs.
You'd have to use dynamic sql. But don't do that! You're better off using an ORM.
EXEC(N'SELECT * from ' + #MyTable + N' WHERE ... ')
You can use dynamic Sql, but check that the object exists first unless you can 100% trust the source of that parameter. It's likely that there will be a performance hit as SQL server won't be able to re-use the same execution plan for different parameters.
IF OBJECT_ID(#tablename, N'U') IS NOT NULL
BEGIN
--dynamic sql
END
ALTER procedure [dbo].[test](#table_name varchar(max))
AS
BEGIN
declare #tablename varchar(max)=#table_name;
declare #statement varchar(max);
set #statement = 'Select * from ' + #tablename;
execute (#statement);
END
I just ran into a strange thing...there is some code on our site that is taking a giant SQL statement, modifying it in code by doing some search and replace based on some user values, and then passing it on to SQL Server as a query.
I was thinking that this would be cleaner as a parameterized query to a stored proc, with the user values as the parameters, but when I looked more closely I see why they might be doing it...the table that they are selecting from is variably dependant on those user values.
For instance, in one case if the values were ("FOO", "BAR") the query would end up being something like "SELECT * FROM FOO_BAR"
Is there an easy and clear way to do this? Everything I'm trying seems inelegant.
EDIT: I could, of course, dynamically generate the sql in the stored proc, and exec that (bleh), but at that point I'm wondering if I've gained anything.
EDIT2: Refactoring the table names in some intelligent way, say having them all in one table with the different names as a new column would be a nice way to solve all of this, which several people have pointed out directly, or alluded to. Sadly, it is not an option in this case.
First of all, you should NEVER do SQL command compositions on a client app like this, that's what SQL Injection is. (Its OK for an admin tool that has no privs of its own, but not for a shared use application).
Secondly, yes, a parametrized call to a Stored procedure is both cleaner and safer.
However, as you will need to use Dynamic SQL to do this, you still do not want to include the passed string in the text of the executed query. Instead, you want to used the passed string to look up the names of the actual tables that the user should be allowed to query in the way.
Here's a simple naive example:
CREATE PROC spCountAnyTableRows( #PassedTableName as NVarchar(255) ) AS
-- Counts the number of rows from any non-system Table, *SAFELY*
BEGIN
DECLARE #ActualTableName AS NVarchar(255)
SELECT #ActualTableName = QUOTENAME( TABLE_NAME )
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = #PassedTableName
DECLARE #sql AS NVARCHAR(MAX)
SELECT #sql = 'SELECT COUNT(*) FROM ' + #ActualTableName + ';'
EXEC(#SQL)
END
Some have fairly asked why this is safer. Hopefully, little Bobby Tables can make this clearer:
0
Answers to more questions:
QUOTENAME alone is not guaranteed to be safe. MS encourages us to use it, but they have not given a guarantee that it cannot be out-foxed by hackers. FYI, real Security is all about the guarantees. The table lookup with QUOTENAME, is another story, it's unbreakable.
QUOTENAME is not strictly necessary for this example, the Lookup translation on INFORMATION_SCHEMA alone is normally sufficient. QUOTENAME is in here because it is good form in security to include a complete and correct solution. QUOTENAME in here is actually protecting against a distinct, but similar potential problem know as latent injection.
I should note that you can do the same thing with dynamic Column Names and the INFORMATION_SCHEMA.COLUMNS table.
You can also bypass the need for stored procedures by using a parameterized SQL query instead (see here: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlcommand.parameters?view=netframework-4.8). But I think that stored procedures provide a more manageable and less error-prone security facility for cases like this.
(Un)fortunately there's no way of doing this - you can't use table name passed as a parameter to stored code other than for dynamic sql generation. When it comes to deciding where to generate sql code, I prefer application code rather that stored code. Application code is usually faster and easier to maintain.
In case you don't like the solution you're working with, I'd suggest a deeper redesign (i.e. change the schema/application logic so you no longer have to pass table name as a parameter anywhere).
I would argue against dynamically generating the SQL in the stored proc; that'll get you into trouble and could cause injection vulnerability.
Instead, I would analyze all of the tables that could be affected by the query and create some sort of enumeration that would determine which table to use for the query.
Sounds like you'd be better off with an ORM solution.
I cringe when I see dynamic sql in a stored procedure.
One thing you can consider is to make a case statement that contains the same SQL command you want, once for each valid table, then pass as a string the table name into this procedure and have the case choose which command to run.
By the way as a security person the suggestion above telling you to select from the system tables in order to make sure you have a valid table seems like a wasted operation to me. If someone can inject passed the QUOTENAME() then then injection would work on the system table just as well as on the underlying table. The only thing this helps with it to ensure it is a valid table name, and I think the suggestion above is a better approach to that since you are not using QUOTENAME() at all.
Depending on whether the set of columns in those tables is the same or different, I'd approach it in two ways in the longer term:
1) if they the same, why not create a new column that would be used as a selector, whose value is derived from the user-supplied parameters ? (is it a performance optimization?)
2) if they are different, chances are that handling of them is also different. As such, it seems like splitting the select/handle code into separate blocks and then calling them separately would be a most modular approach to me. You will repeat the "select * from" part,
but in this scenario the set of tables is hopefully finite.
Allowing the calling code to supply two arbitrary parts of the table name to do a select from feels very dangerous.
I don't know the reason why you have the data spread over several tables, but it sounds like you are breaking one of the fundamentals. The data should be in the tables, not as table names.
If the tables have more or less the same layout, consider if it would be best to put the data in a single table instead. That would solve your problem with the dynamic query, and it would make the database layout more flexible.
Instead of Querying the tables based on user input values, you can pick the procedure instead.
that is to say
1. Create a procedure FOO_BAR_prc and inside that you put the query 'select * from foo_bar' , that way the query will be precompiled by the database.
2. Then based on the user input now execute the correct procedure from your application code.
Since you have around 50 tables, this might not be a feasible solution though as it would require lot of work on your part.
In fact, I wanted to know how to pass table name to create a table in stored procedure. By reading some of the answers and attempting some modification at my end, I finally able to create a table with name passed as parameter. Here is the stored procedure for others to check any error in it.
USE [Database Name]
GO
/****** Object: StoredProcedure [dbo].[sp_CreateDynamicTable] Script Date: 06/20/2015 16:56:25 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_CreateDynamicTable]
#tName varchar(255)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #SQL nvarchar(max)
SET #SQL = N'CREATE TABLE [DBO].['+ #tName + '] (DocID nvarchar(10) null);'
EXECUTE sp_executesql #SQL
END
#RBarry Young
You don't need to add the brackets to #ActualTableName in the query string because it is already included in the result from the query in the INFORMATION_SCHEMA.TABLES. Otherwise, there will be error(s) when executed.
CREATE PROC spCountAnyTableRows( #PassedTableName as NVarchar(255) ) AS
-- Counts the number of rows from any non-system Table, SAFELY
BEGIN
DECLARE #ActualTableName AS NVarchar(255)
SELECT #ActualTableName = QUOTENAME( TABLE_NAME )
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = #PassedTableName
DECLARE #sql AS NVARCHAR(MAX)
--SELECT #sql = 'SELECT COUNT(*) FROM [' + #ActualTableName + '];'
-- changed to this
SELECT #sql = 'SELECT COUNT(*) FROM ' + #ActualTableName + ';'
EXEC(#SQL)
END
I would avoid dynamic SQL at all costs.
Isn't the most elegant solution but does the job perfectly.
PROCEDURE TABLE_AS_PARAMTER (
p_table_name IN VARCHAR2
) AS
BEGIN
CASE p_table_name
WHEN 'TABLE1' THEN
UPDATE TABLE1
SET
COLUMN1 =1
WHERE
ID =1;
WHEN 'TABLE2' THEN
UPDATE TABLE1
SET
COLUMN1 =1
WHERE
ID =2;
END CASE;
COMMIT;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK
END TABLE_AS_PARAMTER;
i have a situation where i want to check a certain column ( like version number) and then apply a bunch of ddl changes
trouble is i am not able to do it with in a IF BEGIN END block, since DDL statements require a GO separator between them, and TSQL wont allow that.
I am wondering if there is any way around to accomplish this
You don't need to use a full block. A conditional will execute the next statement in its entirety if you don't use a BEGIN/END -- including a single DDL statement. This is equivalent to the behavior of if in Pascal, C, etc. Of course, that means that you will have to re-check your condition over and over and over. It also means that using variables to control the script's behavior is pretty much out of the question.
[Edit: CREATE PROCEDURE doesn't work in the example below, so I changed it to something else and moved CREATE PROCEDURE for a more extended discussion below]
If ((SELECT Version FROM table WHERE... ) <= 15)
CREATE TABLE dbo.MNP (
....
)
GO
If ((SELECT Version FROM table WHERE... ) <= 15)
ALTER TABLE dbo.T1
ALTER COLUMN Field1 AS CHAR(15)
GO
...
Or something like that, depending on what your condition is.
Unfortunately, CREATE/ALTER PROCEDURE and CREATE/ALTER VIEW have special requirements that make it much harder to work with. They are pretty much required to be the only thing in a statement, so you can't combine them with IF at all.
For many scenarios, when you want to "upgrade" your objects, you can work it as a conditional drop followed by a create:
IF(EXISTS(SELECT * FROM sys.objects WHERE type='p' AND object_id = OBJECT_ID('dbo.abc')))
DROP PROCEDURE dbo.abc
GO
CREATE PROCEDURE dbo.abc
AS
...
GO
If you do really need conditional logic to decide what to do, then the only way I know of is to use EXECUTE to run the DDL statements as a string.
If ((SELECT Version FROM table WHERE... ) <= 15)
EXECUTE 'CREATE PROC dbo.abc
AS
....
')
But this is very painful. You have to escape any quotes in the body of the procedure and it's really hard to read.
Depending on the changes that you need to apply, you can see all this can get very ugly fast. The above doesn't even include error checking, which is a royal pain all on its own. This is why hordes of toolmakers make a living by figuring out ways to automate the creation of deployment scripts.
Sorry; there is no easy "right" way that works for everything. This is just something that TSQL supports very poorly. Still, the above should be a good start.
GO is recognised by client tools, not by the server.
You can have CREATEs in your stored procedures or ad-hoc queries with no GO's.
Multiple "IF" statements? You can test then for the success of subsequent DDL statements
Dynamic SQL? EXEC ('ALTER TABLE foo WITH CHECK ADD CONSTRAINT ...')?
As mentioned, GO is a client only batch separator to break down a single SQL text block into batches that are submitted to the SQL Server.
In the application I'm working on porting to the web, we currently dynamically access different tables at runtime from run to run, based on a "template" string that is specified. I would like to move the burden of doing that back to the database now that we are moving to SQL server, so I don't have to mess with a dynamic GridView. I thought of writing a Table-valued UDF with a parameter for the table name and one for the query WHERE clause.
I entered the following for my UDF but obviously it doesn't work. Is there any way to take a varchar or string of some kind and get a table reference that can work in the FROM clause?
CREATE FUNCTION TemplateSelector
(
#template varchar(40),
#code varchar(80)
)
RETURNS TABLE
AS
RETURN
(
SELECT * FROM #template WHERE ProductionCode = #code
)
Or some other way of getting a result set similar in concept to this. Basically all records in the table indicated by the varchar #template with the matching ProductionCode of the #code.
I get the error "Must declare the table variable "#template"", so SQL server probably things I'm trying to select from a table variable.
On Edit: Yeah I don't need to do it in a function, I can run Stored Procs, I've just not written any of them before.
CREATE PROCEDURE TemplateSelector
(
#template varchar(40),
#code varchar(80)
)
AS
EXEC('SELECT * FROM ' + #template + ' WHERE ProductionCode = ' + #code)
This works, though it's not a UDF.
The only way to do this is with the exec command.
Also, you have to move it out to a stored proc instead of a function. Apparently functions can't execute dynamic sql.
The only way that this would be possible is with dynamic SQL, however, dynamic SQL is not supported by SqlServer within a function.
I'm sorry to say that I'm quite sure that it is NOT possible to do this within a function.
If you were working with stored procedures it would be possible.
Also, it should be noted that, be replacing the table name in the query, you've destroyed SQL Server's ability to cache the execution plan for the query. This pretty much reduces the advantage of using a UDF or SP to nil. You might as well just call the SQL query directly.
I have a finite number of tables that I want to be able to address, so I could writing something using IF, that tests #template for matches with a number of values and for each match runs
SELECT * FROM TEMPLATENAME WHERE ProductionCode = #code
It sounds like that is a better option
If you have numerous tables with identical structure, it usually means you haven't designed your database in a normal form. You should unify these into one table. You may need to give this table one more attribute column to distinguish the data sets.