Conditional Compilation of schema objects in SSDT Project - conditional-compilation

Our SSDT database project includes a table that has a computed column that can take one of several forms, depending on customer requirements. I'm trying to figure out how to manage this computed column so that we can still use the Publish function without reverting everyone's columns back to the default.
What I'm trying to accomplish can be explained in the following invalid T-SQL code:
CREATE TABLE dbo.Customer
(
Id INTEGER,
Region INTEGER,
Name VARCHAR(50),
AccountNumber AS dbo.FormatAccountNumber(Id, Region)
)
CREATE FUNCTION [dbo].[FormatAccountNumber]
(
#Id INTEGER,
#Region INTEGER
)
RETURNS VARCHAR(20)
AS
BEGIN
IF '$(AccountType)' = 'Regional'
RETURN CONVERT(VARCHAR, #Region) + '-' + CONVERT(VARCHAR, #Id)
IF '$(AccountType)' = 'Merged'
RETURN CONVERT(VARCHAR, #Region * 100000 + #Id)
IF '$(AccountType)' = 'Flat'
RETURN CONVERT(VARCHAR, #Id)
END
This, of course, doesn't work because the $(AccountType) SQLCMD variable can't be used inside of the function, and wouldn't be set properly at run-time anyway. I've also trying putting the SQLCMD conditional around the entire function:
IF '$(AccountType)' = 'Flat'
CREATE FUNCTION ...
but this produces the error that "CREATE FUNCTION must be the only statement in the batch."
Is there any way to do any sort of conditional compilation of schema in the SSDT project? And if not, what options do I have for maintaining this sort of customizable field within the SSDT publishing process?

You could use a Post-Deploment script to imperatively deploy your object in dynamic SQL:
IF NOT EXISTS (SELECT * FROM sys.objects
WHERE object_id = OBJECT_ID(N'[dbo].[FormatAccountNumber]')
AND type in (N'FN', N'IF', N'TF', N'FS', N'FT'))
EXEC('CREATE FUNCTION [dbo].[FormatAccountNumber] () RETURNS BIT AS BEGIN RETURN 0 END')
GO
IF '$(AccountType)' = 'Regional'
BEGIN
EXEC('
ALTER FUNCTION [dbo].[FormatAccountNumber]
(
#Id INTEGER,
#Region INTEGER
)
RETURNS VARCHAR(20)
AS
BEGIN
RETURN CONVERT(VARCHAR, #Region) + ''-'' + CONVERT(VARCHAR, #Id)
END
')
END
Note that by doing this, you won't be able to make any references to the dbo.FormatAccountNumber function in your SSDT Database Project (unless those objects also included in the Post-Deployment script).
I also toyed with an alternate solution which involved addition conditionals within the .sqlproj file itself (within the ItemGroup elements), but this did get a bit messy since MsBuild Properties aren't exactly like-for-like with SQLCMD Variables, but I can post this if you like.
If you find yourself needing to conditionally deploy objects in your database often, you may like to consider moving to an imperative-based deployment solution (as opposed to the declarative style of SSDT projects). Imperative deployment, often called migrations, gives you greater control of deploy-time behaviour.
Disclaimer: I am the founder of ReadyRoll, which makes a product for VS that uses imperative deployment.

I have had an open discussion on MSDN about this need. Have not made much progress. The ideal situation would allow you to flag db objects as "inheritable" in base ssdt projects so that other projects that reference the base project or DAC won't complain of the duplicate object, and would only create the base object or "stub" if it did not exist. This would allow you to have "layers" of database models. See my post on msdn Extending SSDT Composite Solutions with Overriden Stored Procedures

Related

Create a generic procedure, which inserts data into any table

I'm currently working on a .NET application and want to make it as modular as possible. I've already created a basic SELECT procedure, which returns data by checking inputted parameters on SQL Server side.
I want to create a procedure that parses structured data as string and inserts its' contents to corresponding table in database.
For example, I have a table as
CREATE TABLE ExampleTable (
id_exampleTable int IDENTITY (1, 1) NOT NULL,
exampleColumn1 nvarchar(200) NOT NULL,
exampleColumn2 int NULL,
exampleColumn3 int NOT NULL,
CONSTRAINT pk_exampleTable PRIMARY KEY ( id_exampleTable )
)
And my procedure starts as
CREATE PROCEDURE InsertDataIntoCorrespondingTable
#dataTable nvarchar(max), --name of Table in my DB
#data nvarchar(max) --normalized string parameter as 'column1, column2, column3, etc.'
AS
BEGIN
IF #dataTable = 'table'
BEGIN
/**Parse this string and execute insert command**/
END
ELSE IF /**Other statements**/
END
TL;DR
So basically, I'm looking for a solution that can help me achieve something like this
EXEC InsertDataIntoCorrespondingTableByID(
#dataTable = 'ExampleTable',
#data = '''exampleColumn1'', 2, 3'
)
Which should be equal to just
INSERT INTO ExampleTable SELECT 'exampleColumn1', 2, 3
Sure, I can push data as INSERT statements (for each and every 14 tables inside DB...), generated inside an app, but I want to conquer T-SQL :)
This might be reasonable (to some degree) on an RDBMS that supports structured data like JSON or XML natively, but doing this the way you are planning is going to cause some real pain-in-the-rear support and, more importantly, a sql injection attack vector. I would leave this to the realm of the web backend server where it belongs.
You are likely going to invent your own structured data markup language and parser to solve this as sql server. That's a wheel that doesn't need to be reinvented. If you do end up building this, highly consider going with JSON to avoid all the issues that structured data inherently bring with it, assuming your version of sql server supports json parsing/packaging.
Your front end that packages your data into your SDML is going to have to assume column ordinals, but column ordinal is not something that one should rely on in a database. SQL Amateurs often do, I know from years in the industry and dealing with end users that are upset when a new column is introduced in a position they don't want it. Adding a column to a table shouldn't break an application. If it does, that application has bad code.
Regarding the sql injection attack vector, your SP code is going to get ugly. You'll need to parse out each item in #data into a variable of its own in order to properly parameterize your dynamic sql that is being built. See here under the "working with parameters" section for what that will look like. Failure to add this to your SP code means that values passed in that #data SDML could become executable SQL instead of literals and that would be very bad. This is not easy to solve in SP language. Where it IS easy to solve though is in the backend server code. Every database library on the planet supports parameterized query building/execution natively.
Once you have this built you will be dynamically generating an INSERT statement and dynamically generating variables or an array or some data structure to pass in parameters to the INSERT statement to avoid sql injection attacks. It's going to be dynamic, on top of dynamic, on top of dynamic which leads to:
From a support context, imagine that your application just totally throws up one day. You have to dive into investigate. You track the SDML that your front end created that caused the failure, and you open up your SP code to troubleshoot. Imagine what this code ends up looking like
It has to determine if the table exists
It has to parse the SDML to get each literal
It has to read DB metadata to get the column list
It has to dynamically write the insert statement, listing the columns from metadata and dynamically creating sql parameters for the VALUES() list.
It has to execute sending a dynamic number of variables into the dynamically generated sql.
My support staff would hang me out to dry if they had to deal with that, and I'm the one paying them.
All of this is solved by using a proper backend to handle communication, deeper validation, sql parameter binding, error catching and handling, and all the other things that backend servers are meant to do.
I believe that your back end web server should be VERY aware of the underlying data model. It should be the connection between your view, your data, and your model. Leave the database to the things it's good at (reading and writing data). Leave your front end to the things that it's good at (presenting a UI for the end user).
I suppose you could do something like this (may need a little extra work)
declare #columns varchar(max);
select #columns = string_agg(name, ', ') WITHIN GROUP ( ORDER BY column_id )
from sys.all_columns
where object_id = object_id(#dataTable);
declare #sql varchar(max) = select concat('INSERT INTO ',#dataTable,' (',#columns,') VALUES (', #data, ')')
exec sp_executesql #sql
But please don't. If this were a good idea, there would be tons of examples of how to do it. There aren't so it's probably not a good idea.
There are however tons of examples of using ORMs or auto-generated code in stead - because that way your code is maintainable, debugable and performant.

Selecting data from a different schema within a stored procedure

Consider this:
CREATE PROCEDURE [dbo].[setIdentifier](#oldIdentifierName as varchar(50), #newIdentifierName as varchar(50))
AS
BEGIN
DECLARE #old_id as int;
DECLARE #new_id as int;
SET #old_id = (SELECT value FROM Configuration WHERE id = #oldIdentifierName);
SET #new_id = (SELECT value FROM Configuration WHERE id = #newIdentifierName);
IF #old_id IS NOT NULL AND #new_id IS NOT NULL
BEGIN
UPDATE Customer
SET type = #new_id
WHERE type = #old_id;
END;
END
[...]
EXECUTE dbo.setIdentifier '1', '2';
What this does is create a stored procedure that accepts two parameters which it then uses to update a Customer table.
The problem is that the entire script above runs within a schema other than "dbo". Let's just assume the schema is "company1". And when the stored procedure is called, I get an error from the SELECT statement, which says that the Configuration table cannot be found. I'm guessing this is because MS SQL by default looks for tables within the same schema as the location of the stored procedure, and not within the calling context.
My question is this:
Is there some option or parameter or switch of some kind that will
tell MS SQL to look for tables in the "caller's default schema" and
not within the schema that procedure itself is stored in?
If not,
what would you recommend? I don't really want to prefix the tables
with the schema name, because it would be kind of unflexible to do
that. So I'm thinking about using dynamic sql (and the schema_name()
function which returns the correct value even within the procedure),
but I am just not experienced enough with MS SQL to construct the
proper syntax.
It would be a tad more efficient to explicitly specify the schema name. And generally speaking, schema's are mainly used to divide a database into logical area's. I would not anticipate on tables schema-hopping often.
Regarding your question, you might want to have a look at the 'execute as' documentation on msdn, since it allows to explicitly control your execution context.
I ended up passing the schema name to my script as a property on the command line for the "sqlcmd" command. Like this:
C:/> sqlcmd -vSCHEMANAME=myschema -imysqlfile
In the SQL script I can then access this variable like this:
SELECT * from $(SCHEMANAME).myTable WHERE.... etc
Not quite as flexible as dynamic sql, but "good enough" as it were.
Thanks all for taking time to respond.

Validate T-SQL programmability objects

Is there a way to validate programmability objects in SQL Server 2008?
I have a database with ~500 programmability objects which depend on other programmability objects (not only tables).
If I do some refactoring, it is very hard find other objects which are broken by the changes. For example if I change the parameter count...
Original state of database:
CREATE FUNCTION [dbo].[GetSomeText]() RETURNS nvarchar(max) AS BEGIN RETURN 'asdf' END
/* uses "GetSomeText()" function */
CREATE FUNCTION [dbo].[GetOtherText]() RETURNS nvarchar(max) AS BEGIN RETURN [dbo].[GetSomeText]() + '-qwer' END
Now I do some refactoring (add parameter #Num to GetSomeText() function):
ALTER FUNCTION [dbo].[GetSomeText](#Num int) RETURNS nvarchar(max) AS BEGIN RETURN 'asdf' + CAST(#Num as nvarchar(max)) END
Now the function GetOtherText() is broken, because it is calling GetSomeText() function without a required parameter.
Is there a way to get information about this error?
Currently I script every programmability object as ALTER, run the alter script, and check for errors. This way looks to be too complex (and is hard to use in T-SQL only enviroment).
EDIT:
Thanks for answers! I know how to get dependenices or list of all objects.
The problem is in checking the body of object. If I get the dependency, is there other way to check validity than run ALTER script?
I don't think there's a way to find the dependency. You can, however, find everything that references the name of the object you're changing like this:
select OBJECT_DEFINITION(o.object_id) as objectDefinition, *
from sys.objects o
where o.type in ('P', 'FN')
and OBJECT_DEFINITION(o.object_id) like '%GetSomeText%'
o.type in ('P', 'FN') limits the search to P - Procedures and FN - Scalar Functions. Check out more info about OBJECT_DEFINITION: http://msdn.microsoft.com/en-us/library/ms176090.aspx
Perhaps you could try introducing some automated database developer/unit testing.
With 500 SQL objects it would be onerous to go back and 'retro fit' for them all. Best approach might be to incrementally create these tests as the need to refactor/change existing APIs/create new SQL objects arises
These automated tests could then be included as part of your overall continous integration approach. Note for the example given you would still have the issue of finding existing dependencies. But once there was sufficient test coverage the tests should highlight any breaking changes introduced.
I have created a test tool that might be of use - but there are a number of others out there:
http://dbtestunit.wordpress.com/
One of the easiest ways to get dependency is to use sp_depends. This does work with functions, but you need to be sure you are in the right DB context:
USE MyDatabase
EXEC sp_depends #objname = N'dbo.FunctionName'
This will show you any object whether it be a function, stored proc, table, or view that has a dependency for the listed object.
This is not always accurate with cross-database dependencies, though, so be aware.

Finding Caller of SQL Function

There's a SQL function that I'd like to remove from a SQL Server 2005 database, but first I'd like to make sure that there's no one calling it. I've used the "View Dependencies" feature to remove any reference to it from the database. However, there may be web applications or SSIS packages using it.
My idea was to have the function insert a record in an audit table every time it was called. However, this will be of limited value unless I also know the caller. Is there any way to determine who called the function?
You can call extended stored procedures from a function.
Some examples are:
xp_cmdshell
xp_regwrite
xp_logevent
If you had the correct permissions, theoretically you could call an extended stored procedure from your function and store information like APP_NAME() and ORIGINAL_LOGIN() in a flat file or a registry key.
Another option is to build an extended stored procedure from scratch.
If all this is too much trouble, I'd follow the early recommendation of SQL Profiler or server side tracing.
An example of using an extended stored procedure is below. This uses xp_logevent to log every instance of the function call in the Windows application log.
One caveat of this method is that if the function is applied to a column in a SELECT query, it will be called for every row that is returned. That means there is a possibility you could quickly fill up the log.
Code:
USE [master]
GO
/* A security risk but will get the job done easily */
GRANT EXECUTE ON xp_logevent TO PUBLIC
GO
/* Test database */
USE [Sandbox]
GO
/* Test function which always returns 1 */
CREATE FUNCTION ufx_Function() RETURNS INT
AS
BEGIN
DECLARE
#msg VARCHAR(4000),
#login SYSNAME,
#app SYSNAME
/* Gather critical information */
SET #login = ORIGINAL_LOGIN()
SET #app = APP_NAME()
SET #msg = 'The function ufx_Function was executed by '
+ #login + ' using the application ' + #app
/* Log this event */
EXEC master.dbo.xp_logevent 60000, #msg, warning
/* Resume normal function */
RETURN 1
END
GO
/* Test */
SELECT dbo.ufx_Function()
Depending on your current security model. We use connection pooling w/ one sql account. Each application has it's own account to connect to the database. If this is the case. You could then do a Sql Profiler session to find the caller of that function. Whichever account is calling the function will directly relate to one application.
This works for us in the way we handle Sql traffic; I hope it does the same for you.
try this to search the code:
--declare and set a value of #SearchValue to be your function name
SELECT DISTINCT
s.name+'.'+o.name AS Object_Name,o.type_desc
FROM sys.sql_modules m
INNER JOIN sys.objects o ON m.object_id=o.object_id
INNER JOIN sys.schemas s ON o.schema_id=s.schema_id
WHERE m.definition Like '%'+#SearchValue+'%'
ORDER BY 1
to find the caller at run time, you might try using CONTEXT_INFO
--in the code chain doing the suspected function call:
DECLARE #CONTEXT_INFO varbinary(128)
,#Info varchar(128)
SET #Info='????'
SET #CONTEXT_INFO =CONVERT(varbinary(128),'InfoForFunction='+ISNULL(#Info,'')+REPLICATE(' ',128))
SET CONTEXT_INFO #CONTEXT_INFO
--after the suspected function call
SET CONTEXT_INFO 0x0 --reset CONTEXT_INFO
--here is the portion to put in the function:
DECLARE #Info varchar(128)
,#sCONTEXT_INFO varchar(128)
SET #sCONTEXT_INFO=CONVERT(varchar(128),CONTEXT_INFO())
IF LEFT(#sCONTEXT_INFO,15)='InfoForFunction='
BEGIN
SET #Info=RIGHT(RTRIM(#sCONTEXT_INFO),LEN(RTRIM(#sCONTEXT_INFO))-15)
END
--use the #Info
SELECT #Info,#sCONTEXT_INFO
if you put different values in #CONTEXT_INFO in various places, you can narrow down who is calling the function, and refine the value until you find it.
You can try using APP_NAME() and USER_NAME(). It won't give you specifics (like an SSIS package name), but it might help.
This will help you find if this is being called anywhere in your database.
select object_name(id) from sys.syscomments where text like '%**<FunctionName>**%'
Another far less elegant way is to grep -R [functionname] * through your source code. This may or may not be workable depending on the amount of code.
This has the advantage of working even if that part of the only gets used very infrequently, which would be big problem with your audit table idea.
You could run a trace in the profiler to see if that function is called for a week (or whatever you consider a safe window).
I think that you might also be able to use OPENROWSET to call an SP which logs to a table if you enable ad-hoc queries.

How should I pass a table name into a stored proc?

I just ran into a strange thing...there is some code on our site that is taking a giant SQL statement, modifying it in code by doing some search and replace based on some user values, and then passing it on to SQL Server as a query.
I was thinking that this would be cleaner as a parameterized query to a stored proc, with the user values as the parameters, but when I looked more closely I see why they might be doing it...the table that they are selecting from is variably dependant on those user values.
For instance, in one case if the values were ("FOO", "BAR") the query would end up being something like "SELECT * FROM FOO_BAR"
Is there an easy and clear way to do this? Everything I'm trying seems inelegant.
EDIT: I could, of course, dynamically generate the sql in the stored proc, and exec that (bleh), but at that point I'm wondering if I've gained anything.
EDIT2: Refactoring the table names in some intelligent way, say having them all in one table with the different names as a new column would be a nice way to solve all of this, which several people have pointed out directly, or alluded to. Sadly, it is not an option in this case.
First of all, you should NEVER do SQL command compositions on a client app like this, that's what SQL Injection is. (Its OK for an admin tool that has no privs of its own, but not for a shared use application).
Secondly, yes, a parametrized call to a Stored procedure is both cleaner and safer.
However, as you will need to use Dynamic SQL to do this, you still do not want to include the passed string in the text of the executed query. Instead, you want to used the passed string to look up the names of the actual tables that the user should be allowed to query in the way.
Here's a simple naive example:
CREATE PROC spCountAnyTableRows( #PassedTableName as NVarchar(255) ) AS
-- Counts the number of rows from any non-system Table, *SAFELY*
BEGIN
DECLARE #ActualTableName AS NVarchar(255)
SELECT #ActualTableName = QUOTENAME( TABLE_NAME )
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = #PassedTableName
DECLARE #sql AS NVARCHAR(MAX)
SELECT #sql = 'SELECT COUNT(*) FROM ' + #ActualTableName + ';'
EXEC(#SQL)
END
Some have fairly asked why this is safer. Hopefully, little Bobby Tables can make this clearer:
0
Answers to more questions:
QUOTENAME alone is not guaranteed to be safe. MS encourages us to use it, but they have not given a guarantee that it cannot be out-foxed by hackers. FYI, real Security is all about the guarantees. The table lookup with QUOTENAME, is another story, it's unbreakable.
QUOTENAME is not strictly necessary for this example, the Lookup translation on INFORMATION_SCHEMA alone is normally sufficient. QUOTENAME is in here because it is good form in security to include a complete and correct solution. QUOTENAME in here is actually protecting against a distinct, but similar potential problem know as latent injection.
I should note that you can do the same thing with dynamic Column Names and the INFORMATION_SCHEMA.COLUMNS table.
You can also bypass the need for stored procedures by using a parameterized SQL query instead (see here: https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlcommand.parameters?view=netframework-4.8). But I think that stored procedures provide a more manageable and less error-prone security facility for cases like this.
(Un)fortunately there's no way of doing this - you can't use table name passed as a parameter to stored code other than for dynamic sql generation. When it comes to deciding where to generate sql code, I prefer application code rather that stored code. Application code is usually faster and easier to maintain.
In case you don't like the solution you're working with, I'd suggest a deeper redesign (i.e. change the schema/application logic so you no longer have to pass table name as a parameter anywhere).
I would argue against dynamically generating the SQL in the stored proc; that'll get you into trouble and could cause injection vulnerability.
Instead, I would analyze all of the tables that could be affected by the query and create some sort of enumeration that would determine which table to use for the query.
Sounds like you'd be better off with an ORM solution.
I cringe when I see dynamic sql in a stored procedure.
One thing you can consider is to make a case statement that contains the same SQL command you want, once for each valid table, then pass as a string the table name into this procedure and have the case choose which command to run.
By the way as a security person the suggestion above telling you to select from the system tables in order to make sure you have a valid table seems like a wasted operation to me. If someone can inject passed the QUOTENAME() then then injection would work on the system table just as well as on the underlying table. The only thing this helps with it to ensure it is a valid table name, and I think the suggestion above is a better approach to that since you are not using QUOTENAME() at all.
Depending on whether the set of columns in those tables is the same or different, I'd approach it in two ways in the longer term:
1) if they the same, why not create a new column that would be used as a selector, whose value is derived from the user-supplied parameters ? (is it a performance optimization?)
2) if they are different, chances are that handling of them is also different. As such, it seems like splitting the select/handle code into separate blocks and then calling them separately would be a most modular approach to me. You will repeat the "select * from" part,
but in this scenario the set of tables is hopefully finite.
Allowing the calling code to supply two arbitrary parts of the table name to do a select from feels very dangerous.
I don't know the reason why you have the data spread over several tables, but it sounds like you are breaking one of the fundamentals. The data should be in the tables, not as table names.
If the tables have more or less the same layout, consider if it would be best to put the data in a single table instead. That would solve your problem with the dynamic query, and it would make the database layout more flexible.
Instead of Querying the tables based on user input values, you can pick the procedure instead.
that is to say
1. Create a procedure FOO_BAR_prc and inside that you put the query 'select * from foo_bar' , that way the query will be precompiled by the database.
2. Then based on the user input now execute the correct procedure from your application code.
Since you have around 50 tables, this might not be a feasible solution though as it would require lot of work on your part.
In fact, I wanted to know how to pass table name to create a table in stored procedure. By reading some of the answers and attempting some modification at my end, I finally able to create a table with name passed as parameter. Here is the stored procedure for others to check any error in it.
USE [Database Name]
GO
/****** Object: StoredProcedure [dbo].[sp_CreateDynamicTable] Script Date: 06/20/2015 16:56:25 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE [dbo].[sp_CreateDynamicTable]
#tName varchar(255)
AS
BEGIN
SET NOCOUNT ON;
DECLARE #SQL nvarchar(max)
SET #SQL = N'CREATE TABLE [DBO].['+ #tName + '] (DocID nvarchar(10) null);'
EXECUTE sp_executesql #SQL
END
#RBarry Young
You don't need to add the brackets to #ActualTableName in the query string because it is already included in the result from the query in the INFORMATION_SCHEMA.TABLES. Otherwise, there will be error(s) when executed.
CREATE PROC spCountAnyTableRows( #PassedTableName as NVarchar(255) ) AS
-- Counts the number of rows from any non-system Table, SAFELY
BEGIN
DECLARE #ActualTableName AS NVarchar(255)
SELECT #ActualTableName = QUOTENAME( TABLE_NAME )
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_NAME = #PassedTableName
DECLARE #sql AS NVARCHAR(MAX)
--SELECT #sql = 'SELECT COUNT(*) FROM [' + #ActualTableName + '];'
-- changed to this
SELECT #sql = 'SELECT COUNT(*) FROM ' + #ActualTableName + ';'
EXEC(#SQL)
END
I would avoid dynamic SQL at all costs.
Isn't the most elegant solution but does the job perfectly.
PROCEDURE TABLE_AS_PARAMTER (
p_table_name IN VARCHAR2
) AS
BEGIN
CASE p_table_name
WHEN 'TABLE1' THEN
UPDATE TABLE1
SET
COLUMN1 =1
WHERE
ID =1;
WHEN 'TABLE2' THEN
UPDATE TABLE1
SET
COLUMN1 =1
WHERE
ID =2;
END CASE;
COMMIT;
EXCEPTION
WHEN OTHERS THEN
ROLLBACK
END TABLE_AS_PARAMTER;