I want to add column to table by stored procedure and the name of column should be parameter.'s value.
You'll have to compose a dynamic DDL SQL statement where you concatenate the name of the column received as argument:
CREATE PROCEDURE AddColumnToTable
#columnName VARCHAR(128)
AS
EXEC ('ALTER TABLE tableName ADD' + SPACE(1) + #columnName + SPACE(1) + 'VARCHAR(MAX) NULL')
Note that in this example the name of the table as well as the column's type are hard-coded in the SQL statement. You may want to consider adding them as parameters to the stored procedure for a more generic solution.
Related resources:
Execute (Transact-SQL)
The Curse and Blessings of Dynamic SQL
This is a terrible idea in general. If your schema is requiring changes to tables that need to be done by sp, then they are happening too frequently and the design should be reviewed.
But if you are stuck with this horrible process (and I can't emphasize enough what a truly bad idea it is.) then you also need to have an input pvalue for the data type for the data as well and a nullable one for the size of the data type if need be, varchar(max) is a poor choice for every possible column to be added. It should only be used when you expect to have more than 8000 characters in the column for indexing reasons.
Why is this bad? Well to begin with, you have lost control over the schena. People can anything and there is no review to make sure you don't get twelve versions of the same things with slighly differnt names added. Next how do you intend to fix the application to use these fieldss without knowing what was added? Since you should never return more fields than you need, your production code should not be using select *. Therefore how do you know which fields to add if your users have added them. Users in general aren't knowledgeable enough to add fields. They don't understand database structure or design and don't understand how to normalize and performance tune. Letting people willy-nilly add fields to data tables is short-sighted and will lead to badly performing databases and awkward, hard-to-use interfaces that annoy the customers. If you have properly done your work in designing the database and application, there should be very little that a user needs to add. If you are basing your work on the user being able to have the flexibilty to add, you have a disaster of a project that not only will be hard to maintain, it will perform badly and in general your users will hate it. I've been forced to work work with some of these horrible commercial products (look at Clarity if you want an example of how flexibilty trumped design and made a product that is virtually unuseable).
Related
I support a database that contains a schema that has a couple hundred tables containing our most important data.
Our application also offers APIs implemented as queries stored in NVARCHAR(MAX) fields in a Query table which are written against the views as well as the tables in this critical schema.
Over time, columns have been added to the tables, but the APIs haven't always kept up.
I've been asked if I can find a way via SQL to identify, as nearly as possible (some false positives/negatives OK), columns in the tables that are not referenced by either the views or the SQL queries that provide the API output.
Initially this seemed do-able. I've found some similar questions on the topic, such as here and here that sort of give guidance on how to start...although I note that even with these, there's the kind of ugly fallback method that looks like:
OBJECT_DEFINITION(OBJECT_ID([Schema].[View])) LIKE '%' + [Column] + '%'
Which is likely to generate false positives as well as be super slow when I'm trying to do it for a couple of thousand column names.
Isn't there anything better/more reliable? Maybe something that could compile a query down to a plan and be able to determine from the plan every column that must be accessed in order to deliver the results?
Our application also offers APIs implemented as queries stored in
NVARCHAR(MAX) fields
So you've reimplemented views? :)
If you make them actual views you can look at INFORMATION_SCHEMA - cross reference table/columns to view/columns.
Assuming you don't want to do that, and you're prepared to write a job to run occasionally (rather than real-time) you could do some super-cheesy dynamic SQL.
Loop through your definitions that are stored in NVARCHAR(MAX) with a cursor
Create a temp view or SP from the SQL in your NVARCHAR(MAX)
Examine INFORMATION_SCHEMA from your temp view/SP and put that into a temp holding table.
Do this for all your queries then you've got a list of referenced columns
Pretty ugly but should be workable for a tactical scan of your API vs database.
What is the best way to maintain code of a big project?
Let's say you have 1000 stored procedures, and you have to add a new column to a table (or remove)
There might be 1-2 or 30 stored procedures, that might be affected.
Just a single "search" for the tablename might not be good enough, let's say you only need to know the places where the table has insert/update/delete.
searching for 'insert tablename' might be a good idea, but you might have a space between those 2 words or 2 spaces, or a TAB ... maybe the tablename is written like '[tablename]'
The same for all 3 (insert/update/delete.)
I am basically looking for some kind of 'restricted dependencies'
How is this being handled the best way?
Keep a database table with this kind of information, and change that table every time you make changes to stored procedures?
keep some specific code as comment next to each insert/update/delete, and in this way, you will be able to search for what you need?
Example: 'insert_tablename', 'update_tablename', 'delete_tablename'
anyone having a better idea?
Ideally, changes are backward compatible. Not just so that you can change a table without breaking all of the objects that reference it, but also so that you can deploy all of the database changes before you deploy all of the application code (in a distributed architecture, think a downloadable desktop app or an iPhone app, where folks connect to your database remotely, this is crucial).
For example, if you add a new column to a table, it should be NULLable or have a default value so that INSERT statements don't need to be updated immediately to reference it. Stored procedures can be updated gradually to accept a new parameter to represent this column, and it should be nullable / optional so that the application(s) don't need to be aware of this column immediately. Etc.
This also demands that your original insert statements included an explicit column list. If you just say:
INSERT dbo.table VALUES(#p1, #p2, ...);
Then that makes it much tougher to make your changes backward compatible.
As for removing a column, well, that's a little tougher. Dependencies are not perfect in SQL Server, but you should be able to find a lot of information from these dynamic management objects:
sys.dm_sql_referenced_entities
sys.dm_sql_referencing_entities
sys.sql_expression_dependencies
You might also find these articles interesting:
Keeping sysdepends up to date
Make your database changes backward compatible when adding a new column
Make your database changes backward compatible when dropping a column
Make your database changes backward compatible when renaming an entity
Make your database changes backward compatible when changing a relationship
In an ad-hoc query using Select ColumnName is better, but does it matter in a Stored Procedure after it's saved in the plan guide?
Always explicitly state the columns, even in a stored procedure. SELECT * is considered bad practice.
For instance you don't know the column order that will be returned, some applications may be relying on a specific column order.
I.e. the application code may look something like:
Id = Column[0]; // bad design
If you've used SELECT * ID may no longer be the first column and cause the application to crash. Also, if the database is modified and an additional 5 fields have been added you are returning additional fields that may not be relevant.
These topics always elicit blanket statements like ALWAYS do this or NEVER do that, but the reality is, like with most things it depends on the situation. I'll concede that it's typically good practice to list out columns, but whether or not it's bad practice to use SELECT * depends on the situation.
Consider a variety of tables that all have a common field or two, for example we have a number of tables that have different layouts, but they all have 'access_dt' and 'host_ip'. These tables aren't typically used together, but there are instances when suspicious activity prompts a full report of all activity. These aren't common, and they are manually reviewed, as such, they are well served by a stored procedure that generates a report by looping through every log table and using SELECT * leveraging the common fields between all tables.
It would be a waste of time to list out fields in this situation.
Again, I agree that it's typically good practice to list out fields, but it's not always bad practice to use SELECT *.
Edit: Tried to clarify example a bit.
It's a best practice in general but if you actually do need all the column, you'd better use the quickly read "SELECT *".
The important thing is to avoid retreiving data you don't need.
It is considered bad practice in situations like stored procedures when you are querying large datasets with table scans. You want to avoid using table scans because it causes a hit to the performance of the query. It's also a matter of readability.
SOme other food for thought. If your query has any joins at all you are returning data you don't need because the data in the join columns is the same. Further if the table is later changed to add some things you don't need (such as columns for audit purposes) you may be returning data to the user that they should not be seeing.
Nobody has mentioned the case when you need ALL columns from a table, even if the columns change, e.g. when archiving table rows as XML. I agree one should not use "SELECT *" as a replacement for "I need all the columns that currently exist in the table," just out of laziness or for readability. There needs to be a valid reason. It could be essential when one needs "all the columns that could exist in the table."
Also, how about when creating "wrapper" views for tables?
A specification essentially is a text string representing a "where" clause created by an end user.
I have stored procedures that copy a set of related tables and records to other places. The operation is always the same, but dependent on some crazy user requirements like "products that are frozen and blue and on sale on Tuesday".
What if we fed the user specification (or string parameter) to a scalar function that returned true/false which executed the specification as dynamic SQL or just exec (#variable).
It could tell us whether those records exist. We could add the result of the function to our copy products where clause.
It would keep us from recompiling the copy script each time our where clauses changed. Plus it would isolate the product selection in to a single function.
Anyone ever do anything like this or have examples? What bad things could come of it?
EDIT:
This is the specification I simply added to the end of each insert/select statement:
and exists (
select null as nothing
from SameTableAsOutsideTable inside
where inside.ID = outside.id and -- Join operations to outside table
inside.page in (6, 7) and -- Criteria 1
inside.dept in (7, 6, 2, 4) -- Criteria 2
)
It would be great to feed a parameter into a function that produces records based on the user criteria, so all that above could be something like:
and dbo.UserCriteria( #page="6,7", #dept="7,6,2,4")
Dynamic Search Conditions in T-SQL
When optimizing SQL the important thing is optimizing the access path to data (ie. index usage). This trumps code reuse, maintainability, nice formatting and just about every other development perk you can think of. This is because a bad access path will cause the query to perform hundreds of times slower than it should. The article linked sums up very well all the options you have, and your envisioned function is nowhere on the radar. Your options will gravitate around dynamic SQL or very complicated static queries. I'm afraid there is no free lunch on this topic.
It doesn't sound like a very good idea to me. Even supposing that you had proper defensive coding to avoid SQL injection attacks it's not going to really buy you anything. The code still needs to be "compiled" each time.
Also, it's pretty much always a bad idea to let users create free-form WHERE clauses. Users are pretty good at finding new and innovative ways to bring a server to a grinding halt.
If you or your users or someone else in the business can't come up with some concrete search requirements then it's likely that someone isn't thinking about it hard enough and doesn't really know what they want. You can have pretty versatile search capabilities without letting the users completely loose on the system. Alternatively, look at some of the BI tools out there and consider creating a data mart where they can do these kinds of ad hoc searches.
How about this:
You create another store procedure (instead of function) and pass the right condition to it.
Based on that condition it dumps the record ids to a temp table.
Next you move procedure will read ids from that table and do the needful things?
Or you could create a user function that returns a table which is nothing but the ids of the records that matches your criteria (dynamic)
If I am totally off, then please clarify me.
Hope this helps.
If you are forced to use dynamic queries and you don't have any solid and predefined search requirements, it is strongly recommended to use sp_executesql instead of EXEC . It provides parametrized queries to prevent SQL Injection attacks (to some extent) and It makes use of execution plans to speed up performance. (More info)
I'm working with a legacy application with surprise surprise, next to no useful documentation on naming convensions or over all data structure.
There are similarly named tables in the database with one of them ending in HX. Does this hold any significance in anyones database experience?
The data seems to be replicated so I would assume that it is a Historical table of sorts, but I just want to be sure before I avoid populating it.
Thanks in Advance.
Cory
I've seen this before but I don't think it's usage was part of any standard. When I saw it was was used to prefix tables (hx_ReportingUsage) that stored information on historical index usage. Something similar to what this article talks about.
Again, I think this was just an internal naming convention. Your best bet would probably be to search all stored procs, UDFs and code you have access to for the table name and see if you can piece together how it's being used.
If you are using SQL Server you can use this query to look for text in stored procs:
SELECT OBJECT_NAME(id)
FROM syscomments
WHERE [text] LIKE '%foobar%'
AND OBJECTPROPERTY(id, 'IsProcedure') = 1
GROUP BY OBJECT_NAME(id)
In the few places I've come across this it was related to history tables. It has depended on the system. I've seen this mostly in the Oil & Gas world, not so much outside that industry.
I shouldn't just assume that's the case though. If it's possible, I'd do a dependency search to find out if there are any scripts dependent upon the table - if they're history tables, you'll likely find a copy procedure or trigger somewhere to keep them populated.
Check out sp_depends to see if that can shed any light on the subject.
Exec sp_depends 'table_name_hx'
You should get a list of everything that's got references to it and what type of object is referencing.
It might have come from the medical field. They often have abbreviations like Rx for prescription, Sx for symptoms, etc. Hx is medical history: http://en.wikipedia.org/wiki/Medical_history. This could have found its way into other fields and even databases.