What is the purpose of tbl_Product, tbl_Order naming conventions? - sql

I've seen tbl_ prefixes decorating all tables in the last two code bases I've worked with. I can't think of any reason they'd be useful for application developers or database administrators.
If a DBA needs to see what objects are tables they can always join up to DMV's or the schema tables in master right? I can't think of how they'd be useful to a programmer either, even more so if the project is using an ORM tool.
Still even while writing stored procs they just seem to get in the way.
Can anybody explain how they'd be useful in a non-subjective way? Ex ( having tbl_ helps me perform x task )

I’ve heard the same thing over and over again the reason being, it helps them to know the type of the object.
Inside a query, using prefixes could help them separate tables from views for instance.
I really don’t agree with that. After getting used to the database schema, prefixes become redundant and as happens with everything that is redundant, it can become desynchronized or make changes harder to do.
Let’s say you have a table that for whatever reason you have to split into two tables.
Let’s say you decide to create a view that emulates the original table selecting data from the two new tables.
Are you going to rename throughout you code base or are you going to stick with a view prefixed as tbl_?
So my point is database object names should not have any prefixes for inferring their types.

I can't think of many benefits. You could definitely see cleaner SQL like this:
select post.id, post.subject, post.body from tbl_post post where post.author="eric"
Makes your variables easier. Otherwise, this just looks like you've been dealing with people who learned databases on MS Access.

This is hungarian notation, or rather a misuse of it. The original use was to put some important aspect of the use of a variable as a prefix. The fact that a table is a table is hardly a useful way to use hungarian notation.
In ASP hungarian notation is used to specify a data type, as VBSCript only has variants. I've seen ASP programmers apply this to tables and fields in the data base too, that is one way that this misuse come into use.

the one benefit of it is that you can tell the difference between view, table, and materialized view. Of course, this doesn't really matter when you are writing the code, it is maintaining it that matters. If someone realizes that they are pulling from a view, they might be able to optimize the code better. Views based on views based on views can be very inefficient. Without the tbl_ or view_ prefix, it might be more difficult to tell if this is happening.

I think that when using an ORM such as Entiry Framework, it may make it easier to tell which side the the mapping is dealing with tables and which side is dealing with objects (entities). My tbl_Employee maps to Employee, for example. There is no way to get the layers of abstaction confused.

Related

Are modifiable join views a reasonable design choice?

To be clear, by modifiable join view I mean a view constructed from the joining of two or more tables that allows insert/update/delete actions that modify any/all of the component tables.
This may be a postgres specific question, not sure. I am also interested if other DBMSs have idiosyncratic features for modifiable join views, since as far as I can tell, they are not possible in standard SQL.
I'm working on a postgres schema, and some of my recent reading has suggested that it is possible to construct modifiable join views using instead rules (CREATE RULE ... DO INSTEAD ...). Modifiable join views seem desirable since it would allow for hiding strong normalization behind an interface, providing a mechanism for classic abstraction. Rules are the only option for implementation, since currently triggers cannot be set on views.
However, the first modifiable view I tried to design ran into problems, and I find out that many consider non-trivial rules to be harmful (see links in comments to this SO answer). Also, I can't find any examples of modifiable join views on the web.
Questions (Edit to put finer points on the questions):
Do you have any experience with modifiable join views and can you provide a concrete example with select/insert/delete/update ability?
Are they practical, i.e. can they be treated transparently without having to tiptoe around mines/black holes?
Are they ever a good design choice, in terms of functionality/effort ratio and maintainability?
Would greatly appreciate links to any examples/discussions on this topic. Thanks.
Yes, I have some experience with updatable views in general. I think they're practical in PostgreSQL. Like all design choices, they can be a good choice, and they can be a bad choice.
I find them particularly useful in dealing with supertype/subtype tables. I create one view for each subtype; the view joins the subtype to the supertype. Revoke permissions on the base tables, write rules for the view, and give client code access only to the views. All data manipulation done by client code then goes through the view and the rules defined on them.
I don't think rules are really different from any other feature in any other environment. And by environment, I mean C, C++, Java, Ruby, Python, Erlang, and BASIC, not just dbms environments.
Use the good features of a language. Avoid the bad ones.
"Don't use malloc()" is bad advice. "Always check the return value of malloc()" is good advice. "Never use rules" is bad advice. "Avoid using rules in ways that are known to have questionable behavior" is good advice. The rules you need for views on supertype/subtype tables are simple and easy to understand. They don't misbehave.
At the theoretical level, views provide logical data independence. But that's only possible if the views are updatable. (And many views should be updatable directly by the database engine, without any need of rules or triggers.)
I use them as a replacement for ORMs. I think as long as you do not run-a-muck sprinkling them everywhere through the database they can be easy enough to understand. I define a schema for an application and then whatever views are in that schema are the methods and operations of that app. The client code can be mostly automated after that since the views give the abstraction I need to write generic client code.
People point out that the rule rewrite is not a real table (but it is posing as one) which makes it possible to write things that will break. This is possible but I have yet to come across it yet. The idea is to hide the complexity in the rewrite and then only do simple deletes and update with no joins. If it turns out that a join is needed - it is time to rewrite the rule, not the top level query.
At the end, I find it a very compact way to write the database. All the ways of interfacing with it are written as rules. No connection should have access to a real table. Your business logic is very explicit. If a view does not have an UPDATE rule for it - it can not be updated period. Since you have written all this in the database level instead of the client level, it is not tied to a web framework or a particular language. This leads to a lot of flexibility in how you want to connect to the database. Imagine you used web framework, but as time goes on you need direct access to the database for another source. Direct access will also bypass all of ORM business rules you worked so hard on. With a rule writing interface you can expose, the interface without fear that the new connection will corrupt the data.
If people say you can really F UP a database with them - then sure - of course you can. But you can with everything else too. If people say you can not use them at all with out mucking things up, then I would disagree.
Two quick links:
Why using rules is bad idea
Triggers on views
My personal preference is to use views only for reading data, (virtually) never for inserting or updating. By essentially re-normalizing data (which sounds like what you are doing) in your database, you are likely creating a system that will be very difficult to test and maintain in the long term.
If at all possible, look at mapping your denormalized data back to a normal schema somewhere in your application code, and providing it to the database that way (to individual tables IMHO) in a single transaction.
I know in SQL Server if you update a view you must limit the change to only one table anyway which makes using views for updating useless in my mind as you have to know which fields go with which tables anyway.
If you want to abstract the information out and not have to worry about the database structure for inserts adn updates, an ORM mught do a better job for you than views.
I have never used modifiable views of any sort but as you are asking whether they are a "reasonable design choice", can I suggest an alternative design choice with many benefits where modifiable views are not needed: a Transactional API
Basically what this amounts to is:
Users have no access to tables and cannot issue insert, update, delete statements at all
Users have access to functions that represent well defined transactions - at the simplest level these may just do a single DML, but often would not. The important thing is that they map to transactions in the 'business' sense rather than in the 'database' sense
For querying, users have access to (non-modifiable) views
I do usually do views in the form of "last-valid-record" just hidding and tracking modifications (like a wiki)
The only drawback that I see to this is: then you use your view as a table, and you join it with anything, and and you use it on "wheres", and you insert records on it, and so on, but behinds you have made lot of performance lost compared to the same acctions against a real table (more bigger and more complex). I think it depends on how many people must understud de schema. Its true that some DBMS also admit to index the views, but I think you lose an important amount of performance anyway. Sorry about my english.

Why prefix sql function names?

What is a scenario that exemplifies a good reason to use prefixes, such as fn_GetName, on function names in SQL Server? It would seem that it would be unnecessary since usually the context of its usage would make it clear that it's a function. I have not used any other language that has ever needed prefixes on functions, and I can't think of a good scenario that would show why SQL is any different.
My only thinking is that perhaps in older IDE's it was useful for grouping functions together when the database objects were all listed together, but modern IDE's already make it clear what is a function.
You are correct about older IDEs
As a DBA, trying to fix permissions using SQL Server Enterprise Manager (SQL Server 2000 and 7.0), it was complete bugger trying to view permissions. If you had ufn or usp or vw it became easier to group things together because of how the GUI presented the data.
Saying that, if you have SELECT * FROM Thing, what is Thing? A view or a table? It may work for you as a developer but it will crash when deployed because you don't grant permissions on the table itself: only views or procs. Surely a vwThing will keep your blood pressure down...?
If you use schemas, it becomes irrelevant. You can "namespace" your objects. For example, tables in "Data" and other objects per client in other schemas eg WebGUI
Edit:
Function. You have table valued and scalar functions. If you stick with the code "VerbNoun" concept, how do you know which is which without some other clue. (of course, this can't happen because object names are unique)
SELECT dbo.GetName() FROM MyTable
SELECT * FROM dbo.GetName()
If you use a plural to signify a table valued function, this is arguably worse
SELECT dbo.GetName() FROM MyTable
SELECT * FROM dbo.GetNames()
Whereas this is less ambiguous, albeit offensive to some folk ;-)
SELECT dbo.sfnGetName() FROM MyTable
SELECT * FROM dbo.tfnGetName()
With schemas. And no name ambiguity.
SELECT ScalarFN.GetName() FROM MyTable
SELECT * FROM TableFN.GetName()
Your "any other language" comment doesn't apply. SQL isn't structured like c#, Java, f#, Ada (OK, PL/SQL might be), VBA, whatever: there is no object or namespace hierarchy. No Object.DoStuff method stuff.
A prefix may be just the thing to keep you sane...
There's no need to prefix function names with fn_ any more than there's a need to prefix table names with t_ (a convention I have seen). This sort of systematic prefix tends to be used by people who are not yet comfortable with the language and who need the convention as an extra help to understanding the code.
Like all naming conventions, it hardly matters what the convention actually is. What really matter is to be consistent. So even if the convention is wrong, it is still important to stick to it for consistency. Yes, it may be argued that if the naming convention is wrong then it should be changed, but the effort implies a lot: rename all objects, change source code, all maintenance procedures, get the dev team committed to follow the new convention, have all the support and ops personnel follow the new rules etc etc. On a large org, the effort to change a well established naming convention is just simply overwhelming.
I don't know what your situation is, but you should consider carefully before proposing a naming convention change just for sake of 'pretty'. No matter how bad the existing naming convention in your org is, is far better to stick to it and keep the naming consistency than to ignore it and start your own.
Of course, more often than not, the naming convention is not only bad, is also not followed and names are inconsistent. In that case, sucks to be you...
What is a scenario that exemplifies a
good reason to use prefixes
THere is none. People do all kind of stuff because they always did so, and quite a number of bad habits are explained with ancient knowledge that is wrong for many years.
I'm not a fan of prefixes, but perhaps one advantage could be that these fn_ prefixes might make it easier to identify that a particular function is user-defined, rather than in-built.
We had many painful meetings at our company about this. IMO, we don't need prefixes on any object names. However, you could make an argument for putting them on views, if the name of the view might conflict with the underlying table name.
In any case, there is no SQL requirement that prefixes be used, and frankly, IMO, they have no real value.
As others have noticed, any scenario you define can easily be challenged, so there's no rule defining it as necessary; however, there's equally no rule saying it's unnecessary, either. Like all naming conventions, it's more important to have consistent usage rather than justification.
One (and the only) advantage I can think of is that it a consequently applied prefixing scheme could make using intellisense / autocomplete easier since related functionality is automagically grouped together.
Since I recently stumbled over this question, I'd like to add the following point in favour of prefixes: Imagine, you have some object id from some system table and you want to determine whether it's a function, proc, view etc. You could of course run a test for each type, it's much easier though to extract the prefix from the object name and then act upon that. This gets even easier when you use an underscore to separate the prefix from the name, e.g. usp_Foo instead of uspFoo. IMHO it's not just about stupid IDEs.

Many-to-many relationship: use associative table or delimited values in a column?

Update 2009.04.24
The main point of my question is not developer confusion and what to do about it.
The point is to understand when delimited values are the right solution.
I've seen delimited data used in commercial product databases (Ektron lol).
SQL Server even has an XML datatype, so that could be used for the same purpose as delimited fields.
/end Update
The application I'm designing has some many-to-many relationships. In the past, I've often used associative tables to represent these in the database. This has caused some confusion to the developers.
Here's an example DB structure:
Document
---------------
ID (PK)
Title
CategoryIDs (varchar(4000))
Category
------------
ID (PK)
Title
There is a many-to-many relationship between Document and Category.
In this implementation, Document.CategoryIDs is a big pipe-delimited list of CategoryIDs.
To me, this is bad because it requires use of substring matching in queries -- which cannot make use of indexes. I think this will be slow and will not scale.
With that model, to get all Documents for a Category, you would need something like the following:
select * from documents where categoryids like '%|' + #targetCategoryId + '|%'
My solution is to create an associative table as follows:
Document_Category
-------------------------------
DocumentID (PK)
CategoryID (PK)
This is confusing to the developers. Is there some elegant alternate solution that I'm missing?
I'm assuming there will be thousands of rows in Document. Category may be like 40 rows or so. The primary concern is query performance. Am I over-engineering this?
Is there a case where it's preferred to store lists of IDs in database columns rather than pushing the data out to an associative table?
Consider also that we may need to create many-to-many relationships among documents. This would suggest an associative table Document_Document. Is that the preferred design or is it better to store the associated Document IDs in a single column?
Thanks.
This is confusing to the developers.
Get better developers. That is the right approach.
Your suggestion IS the elegant, powerful, best practice solution.
Since I don't think the other answers said the following strongly enough, I'm going to do it.
If your developers 1) can't understand how to model a many-to-many relationship in a relational database, and 2) strongly insist on storing your CategoryIDs as delimited character data,
Then they ought to immediately lose all database design privileges. At the very least, they need an actual experienced professional to join their team who has the authority to stop them from doing something this unwise and can give them the database design training they are completely lacking.
Last, you should not refer to them as "database developers" again until they are properly up to speed, as this is a slight to those of us who actually are competent developers & designers.
I hope this answer is very helpful to you.
Update
The main point of my question is not developer confusion and what to do about it.
The point is to understand when delimited values are the right solution.
Delimited values are the wrong solution except in extremely rare cases. When individual values will ever be queried/inserted/deleted/updated this proves it was the wrong decision, because you have to parse and touch all the other values just to work with the desired one. By doing this you're violating first (!!!) normal form (this phrase should sound to you like an unbelievably vile expletive). Using XML to do the same thing is wrong, too. Storing delimited values or multi-value XML in a column could make sense when it is treated as an indivisible and opaque "property bag" that is NOT queried on by the database but is always sent whole to another consumer (perhaps a web server or an EDI recipient).
This takes me back to my initial comment. Developers who think violating first normal form is a good idea are very inexperienced developers in my book.
I will grant there are some pretty sophisticated non-relational data storage implementations out there using text property bags (such as Facebook(?) and other multi-million user sites running on thousands of servers). Well, when your database, user base, and transactions per second are big enough to need that, you'll have the money to develop it. In the meantime, stick with best practice.
It's almost always a big mistake to use comma separated IDs.
RDBMS are designed to store relationships.
My solution is to create an
associative table as follows: This is
confusing to the developers
Really? this is database 101, if this is confusing to them then maybe they need to step away from their wizard generated code and learn some basic DB normalization.
What you propose is the right solution!!
The Document_Category table in your design is certainly the correct way to approach the problem. If it's possible, I would suggest that you educate the developers instead of coming up with a suboptimal solution (and taking a performance hit, and not having referential integrity).
Your other options may depend on the database you're using. For example, in SQL Server you can have an XML column that would allow you to store your array in a pre-defined schema and then do joins based on the contents of that field. Other database systems may have something similar.
The many-to-many mapping you are doing is fine and normalized. It also allows for other data to be added later if needed. For example, say you wanted to add a time that the category was added to the document.
I would suggest having a surrogate primary key on the document_category table as well. And a Unique(documentid, categoryid) constraint if that makes sense to do so.
Why are the developers confused?
The 'this is confusing to the developers' design means you have under-educated developers. It is the better relational database design - you should use it if at all possible.
If you really want to use the list structure, then use a DBMS that understands them. Examples of such databases would be the U2 (Unidata, Universe) DBMS, which are (or were, once upon a long time ago) based on the Pick DBMS. There are likely to be other similar DBMS providers.
This is the classic object-relational mapping problem. The developers are probably not stupid, just inexperienced or unaccustomed to doing things the right way. Shouting "3NF!" over and over again won't convince them of the right way.
I suggest you ask your developers to explain to you how they would get a count of documents by category using the pipe-delimited approach. It would be a nightmare, whereas the link table makes it quite simple.
The number one reason that my developers try this "comma-delimited values in a database column" approach is that they have a perception that adding a new table to address the need for multiple values will take too long to add to the data model and the database.
Most of them know that their work around is bad for all kinds of reasons, but they choose this suboptimal method because they just can. They can do this and maybe never get caught, or they will get caught much later in the project when it is too expensive and risky to fix it. Why do they do this? Because their performance is measured solely on speed and not on quality or compliance.
It could also be, as on one of my projects, that the developers had a table to put the multi values in but were under the impression that duplicating that data in the parent table would speed up performance. They were wrong and they were called out on it.
So while you do need an answer to how to handle these costly, risky, and business-confidence damaging tricks, you should also try to find the reason why the developers believe that taking this course of action is better in the short and the long run for the project and company. Then fix both the perception and the data structures.
Yes, it could just be laziness, malicious intent, or cluelessness, but I'm betting most of the time developers do this stuff because they are constantly being told "just get it done". We on the data model and database design sides need to ensure that we aren't sending the wrong message about how responsive we can be to requests to fulfill a business requirement for a new entity/table/piece of information.
We should also see that data people need to be constantly monitoring the "as-built" part of our data architectures.
Personally, I never authorize the use of comma delimited values in a relational database because it is actually faster to build a new table than it is to build a parsing routine to create, update, and manage multiple values in a column and deal with all the anomalies introduced because sometimes that data has embedded commas, too.
Bottom line, don't do comma delimited values, but find out why the developers want to do it and fix that problem.

What are views good for?

I'm just trying to get a general idea of what views are used for in RDBMSes. That is to say, I know what a view is and how to make one. I also know what I've used them for in the past.
But I want to make sure I have a thorough understanding of what a view is useful for and what a view shouldn't be useful for. More specifically:
What is a view useful for?
Are there any situations in which it is tempting to use a view when you shouldn't use one?
Why would you use a view in lieu of something like a table-valued function or vice versa?
Are there any circumstances that a view might be useful that aren't apparent at first glance?
(And for the record, some of these questions are intentionally naive. This is partly a concept check.)
In a way, a view is like an interface. You can change the underlying table structure all you want, but the view gives a way for the code to not have to change.
Views are a nice way of providing something simple to report writers. If your business users want to access the data from something like Crystal Reports, you can give them some views in their account that simplify the data -- maybe even denormalize it for them.
1) What is a view useful for?
IOPO In One Place Only•Whether you consider the data itself or the queries that reference the joined tables, utilizing a view avoids unnecessary redundancy. •Views also provide an abstracting layer preventing direct access to the tables (and the resulting handcuffing referencing physical dependencies). In fact, I think it's good practice1 to offer only abstracted access to your underlying data (using views & table-valued functions), including views such as CREATE VIEW AS SELECT * FROM tblData1I hafta admit there's a good deal of "Do as I say; not as I do" in that advice ;)
2) Are there any situations in which it is tempting to use a view when you shouldn't use one?
Performance in view joins used to be a concern (e.g. SQL 2000). I'm no expert, but I haven't worried about it in a while. (Nor can I think of where I'm presently using view joins.)Another situation where a view might be overkill is when the view is only referenced from one calling location and a derived table could be used instead. Just like an anonymous type is preferable to a class in .NET if the anonymous type is only used/referenced once. • See the derived table description in http://msdn.microsoft.com/en-us/library/ms177634.aspx
3) Why would you use a view in lieu of something like a table-valued function or vice versa?
(Aside from performance reasons) A table-valued function is functionally equivalent to a parameterized view. In fact, a common simple table-valued function use case is simply to add a WHERE clause filter to an already existing view in a single object.
4) Are there any circumstances that a view might be useful that aren't apparent at first glance?
I can't think of any non-apparent uses of the top of my head. (I suppose if I could, that would make them apparent ;)
Views can be used to provide security (ie: users can have access to views that only access certain columns in a table), views can provide additional security for updates, inserts, etc. Views also provide a way to alias column names (as do sp's) but views are more of an isolation from the actual table.
In a sense views denormalize. Denormalization is sometimes necessary to provide data in a more meaningful manner. This is what a lot of applications do anyway by way of domain modeling in their objects. They help present the data in a way that more closely matches a business' perspective.
In addition to what the others have stated, views can also be useful for removing more complecated SQL queries from the application.
As an example, instead of in an application doing:
sql = "select a, b from table1 union
select a, b from table2";
You could abstract that to a view:
create view union_table1_table2_v as
select a,b from table1
union
select a,b from table2
and in the app code, simply have:
sql = "select a, b from union_table1_table2_v";
Also if the data structures ever change, you won't have to change the app code, recompile, and redeploy. you would just change the view in the db.
Views hide the database complexity. They are great for a lot of reasons and are useful in a lot of situations, but if you have users that are allowed to write their own queries and reports, you can use them as a safeguard to make sure they don't submit badly designed queries with nasty cartesian joins that take down your database server.
The OP asked if there were situations where it might be tempting to use a view, but it's not appropriate.
What you don't want to use a view for is a substitute for complex joins. That is, don't let your procedural programming habit of breaking a problem down into smaller pieces lead you toward using several views joined together instead of one larger join. Doing so will kill the database engine's efficiency since it's essentially doing several separate queries rather than one larger one.
For example, let's say you have to join tables A, B, C, and D together. You may be tempted to make a view out of tables A & B and a view out of C & D, then join the two views together. It's much better to just join A, B, C, and D in one query.
Views can centralize or consolidate data. Where I'm at we have a number of different databases on a couple different linked servers. Each database holds data for a different application. A couple of those databases hold information that are relavent to a number of different applications. What we'll do in those circumstances is create a view in that application's database that just pulls data from the database where the data is really stored, so that the queries we write don't look like they're going across different databases.
The responses so far are correct -- views are good for providing security, denormalization (although there is much pain down that road if done wrong), data model abstraction, etc.
In addition, views are commonly used to implement business logic (a lapsed user is a user who has not logged in in the last 40 days, that sort of thing).
Views save a lot of repeated complex JOIN statements in your SQL scripts. You can just encapsulate some complex JOIN in some view and call it in your SELECT statement whenever needed. This would sometimes be handy, straight forward and easier than writing out the join statements in every query.
A view is simply a stored, named SELECT statement. Think of views like library functions.
I wanted to highlight the use of views for reporting. Often, there is a conflict between normalizing the database tables to speed up performance, especially for editing and inserting data (OLTP uses), and denormalizing to reduce the number of table joins for queries for reporting and analysis (OLAP uses). Of necessity, OLTP usually wins, because data entry must have optimal performance. Creating views, then, for optimal reporting performance, can help to satisfy both classes of users (data entry and report viewers).
I remember a very long SELECT which involved several UNIONs. Each UNION included a join to a price table which was created on the fly by a SELECT that was itself fairly long and hard to understand. I think it would have been a good idea to have a view that to create the price table. It would have shortened the overall SELECT by about half.
I don't know if the DB would evaluate the view once, or once each time in was invoked. Anyone know? If the former, using a view would improved performance.
Anytime you need [my_interface] != [user_interface].
Example:
TABLE A:
id
info
VIEW for TABLE A:
Customer Information
this is a way you might hide the id from the customer and rename the info to a more verbose name both at once.
The view will use underlying index for primary key id, so you won't see a performance loss, just better abstraction of the select query.

SQL Server 2005: Wrapping Tables by Views - Pros and Cons

Background
I am working on a legacy small-business automation system (inventory, sales, procurement, etc.) that has a single database hosted by SQL Server 2005 and a bunch of client applications. The main client (used by all users) is an MS Access 2003 application (ADP), and other clients include various VB/VBA applications like Excel add-ins and command-line utilities.
In addition to 60 or so tables (mostly in 3NF), the database contains about 200 views, about 170 UDFs (mostly scalar and table-valued inline ones), and about 50 stored procedures. As you might have guessed, some portion of so-called "business logic" is encapsulated in this mass of T-SQL code (and thus is shared by all clients).
Overall, the system's code (including the T-SQL code) is not very well organized and is very refactoring-resistant, so to speak. In particular, schemata of most of the tables cry for all kinds of refactorings, small (like column renamings) and large (like normalization).
FWIW, I have pretty long and decent application development experience (C/C++, Java, VB, and whatnot), but I am not a DBA. So, if the question will look silly to you, now you know why it is so. :-)
Question
While thinking about refactoring all this mess (in a peacemeal fashion of course), I've come up with the following idea:
For each table, create a "wrapper" view that (a) has all the columns that the table has; and (b) in some cases, has some additional computed columns based on the table's "real" columns.
A typical (albeit simplistic) example of such additional computed column would be sale price of a product derived from the product's regular price and discount.
Reorganize all the code (both T-SQL and VB/VBA client code) so that only the "wrapper" views refer to tables directly.
So, for example, even if an application or a stored procedure needed to insert/update/delete records from a table, they'd do that against the corresponding "table wrapper" view, not against the table directly.
So, essentially this is about isolating all the tables by views from the rest of the system.
This approach seems to provide a lot of benefits, especially from maintainability viewpoint. For example:
When a table column is to be renamed, it can be done without rewriting all the affected client code at once.
It is easier to implement derived attributes (easier than using computed columns).
You can effectively have aliases for column names.
Obviously, there must be some price for all these benefits, but I am not sure that I am seeing all the catches lurking out there.
Did anybody try this approach in practice? What are the major pitfalls?
One obvious disadvantage is the cost of maintaining "wrapper" views in sync with their corresponding tables (a new column in a table has to be added to a view too; a column deleted from a table has to be deleted from the view too; etc.). But this price seems to be small and fair for making the overall codebase more resilient.
Does anyone know any other, stronger drawbacks?
For example, usage of all those "wrapper" views instead of tables is very likely to have some adverse performance impact, but is this impact going to be substantial enough to worry about it? Also, while using ADODB, it is very easy to get a recordset that is not updateable even when it is based just on a few joined tables; so, are the "wrapper" views going to make things substantially worse? And so on, and so forth...
Any comments (especially shared real experience) would be greatly appreciated.
Thank you!
P.S. I stepped on the following old article that discusses the idea of "wrapper" views:
The Big View Myth
The article advises to avoid the approach described above. But... I do not really see any good reasons against this idea in the article. Quite the contrary, in its list of good reasons to create a view, almost each item is exactly why it is so tempting to create a "wrapper" view for each and every table (especially in a legacy system, as a part of refactoring process).
The article is really old (1999), so whatever reasons were good then may be no longer good now (and vice versa). It would be really interesting to hear from someone who considered or even tried this idea recently, with the latest versions of SQL Server and MS Access...
When designing a database, I prefer the following:
no direct table access by the code (but is ok from stored procedures and views and functions)
a base view for each table that includes all columns
an extended view for each table that includes lookup columns (types, statuses, etc.)
stored procedures for all updates
functions for any complex queries
this allows the DBA to work directly with the table (to add columns, clean things up, inject data, etc.) without disturbing the code base, and it insulates the code base from any changes made to the table (temporary or otherwise)
there may be performance penalties for doing things this way, but so far they have not been significant - and the benefits of the layer of insulation have been life-savers several times
You won't notice any performance impact for one-table views; SQL Server will use the underlying table when building the execution plans for any code using those views. I recommend you schema-bind those views, to avoid accidentally changing the underlying table without changing the view (think of the poor next guy.)
When a table column is to be renamed
In my experience, this rarely happens. Adding columns, removing columns, changing indexes and changing data types are the usual alter table scripts that you'll run.
It is easier to implement derived attributes (easier than using computed columns).
I would dispute that. What's the difference between putting the calculation in a column definition and putting it in a view definition? Also, you'll see a performance hit for moving it into a view instead of a computed column. The only real advantage is that changing the calculation is easier in a view than by altering a table (due to indexes and data pages.)
You can effectively have aliases for column names.
That's the real reason to have views; aliasing tables and columns, and combining multiple tables. Best practice in my past few jobs has been to use views where I needed to denormalise the data (lookups and such, as you've already pointed out.)
As usual, the most truthful response to a DBA question is "it depends" - on your situation, skillset, etc. In your case, refactoring "everything" is going to break all the apps anyways. If you do fix the base tables correctly, the indirection you're trying to get from your views won't be required, and will only double your schema maintenance for any future changes. I'd say skip the wrapper views, fix the tables and stored procs (which provide enough information hiding already), and you'll be fine.
I agree with Steven's comment--primarily because you are using Access. It's extremely important to keep the pros/cons of Access in focus when re-designing this database. I've been there, done that with the Access front-end/SQL Server back-end (although it wasn't an ADP project).
I would add that views are nice for ensuring that data is not changed outside of the Access forms in the project. The downside is that stored procedures be required for all updates--if you don't already have those, they'd have to be created too.