I am learning SQL. It seems that PostgreSQL allows you to update a table through a 'view', if you have visibility of a few select columns of the table. On the other hand, SQLite simply does not support this (which makes more sense to me).
I wonder whether it is a good practice to update tables through views even when it is allowed?
This question may be a matter of opinion, but I would say that updating data through a view is generally not a good practice, although there are exceptions.
One of the main reasons to define views is to isolate users from changes in underlying data structures. Because not all views are updatable, that means that a change to the definition of a view (but not the result set) could invalidate code.
In some databases, it is possible to get around this by using triggers on views.
I should add that is "general" thinking. Another reason to have views is for access control and security. For instance, some users may not be able to see some columns in some tables; they have access to the view but not the underlying table. In this case, updates to views are a bit more reasonable.
All that said, I should point out that I'm not really a fan of having users update data directly at all. My preference is to do such updates through stored procedures, so there is much better control over the data model, auditing, and user-access.
Related
I am studying for a final exam, and came across this question:
Explain why updating views is not recommended. Explain how triggers can be used to support view updates.
I have looked on the web, and read a couple chapters from the book to no avail.
I have seen points made to where views can help make life easier, but none arguing against them.
Is this a possible answer?
One could use the INSTEAD OF clause in a TRIGGER statement in order to circumvent the updating of a table. This would allow for the update of multiple tables that could be represented by one view.
So, my questions are:
1.) Why are updating views not recommended?
2.) How can triggers be used as a solution to the problem?
There are many restrictions on inherently updatable views.
This can be both frustrating and fragile, as future evolution of your view and/or schema might made the view no longer inherently updatable -- so breaking code that relies on this feature.
At the expense of few lines of code, using an INSTEAD OF trigger will have the benefit of both reducing the above concern and to allow you to update a non-inherently updatable views. You can use an INSTEAD OF trigger on an inherently updatable view to override the default behavior.
When researching views, be sure to rely on relatively recent data as views have changed over the years so older opinions on the subject may not be valid. When views were first made updateable, the restrictions were many and control of the operation was little or nonexistent.
With the ability to write triggers on views in many systems, the restrictions have fallen away and control is precise. We can now determine exactly what happens to all the data during DML to views. It is now to the point where I disallow direct access to tables by applications. All DML originating from apps have to go thru views (or stored procedures, but they are not as good as view triggers). The benefits are so vast I don't see why it hasn't become a universal standard.
Indeed, many people's "views" on views (unfortunately, many people who are in charge of databases) seem to be stuck in the 1990's. Some don't want any views at all in their database. Some allow views but don't allow using them for DML. Many insist on giving them special names ("VW_name", "name_View" or similar) which breaks an extremely useful wall of abstraction for your data. Data abstraction is not a strong point of databases, so grab it where you can.
To be clear, by modifiable join view I mean a view constructed from the joining of two or more tables that allows insert/update/delete actions that modify any/all of the component tables.
This may be a postgres specific question, not sure. I am also interested if other DBMSs have idiosyncratic features for modifiable join views, since as far as I can tell, they are not possible in standard SQL.
I'm working on a postgres schema, and some of my recent reading has suggested that it is possible to construct modifiable join views using instead rules (CREATE RULE ... DO INSTEAD ...). Modifiable join views seem desirable since it would allow for hiding strong normalization behind an interface, providing a mechanism for classic abstraction. Rules are the only option for implementation, since currently triggers cannot be set on views.
However, the first modifiable view I tried to design ran into problems, and I find out that many consider non-trivial rules to be harmful (see links in comments to this SO answer). Also, I can't find any examples of modifiable join views on the web.
Questions (Edit to put finer points on the questions):
Do you have any experience with modifiable join views and can you provide a concrete example with select/insert/delete/update ability?
Are they practical, i.e. can they be treated transparently without having to tiptoe around mines/black holes?
Are they ever a good design choice, in terms of functionality/effort ratio and maintainability?
Would greatly appreciate links to any examples/discussions on this topic. Thanks.
Yes, I have some experience with updatable views in general. I think they're practical in PostgreSQL. Like all design choices, they can be a good choice, and they can be a bad choice.
I find them particularly useful in dealing with supertype/subtype tables. I create one view for each subtype; the view joins the subtype to the supertype. Revoke permissions on the base tables, write rules for the view, and give client code access only to the views. All data manipulation done by client code then goes through the view and the rules defined on them.
I don't think rules are really different from any other feature in any other environment. And by environment, I mean C, C++, Java, Ruby, Python, Erlang, and BASIC, not just dbms environments.
Use the good features of a language. Avoid the bad ones.
"Don't use malloc()" is bad advice. "Always check the return value of malloc()" is good advice. "Never use rules" is bad advice. "Avoid using rules in ways that are known to have questionable behavior" is good advice. The rules you need for views on supertype/subtype tables are simple and easy to understand. They don't misbehave.
At the theoretical level, views provide logical data independence. But that's only possible if the views are updatable. (And many views should be updatable directly by the database engine, without any need of rules or triggers.)
I use them as a replacement for ORMs. I think as long as you do not run-a-muck sprinkling them everywhere through the database they can be easy enough to understand. I define a schema for an application and then whatever views are in that schema are the methods and operations of that app. The client code can be mostly automated after that since the views give the abstraction I need to write generic client code.
People point out that the rule rewrite is not a real table (but it is posing as one) which makes it possible to write things that will break. This is possible but I have yet to come across it yet. The idea is to hide the complexity in the rewrite and then only do simple deletes and update with no joins. If it turns out that a join is needed - it is time to rewrite the rule, not the top level query.
At the end, I find it a very compact way to write the database. All the ways of interfacing with it are written as rules. No connection should have access to a real table. Your business logic is very explicit. If a view does not have an UPDATE rule for it - it can not be updated period. Since you have written all this in the database level instead of the client level, it is not tied to a web framework or a particular language. This leads to a lot of flexibility in how you want to connect to the database. Imagine you used web framework, but as time goes on you need direct access to the database for another source. Direct access will also bypass all of ORM business rules you worked so hard on. With a rule writing interface you can expose, the interface without fear that the new connection will corrupt the data.
If people say you can really F UP a database with them - then sure - of course you can. But you can with everything else too. If people say you can not use them at all with out mucking things up, then I would disagree.
Two quick links:
Why using rules is bad idea
Triggers on views
My personal preference is to use views only for reading data, (virtually) never for inserting or updating. By essentially re-normalizing data (which sounds like what you are doing) in your database, you are likely creating a system that will be very difficult to test and maintain in the long term.
If at all possible, look at mapping your denormalized data back to a normal schema somewhere in your application code, and providing it to the database that way (to individual tables IMHO) in a single transaction.
I know in SQL Server if you update a view you must limit the change to only one table anyway which makes using views for updating useless in my mind as you have to know which fields go with which tables anyway.
If you want to abstract the information out and not have to worry about the database structure for inserts adn updates, an ORM mught do a better job for you than views.
I have never used modifiable views of any sort but as you are asking whether they are a "reasonable design choice", can I suggest an alternative design choice with many benefits where modifiable views are not needed: a Transactional API
Basically what this amounts to is:
Users have no access to tables and cannot issue insert, update, delete statements at all
Users have access to functions that represent well defined transactions - at the simplest level these may just do a single DML, but often would not. The important thing is that they map to transactions in the 'business' sense rather than in the 'database' sense
For querying, users have access to (non-modifiable) views
I do usually do views in the form of "last-valid-record" just hidding and tracking modifications (like a wiki)
The only drawback that I see to this is: then you use your view as a table, and you join it with anything, and and you use it on "wheres", and you insert records on it, and so on, but behinds you have made lot of performance lost compared to the same acctions against a real table (more bigger and more complex). I think it depends on how many people must understud de schema. Its true that some DBMS also admit to index the views, but I think you lose an important amount of performance anyway. Sorry about my english.
When should a View actually be used over an actual Table? What gains should I expect this to produce?
Overall, what are the advantages of using a view over a table? Shouldn't I design the table in the way the view should look like in the first place?
Oh there are many differences you will need to consider
Views for selection:
Views provide abstraction over tables. You can add/remove fields easily in a view without modifying your underlying schema
Views can model complex joins easily.
Views can hide database-specific stuff from you. E.g. if you need to do some checks using Oracles SYS_CONTEXT function or many other things
You can easily manage your GRANTS directly on views, rather than the actual tables. It's easier to manage if you know a certain user may only access a view.
Views can help you with backwards compatibility. You can change the underlying schema, but the views can hide those facts from a certain client.
Views for insertion/updates:
You can handle security issues with views by using such functionality as Oracle's "WITH CHECK OPTION" clause directly in the view
Drawbacks
You lose information about relations (primary keys, foreign keys)
It's not obvious whether you will be able to insert/update a view, because the view hides its underlying joins from you
Views can:
Simplify a complex table structure
Simplify your security model by allowing you to filter sensitive data and assign permissions in a simpler fashion
Allow you to change the logic and behavior without changing the output structure (the output remains the same but the underlying SELECT could change significantly)
Increase performance (Sql Server Indexed Views)
Offer specific query optimization with the view that might be difficult to glean otherwise
And you should not design tables to match views. Your base model should concern itself with efficient storage and retrieval of the data. Views are partly a tool that mitigates the complexities that arise from an efficient, normalized model by allowing you to abstract that complexity.
Also, asking "what are the advantages of using a view over a table? " is not a great comparison. You can't go without tables, but you can do without views. They each exist for a very different reason. Tables are the concrete model and Views are an abstracted, well, View.
Views are acceptable when you need to ensure that complex logic is followed every time. For instance, we have a view that creates the raw data needed for all financial reporting. By having all reports use this view, everyone is working from the same data set, rather than one report using one set of joins and another forgetting to use one which gives different results.
Views are acceptable when you want to restrict users to a particular subset of data. For instance, if you do not delete records but only mark the current one as active and the older versions as inactive, you want a view to use to select only the active records. This prevents people from forgetting to put the where clause in the query and getting bad results.
Views can be used to ensure that users only have access to a set of records - for instance, a view of the tables for a particular client and no security rights on the tables can mean that the users for that client can only ever see the data for that client.
Views are very helpful when refactoring databases.
Views are not acceptable when you use views to call views which can result in horrible performance (at least in SQL Server). We almost lost a multimillion dollar client because someone chose to abstract the database that way and performance was horrendous and timeouts frequent. We had to pay for the fix too, not the client, as the performance issue was completely our fault. When views call views, they have to completely generate the underlying view. I have seen this where the view called a view which called a view and so many millions of records were generated in order to see the three the user ultimately needed. I remember one of these views took 8 minutes to do a simple count(*) of the records. Views calling views are an extremely poor idea.
Views are often a bad idea to use to update records as usually you can only update fields from the same table (again this is SQL Server, other databases may vary). If that's the case, it makes more sense to directly update the tables anyway so that you know which fields are available.
According to Wikipedia,
Views can provide many advantages over tables:
Views can represent a subset of the data contained in a table.
Views can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.
Views can join and simplify multiple tables into a single virtual table.
Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data.
Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table.
Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents.
Views can provide extra security, depending on the SQL engine used.
A common practice is to hide joins in a view to present the user a more denormalized data model. Other uses involve security (for example by hiding certain columns and/or rows) or performance (in case of materialized views)
Views are handy when you need to select from several tables, or just to get a subset of a table.
You should design your tables in such a way that your database is well normalized (minimum duplication). This can make querying somewhat difficult.
Views are a bit of separation, allowing you to view the data in the tables differently than they are stored.
You should design your table WITHOUT considering the views.
Apart from saving joins and conditions, Views do have a performance advantage: SQL Server may calculate and save its execution plan in the view, and therefore make it faster than "on the fly" SQL statements.
View may also ease your work regarding user access at field level.
First of all as the name suggests a view is immutable. thats because a view is nothing other than a virtual table created from a stored query in the DB.
Because of this you have some characteristics of views:
you can show only a subset of the data
you can join multiple tables into a single view
you can aggregate data in a view (select count)
view dont actually hold data, they dont need any tablespace since they are virtual aggregations of underlying tables
so there are a gazillion of use cases for which views are better fitted than tables, just think about only displaying active users on a website. a view would be better because you operate only on a subset of the data which actually is in your DB (active and inactive users)
check out this article
hope this helped..
Recently I faced an interview and I was asked the above question.
I was dumb when I think about it.
Interviewer said:
All people are saying views have lots
of advantages but I find no
disadvantages, why so?
when table is not there view will not work.
dml is not possible if that is more than one table.
it is also database object so it will occupy the space.
When table is dropped view becomes inactive.. it depends on the table objects.
Querying from view takes more time than directly querying from the table
Most of the things I would say have already been covered, I would add this though.
Views are useful in many situations but making too much use of them can be a mistake because they tie your hands in terms of query structure. Often when your overall query contains several views within it (especially when views are layered), or when a view has been adapted for a slightly different purpose to what was originally intended, you find that there is a far better way of writing the query if you just expand the views and change the logic.
Like any tool, views can be misused particularly when you're not sure how they should be used properly.
Chris Mullins defines three basic view implementation rules:
The View Usage Rule
The Proliferation Avoidance Rule
The View Synchronization Rule
If you don't get these things right you get code maintenance problems, performance problems, security problems, etc.
The only disadvantage I can think of is that you may force the user to join several views to get the data in a way that is useful to them, as you now have largely static queries.
So, if the view was created one time and it is expected to never change, you may end up with a preponderance of views that creates a maze for the user to navigate through, so there should be some process to update views, to keep them useful as needs change.
1) when a table is dropped ,view will be affected.
2) If column name is renamed then view will show exception "Invalid column name" .
3)When view is created for large table ,it occupies some memory .
If you write some complex views, while querying simple data from view it will take more time.
It affects performance. Querying from view takes more time than directly querying from the table.
If view would join more than one table, you may not perform any DML
operations.
Table dependence- if you change table, you need to
updated view also.
A view permits the DBA (database administrator) to tightly control what goes in and comes out of a database.
In banking a view is often used to permanently keep track of every change made to the table. The real table typically contains additional columns that are not seen by "the view" such as:
last-modified (when the last change was made)
last-action (update/delete/add)
last-actioner (person who updated the row)
So when displaying the view of the table only the latest update or add of any row is displayed. However the table still contains every existing change and row deletion.
The major downside to a view is to the user of the table (the application programmer) who cannot directly change the underlying table (for performance reasons, for example). Additionally it does create more work for the database administrator. You might also consider the extra CPU burden placed upon the server - particularly if it is utilised by many clients.
I'm architecting a new app at the moment, with a high read:write ratio. At my current employer we have lots of denormalised data on our tables for performance reasons. Is it better practice to have totally 3NF tables and then use indexed views to do all the denormalisation? Should I run queries against the tables or views?
An example of some of the things I am interested are aggregates of columns child tables (e.g. having user post count stored somewhere).
In general it's a good idea to have denormalized views if you need to access across multiple normalized tables very frequently. In most cases it'll be a significant performance increase over using a join and querying directly against the tables, and it's usually not any less maintainable, since either your view or join can be written to be agnostic about changes to parts of the tables that it doesn't use.
Whether all your tables should be in the third normal form is another question. In most applications I've worked with the answer is most tables should be normalized this way, but there are exceptions. Whether to make an exception has to do with how the data is used, and whether you can be very confident about that use not changing in the future.
Having to go back and re-normalize later because you did something the wrong way can be costly, but over-normalizing data that should be straightforward to use and understand can make things more complicated and difficult to maintain than they need to be. Your mileage may vary.
If you are going to use views to present denormalized data to the user (and you're using SQL Server), you should check out the SCHEMABINDING clause. If a view is schemabound, you can index it, and the index will be updated when the underlying tables are updated. In this way, if the indexes are set up well, people who are looking for data can actually select from the index, so it won't need to rebuild the complex view for every query, but users will still see up-to-date date when the underlying tables change.