I'm trying to understand the difference between the two, and when it would be advisable to use one over the other.
A datamart is a whole database: generally like a simpler data warehouse in that it is usually the source for reporting or analysis. It is usually the end point of ETL processes pulling and aggregating data from multiple sources.
A materialised view is a stored query. It is 'materialised' in the sense that some aspect of it will be permanently stored, as opposed to an ordinary view which is evaluated on the fly. Often this is in order to apply indexes to a view: the view has to be schema bound to the underlying data, and updates to the underlying data will cause the materialised view's indexes to be updated so they're ready ahead of time before the view is called.
So really, the question of which to use doesn't make sense: they're completely different things.
If the problem is complex querying, then go for views.
If performance is the problem, then go for data marts.
Related
The BigQuery documentation describes each kind of view but doesn't provide particular use cases, when to use one and when to use the other, so the question here is what are some of the best use cases to use one over the other.
Views are generally used when data is to be accessed infrequently and data in tables get updated on a frequent basis.
Some Use Cases where you might benefit from using views are:
When you want to consult a table.
Views are commonly faster than materialized views.
Making use when creating a dashboard or an impression etiquette is highly recommended.
Materialized Views are used when data is to be accessed frequently and data in tables do not get updated on a frequent basis.
Some Use Cases where you might benefit from using materialized views are:
Pre-aggregate data. Aggregation of streaming data.
Pre-filter data. Run queries that only read a particular subset of the table.
Pre-join data. Query joins, especially between large and small tables.
Recluster data. Run queries that would benefit from a clustering scheme that differs from the base tables.
Note that these are just some Use Cases between views and materialized views.
I've just started joining Stored Procedures together using views and it seems a simple way of building up a short query using the results of others.
Are there any disadvantages to over relying on Views before I plough on with this method? Am I better to pursue the temporary table option?
The main differences are that a view only actually stores the query not the results (with the exception of materialised views) and views persist after the end of your session. Views are an excellent way of hiding complexity, but does not make the queries run more quickly than if you wrote out the whole thing in one query. Views also do not use up storage space (except for a very small amount for the metadata).
I would recommend using views if you do not have any requirements to speed the queries up further or if you need to be able to reference the data without recreating it subsequent sessions.
Temporary tables do store the result, but just for the current session, so if you need a base query to speed up further queries for the duration of your session, this can be useful.
In fact, views are mostly used for security reasons, and they also make queries more simple (for some cases.) So it just depends on what you are doing, based on if it requires storing and other requirements.
Recently started working with a database in which the convention is to create a view for every table. If you assume that there is a one to one mapping between tables and views, I was wondering if anyone could tell me the performance impacts of doing something like this. BTW, this is on Oracle.
Assuming the question is about non-materialized views -- Really depends on the query that the view is based on, and what is being done to it. Sometimes, the predicates can be pushed into the view query by the optimizer. If not, then it wouldn't be as good as against the table itself. Views are built on top of tables -- why would you expect that the performance would be better?
Layering views, where you build one view on top of another, is a bad practice because you won't know about issues until run time. It's also less of a chance that predicate pushing will occur with layered views.
Views can also be updateable -- they aren't a reliable means to restricting access to resources if someone has INSERT/UPDATE/DELETE privileges on the underlying tables.
Materialized views are as good as tables, but are notoriously restrictive in what they support.
You don't explain what you're doing in the views? A 1:1 with the tables sounds like you are using the views more like synonyms than a view. IOW, are the views = "SELECT * FROM table", then you'll see no performance hit except on hard parse.
If you are joining to other tables or placing filter clauses in them which prevent predicate pushing than you're bound to see a major hit sometime.
The only pain I have had with views is a distributed query over a DB link. The local optimizer gets some details about the remote object, but the view doesn't tell it about any indexes so you can get some kooky plans.
I've heard about some places that use it as a standard since they can easily 're-order' the columns in a view. Not a big benefit in my opinion by YMMV
When should a View actually be used over an actual Table? What gains should I expect this to produce?
Overall, what are the advantages of using a view over a table? Shouldn't I design the table in the way the view should look like in the first place?
Oh there are many differences you will need to consider
Views for selection:
Views provide abstraction over tables. You can add/remove fields easily in a view without modifying your underlying schema
Views can model complex joins easily.
Views can hide database-specific stuff from you. E.g. if you need to do some checks using Oracles SYS_CONTEXT function or many other things
You can easily manage your GRANTS directly on views, rather than the actual tables. It's easier to manage if you know a certain user may only access a view.
Views can help you with backwards compatibility. You can change the underlying schema, but the views can hide those facts from a certain client.
Views for insertion/updates:
You can handle security issues with views by using such functionality as Oracle's "WITH CHECK OPTION" clause directly in the view
Drawbacks
You lose information about relations (primary keys, foreign keys)
It's not obvious whether you will be able to insert/update a view, because the view hides its underlying joins from you
Views can:
Simplify a complex table structure
Simplify your security model by allowing you to filter sensitive data and assign permissions in a simpler fashion
Allow you to change the logic and behavior without changing the output structure (the output remains the same but the underlying SELECT could change significantly)
Increase performance (Sql Server Indexed Views)
Offer specific query optimization with the view that might be difficult to glean otherwise
And you should not design tables to match views. Your base model should concern itself with efficient storage and retrieval of the data. Views are partly a tool that mitigates the complexities that arise from an efficient, normalized model by allowing you to abstract that complexity.
Also, asking "what are the advantages of using a view over a table? " is not a great comparison. You can't go without tables, but you can do without views. They each exist for a very different reason. Tables are the concrete model and Views are an abstracted, well, View.
Views are acceptable when you need to ensure that complex logic is followed every time. For instance, we have a view that creates the raw data needed for all financial reporting. By having all reports use this view, everyone is working from the same data set, rather than one report using one set of joins and another forgetting to use one which gives different results.
Views are acceptable when you want to restrict users to a particular subset of data. For instance, if you do not delete records but only mark the current one as active and the older versions as inactive, you want a view to use to select only the active records. This prevents people from forgetting to put the where clause in the query and getting bad results.
Views can be used to ensure that users only have access to a set of records - for instance, a view of the tables for a particular client and no security rights on the tables can mean that the users for that client can only ever see the data for that client.
Views are very helpful when refactoring databases.
Views are not acceptable when you use views to call views which can result in horrible performance (at least in SQL Server). We almost lost a multimillion dollar client because someone chose to abstract the database that way and performance was horrendous and timeouts frequent. We had to pay for the fix too, not the client, as the performance issue was completely our fault. When views call views, they have to completely generate the underlying view. I have seen this where the view called a view which called a view and so many millions of records were generated in order to see the three the user ultimately needed. I remember one of these views took 8 minutes to do a simple count(*) of the records. Views calling views are an extremely poor idea.
Views are often a bad idea to use to update records as usually you can only update fields from the same table (again this is SQL Server, other databases may vary). If that's the case, it makes more sense to directly update the tables anyway so that you know which fields are available.
According to Wikipedia,
Views can provide many advantages over tables:
Views can represent a subset of the data contained in a table.
Views can limit the degree of exposure of the underlying tables to the outer world: a given user may have permission to query the view, while denied access to the rest of the base table.
Views can join and simplify multiple tables into a single virtual table.
Views can act as aggregated tables, where the database engine aggregates data (sum, average, etc.) and presents the calculated results as part of the data.
Views can hide the complexity of data. For example, a view could appear as Sales2000 or Sales2001, transparently partitioning the actual underlying table.
Views take very little space to store; the database contains only the definition of a view, not a copy of all the data that it presents.
Views can provide extra security, depending on the SQL engine used.
A common practice is to hide joins in a view to present the user a more denormalized data model. Other uses involve security (for example by hiding certain columns and/or rows) or performance (in case of materialized views)
Views are handy when you need to select from several tables, or just to get a subset of a table.
You should design your tables in such a way that your database is well normalized (minimum duplication). This can make querying somewhat difficult.
Views are a bit of separation, allowing you to view the data in the tables differently than they are stored.
You should design your table WITHOUT considering the views.
Apart from saving joins and conditions, Views do have a performance advantage: SQL Server may calculate and save its execution plan in the view, and therefore make it faster than "on the fly" SQL statements.
View may also ease your work regarding user access at field level.
First of all as the name suggests a view is immutable. thats because a view is nothing other than a virtual table created from a stored query in the DB.
Because of this you have some characteristics of views:
you can show only a subset of the data
you can join multiple tables into a single view
you can aggregate data in a view (select count)
view dont actually hold data, they dont need any tablespace since they are virtual aggregations of underlying tables
so there are a gazillion of use cases for which views are better fitted than tables, just think about only displaying active users on a website. a view would be better because you operate only on a subset of the data which actually is in your DB (active and inactive users)
check out this article
hope this helped..
I'm architecting a new app at the moment, with a high read:write ratio. At my current employer we have lots of denormalised data on our tables for performance reasons. Is it better practice to have totally 3NF tables and then use indexed views to do all the denormalisation? Should I run queries against the tables or views?
An example of some of the things I am interested are aggregates of columns child tables (e.g. having user post count stored somewhere).
In general it's a good idea to have denormalized views if you need to access across multiple normalized tables very frequently. In most cases it'll be a significant performance increase over using a join and querying directly against the tables, and it's usually not any less maintainable, since either your view or join can be written to be agnostic about changes to parts of the tables that it doesn't use.
Whether all your tables should be in the third normal form is another question. In most applications I've worked with the answer is most tables should be normalized this way, but there are exceptions. Whether to make an exception has to do with how the data is used, and whether you can be very confident about that use not changing in the future.
Having to go back and re-normalize later because you did something the wrong way can be costly, but over-normalizing data that should be straightforward to use and understand can make things more complicated and difficult to maintain than they need to be. Your mileage may vary.
If you are going to use views to present denormalized data to the user (and you're using SQL Server), you should check out the SCHEMABINDING clause. If a view is schemabound, you can index it, and the index will be updated when the underlying tables are updated. In this way, if the indexes are set up well, people who are looking for data can actually select from the index, so it won't need to rebuild the complex view for every query, but users will still see up-to-date date when the underlying tables change.