Scenario:
I have 3 tables needing to be joined together, a where clause to limit the result set, and only a few columns from each table being selected. Simple. However, the query to do this isn't very pretty, and when using an ORM between the database and the application, its like trying to put a square peg into a round hole.
My way to get around this is to create a view that embraces the query and now my application model maps directly to a view in the database; no more crazy mapping the ORM layer.
Question:
Assuming no other factors come into play here, will the query against the view incur any additional performance penalties that I wouldn't have hit if I executed the SQL statement directly? - This is not an indexed view, assume the same where clause, keep this simple.
I am being led to believe that a view suffers from extra overhead of "being built". My understanding is that with all else the same, the two should have identical performance.
Please clarify. Thanks!
From MSDN:
View resolution
When an SQL statement references a nonindexed view, the parser and query optimizer analyze the source of both the SQL statement and the view and then resolve them into a single execution plan. There is not one plan for the SQL statement and a separate plan for the view.
There should not be any different performance. Views helps you organize, not any performance enhancement. Unless you are using indexed views.
Only the definition of a nonindexed view is stored, not the rows of the view. The query optimizer incorporates the logic from the view definition into the execution plan it builds for the SQL statement that references the nonindexed view.
In Oracle, the performance is the same. A view is really a named sql statement. But fancier.
When you start nesting views, and joining views with other table or views, things get complicated real quick. If Oracle can't push your filters down the view to the table, it often has to materialize (build a temp table of) parts of the query, and this is when you get the bad performance.
Related
I am using views for query convenience. The view is a join between three tables, using INNER JOIN and OUTER RIGHT joins. The overall result set from the view could be 500,000 records. I then perform other queries off of this view, similar to:
SELECT colA, colB, colC FROM vwMyView WHERE colD = 'ABC'
This query might return only 30 or so results. How will this be for performance? Internally in the SQL engine will the view always be executed, then the WHERE clause applied after, or is SQL Server smart enough to apply the WHERE clause first so that the JOIN operations are only done on a subset of records?
If I'm only returning 30 records to the middle tier, do I need to worry too much that the SQL Server had to trawl through 500,000 records to get to those 30 records? I have indexes applied on all important columns on the base tables.
Using MS SQL Server, view is not materialized
Usually, a view is treated in much the same way as a macro might be in other languages - the body of the view is "expanded out" into the query it's a part of, before the query is optimized. So your concern about it first computing all 500,000 results first is unfounded.
The exception to the above is if the view is e.g. an indexed view (SQL Server, query has to use appropriate hints or you have to be using a high-level edition) or a materialized view (Oracle, not sure on the requirements) where the view isn't expanded out - but the results have already been computed beforehand and are being stored much like a real table's rows are - so again, there shouldn't be too much concern whilst actually querying.
When not having a materialized view, the SQL behind your view will always executed when using the view e.g. inside the FROM part. Of course, maybe some caching is possible, but this is depending on your DBMS and your configurations.
To see what the database is doing in background your might like to start with using EXPLAIN ANALYZE <your query>.
Performance of queries on large datasets typically need clever application of indices. In your case a simple index on colD probably will do the trick. Depending on the data different types of indeces might need scrutiny. Hash tables, btrees etc all behave differently depending on the data. So there is no one solution that rules them all here. Otherwise optimization is better left to the query optimizer in your RDBMS. The developers there spend quite some time optimizing and critical segments probably are in low-level fast moving code.
On another node clever cleaning of the data might be considered as well. And if aggregation is required datawarehousing with clever dimensions and pre aggregated values. Storage is cheap these days, computing time maybe not so.
hi we have a stored procedure which is scheduled for daily basis that fetches records from a table having huge data after filtering. my question is if i create a view on the table and fetch the data from the view will this be faster process or slower?
A standard view, it shouldn't make any difference as the inner SQL just gets expanded out into the query. Note, the same applies with inline table-valued user defined functions (think "parameterised view").
However, if you make it an indexed view, then you could see a performance improvement.
Just remember, a view is nothing but a select statement (indexed views are different). If you have:
SELECT * FROM TABLE
And that is in a procedure, if you put the same thing in a view and then did:
SELECT * FROM VIEW
Within a procedure, there is absolutely no difference between these two. But, if things get more complicated so that you're joining against a lot of tables, then it really depends on how they're being accessed.
For example, if you create a view that accesses 6 tables and then you write a query that only needs to pull data from 3 of those tables, you may benefit from a process called simplification that takes place within the optimization process and you'll see a plan that only references 3 tables. However, you might not. If not, then a query that you would write against the 3 tables will generally run faster than a plan against a view that accesses more than 3 tables.
If you start nesting views, having views that call views or join to views, then you may see a very serious performance degradation.
In general, if you're working with stored procedures, I would suggest you just write your queries against the tables directly. It won't hurt performance at all, and it could help you avoid issues with nested views and plan simplification.
just complementing #AdaTheDev answer:
the same applies with table-valued user defined functions
that's true for inline table-valued functions, but not 100% true for a multistatement table-valued function. This second type of function will use a lot more resources (memory) than the first one
About the index view, it can help, but bear in mind that it can drastically will increase your storage space
I'm wondering if this is a bad practice or if in general this is the correct approach.
Lets say that I've created a view that combines a few attributes from a few tables.
My question, what do I need to do so I can query against this view as if it were a table without worrying about performance?
All attributes in the original tables are indexed, my concern is that the result view will have hundreds of thousands of records, which I will want to narrow down quite a bit based on user input.
What I'd like to avoid, is having multiple versions of the code that generates this view floating around with a few extra "where" conditions to facilitate the user input filtering.
For example, assume my view has this header VIEW(Name, Type, DateEntered) this may have 100,000+ rows (possibly millions). I'd like to be able to make this view in SQL Server, and then in my application write querlies like this:
SELECT Name, Type, DateEntered FROM MyView WHERE DateEntered BETWEEN #date1 and #date2;
Basically, I am denormalizing my data for a series of reports that need to be run, and I'd like to centralize where I pull the data from, maybe I'm not looking at this problem from the right angle though, so I'm open to alternative ways to attack this.
My question, what do I need to do so I can query against this view as if it were a table without worrying about performance?
SQL Server is very good in view unnesting.
Your queries will be as efficient as if the view's query were used in the query itself.
This means that
CREATE VIEW myview AS
SELECT *
FROM /* complex joins */
SELECT *
FROM mytable
JOIN myiew
ON …
and
SELECT *
FROM mytable
JOIN (
SELECT *
FROM /* complex joins */
) myview
ON …
will have the same performance.
SQL Server 2005 has indexed views - these provide indexes on views. That should help with performance. If the underlying tables already have good indexes on the queried fields, these will be used - you should only add indexed views when this is not the case.
These are known in other database systems as materialized views.
The view will make use of the index in your WHERE clause to filter the results.
Views aren't stored result sets. They're stored queries, so you'll have the performance gained from your indexes each time you query the view.
Why would it perform badly? I, mean you can think of a view as a compiled select statement. It makes use of existing indexes on the underlying tables, even when you add extra where clauses. In my opinion it is a good approach. In any case it's better than having virtually the same select statement scattered all over your application (from a design and maintainability point of view at least).
If not indexed then...
When you query a view, it's ignored. The view is expanded into the main query.
It is the same as querying the main tables directly.
What will kill you is view on top of view on top of view, in my experience.
It should, in general, perform no worse than the inline code.
Note that it is possible to make views which hide very complex processing (joins, pivots, stacked CTEs, etc), and you may never want anyone to be able to SELECT * FROM view on such a view for all time or all products or whatever. If you have standard filter criteria, you can use an inline table-valued function (effectively a parameterized view), which would require all users to supply the expected parameters.
In your case, for instance, they would always have to do:
SELECT Name, Type, DateEntered
FROM MyITVF(#date1, #date2);
To share the view logic between multiple ITVFs, you can build many inline table-valued functions on top of the view, but not give access to the underlying tables or views to users.
the mysql certification guide suggests that views can be used for:
creating a summary that may involve calculations
selecting a set of rows with a WHERE clause, hide irrelevant information
result of a join or union
allow for changes made to base table via a view that preserve the schema of original table to accommodate other applications
but from how to implement search for 2 different table data?
And maybe you're right that it doesn't
work since mysql views are not good
friends with indexing. But still. Is
there anything to search for in the
shops table?
i learn that views dont work well with indexing so, will it be a big performance hit, for the convenience it may provide?
A view can be simply thought of as a SQL query stored permanently on the server. Whatever indices the query optimizes to will be used. In that sense, there is no difference between the SQL query or a view. It does not affect performance any more negatively than the actual SQL query. If anything, since it is stored on the server, and does not need to be evaluated at run time, it is actually faster.
It does afford you these additional advantages
reusability
a single source for optimization
This mysql-forum-thread about indexing views gives a lot of insight into what mysql views actually are.
Some key points:
A view is really nothing more than a stored select statement
The data of a view is the data of tables referenced by the View.
creating an index on a view will not work as of the current version
If merge algorithm is used, then indexes of underlying tables will be used.
The underlying indices are not visible, however. DESCRIBE on a view will show no indexed columns.
MySQL views, according to the official MySQL documentation, are stored queries that when invoked produce a result set.
A database view is nothing but a virtual table or logical table (commonly consist of SELECT query with joins). Because a database view is similar to a database table, which consists of rows and columns, so you can query data against it.
Views should be used when:
Simplifying complex queries (like IF ELSE and JOIN or working with triggers and such)
Putting extra layer of security and limit or restrict data access (since views are merely virtual tables, can be set to be read-only to specific set of DB users and restrict INSERT )
Backward compatibility and query reusability
Working with computed columns. Computed columns should NOT be on DB tables, because the DB schema would be a bad design.
Views should not be use when:
associate table(s) is/are tentative or subjected to frequent structure change.
According to http://www.mysqltutorial.org/introduction-sql-views.aspx
A database table should not have calculated columns however a database view should.
I tend to use a view when I need to calculate totals, counts etc.
Hope that help!
One more down side of view that doesn't work well with mysql replicator as well as it is causing the master a bit behind of the slave.
http://bugs.mysql.com/bug.php?id=30998
I have always hoped and assumed that it is not - that set theory (or something) provides a shortcut to the result.
I have created a non-updateable view that aggregates data from several tables, in a way that produces an exponential number of records. From this view, I query one record at a time. Because the underlying dataset is small, this technique works well - but I'm concerned it won't scale.
I've heard MySQL uses temporary tables to implement views. My heart lurches at the thought of potentially massive temp tables popping into and out of existence for each and every query.
Use explain <select query> syntax to see what really happens within your query.
Generally speaking, using a view is equivalent to using subquery with the same SQL. No better and no worse, just a shortcut to prevent writing the same subquery over and over again.
Sometimes you'll end up with temporary tables used to resolve some comples queries, but it shouldn't happen often if DB structure is well optimized and using views instead of subqueries won't change anything.