How do MySQL views work? - sql

When I create a view I am basically making a new table that will automatically be transacted upon when data in one of the tables it joins changes; is that correct?
Also why can't I use subqueries in my view?

A view works like a table, but it is not a table. It never exists; it is only a prepared SQL statement that is run when you reference the view name. IE:
CREATE VIEW foo AS
SELECT * FROM bar
SELECT * FROM foo
...is equivalent to running:
SELECT x.*
FROM (SELECT * FROM bar) x
A MySQLDump will never contain rows to be inserted into a view...
Also why can't I use subqueries in my view????
That, sadly, is by (albeit questionable) design. There's numerous limitations for MySQL views, which are documented: http://dev.mysql.com/doc/refman/5.0/en/create-view.html
So if it's just an imaginary table/prepared statement does that mean it theoretically has the same performance (or even less) as a normal table/query?
No.
A table can have indexes associated, which can make data retrieval faster (at some cost for insert/update). Some databases support "materialized" views, which are views that can have indexes applied to them - which shouldn't be a surprise that MySQL doesn't support given the limited view functionality (which only began in v5 IIRC, very late to the game).
Because a view is a derived table, the performance of the view is only as good as the query it is built on. If that query sucks, the performance issue will just snowball... That said, when querying a view - if a view column reference in the WHERE clause is not wrapped in a function (IE: WHERE v.column LIKE ..., not WHERE LOWER(t.column) LIKE ...), the optimizer may push the criteria (called a predicate) onto the original query - making it faster.

I ran into the same problem also (to my surprise, because my search seems to indicate that Oracle and MS do support it).
I get around this limitation (at least for now, until proven non-usable) by creating two additional views for my final view.
Example:
CREATE VIEW Foo1 AS
SELECT * FROM t ORDER BY ID, InsertDate DESC
CREATE VIEW Foo2 AS
SELECT * FROM Foo1 GROUP BY ID
CREATE VIEW Foo AS
SELECT * FROM Foo2 ORDER BY ID
The example above basically has a table 't' which is a temporal table containing all the revisions. My 'Foo' (view) basically is a simple view of only my most current revisions of each record. Seems to work alright for now!
Update:
I don't know if this is another bug in MySQL 5.1, but the above example doesn't in fact work! The 'Foo1' works as expected, but the 'Foo2' seems to ignore the order prior to grouping so my end result is not what is intended. I even get the same result if I change the 'DESC' for 'ASC' (surprisingly).
Also, if you read the 17.5.1. View Syntax section, it clearly states:
"A view can be created from many kinds of SELECT statements. It can refer to base tables or other views. It can use joins, UNION, and subqueries."
I'm going to update my database to 5.6 and try it again!

The difference is :
for view you can only have subqueries in the where - part, not in the from - part so a
CREATE VIEW v AS SELECT * FROM foo WHERE id IN (SELECT id FROM bar)
would work - but at the same time you get a read-only view ... A simple view on a single table would allow to update "through" the view to the underlying table

Related

What exactly is a sql "view"

I suppose the definition might be different for different databases (I've tagged a few databases in the question), but suppose I have the following (in pseudocode):
CREATE VIEW myview FROM
SELECT * FROM mytable GROUP BY name
And then I can query the view like so:
SELECT * FROM myview WHERE name like 'bob%'
What exactly is the "view" doing in this case? Is it just a short-hand and the same as doing:
SELECT * FROM (
SELECT * FROM mytable GROUP BY name
) myview WHERE name like 'bob%'
Or does creating a view reserve storage (or memory, indexes, whatever else)? In other words, what are the internals of what happens when a view is created and accessed?
A view is a name that refers to a stored SQL query. When referenced, the definition of the query are replaced in the referencing query. It is basically the short-hand that you describe.
A view is defined by the standard and is pretty much the same thing across all databases.
A view does not permanently store data. Each time it is referenced the code is run. One caveat is that -- in some databases -- the view may be pre-compiled, so the pre-compiled code is actually included in the query plan.
By contrast, some databases support materialized views. These are very different beasts and they do store data.
Some other reasons for views:
Not everyone is a SQL expert so the Data Base Administrator might develop views consisting of complex joins on multiple tables to provide users easy access to the data they might need to access but might not know how to best do that.
On some databases you can also create read-only views. Again, a DBA might create these to limit what operations a user can perform on certain tables.
A DBA might also create a view to limit what columns of a table a user can see.

SQL Server, VIEW has mixed up data

My team and I are using Microsoft SQL Server 2014 - Standard Edition (64-bit)
We have created some views as we usually do, they work(ed) normally while we were coding and testing.
Then suddenly, one of our QA noticed that the data in the application was mixed up, for example, description data was in the name field, name was where sex was supposed to be, etc.
We cheched data in the Tables and it was correct, then we checked the view, querying the view like this SELECT * FROM VIEW and realized that the view had the data mixed up.. The next logic step was to check the view queries, for our surprise all the queries were correct. so what was happening?
Well, that is the question, why the data in a view is corrupt or mixed up if the queries within the view are correct and they were working well for long time?
We just ALTERED the view, not modifying anything and that fixed the issue.
But, we need to know the cause of the data corruption, because we don't want to monitor and alter views all the time.
VIEW CODE AS REQUESTED
ALTER VIEW [dbo].[pvvClient] AS
SELECT *
FROM Table
INNER JOIN Table 2 ON.....
The first thing that came to my mind was that the Table(s) has changed and that raised this behavior, do you think SCHEMABINDING can help to avoid this kind of issues
When you put * in the column list of a view and the underlying tables change your view will not automatically update to include the changed columns. In fact, if you delete a column you can get the data mixed up across columns. This has been discussed and documented many times. Aaron Bertrand has a great article covering this topic.
Bad habits to kick : using SELECT * / omitting the column list
Moral of the story, avoid using select * unless the select is inside an EXISTS.
It doesn't make sense that simply altering the view without changing any of the code would fix it, but an important point is this part of your view...
select pvtConsumidorFinanciero.*
If this table definition changes... that is, if more columns are added or some are removed, the columns in this view would also change. That is why it is good practice to never select * in a view, especially when querying another view.
Additionally, this table could have the same column names as other tables.
What also could have happened is in your application, you are select * from view. Again, if a DBA changed the view, this could mess up your application, so i would avoid it an explicitly list the columns you want returned in the order you want them returned.
I think this is is also part of the answer:
When you create views it is a good practice to use SCHEMABINDING, this way when you alter the table under the view, you are forced to review your view as well.

use of views to protect the actual tables in sql

how do views act as a mediator between the actual tables and an end user ? what's the internal process which occurs when a view is created. i mean that when a view is created on a table, then does it stands like a wall between the table and the end user or else? how do views protect the actual tables, only with the check option? but if a user inserts directly into the table then how come do i protect the actual tables?
if he/she does not use : insert into **vw** values(), but uses: insert into **table_name** values() , then how is the table protected now?
Non-materialized views are just prepackaged SQL queries. They execute the same as any derived table/inline view. Multiple references to the same view will run the query the view contains for every reference. IE:
CREATE VIEW vw_example AS
SELECT id, column, date_column FROM ITEMS
SELECT x.*, y.*
FROM vw_example x
JOIN vw_example y ON y.id = x.id
...translates into being:
SELECT x.*, y.*
FROM (SELECT id, column, date_column FROM ITEMS) x
JOIN (SELECT id, column, date_column FROM ITEMS) y ON y.id = x.id
Caching
The primary benefit is caching because the query will be identical. Queries are cached, including the execution plan, in order to make the query run faster later on because the execution plan has been generated already. Caching often requires queries to be identical to the point of case sensitivity, and expires eventually.
Predicate Pushing
Another potential benefit is that views often allow "predicate pushing", where criteria specified on the view can be pushed into the query the view represents by the optimizer. This means that the query could scan the table once, rather than scan the table in order to present the resultset to the outer/ultimate query.
SELECT x.*
FROM vw_example x
WHERE x.column = 'y'
...could be interpreted by the optimizer as:
SELECT id, column, date_column
FROM ITEMS
WHERE x.column = 'y'
The decision for predicate pushing lies solely with the optimizer. I'm unaware of any ability for a developer to force the decision, only that it really depends on the query the view uses and what additional criteria is being applied.
Commentary on Typical Use of Non-materialized Views
Sadly, it's very common to see a non-materialized SQL view used for nothing more than encapsulation to simplify writing queries -- simplification which isn't a recommended practice either. SQL is SET based, and doesn't optimize well using procedural approaches. Layering views on top of one another is also not a recommended practice.
Updateable Views
Non-materialized views are also updatable, but there are restrictions because a view can be made of numerous tables joined together. An updatable, non-materialized view will stop a user from being able to insert new records, but could update existing ones. The CHECK OPTION depends on the query used to create the view for enforcing a degree of update restriction, but it's not enough to ensure none will ever happen. This demonstrates that the only reliable means of securing against unwanted add/editing/deletion is to grant proper privileges to the user, preferably via a role.
Views do not protect tables, though they can be used in a permissions-based table-protection scheme. Views simply provide a convenient way to access tables. If you give a user access to views and not tables, then you have probably greatly restricted access.

can I rely on the order of fields in a sql view?

If I create a view and select my fields in the order I want to "receive" them in can I be fully assured that I can call "Select * from myView" from my apps instead of specifying ALL of the fieldnames yet again in my select query?
I ask this because I pass whole datarows to my DataModels and construct the objects by assigning properties to the different indexes in the itemarray attached to this datarow. If these fields get out of order there's no telling what could happen to my object.
I know that I can't rely on an order-by that lives inside of a view (been burned before on this one). But the order of the fields I was not sure about.
Sorry if this is sql noob level. We all start somewhere with it. Right now all the extraneous field names in my app code is making readability somewhat difficult so if I can safely go back and replace a lot of syntax with a * then that would be great.
These tables are small so i'm not worried about implications of using a * over individual fields. I'm just looking to not code unnecessary syntax.
Column order is guaranteed, row order (as you noted) is not.
Column order may not be guaranteed or reliable if both of these are true
the view definition has SELECT * or SELECT tableA.* internally
any changes are made to the table(s) concerned
You'd need to run sp_refreshview: see this question/answer for potential issues.
Of course, if you have simple SELECT * FROM table in a view, why not just use the table and save some maintenance pain?
Finally, and I have to say it, it isn't recommeded to use SELECT *... :-)
Yes, left-to-right ordering of columns is guaranteed in SQL. In fact, it's one of the top three flaws used to prove that SQL is not truly relational (e.g. see The Importance of Column Names by Hugh Darwen), duplicate rows and the NULL value being the other two.
Yes, I've always relied on select * returning fields in the order specified in the view or table.
For example Microsoft SQL - "* Specifies that all columns from all tables and views in the FROM clause should be returned. The columns are returned by table or view, as specified in the FROM clause, and in the order in which they exist in the table or view."

Performance when querying a View

I'm wondering if this is a bad practice or if in general this is the correct approach.
Lets say that I've created a view that combines a few attributes from a few tables.
My question, what do I need to do so I can query against this view as if it were a table without worrying about performance?
All attributes in the original tables are indexed, my concern is that the result view will have hundreds of thousands of records, which I will want to narrow down quite a bit based on user input.
What I'd like to avoid, is having multiple versions of the code that generates this view floating around with a few extra "where" conditions to facilitate the user input filtering.
For example, assume my view has this header VIEW(Name, Type, DateEntered) this may have 100,000+ rows (possibly millions). I'd like to be able to make this view in SQL Server, and then in my application write querlies like this:
SELECT Name, Type, DateEntered FROM MyView WHERE DateEntered BETWEEN #date1 and #date2;
Basically, I am denormalizing my data for a series of reports that need to be run, and I'd like to centralize where I pull the data from, maybe I'm not looking at this problem from the right angle though, so I'm open to alternative ways to attack this.
My question, what do I need to do so I can query against this view as if it were a table without worrying about performance?
SQL Server is very good in view unnesting.
Your queries will be as efficient as if the view's query were used in the query itself.
This means that
CREATE VIEW myview AS
SELECT *
FROM /* complex joins */
SELECT *
FROM mytable
JOIN myiew
ON …
and
SELECT *
FROM mytable
JOIN (
SELECT *
FROM /* complex joins */
) myview
ON …
will have the same performance.
SQL Server 2005 has indexed views - these provide indexes on views. That should help with performance. If the underlying tables already have good indexes on the queried fields, these will be used - you should only add indexed views when this is not the case.
These are known in other database systems as materialized views.
The view will make use of the index in your WHERE clause to filter the results.
Views aren't stored result sets. They're stored queries, so you'll have the performance gained from your indexes each time you query the view.
Why would it perform badly? I, mean you can think of a view as a compiled select statement. It makes use of existing indexes on the underlying tables, even when you add extra where clauses. In my opinion it is a good approach. In any case it's better than having virtually the same select statement scattered all over your application (from a design and maintainability point of view at least).
If not indexed then...
When you query a view, it's ignored. The view is expanded into the main query.
It is the same as querying the main tables directly.
What will kill you is view on top of view on top of view, in my experience.
It should, in general, perform no worse than the inline code.
Note that it is possible to make views which hide very complex processing (joins, pivots, stacked CTEs, etc), and you may never want anyone to be able to SELECT * FROM view on such a view for all time or all products or whatever. If you have standard filter criteria, you can use an inline table-valued function (effectively a parameterized view), which would require all users to supply the expected parameters.
In your case, for instance, they would always have to do:
SELECT Name, Type, DateEntered
FROM MyITVF(#date1, #date2);
To share the view logic between multiple ITVFs, you can build many inline table-valued functions on top of the view, but not give access to the underlying tables or views to users.