We are trying to use ARRAY_AGG as a pattern for MATERIALIZED views to get the 'latest' product event for a given productId.
The SQL below is a standard pattern (well documented on this site) and works in of itself, but in the context of BigQuery MVs, fails with the attached error.
We essentially want to use this type of SQL on a materialized view, where 'latest' is incrementally updated by BQ MV, rather than the alternative of schedule queries to reprocess all the events in product_events?
CREATE MATERIALIZED VIEW `project.product_events_latest`
AS
SELECT
ARRAY_AGG(
e ORDER BY PARSE_TIMESTAMP("%Y-%m-%dT%H:%M:%E*S%Ez", event.eventOccurredTime) DESC LIMIT 1
)[OFFSET(0)].*
FROM
`project.product_events` e
GROUP BY
e.productEvent.product.id
Im not sure what the unsupported feature is, and if there is a way to re-write to make it work differently or just isnt possible yet? Any help appreciated!
Currently BigQuery materialized view does not support OFFSET and other post-computation present on top of aggregate functions.
Instead, move [OFFSET(0)] into a regular view on top of the
materialized view
Related
What is the difference between Views and Materialized Views in Oracle?
Materialized views are disk based and are updated periodically based upon the query definition.
Views are virtual only and run the query definition each time they are accessed.
Views
They evaluate the data in the tables underlying the view definition at the time the view is queried. It is a logical view of your tables, with no data stored anywhere else.
The upside of a view is that it will always return the latest data to you. The downside of a view is that its performance depends on how good a select statement the view is based on. If the select statement used by the view joins many tables, or uses joins based on non-indexed columns, the view could perform poorly.
Materialized views
They are similar to regular views, in that they are a logical view of your data (based on a select statement), however, the underlying query result set has been saved to a table. The upside of this is that when you query a materialized view, you are querying a table, which may also be indexed.
In addition, because all the joins have been resolved at materialized view refresh time, you pay the price of the join once (or as often as you refresh your materialized view), rather than each time you select from the materialized view. In addition, with query rewrite enabled, Oracle can optimize a query that selects from the source of your materialized view in such a way that it instead reads from your materialized view. In situations where you create materialized views as forms of aggregate tables, or as copies of frequently executed queries, this can greatly speed up the response time of your end user application. The downside though is that the data you get back from the materialized view is only as up to date as the last time the materialized view has been refreshed.
Materialized views can be set to refresh manually, on a set schedule, or based on the database detecting a change in data from one of the underlying tables. Materialized views can be incrementally updated by combining them with materialized view logs, which act as change data capture sources on the underlying tables.
Materialized views are most often used in data warehousing / business intelligence applications where querying large fact tables with thousands of millions of rows would result in query response times that resulted in an unusable application.
Materialized views also help to guarantee a consistent moment in time, similar to snapshot isolation.
A view uses a query to pull data from the underlying tables.
A materialized view is a table on disk that contains the result set of a query.
Materialized views are primarily used to increase application performance when it isn't feasible or desirable to use a standard view with indexes applied to it. Materialized views can be updated on a regular basis either through triggers or by using the ON COMMIT REFRESH option. This does require a few extra permissions, but it's nothing complex. ON COMMIT REFRESH has been in place since at least Oracle 10.
Materialised view - a table on a disk that contains the result set of a query
Non-materiased view - a query that pulls data from the underlying table
Views are essentially logical table-like structures populated on the fly by a given query. The results of a view query are not stored anywhere on disk and the view is recreated every time the query is executed. Materialized views are actual structures stored within the database and written to disk. They are updated based on the parameters defined when they are created.
View: View is just a named query. It doesn't store anything. When there is a query on view, it runs the query of the view definition. Actual data comes from table.
Materialised views: Stores data physically and get updated periodically. While querying MV, it gives data from MV.
Adding to Mike McAllister's pretty-thorough answer...
Materialized views can only be set to refresh automatically through the database detecting changes when the view query is considered simple by the compiler. If it's considered too complex, it won't be able to set up what are essentially internal triggers to track changes in the source tables to only update the changed rows in the mview table.
When you create a materialized view, you'll find that Oracle creates both the mview and as a table with the same name, which can make things confusing.
Materialized views are the logical view of data-driven by the select query but the result of the query will get stored in the table or disk, also the definition of the query will also store in the database.
The performance of Materialized view it is better than normal View because the data of materialized view will be stored in table and table may be indexed so faster for joining also joining is done at the time of materialized views refresh time so no need to every time fire join statement as in case of view.
Other difference includes in case of View we always get latest data but in case of Materialized view we need to refresh the view for getting latest data.
In case of Materialized view we need an extra trigger or some automatic method so that we can keep MV refreshed, this is not required for views in the database.
I have a question about views. Consider we have a view that I insert a record in it's base table. Does my view update after this insert or I should do a SELECT to update?
I think my question obvious - is view just a SELECT or it's result save in database and if it's base table, so when it's updated it then becomes update?
Normal Views are not persisted. If an updateable view is inserted into then selecting from the view (or the affected underlying tables) will show your results.
Not entirely sure what problem you are trying to solve. Views (non-indexed) suffice for most applications.
Have a look at Indexed Views: Improving Performance with SQL Server 2008 Indexed Views:
In the case of a nonindexed view, the portions of the view necessary
to solve the query are materialized at run time. Any computations such
as joins or aggregations are done during query execution for each
query referencing the view. After a unique clustered index is
created on the view, the view's result set is materialized immediately
and persisted in physical storage in the database, saving the overhead
of performing this costly operation at execution time.
The typical use of an indexed view is when you have expensive aggregations to perform.
Think of a view as a select statement. Instead of having to write out the entire select statement, you just select the view and it runs that select statement for you. So yes, anything you do to the underlying tables will automatically be visible in the view.
When I create a view I am basically making a new table that will automatically be transacted upon when data in one of the tables it joins changes; is that correct?
Also why can't I use subqueries in my view?
A view works like a table, but it is not a table. It never exists; it is only a prepared SQL statement that is run when you reference the view name. IE:
CREATE VIEW foo AS
SELECT * FROM bar
SELECT * FROM foo
...is equivalent to running:
SELECT x.*
FROM (SELECT * FROM bar) x
A MySQLDump will never contain rows to be inserted into a view...
Also why can't I use subqueries in my view????
That, sadly, is by (albeit questionable) design. There's numerous limitations for MySQL views, which are documented: http://dev.mysql.com/doc/refman/5.0/en/create-view.html
So if it's just an imaginary table/prepared statement does that mean it theoretically has the same performance (or even less) as a normal table/query?
No.
A table can have indexes associated, which can make data retrieval faster (at some cost for insert/update). Some databases support "materialized" views, which are views that can have indexes applied to them - which shouldn't be a surprise that MySQL doesn't support given the limited view functionality (which only began in v5 IIRC, very late to the game).
Because a view is a derived table, the performance of the view is only as good as the query it is built on. If that query sucks, the performance issue will just snowball... That said, when querying a view - if a view column reference in the WHERE clause is not wrapped in a function (IE: WHERE v.column LIKE ..., not WHERE LOWER(t.column) LIKE ...), the optimizer may push the criteria (called a predicate) onto the original query - making it faster.
I ran into the same problem also (to my surprise, because my search seems to indicate that Oracle and MS do support it).
I get around this limitation (at least for now, until proven non-usable) by creating two additional views for my final view.
Example:
CREATE VIEW Foo1 AS
SELECT * FROM t ORDER BY ID, InsertDate DESC
CREATE VIEW Foo2 AS
SELECT * FROM Foo1 GROUP BY ID
CREATE VIEW Foo AS
SELECT * FROM Foo2 ORDER BY ID
The example above basically has a table 't' which is a temporal table containing all the revisions. My 'Foo' (view) basically is a simple view of only my most current revisions of each record. Seems to work alright for now!
Update:
I don't know if this is another bug in MySQL 5.1, but the above example doesn't in fact work! The 'Foo1' works as expected, but the 'Foo2' seems to ignore the order prior to grouping so my end result is not what is intended. I even get the same result if I change the 'DESC' for 'ASC' (surprisingly).
Also, if you read the 17.5.1. View Syntax section, it clearly states:
"A view can be created from many kinds of SELECT statements. It can refer to base tables or other views. It can use joins, UNION, and subqueries."
I'm going to update my database to 5.6 and try it again!
The difference is :
for view you can only have subqueries in the where - part, not in the from - part so a
CREATE VIEW v AS SELECT * FROM foo WHERE id IN (SELECT id FROM bar)
would work - but at the same time you get a read-only view ... A simple view on a single table would allow to update "through" the view to the underlying table
I have created a simple view consisting of 3 tables in SQL.
By right clicking and selecting Design, in the Object explorer table, i modified my custom view. I just added sortby asc in a field.
The problem is that the changes are not reflected in the outout of the View.
After saving the view, and selecting Open view the sort is not displayed in output.
So what is going on here?
Technically, it is possible to bake sorting into a VIEW but it is highly discouraged. Instead, you should apply sort while selecting from the view like so:
Select ...
From MyView
Order By SortByCol ASC
If you really wanted to know (but again, I would strongly recommend against this), you can use the TOP command to get around the limitation of sorting in the view:
Select TOP 100 PERCENT * Col1, Col2....
From Table1
Order By SortByCol ASC
It seems :
There is a restriction on the SELECT clauses in a view definition in SQL Server 2000, SQL 2005 and SQL 2008. A CREATE VIEW statement cannot include ORDER BY clause, unless there is also a TOP clause in the select list of the SELECT statement. The ORDER BY clause is used only to determine the rows that are returned by the TOP clause in the view definition. The ORDER BY clause does not guarantee ordered results when the view is queried, unless ORDER BY is also specified in the query itself.
There is also a hotfix that needs to be applied. After that you should use Top 100% to make sure that the order by works.
HTH
Generally, Views cannot be sorted.
(As others mentioned, there's a hack to do it, but since you are using a visual query designer rather than writing your view definition in SQL, it's probably difficult to implement that hack.)
You didn't actually "modify" your view, you only changed the SELECT statement that EM was using to select from your view. Sort settings are not retained in the view definition.
When you close the tab, EM doesn't remember your sort preference for that view, so when you open the view again, it comes out in whatever order SQL Server decides.
What is the difference between Views and Materialized Views in Oracle?
Materialized views are disk based and are updated periodically based upon the query definition.
Views are virtual only and run the query definition each time they are accessed.
Views
They evaluate the data in the tables underlying the view definition at the time the view is queried. It is a logical view of your tables, with no data stored anywhere else.
The upside of a view is that it will always return the latest data to you. The downside of a view is that its performance depends on how good a select statement the view is based on. If the select statement used by the view joins many tables, or uses joins based on non-indexed columns, the view could perform poorly.
Materialized views
They are similar to regular views, in that they are a logical view of your data (based on a select statement), however, the underlying query result set has been saved to a table. The upside of this is that when you query a materialized view, you are querying a table, which may also be indexed.
In addition, because all the joins have been resolved at materialized view refresh time, you pay the price of the join once (or as often as you refresh your materialized view), rather than each time you select from the materialized view. In addition, with query rewrite enabled, Oracle can optimize a query that selects from the source of your materialized view in such a way that it instead reads from your materialized view. In situations where you create materialized views as forms of aggregate tables, or as copies of frequently executed queries, this can greatly speed up the response time of your end user application. The downside though is that the data you get back from the materialized view is only as up to date as the last time the materialized view has been refreshed.
Materialized views can be set to refresh manually, on a set schedule, or based on the database detecting a change in data from one of the underlying tables. Materialized views can be incrementally updated by combining them with materialized view logs, which act as change data capture sources on the underlying tables.
Materialized views are most often used in data warehousing / business intelligence applications where querying large fact tables with thousands of millions of rows would result in query response times that resulted in an unusable application.
Materialized views also help to guarantee a consistent moment in time, similar to snapshot isolation.
A view uses a query to pull data from the underlying tables.
A materialized view is a table on disk that contains the result set of a query.
Materialized views are primarily used to increase application performance when it isn't feasible or desirable to use a standard view with indexes applied to it. Materialized views can be updated on a regular basis either through triggers or by using the ON COMMIT REFRESH option. This does require a few extra permissions, but it's nothing complex. ON COMMIT REFRESH has been in place since at least Oracle 10.
Materialised view - a table on a disk that contains the result set of a query
Non-materiased view - a query that pulls data from the underlying table
Views are essentially logical table-like structures populated on the fly by a given query. The results of a view query are not stored anywhere on disk and the view is recreated every time the query is executed. Materialized views are actual structures stored within the database and written to disk. They are updated based on the parameters defined when they are created.
View: View is just a named query. It doesn't store anything. When there is a query on view, it runs the query of the view definition. Actual data comes from table.
Materialised views: Stores data physically and get updated periodically. While querying MV, it gives data from MV.
Adding to Mike McAllister's pretty-thorough answer...
Materialized views can only be set to refresh automatically through the database detecting changes when the view query is considered simple by the compiler. If it's considered too complex, it won't be able to set up what are essentially internal triggers to track changes in the source tables to only update the changed rows in the mview table.
When you create a materialized view, you'll find that Oracle creates both the mview and as a table with the same name, which can make things confusing.
Materialized views are the logical view of data-driven by the select query but the result of the query will get stored in the table or disk, also the definition of the query will also store in the database.
The performance of Materialized view it is better than normal View because the data of materialized view will be stored in table and table may be indexed so faster for joining also joining is done at the time of materialized views refresh time so no need to every time fire join statement as in case of view.
Other difference includes in case of View we always get latest data but in case of Materialized view we need to refresh the view for getting latest data.
In case of Materialized view we need an extra trigger or some automatic method so that we can keep MV refreshed, this is not required for views in the database.