Materialized view in Oracle? [closed]

Materialized view in Oracle? [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
What is Materialized view in Oracle? What is the use of it? I searched this topic on the net but i cannot able to get an clear idea of it. So can you please explain this topic with an clear example. So that it will be more useful for me to understand the topic clearly.

A Materialized view is an RDMS provided mechanism to trade additional storage consumption for better query performance.
For example suppose that you have a really big query with 10 table joins that takes a long time to return data. If you convert the query into a materialized view the results of this query will be materialized into a special db table on disk automatically, whats even better is that as rows are added/updated/deleted they are automatically reflected in the materialized view.
The tradeoff of this handy tool is slower inserts and updates on underlying tables though. Materialized views is one of the few redeeming qualities of Oracle IMHO.
Here is an example of a two table join MATERIALIZED VIEW.
CREATE MATERIALIZED VIEW MV_Test
NOLOGGING
CACHE
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS
SELECT V.*, P.*, V.ROWID as V_ROWID, P.ROWID as P_ROWID, (2+1) as Calc1, 'Constant1' as Const1
FROM TPM_PROJECTVERSION V
INNER JOIN TPM_PROJECT P ON P.PROJECTID = V.PROJECTID
Now instead of running this same query everytime you can just run this simpler query against your new view which will run faster. The really cool thing is that you can also add derived and calculated columns too.
SELECT * FROM MV_Test WHERE ...
P.S.
MATERIALIZED VIEWS are not a panacea, use them in cases where you have a really slow query with lots of joins that is frequently used and where the reads far outweigh the writes.

A (nearly) real-world example.
Suppose you were asked to develop an enterprise-wide real-time inventory report that will output total worth of inventory across all warehouses of the enterprise.
You would then need to create a query to
sum up all transaction stored in the inventory transaction table grouped by item and warehouse
join the sums with the table storing current price per unit of measure
sum up again per warehouse
In an enterprise, such a query would take hours to complete (even medium companies may have hundreds of thousands of different items) and its performance would deteriorate over time (imagine this query running over 5 years of data).
So, you would write the same (more or less) query as a materialized view. When created, oracle will populate a table (think of it as a hidden table) with the results of your query, and then, each time a transaction is commited to the inventory, will update the record that has to do with this specific item. If an item's price has changed, it will update its worth. In general, every change on the underlying tables will be reflected on your materialized view immediately. Then, your report will run at a very reasonable time.
On top of that, by using GROUPING BY and GROUPING you may get different levels of drilling on the same Materialized View.
Keep in mind, though, that this is an idealized example. In practice, ON COMMIT (i.e. updating the matierialized view the same time with your underlying tables) may cause problems when you create a materialized view over frequently update tables (and inventory transactions are usually that) and you may write, depending on the case, intermediate MVs to boost up performance. Refreshing such a view every 5 minutes is a viable alternative.
MVs are a very powerful feature, but you need to use them with care.

What is a Materialized View?
A materialized view is a replica of a target master from a single
point in time. The master can be either a master table at a master
site or a master materialized view at a materialized view site.
Whereas in multimaster replication tables are continuously updated by
other master sites, materialized views are updated from one or more
masters through individual batch updates, known as a refreshes, from a
single master site or master materialized view site, as illustrated in
Figure 3-1. The arrows in Figure 3-1 represent database links.
--http://docs.oracle.com/cd/A97630_01/server.920/a96567/repmview.htm
Example-- creates a materialized view for employees table. The refresh can be set to preference, so read documentation in link above.
CREATE MATERIALIZED VIEW employee_mv
REFRESH FORCE
BUILD IMMEDIATE
ON DEMAND
AS
SELECT * FROM employees
A materialized view can also only contain a subset of data
CREATE MATERIALIZED VIEW employee_mv
REFRESH FORCE
BUILD IMMEDIATE
ON DEMAND
AS
SELECT name, ssn, address FROM employees

Related

Oracle DB performance: view vs table

I have created a table that is a join of 3-4 other tables. The field values in the original source tables from which this table was created DO change, but rarely.
Updating or recreating the table takes about 30 mins-1 hour, and then some reports are run against it. However, this requires keeping track of any changes from the original source tables.
If, instead, I run reports off a VIEW, I know with 100% certainty that all the field values are correct - but will my SELECT performance suffer and become slower due to the view 'going back and fetching' values each time?
In this case, speed is on the same level of importance of accuracy, and my ultimate question is whether to use a view or a table. Thank you to anyone who's taken the time to read this!

Generated de-normalised View table

We have a system that makes use of a database View, which takes data from a few reference tables (lookups) and then does a lot of pivoting and complex work on a hierarchy table of (pretty much fixed and static) locations, returning a view view of data to the application.
This view is getting slow, as new requirements are added.
A solution that may be an option would be to create a normal table, and select from the view, into this table, and let the application use that highly indexed and fast table for it's querying.
Issue is, I guess, if the underlying tables change, the new table will show old results. But the data that drives this table changes very infrequently. And if it does - a business/technical process could be made that means an 'Update the Table' procedure is run to refresh this data. Or even an update/insert trigger on the primary driving table?
Is this practice advised/ill-advised? And are there ways of making it safer?

The ideal solution is to optimise the underlying queries.
In SSMS run the slow query and include the actual execution plan (Ctrl + M), this will give you a graphical representation of how the query is being executed against your database.
Another helpful tool is to turn on IO statistics, this is usually the main bottleneck with queries, put this line at the top of your query window:
SET STATISTICS IO ON;
Check if SQL recommends any missing indexes (displayed in green in the execution plan), as you say the data changes infrequently so it should be safe to add additional indexes if needed.
In the execution plan you can hover your mouse over any element for more information, check the value for estimated rows vs actual rows returned, if this varies greatly update the statistics for the tables, this can help the query optimiser find the best execution plan.
To do this for all tables in a database:
USE [Database_Name]
GO
exec sp_updatestats
Still no luck in optimising the view / query?
Be careful with update triggers as if the schema changes on the view/table (say you add a new column to the source table) the new column will not be inserted into your 'optimised' table unless you update the trigger.
If it is not a business requirement to report on real time data there is not too much harm in having a separate optimized table for reporting (Much like a DataMart), just use a SQL Agent job to refresh it nightly during non-peak hours.
There are a few cons to this approach though:
More storage space / duplicated data
More complex database
Additional workload during the refresh
Decreased cache hits

Google BigQuery - sync tables

I have 14 tables in BQ, which are updated several times a day.
Via JOIN of three of them, I have created a new one.
My question is, would this new table be updated each time new data are pushed into BQ tables on which it is based on? If not, is there way, how to make this JOIN "live" so the newly created table will be updated automatically?
Thank you!

BigQuery also supports views, virtual tables defined by a SQL query.
BigQuery's views are logical views, not materialized views, which means that the query that defines the view is re-executed every time the view is queried. Queries are billed according to the total amount of data in all table fields referenced directly or indirectly by the top-level query.
BigQuery supports up to eight levels of nested views.

You can create a view or materialized view so that your required data setup gets updated instantly but this queries underlying tables so beware of joining massive tables.
For more complex table sync from/to BQ and other Apps (two-way sync), I finally used https://www.stacksync.cloud/
It offers real-time update and eventually two-way sync. Check it out too for the less technical folks!

audit table vs. Type 2 Slowly Changing Dimension

In SQL Server 2008+, we'd like to enable tracking of historical changes to a "Customers" table in an operational database.
It's a new table and our app controls all writing to the database, so we don't need evil hacks like triggers. Instead we will build the change tracking into our business object layer, but we need to figure out the right database schema to use.
The number of rows will be under 100,000 and number of changes per record will average 1.5 per year.
There are at least two ways we've been looking at modelling this:
As a Type 2 Slowly Changing Dimension table called CustomersHistory, with columns for EffectiveStartDate, EffectiveEndDate (set to NULL for the current version of the customer), and auditing columns like ChangeReason and ChangedByUsername. Then we'd build a Customers view over that table which is filtered to EffectiveEndDate=NULL. Most parts of our app would query using that view, and only parts that need to be history-aware would query the underlying table. For performance, we could materialize the view and/or add a filtered index on EffectiveEndDate=NULL.
With a separate audit table. Every change to a Customer record writes once to the Customer table and again to a CustomerHistory audit table.
From a quick review of StackOverflow questions, #2 seems to be much more popular. But is this because most DB apps have to deal with legacy and rogue writers?
Given that we're starting from a blank slate, what are pros and cons of either approach? Which would you recommend?

In general, the issue with SCD Type- II is, if the average number of changes in the values of the attributes are very high, you end-up having a very fat dimension table. This growing dimension table joined with a huge fact table slows down the query performance gradually. It's like slow-poisoning.. Initially you don't see the impact. When you realize it, it's too late!
Now I understand that you will create a separate materialized view with EffectiveEndDate = NULL and that will be used in most of your joins. Additionally for you, the data volume is comparatively low (100,000). With average changes of only 1.5 per year, I don't think data volume / query performance etc. are going to be your problem in the near future.
In other words, your table is truly a slowly changing dimension (as opposed to a rapidly changing dimension - where your option #2 is a better fit). In your case, I will prefer option #1.

Refresh strategy for materialized views in a data warehouse

I have a system that has a materialized view that contains roughly 1 billion items, on a consistent two hour basis I need to update about 200 million (20% of the records). My question is what should the refresh strategy on my materialized view be? As of right now it is refresh with an interval. I am curious as to the performance impacts between refreshing on an interval vice refresh never and rename/replace the old materialized view with the new one. The underlying issue is the indices that are used by Oracle which creates a massive amount of redo. Any suggestions are appreciated.
UPDATE
Since some people seem to think this is off topic my current view point is to do the following:
Create an Oracle Schedule Chain that invokes a series of PL/SQL (programming language I promise) functions to refresh materialized view in a pseudo-parallel fashion. However, being as though I fell into the position of a DBA of sorts, I am looking to solve a data problem with an algorithm and/or some code.

Ok so here is the solution I came up with, your mileage may vary and any feedback is appreciated after the fact. The overall strategy was to do the following:
1) Utilize the Oracle Scheduler making use of parallel execution of chains (jobs)
2) Utilize views (the regular kind) as the interface from the application into the database
3) Rely on materialized views to be built in the following manner
create materialized view foo
parallel
nologging
never refresh
as
select statement
as needed use the following:
create index baz on foo(bar) nologging
The advantage of this is that we can build the materialized view in the background before dropping + recreating the view as described in step 2. Now the advantage is creating dynamically named materialized views, while keeping the view with the same name. The key is to not blow away the original materialized view until the new one is finished. This also allows for quick drops, as there is minimum redo to care about. This enabled materialized view creation on ~1 billion records in 5 minutes which met our requirement of "refreshes" every thirty minutes. Further this is able to be handled on a single database node, so even with constrained hardware, it is possible.
Here is a PL/SQL function that will create it for you:
CREATE OR REPLACE procedure foo_bar as
foo_view varchar2(500) := 'foo_'|| to_char(sysdate,'dd_MON_yyyy_hh_mi_ss');
BEGIN
execute immediate
'Create materialized view '|| foo_view || '
parallel
nologging
never refresh
as
select * from cats';
END foo_bar;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas