Life cycle of an Oracle Materialized view - sql

I am looking for the life cycle of an Oracle materialized view. For example the statement:
Create materialized view foo
Refresh On Commit
...
Will this view be updated every time there is a commit to my database, or just one of the tables referenced in the view statement? Also beyond this at what point does Oracle destroy the old cache and replace it with the new one? Specifically what is the window of "staleness" for a materialized view? Meaning is it dependent on how long it takes to create the materialized view.

The ON COMMIT clause will modify the commit process of all transactions that issue DML on a base table:
Specify ON COMMIT to indicate that a fast refresh is to occur whenever the database commits a transaction that operates on a master table of the materialized view. This clause may increase the time taken to complete the commit, because the database performs the refresh operation as part of the commit process.
The commit will be dependent upon the success of the refresh of the materialized view (which means that a commit can fail because a dependent MV can't be refreshed).
The refresh takes place in the same transaction as the one that issues the commit. This means that as soon as the commit is complete, the changes are visible to all sessions (data is thus never stale).
Some of the things you have to be aware of:
The use of on-commit MVs has a performance cost: materialized view logs (adds DML "triggers" to the base table) increase the work on DML and obviously the commit will perform more work than usual. Benchmark your workload to make sure the extra work won't be a burden.
In aggregate on-commit MV, concurrent transactions can update the same MV row, which can lead to some contention during the commit on top of the extra work.
Some tools don't expect a commit to fail, this can lead to some UI problems (usually old client-server apps).

Related

Materialized View: How to automatically refresh it upon table data changes?

Is there a way in Oracle Materialized views so that it automatically refresh itself when there are changes on the tables used in the materialized view? What is the Refresh Mode and Refresh Method that I should use? What options should I use using Sql Developer?
Thank you in advance
Yes, you can define a Materialized View with ON COMMIT, e.g.:
CREATE MATERIALIZED VIEW sales_mv
BUILD IMMEDIATE
REFRESH FAST ON COMMIT
AS SELECT t.calendar_year, p.prod_id ... FROM ...
In this case after every commit the MV is refreshed, provided the last transaction was done on master table, of course.
Since refresh is done after each commit it is strongly recommendd to use FAST REFRESH, rather than COMPLETE this would last too long.
You have several restrictions and pre-conditions in order to use FAST REFRESH, check Oracle documentation: CREATE MATERIALIZED VIEW, FAST Clause for details.
I don't think there's any way to 'automatically' replicate the changes to the m.view right after they are made. But there are ways to use FAST (incremental) refresh on demand, you'd only have to schedule a job for the m.view or and m.view group to do the refresh. You can also use m.view log to keep track of all the dml and the have it propagated to the m.view with a fast refresh on a remote database through the db link.
If you need the changes to be replicated as soon as they are made, then I recommend using golden gate or streams (if you don't want do license GG). Just beware that oracle discontinued support for streams in favor of Golden Gate, so if you have any issues, you're on your own. But anyway, it's a pretty solid replication tool, once you get the hang of it.

Emulating materialized views in PostgreSQL with concurrent refreshes

I'm using PostgreSQL 9.2.4 and would like to emulate a materialized view. Are there any well-known methods for doing this, including concurrent refreshes?
PostgreSQL wiki - materialized views links to two trigger-based implementations.
The general idea is to put AFTER INSERT OR UPDATE OR DELETE ... FOR EACH ROW triggers on each involved table that do partial updates on the target table. Implementation is fairly specific to the nature of the view.
For some more complex views you can't really do partial updates and need to do a concurrent view refresh instead. That usually involves creating a new table, populating it, committing, beginning a new transaction, dropping the old table, renaming the new one to the name of the old one, and committing again.
Starting from 9.5, Postgres supports Concurrent Refresh as stated here in the official documentation. However, there are two preconditions that needs to be satisfied to do so:
You must create an unique index on the materialized view
The unique index must include all the records of the materialized view. In other words you cannot have a WHERE clause in your create index command.
The command to refresh the materialized concurrently view is following:
REFRESH MATERIALIZED VIEW CONCURRENTLY *mat_view_name*;
Note that refreshing the materialized view concurrently is relatively slower than the normal refresh. However, it will make sure that none of your queries on the materialized view is blocked during the concurrent refresh.

materialized view logging exclude deletes

I am using MVIEWs with Fast refresh to replicate some tables across a network. Everything works great, however I ran into an issue when considering my Delete/Purge process.
The source for the MVIEWs that are feeding the log tables have a data retention of 7 days. Ie I will be running a nightly purge process to delete data older than 7 days from current date.
The target MVIEWs however are on an ODS and have a data retention policy of 30 days. Also, these MVIEWs are NOT currently populating another schema or set of tables.
Problem is, when I Delete from the source tables, those delete statements will propagate through to the target MVIEWs and now I no longer have 30 days worth of data - only 7.
Is there a way to exclude logging DELETE for the MVIEW log tables? I noticed in the MLOG$_Table_Name there is a column 'DMLTYPE$$'. Could I somehow delete from the Log table all records where DMLTYPE$$ = 'D'?
Thanks everyone, and yes, I did try researching this online first.
Regards,
Steve
I suppose that you could manually delete data from the materialized view logs before running the refresh. That would probably work. But it would not be a solution that I'd be really comfortable with. It would be a very bespoke solution that would probably not be officially supported. And it if there might ever be another materialized view that depends on the materialized view log, you'd have to ensure that you're only deleting those rows that relate to your materialized view's subscription. Plus, the materialized view on the destination would need to be updatable in order for you to be able to manually remove the rows older than 30 days via a separate process.
If these are the business requirements, something like Oracle Streams (or GoldenGate) would be a much more appropriate architectural solution. Those products are designed to give you more flexibility about which logical change records (LCRs) you apply. In Streams, for example, it is easy enough to create a custom apply handler that discards delete LCRs. And since you're applying LCRs to a table on the destination rather than a materialized view, your 30 day purge process is much easier to manage. This would be a relatively common Streams setup rather than a very unique materialized view setup.

What happen with modifications queries when JDBC application abnormaly exited or connection dropped (Oracle is this essential)?

I call extensive update SQL statement and PL/SQL procedures.
What will happen with data when my application lose connection to DB or server halted or etc?
In case of SQL update command I think that it will be rollback.
For PL/SQL procedure I assume that code execution stopped at some time, any previous commit command will be applied but rest of code doesn't.
Am I right?
Yes it should rollback to the last rollback/commit call.
This became too long for a comment.
DDL statements (truncate, create, drop,...) implicitly commit. So if you do that in your stored procedure calls, everything before that statement will be committed whether you want or not. If the jdbc session is lost after the truncate, the changes before are still committed.
And yes, if you are inserting large volumes without intermediate commits, things can slow down. This is typically because you are building up rollback segments. There is a sweet spot with large inserts where you insert a batch of, say, 1,000 records at a time, committing after each batch.
What you are describing does not seem like normal transactional activity but more like bulk loading. If you are bulk loading, then maintain state so that you can restart the load or discard the records already loaded if you replay. Consider things like shipping as a file and importing (or using an external table) rather than necessarily inserting via a client connection. The APPEND hint and INSERT's NOLOGGING clause to speed up inserts (but note that the db will not be in a typical 'recoverable' state afterward and should be backed up again).

Should I run VACUUM in transaction or after?

I have a mobile application sync process. The transaction does a lot of modification on the database. Since this is done on mobile I need to issue a VACUUM to compact the database.
I am wondering when should I issue a VACUUM
in the transaction, as final statement
or after the transaction?
I am currently looking for SQLite, but if it's different for other engines, let me know in the answers (PostgreSQL, MySQL, Oracle, SQLServer)
Want it or not when using PostgreSQL you can't run VACUUM in transaction as stated in the manual:
VACUUM cannot be executed inside a transaction block.
I would say outside of the transaction. Certainly in PostgreSQL, VACUUM is designed to remove the "dead" tuples (i.e. the old row when a record has been changed or deleted.)
If you're running VACUUM in a transaction that has modified records, these dead rows won't have been marked for deletion.
Depending on which type of VACUUM you're doing, it may also require a table lock which will block if there are other transactions running, so you could potentially end up in a deadlock situation (transaction 1 is blocked waiting for a table lock to do its VACUUM, transaction 2 gets blocked waiting for a row to be released that transaction 1 has locked.)
I'd also recommend that this isn't done in an application (perhaps as a scheduled task) as it can take a while to complete and can negatively affect speed of other queries.
As for SQL Server, there is no VACUUM - what you're looking for is shrink. You can turn on auto shrink in 2005 which will automatically reclaim space when it the server decides, or issue a DBCC statement to shrink the database and log file, but this depends on your backup routine and strategy on a per-database level.
Vacuum is like defrag, it's good to do if youve recently deleted a lot of stuff, or maybe after youve inserted a lot of stuff, but by no means should you do it in every transaction. It's slower than almost any other database command and is more of a maintenance task.
We sometimes add/remove the majority of our db file, so then a vacuum would be a good idea, but I still would not consider it a part of the same transaction that did the work.
How frequently is the transaction run?
It's really a daily sort of process not a query by query process, but if you use it without full then it can be used in a transaction since it doesn't acquire a lock.
If your going to do it then it should be outside the transaction, since it is independent of the transactions data integrity.