I have made a materialized view from a table named "table1". the table "table1" gets updated with a huge amount of data every 5 days. I need to now refresh the materialized view 2 hours after the table1 is changed. How do i achieve this? I found some answers regarding cronjob, pgadmin. I tried using a trigger function but i am stuck how do i execute it after a specific time after there is update to the table? Could someone help me out? A small rough example will be sufficient, i am kind of new to the database and all.
Related
I have a join request and the data can change during the day and from date to date (deleted rows), so I want to keep certain data by picking them and save them elsewhere the next day from 3 months.
Usually, I would do a materialized view (for performances / do not touch the production tables) and refresh it every night /or on logs, but the issue here is that I want to be able to ADD the new data from yesterday and do not update the whole mview (data will be deleted from the mview then) and say: what is older than 3 months can be deleted.
How can I do this? Maybe I'm totally wrong thinking about mview and the only way is with dbms_scheduler?
Use your own table, then. Schedule a job (using dbms_scheduler you mentioned) which will
insert new rows (dated yesterday)
delete rows older than 3 months
Properly index it so that you'd be able to fetch "archive" data faster than without an index. Don't forget to regularly gather statistics on both table and index(es).
Currently I have around 1000 tables in which I need to track around 500 tables in various bigquery datasets and generate a report or create of dashboard.so that we can monitor and act promptly if a table is not refreshed.
Could someone please tell me how can I do that with minimal usage of Bigquery slots.
I think you should be able to query the last modification time as shown here:
https://cloud.google.com/bigquery/docs/dataset-metadata
You could then add a table with the max allowed time interval for a table to be updated and include that table in the query to create your own alerts.
drftr
There is a Preview feature INFORMATION_SCHEMA.PARTITIONS giving you the LAST_MODIFIED_TIME per table in a dataset
select *
from yourDataset.INFORMATION_SCHEMA.PARTITIONS;
Twice a day, I run a heavy query and save the results (40MBs worth of rows) to a table.
I truncate this results table before inserting the new results such that it only ever has the latest query's results in it.
The problem, is that while the update to the table is written, there is technically no data and/or a lock. When that is the case, anyone interacting with the site could experience an interruption. I haven't experienced this yet, but I am looking to mitigate this in the future.
What is the best way to remedy this? Is it proper to write the new results to a table named results_pending, then drop the results table and rename results_pending to results?
Two methods come to mind. One is to swap partitions for the table. To be honest, I haven't done this in SQL Server, but it should work at a low level.
I would normally have all access go through a view. Then, I would create the new day's data in a separate table -- and change the view to point to the new table. The view change is close to "atomic". Well, actually, there is a small period of time when the view might not be available.
Then, at your leisure you can drop the old version of the table.
TRUNCATE is a DDL operation which causes problems like this. If you are using snapshot isolation with row versioning and want users to either see the old or new data then use a single transaction to DELETE the old records and INSERT the new data.
Another option if a lot of the data doesn't actually change is to UPDATE / INSERT / DELETE only those records that need it and leave unchanged records alone.
I need to replicate a table from an external db to an internal db for performance reasons. Several apps will use this local db to do joins and compare data. I only need to replicate every hour or so but if there is a performance solution, I would prefer to replicate every 5 to 10 minutes.
What would be the best way to replicate? The first thing that comes to mind is DROP and then CREATE:
DROP TABLE clonedTable;
CREATE TABLE clonedTable AS SELECT * from foo.extern#data.sourceTable;
There has to be a better way right? Hopefully an atomic solution to avoid the fraction of a second where the table doesn't exist but someone might try to query it.
The simplest possible solution would be a materialized view that is set to refresh every hour.
CREATE MATERIALIZED VIEW mv_cloned_table
REFRESH COMPLETE
START WITH sysdate + interval '1' minute
NEXT sysdate + interval '1' hour
AS
SELECT *
FROM foo.external_table#database_link;
This will delete all the data currently in mv_cloned_table, insert all the data from the table in the external database, and then schedule itself to run again an hour after it finishes (so it will actually be 1 hour + however long it takes to refresh between refreshes).
There are lots of ways to optimize this.
If the folks that own the source database are amenable to it, you can ask them to create a materialized view log on the source table. That would allow your materialized view to replicate just the changes which should be much more efficient and would allow you to schedule refreshes much more frequently.
If you have the cooperation of the folks that own the source database, you could also use Streams instead of materialized views which would let you replicate the changes in near real time (a lag of a few seconds would be common). That also tends to be more efficient on the source system than maintaining the materialized view logs would be. But it tends to take more admin time to get everything working properly-- materialized views are much less flexible and less efficient but pretty easy to configure.
If you don't mind the table being empty during a refresh (it would exist, it would just have no data), you can do a non-atomic refresh on the materialized view which would do a TRUNCATE followed by a direct-path INSERT rather than a DELETE and conventional path INSERT. The former will be much more efficient but will mean that the table appears empty when you're doing joins and data comparisons on the local server which seems unlikely to be appropriate in this situation.
If you want to go down the path of having the source side create a materialized view log so that you can do an incremental refresh, on the source side, assuming the source table has a primary key, you'd ask them to
CREATE MATERIALIZED VIEW LOG ON foo.external_table
WITH PRIMARY KEY
INCLUDING NEW VALUES;
The materialized view that you would create would then be
CREATE MATERIALIZED VIEW mv_cloned_table
REFRESH FAST
START WITH sysdate + interval '1' minute
NEXT sysdate + interval '1' hour
WITH PRIMARY KEY
AS
SELECT *
FROM foo.external_table#database_link;
I have a table with a MVIEW Log, i would like to know if its suspicious to have :
SELECT count(*) from Table
8036132 rows
and
SELECT count(*) from MLOG$_Table
81657998 rows
Im asking this question because i get an error when trying to refresh my MVIEW
ORA-30036 : unable to extend segment by 4 in undo tablespace 'UNDOTBS1' and i would like to know if something could be done except of extending the undo Tablespace?
Thanks in advance
Yes, that is suspicious.
You need materialized view logs to be able to do a fast refresh. A fast refresh is really an incremental refresh: a refresh that only refreshes the last changes to avoid having to do a complete refresh, which could be time-consuming. If your materialized view log contains 10 times as much rows as your original table, then you defeat the purpose of a fast refresh.
I'd first look into why this materialized view log contains this much rows. If you can avoid that, then your other problem - the ORA-30036 - will likely disappear as well.
Regards,
Rob.