Periodically Deleting Rows in TSQL

Periodically Deleting Rows in TSQL - sql

I have an audit table setup which essentially mirrors one of my tables along with a date, user and command type. Here's how it might look like:
AuditID UserID Individual modtype user audit_performed
1 1239 Day Meff INSERT dbo 2010-11-04 14:50:56.357
2 2334 Dasdf fdlla INSERT dbo 2010-11-04 14:51:07.980
3 3324 Dasdf fdla DELETE dbo 2010-11-04 14:51:11.130
4 5009 Day Meffasdf UPDATE dbo 2010-11-04 14:51:12.777
Since these types of tables can get big pretty quick - I was thinking of putting in some sort of automatic delete of the older rows. So for example if I have 3 months of history - if I could delete the first month while retaining the last two. And again all of this must be automatic - I imagine once a certain date is hit, a query activates and deletes the oldest month with audit data. What is the best way to do this?
I'm using SQL Server 2005 by the way.

A SQL agent job should be fine here. You definitely don't need to do this on every single insert with a trigger. I doubt you even need to do it every day. You could schedule a job that runs once a month and clears out anything older than 2 months (so at most you'd have 3 months of data minus 1 day at any given time).

You could use SQL Server agent..you can schedule a repeating job like deleting entries from the current audit table after certain period. Here is how you would do it.
I would recommend storing the data in another table audit_archive table and deleting it from the current audit table. So, that in case you want some history you still have it and your table also doesn't get too big.

You could try a trigger every time a row is added it will clear anything older than 3 months.
You could also try SQL Agent to run a script every day that will do that.

Have you looked at using triggers? You could define a trigger to run when you add a row (on INSERT) that deletes any rows that are more than three months old.

Related

Oracle: Data consistency across multiple tables to be displayed

I have 3 reports based on 3 different tables, which ideally should match each other in audit.
They are updated sequentially once in a day.
The problem here is when one of the table is updated and second one is in progress, the customer sees data discrepancy between the reports for some time.
We tried the solution where in we commit after all 3 tables are updated but we started having issue on undo tbsp. The application have many other things running on.
I am looking for a solution where in we can restrict the user to show data to a specific point, and he must see updated data only after all 3 tables are refreshed/updated.

I think you can use select * for update for all 3 tables befor start updating procedure.
In that case users can select data and will see only not changed data till update session will not finish and make commit.

You can use a flashback query to show data as-of a point in time:
select * from table1 as of timestamp timestamp '2021-12-10 12:00:00';
The application would need to determine the latest time when the tables were synchronized - perhaps with a log table that records when the update process last started. However, the flashback query also uses the UNDO tablespace. But the query should at least use less UNDO since some of the committed transactions will now free up some space.

SQL Server : executing a trigger AFTER the data is committed. Is it possible?

I might be fighting a losing battle here, but after many attempts, I thought I should ask.
We have a 'use' table that data gets written to from various different sources and in different ways.
We have a procedure that looks at this table and based on that data calculates various metrics, inserting it into another table, this is run overnight.
I'd like to be able to keep this data table up-to-date as the use table changes, rather than just regenerate overnight.
I wrote a trigger for the use table that, based on the users being updated, ran the procedure just for those users and updated their data.
The problem, of course, is this data is calculated by looking at all records for that user in the use table and as the trigger fires during the insert/transaction, the calculation doesn't include the data being inserted.
I then thought I could change the trigger to insert the userids into another table and run a trigger on that table, so it does the calculations then, thinking it was another transaction. I was wrong.
Currently, the only working solution I have is to insert userids into this 2nd table and have a sql job running every 10 seconds waiting for userids to appear in the table and then running the sync proc from there, but it's not ideal as there are a lot of dbs this would need to run through, and it would be awkward to maintain.
Ideally I'd like to be able to run the trigger after the commit has taken place so all the records are available. I've looked at AFTER INSERT but even that runs before the commit, so I'm unsure as to what use it even is.
Anyone have any ideas, or am I trying to do something that is just not possible?
Really basic example of what I mean:
tblUse
userid, activityid
1, 300
1, 301
2, 300
3, 303
So userid 1 currently has 2 records in the table
userid 1 performs an activity and a record gets added.
So after the transaction has been committed, userid 1 will have 3 records in the table.
The problem I have is if I were to count the records at the point the trigger is run I'll still only get 2 as the 3rd hasn't committed yet.
I also can't just move the calculations into the trigger and union the INSERTED table data as it's a very big and complicated script that joins on multiple tables.
Thank you for any ideas.

Oracle materialized view add new data but without update

I have a join request and the data can change during the day and from date to date (deleted rows), so I want to keep certain data by picking them and save them elsewhere the next day from 3 months.
Usually, I would do a materialized view (for performances / do not touch the production tables) and refresh it every night /or on logs, but the issue here is that I want to be able to ADD the new data from yesterday and do not update the whole mview (data will be deleted from the mview then) and say: what is older than 3 months can be deleted.
How can I do this? Maybe I'm totally wrong thinking about mview and the only way is with dbms_scheduler?

Use your own table, then. Schedule a job (using dbms_scheduler you mentioned) which will
insert new rows (dated yesterday)
delete rows older than 3 months
Properly index it so that you'd be able to fetch "archive" data faster than without an index. Don't forget to regularly gather statistics on both table and index(es).

Cannot find last modification date for a table in Oracle SQL

I have two tables on my database, and I am trying to get the last DML (Insert, update or delete timestamp) using "SCN_TO_TIMESTAMP(MAX(ora_rowscn))" and "dba_tab_modifications" in Oracle 12DB.
Following is the information for the two table:
Table Name | Create Date | Last DML | SCN_TO_TIMESTAMP(MAX(ora_rowscn))
| |(as given from user)|
-----------+-------------+--------------------+-----------------------------------
Table1 | 25 SEP 2017 | 13 OCT 2020 |ORA-08181: specified number is not a valid system change number
| | |ORA-06512: at "SYS.SCN_TO_TIMESTAMP"
Table2 | 30 JAN 2017 | 29 OCT 2020
Following is the result:
Table Name | SCN_TO_TIMESTAMP(MAX(ora_rowscn)) |dba/all_tab_modifications
-----------+--------------------------------------+-------------------------
Table1 |ORA-08181: specified number is not a | NULL (0 row returned)
| valid system change number |
|ORA-06512: at "SYS.SCN_TO_TIMESTAMP" |
Table2 |29/OCT/20 03:40:15.000000000 AM | 29/OCT/20 03:50:52
Earliest date from dba/all_tab_modifications:
02/OCT/18 22:00:02
Can anyone share me a light why I am not able to get the last DML for Table1, but I ma able to get it for Table2?
I was thinking to execute "DBMS_STATS.FLUSH_DATABASE_MONITORING_INFO" as advised from other blogs. However, my question is if the DML for second table have been monitored, it should have already been flushed.
Both tables ae updated inside different store procedures under the same user ID.
Can anyone share me an idea on how can I get the last DML for the first table? Thanks in advance!

Realistically, if you need this information, you need to store it in the table, use auditing, or do something else to capture changes (i.e. triggers that populate a table of modifications).
max(ORA_ROWSCN) will work to give you the last SCN of a modification (note that by default, this is stored at the block level not at the row level, so rows with the max(ora_rowscn) aren't necessarily the most recently modified). But Oracle only maintains the mapping of SCN to timestamp for a limited period of time. In the documentation, Oracle guarantees it will maintain the mapping for 120 hours (5 days). If the last modification was more than a few days ago, scn_to_timestamp will no longer work. If your system has a relatively constant rate of SCN generation, you could try to build your own function to generate approximate timestamps but that could produce significant inaccuracies.
dba_tab_modifications is used by the optimizer to identify tables that need new stats gathered so that data is even more transient. If you have statistics gathering enabled every night, you'd expect that information about some tables would get removed every night depending on which tables had fresh statistics gathered. Plus, the timestamp isn't intended to accurately identify the time the underlying table was modified but the time that Oracle wrote the monitoring information.
If this is something you need going forward, you could
Add a timestamp to the table that gets populated when a row is modified.
Add some logging to the stored procedures that lets you identify when tables were modified.
Put a trigger on the table that logs modifications in whatever form is useful to you.
Use Oracle's built-in auditing to capture DML affecting the table.
If you're really determined, assuming that the database is in archivelog mode and that you have all the archived log files since each table was last modified, you could use LogMiner to read through each archived log and find the timestamp of the last modification. But that will be exceedingly slow and depends on your backup strategy allowing you to recover old log files back to the last change.

Saving delete information to temp tables using into

Is there a way of using into with delete keyword to save the deleted rows in a temp table in sql server?

Without using a trigger, the OUTPUT clause lets you do it
-- create if needed
SELECT * INTO #KeepItSafe FROM TheTable WHERE 0 = 1;
--redirect the deleted rows to the temp table using OUTPUT
DELETE TheTable OUTPUT DELETED.* INTO #KeepItSafe WHERE ...;

I guess my main question is what are you trying to do?
1 - Save a copy of the data before deleting?
Both the SELECT/INTO followed by DELETE or DELETE OUTPUT patterns will work fine.
But why keep the data??
2 - If you are looking to audit the table (inserts, updates, deletes), look at my how to prevent unwanted transactions slide deck w/code - http://craftydba.com/?page_id=880.
The audit table can hold information from multiple tables since the data is saved as XML. Therefore, you can un-delete if necessary. It tracks who and what made the change.
3 - If it is a one time delete, I use the SELECT INTO followed by the DELETE. Wrap the DELETE in a TRANSACTION. Check the copied data in the saved table. That way, unexpected deletions can be rolled back.
4 - Last but not least, if you are never going to purge the date from audit table or the copy table, why not mark the row as deleted but keep it for ever?
Many systems like people soft use effective dating to show if a record is no longer active. In the BI world this is called a type 2 dimensional table (slowly changing dimensions).
See the data warehouse institute article.
http://www.bidw.org/datawarehousing/scd-type-2/
Each record has a begin and end date. All active records have a end date of null.
Again, all the above solutions work.
The main question is what is the business requirements?
Sincerely
John
The Crafty Dba

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas