I have an SQLite3 database, with an associated WAL file.
Is there a way, possibly a PRAGMA statement that will tell SQLite to IGNORE the WAL when reading the database?
The issue here is that the WAL contains a different page layout for one table, so if the database is read with the WAL file present, a row is missing. Reading the DB without the WAL shows the missing row.
So ultimately, I need a way of being able to either disable the WAL temporarily whilst I check what is missing, or is there a way to view all of the changes present within the WAL?
I hope that makes sense!
Related
To my understanding, database can postpone writing to table files to boost IO performance. When transaction is COMMITted, data are written to the WAL files.
I'm curious, how much delayed the writing to table files can be. In particular, when I use a simple SELECT, e.g.
SELECT * from myTable;
after COMMIT, is it possible that database has to retrieve data from the WAL files in addition to the table files?
The documentation talks about being able to postpone the flushing of data pages:
If we follow this procedure, we do not need to flush data pages to disk on every transaction commit.
What WAL files allow an RDBMS to do is keeping "dirty" data pages in memory and flushing them to disk at a later time. It does not work in a way that the data pages are modified at a later time.
So the answer to your question is "No, a SELECT is always retrieving data from the data pages, not from the WAL files".
PostgreSQL does not read from WAL for this purpose during normal operations. It would only read WAL in order to apply changes to the data files after a crash (or during replication onto another server).
When data from ordinary data files is changed, the pages of the data files are kept in shared memory (in shared_buffers) until they are written to disk. Any other processes wanting to see that data will find it in that shared memory, and it will be in its changed form. They will always look for it in shared_buffers before they try to read it from disk, so no one will ever see the stale on-disk version of the data, except for the recovery process after a crash.
i am interesting to know know how the isolation level"READ COMMITTED" is provided in Oracle DB implementation. I already know that DB makes records in REDO log, but for now i think that that REDO log is only used to repeat the transaction in case when some unpredictable crash will happen during the transaction. Also i know that DBWR writes the "dirty blocks" every time the REDO log file is filled. But my question is: if DBWR writes "dirty"(changed blocks) to the disk, how isolation level"READ COMMITTED" is provided. I mean during writing DBWR writes data directly to data files or in some special "place" on disk that is visible from current transaction and invisible from other transaction? So after the COMMIT this "place" becomes visible and that's all ? How this works in reality? Sorry for bad English.
In addition to the REDO log, you also have the UNDO tablespace.
When updating data, the old value is stored in the UNDO tablespace. When Oracle sees that you would be reading uncommitted data for a record, it reconstructs the old value from there.
UNDO is also used during database recovery: In addition to re-applying writes that have been committed but not made it to the database files before the crash, the opposite can also take place: rolling back uncommitted changes to database files that happened before the crash.
I'd like to dump a MySQL database in such a way that a file is created for the definition of each table, and another file is created for the data in each table. I'd like this to be done in a way that guarantees database integrity by locking the entire database for the duration of the dump. What is the best way to do this? Similarly, what's the best way to lock the database while restoring a set of these dump files?
edit
I can't assume that mysql will have permission to write to files.
If you are using InnoDB tables and therefore have transaction support, you don't need to lock the database at all you can just add the --single-transaction command line option. This gives you a consistent snapshot without locking anything by using the transaction mechanism.
If you don't have transaction support you can get what you describe with the --lock-tables command line option, this will lock the databases.
It's also worth noting that READ LOCKs aren't quite as good as they sound since any write operation will lock which will lock subsequent read operations, see this article for more detials
Edit:
I don't think it's possible to dump the format and data separately without the file writing permission and the --tab option. I think your only option is to roll your own mysqldump script that uses one of the transaction or READ LOCK mechanisms that mysqldump uses. Failing that it may be easier to just postprocess a single dumpfile into the format you want.
Story: today one of our customers asked us if all the data he deleted in the program was not recoverable.
Aside scheduled backups, we shrink the log file once a day, and we use the DELETE command to remove records inside our tables where needed.
Though, just for the sake of it, I opened the .mdf file with an editor (used PSPad), and searched for a particular unique piece of data -I was sure- was inside one of tables.
Problem: I tracked it in the file, then executed the DELETE command, and it was still there.
Question:
Is there a particular command we are not aware of to delete the records physically form the disk?
Note: we know there are particular techniques to recover lost data from the hard drives, but here I am talking about a notepad-wannabe!
The text may still be there, but SQL Server has no concept of that data having any structure or being available.
The "freed space" is simply deallocated: not removed, compacted or zeroed.
The "Instant File Initialization" feature relies on this too (not zeroing the entire MDF file) and previous disk data is still available eben for a brand new database:
Because the deleted disk content is overwritten only as new data is written to the files, the deleted content might be accessed by an unauthorized principal.
Edit: To reclaim space:
ALTER INDEX...WITH REBUILD is the best way
DBCC SHRINKFILE using NOTRUNCATE can compact pages into gaps caused by deallocated pages, but won't reclaim space in a page for deleted row
SQL Server just marks the space of deleted rows as available, but does not reorganize the database and does not zero out the freed up space. Try to "Shrink" the database, and the deleted rows should no longer be found.
Thanks, gbn, for your correction. A page is the allocation unit of the database, and shrinking a database only eliminates pages, but does not compact them. You'd have to delete all rows in a page in order to see them disappear after shrinking.
If your client is concerned about data security it should use Transparent Database Encryption. Even if you obliterate information from the table, the record is still in the log. Even when log is recycled, the info is still in the backups.
You could update the record with dummy values before issuing the delete, thereby overwriting the data on disk before the database marks it as free. (Whether this also works with LOB fields would warrant investigation, though).
And of course, you'd still have the problem of logs and backups, but I take it you already solved those.
For compliance reasons, when I delete a user's personal information from the database in my current project, the relevant rows need to be really, irrecoverably deleted.
The database we are using is postgres 8.x,
Is there anything I can do, beyond running COMPACT/VACUUM regularly?
Thankfully, our backups will be held by others, and they are allowed to keep the deleted information.
"Irrecoverable deletion" is harder than it sounds, and extends beyond your database. For example, are you planning on going back to all previous instances of your database on tape/backup where this row also exists, and deleting it there too?
Consider a regular deletion and the periodic VACUUMing that you mentioned before.
To accomplish the "D" in ACID, relational databases use a transaction log type system for changes to the database. When a delete is made that delete is made to a memory copy of the data (buffer cache) and then written to a transaction log file in synchronous mode. If the database were to crash the transaction log would be replayed to bring the system back to the correct state. So a delete exists in multiple locations where it would have to be removed. Only at some later time is the record "deleted" from the actual data file on disk (and any indexes). This amount of time varies depending on the database.
Do you back up your database? - If Yes, make sure you delete it from Back ups too.
Is that because of security risk? In that case, I'd change the data in the row and then delete the row.
Perhaps I'm off on a tangent, but do you really want to delete users like that? Most identity & access management approaches recommend keeping users around but in a flagged-as-deleted state, in order not to lose auditing ability (what has this user been up to in the previous five years)?
Deleting user information might be needed for integrity compliance reasons, or for nefarious black-hat purposes. In neither case is there a deletion method which guarantees that no traces could be left of the user's existence, as has been noted in other posts.
Perhaps you should elaborate as to why such an irrevocable delete is desirable...?
This is not something that you can do on the software side. Its a hardware issue to really delete it you need to physically destroy the drive.
How about overwriting the record with random characters/dates/numbers etc?