Implications of overwriting a corrupted database file

Implications of overwriting a corrupted database file - vba

We have an Access database split front-end / back end that frequently corrupts (for various reasons; bad architecture, bad code, too many users, slow network and so on, we're currently re-writing in SQL Server). Usually when this happens admins email all staff and ask them to exit the front end and any other files that link to the back end (e.g some reports in Excel have connections to it) so we can open the DB and have it auto-compact / repair when it detects its in a corrupted state.
Getting users out of the system is like herding cats and we're not always able to do it in good time. I've implemented a form timer event that checks a 3rd DB for a flag as to whether it should remain open, the idea is we set that flag to false when we need the front ends closed. This seems to be effective but I can't say for sure if it works on 100% of installs as sometimes we still experience that file is locked. This may be because of the Excel reports though these are viewed rarely.
Lately, rather than waiting for people to exit I've been making a copy of the corrupted DB before opening it, repairing the copy and then overwriting the original with the copied file when the repair is finished. This seems to work well.
My question is: What are the issues, if any, around overwriting the backend? Could it cause any problems that aren't immediately apparent? I've been doing this a few weeks now and haven't noticed any issues but it just feels like a bad practice. e.g. What happens to the lock file? Does that get updated automatically?

Not much, because the worst case already happened.
When copying an open Access database, there's a risk of open transactions and writes being half-finished, corrupting the database, not being committed, or trashing the VB Project part of the database.
But, the file is already corrupted, and when closing it if you have an open transaction, you will receive an error message (that's also a plausible cause for why your form timer isn't working).
I don't have statistics, but I think closing up a corrupt database by writing transactions to it is likely more dangerous than just copying it with open transactions, as these writes might overwrite stuff they shouldn't.
Of course, never do this when your database is not corrupted, since it can cause corruption if the database is not already corrupted.
Of course, if you have intermittent corruption, then the real issue should be preventing that from occurring, and the bug Gord Thompson referred to in a comment (this one) is very common and likely the culprit. It can go good 20 times in a row, until it goes wrong once and you'll have to revert to a backup, possibly losing data (or worse, not having a backup and losing much more data).

Related

Split MS Access Database takes a long time to open - back end keeps locking and unlocking

I have a split MS Access database. Most of the data is populated through SQL queries run through VBA. When I first connect to the back end data, it takes a long time and the back end file (.accdc file) locks and unlocks 3 or 4 times. It's not the same number of locks every time, but this locking and unlocking corresponds to taking a while to open. When I first open the front end, it does not connect to the back end. This step is done very quickly. The first time that I connect to the back end, it can take a while, though.
Any suggestions on things to look into to speed this up, and make it happen more reliably on the first try? This is a multi-user file and I was not wanting to make any changes to the registry since that would require making that update for everyone in my department. I'm mostly concerned about it taking a while to open, but thought the locking and unlocking seemed peculiar and might be contributing or a symptom of something else going on.

In most cases if you use a persistent connection, then the slow process you note only occurs once at startup.
This and some other performance tips can be found here:
http://www.fmsinc.com/MicrosoftAccess/Performance/LinkedDatabase.html
9 out of 10 times, the above will thus fix the "delays" when running the application. You can for testing simply open any linked tables, minimize that table, and now try running your code or startup form - note how the delays are gone.

Locking Metakit Database in TCL

What is the preferred way to lock a Metakit database from TCL?
Basically I have an application that reads/writes from a Metakit database file, and I'm worried that if the user has two instances of my application running, they could corrupt the database (by doing two writes at the same time).
I know I could use sockets to communicate between instances, but I'd rather not as that could conflict with existing software on the PC. I also thought about using a lock file, but if the process crashes the database would be permanently locked. I know on UNIX it's common to write the PID to a lock file, but I don't know how to tell if a process is still running in a cross platform way. My primary target is Windows.
I'm not totally opposed to adding some native code (compiled C binary), but thought there might be a better pure-TCL way first.
Thanks!

Using a lock file is not so unusual; even though database crashing can make the database hard to unlock. There are some simple workarounds for this issue.
place the lock in a place that gets cleaned up after a reboot; /tmp for unix
If the application opens and finds that the lock is still open; tell the user what's going on and suggest how to fix it; offer to delete the lock file (after issuing sufficient warning) or tell the user where it is so that they can take the risk of deleting it themselves.

The description from the Metakit page on commits says that there are a number of access modes that can be used to allow multiple readers concurrent with a single writer (probably using locking under the hood). Standard metakit is reasonably careful about not leaving its files in an inconsistent state, so I expect that it will handle all that side of things fairly well. What I don't know is how the features discussed on that page are exposed to Tcl scripts.

MS Access 2007 Performance Issues

I am building a database is Access 2007, and we don't even have any data yet but the database is constantly freezing. I used the built in performance checker, and it said everything was fine, but I am worried that the database will be unusably slow if I don't fix it soon.
Here is why I think it may be slow.
We have 300+ queries saved in the
database, all of which need to run
weekly.
We have 4 main reports and a sub
report for nearly all of the queries
above. Why? Because the 4 main
reports need information from all of
the queries, and we are using sub
reports as the source.
A few of our queries are pulling
information from at least 15 other
sub queries.
Other than this, I don't know why it could be slow, unless it's just my computer. Could someone pleas give me some insight about what might be wrong, how I might improve our database's performance, and if this amount of queries and sub reports is abnormally high.
Thanks,

Links to tables on network share, or even a default printer that is part of the network can cause many delays. One often used solution is to keep open (force) a persistent connection. During development you can simply in the front end open any linked table (one that is linked to the back end), and then minimize it. This often will solve those delays. The list of other things to check can be found here:
http://www.granite.ab.ca/access/performancefaq.htm
If the above persistent connection works, you also want to ensure in your startup code you open a connection to the back end to a global database var, or perhaps open up a table to a global reocrdset.

Is it better to better to close the access DB after every operation of keep it open for later operations

I am working on a VB.NET project that grabs data from an Access DB file. All the code snipeets I have come across open the DB, do stuff and close it for each operation. I currently have the DB open for the entire time the application is running and only close it when the application exits.
My question is: Is there a benefit to opening the connection to the DB file for each operation instead of keeping it open for the duration the application is running?

In many database systems it is good practice to keep connections open only when they are in use, since an open connection consumes resources in the database. It is also considered good practice for your code to have as little knowledge as possible about the concrete database in use (for instance by programming against interfaces such as IDbConnection rather than concrete types as OleDbConnection.
For this reason, it could be a good idea to follow the practice of keeping the connection open as little as possible regardless of whether it makes sense or not for the particular database that you use. It simply makes your code more portable, and it increases your chance of not getting it wrong, in case you in your next project happen to work against a system where keeping connections open is a bad thing to do.
So, your question should really be reversed: is there anything to gain by keeping the connection open?

There is no benefit with the Jet/ACE database engine. The cost of creating the LDB file (the record locking file) is very high. You could perhaps avoid that by opening the file exclusively (if it's a single user), but my sense is that opening exclusive is slower than opening multi-user.
The advice for opening and closing connections is based around an assumption of a database server being on the other end of the connection. If you consider how that works, the cost of opening and closing the connection is very slight, as the database deamon has the data files already open, and handles locking on the fly via structures in memory (and perhaps on disk -- I really don't know about how it's implemented in any particular server database) that already exist once the server is up and running.
With Jet/ACE, all users are contending for two files, the data file and the locking file, and setting that up is much more expensive than the incremental cost of creating a new connection to a server database.
Now, in situations where you're aiming for high concurrency with a Jet/ACE data store, there might have to be a trade-off among these factors, and you might get higher concurrency by being much more miserly with your connections. But I would say that if you're into that realm with Jet/ACE, you should probably be contemplating upsizing to a server-based back end in the first place, rather than wasting time on optimizing Jet/ACE for an environment it was not designed for.

Using a SQL Server for application logging. Pros/Cons?

I have a multi-user application that keeps a centralized logfile for activity. Right now, that logging is going into text files to the tune of about 10MB-50MB / day. The text files are rotated daily by the logger, and we keep the past 4 or 5 days worth. Older than that is of no interest to us.
They're read rarely: either when developing the application for error messages, diagnostic messages, or when the application is in production to do triage on a user-reported problem or a bug.
(This is strictly an application log. Security logging is kept elsewhere.)
But when they are read, they're a pain in the ass. Grepping 10MB text files is no fun even with Perl: the fields (transaction ID, user ID, etc..) in the file are useful, but just text. Messages are written sequentially, one like at a time, so interleaved activity is all mixed up when trying to follow a particular transaction or user.
I'm looking for thoughts on the topic. Anyone done application-level logging with an SQL database and liked it? Hated it?

I think that logging directly to a database is usually a bad idea, and I would avoid it.
The main reason is this: a good log will be most useful when you can use it to debug your application post-mortem, once the error has already occurred and you can't reproduce it. To be able to do that, you need to make sure that the logging itself is reliable. And to make any system reliable, a good start is to keep it simple.
So having a simple file-based log with just a few lines of code (open file, append line, close file or keep it opened, repeat...) will usually be more reliable and useful in the future, when you really need it to work.
On the other hand, logging successfully to an SQL server will require that a lot more components work correctly, and there will be a lot more possible error situations where you won't be able to log the information you need, simply because the log infrastructure itself won't be working. And something worst: a failure in the log procedure (like a database corruption or a deadlock) will probably affect the performance of the application, and then you'll have a situation where a secondary component prevents the application of performing it's primary function.
If you need to do a lot of analysis of the logs and you are not comfortable using text-based tools like grep, then keep the logs in text files, and periodically import them to an SQL database. If the SQL fails you won't loose any log information, and it won't even affect the application's ability to function. Then you can do all the data analysis in the DB.
I think those are the main reasons why I don't do logging to a database, although I have done it in the past. Hope it helps.

We used a Log Database at my last job, and it was great.
We had stored procedures that would spit out overviews of general system health for different metrics that I could load from a web page. We could also quickly spit out a trace for a given app over a given period, and if I wanted it was easy to get that as a text file, if you really just like grep-ing files.
To ensure the logging system does not itself become a problem, there is of course a common code framework we used among different apps that handled writing to the log table. Part of that framework included also logging to a file, in case the problem is with the database itself, and part of it involves cycling the logs. As for the space issues, the log database is on a different backup schedule, and it's really not an issue. Space (not-backed-up) is cheap.
I think that addresses most of the concerns expressed elsewhere. It's all a matter of implementation. But if I stopped here it would still be a case of "not much worse", and that's a bad reason to go the trouble of setting up DB logging. What I liked about this is that it allowed us to do some new things that would be much harder to do with flat files.
There were four main improvements over files. The first is the system overviews I've already mentioned. The second, and imo most important, was a check to see if any app was missing messages where we would normally expect to find them. That kind of thing is near-impossible to spot in traditional file logging unless you spend a lot of time each day reviewing mind-numbing logs for apps that just tell you everything's okay 99% of the time. It's amazing how freeing the view to show missing log entries is. Most days we didn't need to look at most of the log files at all... something that would be dangerous and irresponsible without the database.
That brings up the third improvement. We generated a single daily status e-mail, and it was the only thing we needed to review on days that everything ran normally. The e-mail included showed errors and warnings. Missing logs were re-logged as warning by the same db job that sends the e-mail, and missing the e-mail was a big deal. We could send forward a particular log message to our bug tracker with one click, right from within the daily e-mail (it was html-formatted, pulled data from a web app).
The final improvement was that if we did want to follow a specific app more closely, say after making a change, we could subscribe to an RSS feed for that specific application until we were satisfied. It's harder to do that from a text file.
Where I'm at now, we rely a lot more on third party tools and their logging abilities, and that means going back to a lot more manual review. I really miss the DB, and I'm contemplated writing a tool to read those logs and re-log them into a DB to get these abilities back.
Again, we did this with text files as a fallback, and it's the new abilities that really make the database worthwhile. If all you're gonna do is write to a DB and try to use it the same way you did the old text files, it adds unnecessary complexity and you may as well just use the old text files. It's the ability to build out the system for new features that makes it worthwhile.

yeah, we do it here, and I can't stand it. One problem we have here is if there is a problem with the db (connection, corrupted etc), all logging stops. My other big problem with it is that it's difficult to look through to trace problems. We also have problems here with the table logs taking up too much space, and having to worry about truncating them when we move databases because our logs are so large.
I think its clunky compared to log files. I find it difficult to see the "big picture" with it being stored in the database. I'll admit I'm a log file person, I like being able to open a text file and look through (regex) it instead of using sql to try and search for something.
The last place I worked we had log files of 100 meg plus. They're a little difficult to open, but if you have the right tool it's not that bad. We had a system to log messages too. You could quickly look at the file and determine which set of log entries belonged which process.

We've used SQL Server centralized logging before, and as the previous posted mentioned, the biggest problem was that interrupted connectivity to the database would mean interrupted logging. I actually ended up adding a queuing routine to the logging that would try the DB first, and write to a physical file if it failed. You'd just have to add code to that routine that, on a successful log to the db, would check to see if any other entries are queued locally, and write those too.
I like having everything in a DB, as opposed to physical log files, but just because I like parsing it with reports I've written.

I think the problem you have with logging could be solved with logging to SQL, provided that you are able to split out the fields you are interested in, into different columns. You can't treat the SQL database like a text field and expect it to be better, it won't.
Once you get everything you're interested in logging to the columns you want it in, it's much easier to track the sequential actions of something by being able to isolate it by column. Like if you had an "entry" process, you log everything normally with the text "entry process" being put into the "logtype" column or "process" column. Then when you have problems with the "entry process", a WHERE statement on that column isolates all entry processes.

we do it in our organization in large volumes with SQL Server. In my openion writing to database is better because of the search and filter capability. Performance wise 10 to 50 MB worth of data and keeping it only for 5 days, does not affect your application. Tracking transaction and users will be very easy compare to tracking it from text file since you can filter by transaction or user.
You are mentioning that the files read rarely. So, decide if is it worth putting time in development effort to develop the logging framework? Calculate your time spent on searching the logs from log files in a year vs the time it will take to code and test. If the time spending is 1 hour or more a day to search logs it is better to dump logs in to database. Which can drastically reduce time spend on solving issues.
If you spend less than an hour then you can use some text search tools like "SRSearch", which is a great tool that I used, searches from multiple files in a folder and gives you the results in small snippts ("like google search result"), where you click to open the file with the result interested. There are other Text search tools available too. If the environment is windows, then you have Microsoft LogParser also a good tool available for free where you can query your file like a database if the file is written in a specific format.

Here are some additional pros and cons and the reason why I prefer log files instead of databases:
Space is not that cheap when using VPS's. Recovering space on live database systems is often a huge hassle and you might have to shut down services while recovering space. If your logs is so important that you have to keep them for years (like we do) then this is a real problem. Remember that most databases does not recover space when you delete data as it simply re-uses the space - not much help if you are actually running out of space.
If you access the logs fequently and you have to pull daily reports from a database with one huge log table and millions and millions of records then you will impact the performance of your database services while querying the data from the database.
Log files can be created and older logs archived daily. Depending on the type of logs massive amounts of space can be recovered by archiving logs. We save around 6x the space when we compress our logs and in most cases you'll probably save much more.
Individual smaller log files can be compressed and transferred easily without impacting the server. Previously we had logs ranging in the 100's of GB's worth of data in a database. Moving such large databases between servers becomes a major hassle, especially due to the fact that you have to shut down the database server while doing so. What I'm saying is that maintenance becomes a real pain the day you have to start moving large databases around.
Writing to log files in general are a lot faster than writing to DB. Don't underestimate the speed of your operating system file IO.
Log files only suck if you don't structure your logs properly. You may have to use additional tools and you may even have to develop your own to help process them, but in the end it will be worth it.

You could log to a comma or tab delimited text format, or enable your logs to be exported to a CSV format. When you need to read from a log export your CSV file to a table on your SQL server then you can query with standard SQL statements. To automate the process you could use SQL Integration Services.

I've been reading all the answers and they're great. But in a company I worked due to several restrictions and audit it was mandatory to log into a database. Anyway, we had several ways to log and the solution was to install a pipeline where our programmers could connect to the pipeline and log into database, file, console, or even forwarding log to a port to be consumed by another applications.
This pipeline doesn't interrupt the normal process and keeping a log file at the same time you log into the database ensures you rarely lose a line.
I suggest you investigate further log4net that it's great for this.
http://logging.apache.org/log4net/

I could see it working well, provided you had the capability to filter what needs to be logged and when it needs to be logged. A log file (or table, such as it is) is useless if you can't find what you're looking for or contains unnecessary information.

Since your logs are read rarely, I would write them on file (better performance and reliability).
Then, if and only if you need to read them, I would import the log file in a data base (better analysis).
Doing so, you get the advantages of both methods.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas