I am working on migrating a MS Access Database over to a newer SQL platform.
But, with all of the users who are currently using it, we're migrating slowly/carefully.
The first step is that we are re-writing the VBA code into C#, which is then deployed in a .dll along with the database.
Now, the VBA code calls into the C# to do the business logic, then the VBA continues to do the displays/UI, while Access still hosts the database.
The problem comes in that I have a report that is being run after the business logic from the C# in one place, and apparently MS Access has a cache, which clears every 5 seconds. So, the transaction that occurs in the C# code writes to the database, but the VBA code is still using the cache. This is causing errors, as the records added to the database (which the VBA report is trying to report on) don't exist in the cache yet...
I'm guessing that the C# .dll must be getting treated as a "second connection" to the MS Access database, which is what seems to typically cause this error in my searches (thinks that one process is writing, and the other is reading).
Since the cache is cleared out every 5 seconds, we can just put the process to sleep, and wake it up after 5 seconds, and then run the report, but that's pretty terrible for an end user.
And, making things difficult, the cache seems like it only gets used in the deployed version (so, when running from source / in debug mode, the error never happens).
Doing some searches, there seems to be plenty of people who have said "just refresh the cache." But, the question is: within VBA, how do you refresh the cache?
Any advice would be welcome.
Thanks
I've been fighting the same issue for years as I write a lot of tools around an old Powerbuilder application that has an Access MDB back end.
The cache does exist and it is VERY real. When data is inserted on a different connection than it is queried on, the cache can be directly observed and measured. It was also documented by Microsoft before they blackholed a bunch of their old articles...
Microsoft Jet has a read-cache that is updated every PageTimeout milliseconds (default is 5000ms = 5 seconds). It also has a lazy-write mechanism that operates on a separate thread to main processing and thus writes changes to disk asynchronously. These two mechanisms help boost performance, but in certain situations that require high concurrency, they may create problems.
I've found a couple workarounds that are not the best, but somewhat make due until I find something better or can re-write the app with a better back end database.
The seemingly best answer I've found (that may actually work for you since you say you need VBA) is to use JRO.RefreshCache. I've been trying to figure out how to implement this using C# or VB.net without any luck. Below is a link to a code example where you execute the RefreshCache method on your 2nd connection that needs to pull the data. I have not tested this myself.
https://documentation.help/MSJRO/jrmthrefreshcachex.htm
A workaround I've found that will deliver the query results within 500ms to 1000ms of insert time (instead of anywhere between 500 and 5000 ms - or more):
Use System.Data.ODBC instead of OleDB, with connection string: Driver={Microsoft Access Driver (*.mdb, *.accdb)};Dbq=;
If someone knows how to use the JRO.RefreshCache method with OLEDB and C# or VB.net, I'd be forever grateful. I believe the issue is it's looking for an ADO connection to be passed in, not an OLEDB connection.
I not aware of ANY suggesting that some 5 second cache exits? Where did this idea come from????
Furthermore, if you have 5 users, then you not going to be able to update their cache, are you?
In other words, the issue of some cache for one user still not going to solve or work with mutli-users anyway, is it?
The simple matter is if you load up a form with 100 reocrds, and then other users are ALSO working on that 100 rows, then all users will not see other changes until such time you tell access to re-load the form.
You can do this with a me.Refresh in the form, and then it will show changes made by other users (or even your c# code!!!).
However, that not really the soluion here.
How does near EVERY system deal with this issue?
Answer:
You don't, you "design" the software to take the user work flow into account.
So, in place of loading up a form with 100 rows of data? (which you should not, unless SUPER DUPER reason exists for doing that).
The you provide a UI in which the user FIRST searches for whatever it is they want to work on.
In other words, say you just booked a user on a tour. Now, they call the office back, and want to change some details of that tour. But, a different tour staff might pick up the phone. So, now a 2nd user opens the tour?
So, you solve that issue by NOT loading all the tours into that form in the first place.
you provide a search screen, so they can search for the user, find the user, maybe type in a invoice number or whatever.
You display the results in a pick list, and then launch the form to the ONE record (and perhaps detail records from child tables).
So there no concpet of a cache in Access anymore then there is in c#.
However, if you load up a datatable in c#, and then display that data?
Well, what about the other users on that system. They will not see changes to that data ANY MORE then the current access form.
So, if you want to update some data in c#? Then fine, but you need/want to do two things:
First, before you call any c# code that may update the current form reocrd? You need to FORCE a data save of that current record BEFORE you call any code, be it VBA code, or c# code that going to update that current reocrd the user is working on.
You can in Access save the current reocrd in MANY different ways, but the typical approach is:
' single record save - current record
if me.dirty then me.dirty = false
' VBA or c# code goes here.
' optional refresh the current form to reflect changes
me.Refresh
So, in most cases, it is the "design" of your software that will solve this issue.
For example, in the tour example, or in fact ANY system, the user can't work, can't update, and can't do their job UNLESS they first find/search and have a means to bring up that form + record data in the first place.
So, ANY typical good design will:
Ask the user for that name, invoce number or whatever.
Display the results of the search, and THEN allow the user to pick the record/data to work on. When they are done, they close that form and are RIGHT BACK to the search form to do battle with the next customer or task or phone call or whatever.
So, a search form might look like this:
In above, I typed in smi, and then displayed a pick list.
The user can further type in say part of the first name, and thus now get this:
So, maybe they type in a invoice number, customer number, booking number or whatever.
So, you display the results, and then they can select the row or "thing" to work on.
thus, we click on the row (or above glasses button), and then jump to the ONE record.
so, the user does whatever they have to do with the customer. Now, when done, they close the ONE thing, the ONE main reocrd.
This not only saves the data (so others in the office can now use that booking data), but it also means the data is saved. and they are NOW right back at the search screen, ready to do battle with the next customer.
So, not only does this mean we have a VERY bandwith friednly design (we only pull the one main reocrd into that form), but it also is better for work flow.
The Access form's cache thus becomes a non issue, since we only dealing with the one record.
And as I pointed out, if the system is multi-user, then you NOT going to be able to udpate and deal with multiple users cached data anyway, are you?
Think of ANY system you EVER used from a software point of view.
When you use google, does it download the WHOLE internet, and then you use ctrl-f to search megs and megs of data in the browser?
Nope!
you search first, get a list of that search, and THEN pick one!!
And when that list is display, maybe others on the internet are udpateing, and add new data - but if that was cached in your browser, then it would not work!!!
And same goes for a desktop accounting system. You don't load up all accounts, and THEN have the user go ctrl-f to search all the data. You search for the customer, invoice number and PICK ONE to work on.
And it does not make sense to load up a form with 1000 customers, and then go ctrl-f to find that customer. Same goes for a instant banking machine. It does not download ALL customers and THEN let you search. It asks you FIRST to get what you need. So, be it browser based, desktop based, or JUST ABOUT ANY software you use?
You quite much elminate the cache issue, since not pre-loading boatloads of data, but asking and letting the user search for the data they need.
So, in regards to the Access form data and cache?
If you are on a form, and call VBA code, or c# code or whatever?
If that code update the current form, you have NO MORE OR LESS of a issue when calling VBA code, or c# code!!!! If that code updates the current form, and the reocrd is dirty (has pending edits), then you get that message about the current form's reocrd having been udpated by another user!!!
So, your cache issue does NOT IN ANY WAY exist MORE or LESS as a issue in typical Access software.
As a genreal rule, if you are on a form with pending edits, and say want to pop up some form to edit releated data?
You have to ensure that pending edits are SAVED before you launch an form that can edit the same data, or run code that can/may edit that data.
As a result, ZERO cache issues should exist, and they no more or no less exist when calling sql or VBA update code in a form then calling some c# code from that form.
So, write the pending update for that form.
Then run your VBA, SQL, or c# code.
And then do a me.Refresh to display any changes made by those external routines.
there is no documetjion, or ANY article I can find that suggests some kind of 5 seocnd cache or update - it is a urban myth, and your software challenge here in regards to use c# or VBA, or even SQL server stored procedures?
They are all the same issue, and I dare say that often access is used as a front end to SQL server, and ALL OF the SAME issues exist when using SQL server with ms-access.
I have just started using RavenDB on a personal project and so far inserting, updating and querying have all been very easy to implement. However, I have come across a situation where I need a GetOrCreate method and I'm wondering what the best way to achieve this is.
Specifically I am integrating with OpenID and once authentication has taken place the user is redirected to my site. At this point I'd either like to retrieve their user record from Raven (by querying on the ClaimsIdentifier property) or create a new record. The user's ID is currently being set by Raven.
Obviously I can write this in two statements but without some sort of transaction around the select and the create I could potentially end up with two user records in the database with the same claims identifier.
Is there anyway to achieve this kind of functionality? Possibly even more importantly is do you think I'm going down the wrong path. I'm assuming even if I could create a transaction it would make scaling out to multiple servers difficult and in anycase could add a performance bottle-neck.
Would a better approach be to have the Query and Create operations as separate statements and check for duplicates when the user is retrieved and merge at that point. Or do something similar but on a scheduled task?
I can't help but feel I'm missing something obvious here so any advice on this problem would be greatly appreciated.
Note: while scaling out to multiple servers may seem unnessecary for a personal project, I'm using it as an evaluation of Raven before using it in work.
Dan, although RavenDB has support for transactions, I wouldn't go that way in your case. Instead, you could just use the users ClaimsIdentifier as the user documents id, because they are granted to be unique.
Alternatively, you can also stay with user ids being generated by Raven (HiLo btw) and use the new UniqueConstraintsBundle, which lets you attribute certain properties to be unique. Internally it will create an additional document that has the value of your unique property as its id.
We have a CMS built entirely in house. I'm the new web developer guy with literally 4 weeks of ColdFusion Experience. What I want to do is add version control to our dynamic pages. Something like what Wordpress does. When you modify a page in Wordpress it makes some database entires and keeps a copy of each page when you save it. So if you create a page and modifiy it 6 times, all in one day you have 7 different versions to roll back if necessary. Is there a easy way to do something similar in Coldfusion?
Please note I'm not talking about source control or version control of actual CFM files, all pages are done on the backend dynamically using SQL.
sure you can. just stash the page content in another database table. you can do that with ColdFusion or via a trigger in the database.
One way (there are many) to do this is to add a column called "version" and a column called "live" in the table where you're storing all of your cms pages.
The column called live is option but might make it easier for your in some ways when starting out.
The column "version" will tell you what revision number of a document in the CMS you have. By a process of elimination you could say the newest one (highest version #) would be the latest and live one. However, you may need to override this some time and turn an old page live, which is what the "live" setting can be set to.
So when you click "edit" on a page, you would take that version that was clicked, and copy it into a new higher version number. It stays as a draft until you click publish (at which time it's written as 'live')..
I hope that helps. This kind of an approach should work okay with most schema designs but I can't say for sure either without seeing it.
Jas' solution works well if most of the changes are to one field, for example the full text of a page of content.
However, if you have many fields, and people only tend to change one or two at a time, a new entry in to the table for each version can quickly get out of hand, with many almost identical versions in the history.
In this case what i like to do is store the changes on a per field basis in a table ChangeHistory. I include the table name, row ID, field name, previous value, new value, and who made the change and when.
This acts as a complete change history for any field in any table. I'm also able to view changes by record, by user, or by field.
For realtime page generation from the database, your best bet are "live" and "versioned" tables. Reason being keeping all data, live and versioned, in one table will negatively impact performance. So if page generation relies on a single SELECT query from the live table you can easily version the result set using ColdFusion's Web Distributed Data eXchange format (wddx) via the tag <cfwddx>. WDDX is a serialized data format that works particularly well with ColdFusion data (sorta like Python's pickle, albeit without the ability to deal with objects).
The versioned table could be as such:
PageID
Created
Data
Where data is the column storing the WDDX.
Note, you could also use built-in JSON support as well for version serialization (serializeJSON & deserializeJSON), but cfwddx tends to be more stable.
I am looking to understand how enterprise search solutions tackle the issue of user-permissions.
My question is on displaying the search results for users. The naive approach would display the search results to the user, and then if the user clicks a document he is not authorized to see, he will fail to open it. However, it is even forbidden to display a document's title or excerpt if the user does not have permission to read it. So do the various enterprise earch engines:
index each document together with its ACL?
index all documents with no permission info, but check each link in every search result to see whether the querying user has permission to view this link?
Option #2 makes more sense to me, but also seems much slower than option #1.
Option #1 suffers from the need to constantly update the changes in permissions on the indexed documents.
I am looking to understand what is the common approach in the existing solutions in the market today. Is there a third option?
I'm surprised to see that this 5 year old question hasn't got any answers, as I think it's quite a common and important problem in enterprise search.
As outlined in the question there are two common approaches to deal with document-level security:
early-binding-security: indexing ACL's along with the content, and
late-binding-security: handling security at query-time, by filtering out protected results
Handling security on content side only is never recommended as at that point in time confidential information might already have been revealed (e.g. title or preview of a document in the search result).
The advantage of implementing security with a late-binding approach is, that it's very flexible, because there is no need to re-index content upon changed ACLs. The biggest drawback however is, that by doing so, confidential information might be leaked via facet values, and it's not possible to retrieve and display correct facet counts. It also more difficult to properly populate the result list and handle pagination. Last but not least, this approach can significantly slow down the performance.
The advantage of implementing security with an early-binding approach is, that it addresses all of the above disadvantages for the price of re-indexing the content as soon as ACLs change. However, leaks are still possible, e.g. when a group membership or ACL just got changed and isn't reflected yet in the search index. To address this gap the two approaches early-binding and late-binding are often combined.
Last but not least there might be a third option, depending on the Enterprise Search Platform you are using: Attivio's Active Security is based on query time joins, which allows to index security information independent from the document itself, but at query time merges the two documents to ensure that only authorised content makes it into the search results.
Part of the setup routine for the product I'm working on installs a database update utility. The utility checks the current version of the users database and (if necessary) executes a series of SQL statements that upgrade the database to the current version.
Two key features of this routine:
Once initiated, it runs without user interaction
SQL operations preserve the integrity of the users data
The goal is to keep the setup/database routine as simple as possible for the end user (the target audience is non-technical). However, I find that in some cases, these two features are at odds. For example, I want to add a unique index to one of my tables - yet it's possible that existing data already breaks this rule. I could:
Silently choose what's "right" for the user and discard (or archive) data; or
Ask the user to understand what a unique index is and get them to choose what data goes where
Neither option sounds appealing to me. I could compromise and not create a unique index at all, but that would suck. I wonder what others do in this situation?
Check out SQL Packager from Red-Gate. I have not personally used it, but these guys make good tools overall and this seems to do what you're looking for. It let's you modify the script to customize the install:
http://www.red-gate.com/products/SQL_Packager/index.htm
You never throw a users data out. One possible option is to try and create the unique index. If the index creation fails, let them know it failed, tell them what they need to research, and provide them a script they can run if they find they have a data error that they choose to fix up.