Merge the same Access database frequently (daily/weekly) - sql

I need to use one Access(2007)database on 2 offline locations and then get all the data back in one database. Some advised me to use SharePoint, but after some trial and frustration I wonder if it's really the best way.
Is it possible to manage this in an automated way, with update queries or so?
I have 26 tables, but only 14 need to be updated frequently. I use autonumber to create the parentkey and use cascade updating for the linked tables.

If your data can handle it, it's probably better to use a more natural key for the tables that require frequent updating. I.e. ideally you can uniquely identify a record my some combination of the columns in that record. Autonumbers in two databases can, and very likely will, step on each other, then when you do merge any records based on an old auto number need to be mapped properly. That can be done but is kind of a pain. It'd be nicer to avoid it all from the start.
As for using Sharepoint (I assume the suggestion is to replace your tables with lists, not to just put your accdb on SP) it has a lot of limitations in terms of the kinds of indices that can be created and relationships you can establish. Maybe your data are simple enough to live with this. I'm yet to be able to justify the move.
ultimate the answer to your question is YES it is possible to manage the synchonization with insert/update queries and very likely some VBA (possibly lots depending on how complicated your table hierarchy is). You'll need to be vigilant about two people updating a single record. You'll need to come up with some means to resolve the conflict.

Related

How can I create multiple relationships in MS Access at one time?

I feel that this should be a simple question, but I can't seem to find an answer anywhere.
I have an MS Access database where all the key fields have their proper key icon when I view the tables, but no relationships are defined. I need to create relationships between the "UnitID" key field for all the data tables. Some relationships are one-to-one and others are one-to-many (or one to none), but that doesn't matter, I don't need to enforce referential integrity. I just need to query the database, and worked with the query result tables, not add anything or change the data. All the UnitID fields have the same name.
Right now, I am just pulling up the relationship tab and dragging-and-dropping the names for each table, which takes forever. I can use the edit relationships icon that brings up a form, but it still needs to be re-opened for each table.
I am working with a government, publicly downloadable Access database. I realize Access isn't ideal, but that is the format it comes in and the program I'm am supposed to use for my job.
If there is a way to do it in the interface, that would be the best, since I can share it directly with others in my office who are unfamiliar with macros. But I have used VBA before for Excel and know some basic SQL. I've never used macros in Access, so I don't know what their capacities are; can this be done if there is no in-built functionality?
So are you talking about the Relationship Designer Window (Database Tools | Relationships menu option) in MS Access as pictured? With all the tables added, it takes about 5 seconds to click UnitID on one table, drag/drop to UnitID on another table and click Create. I guess it might take an hour or two to do them all?
Why must you have Relationships created at all? They don't define what Queries you can run. And if you don't need Referential Integrity, then I don't see much practical use for them anyhow.
If you can't get your Queries to run, then I would look elsewhere for the root of the problem.
By the way, once you get this problem solved, consider this: you may not need to actually create any Query result Tables if they are used as intermediate results. Since the result of a Query is a Table, then anywhere that the syntax mentions "Table", you can insert a Query. That is, Queries can be nested inside of other Queries. I mention this because you seem to be saying that you need a whole lot of result Tables, which in itself is going to get messy, not to mention that they will take up and lot of space and, worse, will be redundant and will have to recreated whenever your source Tables change (liable to be a maintenance nightmare).

Best way to ensure referential integrity

I'm a SQL noob, and whilst I'm aware of the major tools available, I'm not experienced enough to know the best tool for certain situations.
As an example, I current have a group of tables where referential integrity is needed. Each table does not have all the necessary columns itself to be able to constrain the data, so I have at least 3 options open to me.
Create other table/tables that connect the data together - apart from duplicated data, this leaves multiple files to keep synced.
Create a trigger - not too difficult, but how trustworthy is a trigger? And is it scalable?
Create a function - not something I've done before, but I came across an example showing how it could be used to constrain data stored across multiple tables.
Given what I'm trying to do - maintain integrity by joining data, what should I consider, and are all 3 methods suited to what I'm trying to do?
Here an example using a bridge table to link missing table:
Using foreign keys are the best (and fastest and lightest footstep) way to guarantee data consistency. If you want a table of States to guarantee that only valid state spellings (and states that you do business in) are added to sales orders (so when you search for all sales to New Jersey you only have to search for one spelling) to your other tables there is no easier way then to use FKs.

Permanent table, temp tables or php session?

My web app offers personalized recommendations. When a user starting to use it, about 1000+ rows are being inserted to one big recommendation table, correlating with other tables in the database. Every item the user votes for affects all of those 1000+ rows.
Since the recommendation info is only useful during the session, and since the recommendation table is getting huge, we'd like to switch to a more appropiate method. There's the possibility of deleting the relevant rows as soon as the user session is over. I guess PHP session array or temp tables are better for this case?
One temp table per session will lead to catalog pollution, so not really recommended.
Have you considered actually keeping the data, so as periodically mine it to improve the suggestions?
First: consider redesigning your data structure, I think it is not optimal.
Store a user's recommendation in a table user-recommendeditem-score: I don't see any need for a temp table or anything else.
Otherwise, you could start using sessions, but you should encapsulate the code carefully, making it easy to change if/when this solution is no more maintainable.
I suspect that the method is flawed - 1000+ recommendations per user? How many of them do they ever look at? If you don't know the answer to that question - then you need to spend some time thinking about why you don't know the answer.
Every item the user votes for affects all of those 1000+ rows
Are you sure your data is properly normalised?
But leaving that aside for the moment. The right place to generate / store that is in the database - a relational database is explicitly designed, and a lot more efficient about generating and maintaining tabular sets of data then a conventional programming language.

How can my application benefit from temporary tables?

I've been reading a little about temporary tables in MySQL but I'm an admitted newbie when it comes to databases in general and MySQL in particular. I've looked at some examples and the MySQL documentation on how to create a temporary table, but I'm trying to determine just how temporary tables might benefit my applications and I guess secondly what sorts of issues I can run into. Granted, each situation is different, but I guess what I'm looking for is some general advice on the topic.
I did a little googling but didn't find exactly what I was looking for on the topic. If you have any experience with this, I'd love to hear about it.
Thanks,
Matt
Temporary tables are often valuable when you have a fairly complicated SELECT you want to perform and then perform a bunch of queries on that...
You can do something like:
CREATE TEMPORARY TABLE myTopCustomers
SELECT customers.*,count(*) num from customers join purchases using(customerID)
join items using(itemID) GROUP BY customers.ID HAVING num > 10;
And then do a bunch of queries against myTopCustomers without having to do the joins to purchases and items on each query. Then when your application no longer needs the database handle, no cleanup needs to be done.
Almost always you'll see temporary tables used for derived tables that were expensive to create.
First a disclaimer - my job is reporting so I wind up with far more complex queries than any normal developer would. If you're writing a simple CRUD (Create Read Update Delete) application (this would be most web applications) then you really don't want to write complex queries, and you are probably doing something wrong if you need to create temporary tables.
That said, I use temporary tables in Postgres for a number of purposes, and most will translate to MySQL. I use them to break up complex queries into a series of individually understandable pieces. I use them for consistency - by generating a complex report through a series of queries, and I can then offload some of those queries into modules I use in multiple places, I can make sure that different reports are consistent with each other. (And make sure that if I need to fix something, I only need to fix it once.) And, rarely, I deliberately use them to force a specific query plan. (Don't try this unless you really understand what you are doing!)
So I think temp tables are great. But that said, it is very important for you to understand that databases generally come in two flavors. The first is optimized for pumping out lots of small transactions, and the other is optimized for pumping out a smaller number of complex reports. The two types need to be tuned differently, and a complex report run on a transactional database runs the risk of blocking transactions (and therefore making web pages not return quickly). Therefore you generally don't want to avoid using one database for both purposes.
My guess is that you're writing a web application that needs a transactional database. In that case, you shouldn't use temp tables. And if you do need complex reports generated from your transactional data, a recommended best practice is to take regular (eg daily) backups, restore them on another machine, then run reports against that machine.
The best place to use temporary tables is when you need to pull a bunch of data from multiple tables, do some work on that data, and then combine everything to one result set.
In MS SQL, Temporary tables should also be used in place of cursors whenever possible because of the speed and resource impact associated with cursors.
If you are new to databases, there are some good books by Joe Kelko that review best practices for ANSI SQL. SQL For Smarties will describe in great detail the use of temp table, impact of indexes, where clauses, etc. It's a great reference book with in depth detail.
I've used them in the past when I needed to create evaluated data. That was before the time of views and sub selects in MySQL though and I generally use those now where I would have needed a temporary table. The only time I might use them is if the evaluated data took a long time to create.
I haven't done them in MySQL, but I've done them on other databases (Oracle, SQL Server, etc).
Among other tasks, temporary tables provide a way for you to create a queryable (and returnable, say from a sproc) dataset that's purpose-built. Let's say you have several tables of figures -- you can use a temporary table to roll those figures up to nice, clean totals (or other math), then join that temp table to others in your schema for final output. (An example of this, in one of my projects, is calculating how many scheduled calls a given sales-related employee must make per week, bi-weekly, monthly, etc.)
I also often use them as a means of "tilting" the data -- turning columns to rows, etc. They're good for advanced data processing -- but only use them when you need to. (My golden rule, as always, applies: If you don't know why you're using x, and you don't know how x works, then you probably shouldn't use it.)
Generally, I wind up using them most in sprocs, where complex data processing is needed. I'd love to give a concrete example, but mine would be in T-SQL (as opposed to MySQL's more standard SQL), and also they're all client/production code which I can't share. I'm sure someone else here on SO will pick up and provide some genuine sample code; this was just to help you get the gist of what problem domain temp tables address.

Effectively transforming data from one SQL database to another in live environment

We have a bit of a messy database situation.
Our main back-office system is written in Visual Fox Pro with local data (yes, I know!)
In order to effectively work with the data in our websites, we have chosen to regularly export data to a SQL database. However the process that does this basically clears out the tables each time and does a re-insert.
This means we have two SQL databases - one that our FoxPro export process writes to, and another that our websites read from.
This question is concerned with the transform from one SQL database to the other (SqlFoxProData -> SqlWebData).
For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)
This has been working fine for many months but now we are starting to hit upon performance problems when an update is taking place (this can happen regularly during the day)
Basically when we are updating SqlWebData.ImportantTable from SqlFoxProData.ImportantTable, it's causing occasional connection timeouts/deadlocks/other problems on the live websites.
I've worked hard at optimising queries, caching etc etc, but it's come to a point where I'm looking for another strategy to update the data.
One idea that has come to mind is to have two copies of ImportantTable (A and B), some concept of which table is currently 'active', updating the non-active table, then switching the currenly actice table
i.e. websites read from ImportantTableA whilst we're updating ImportantTableB, then we switch websites to read from ImportantTableB.
Question is, is this feasible and a good idea? I have done something like it before but I'm not convinced it's necessarily good for optimisation/indexing etc.
Any suggestions welcome, I know this is a messy situation... and the long term goal would be to get our FoxPro application pointing to SQL.
(We're using SQL 2005 if it helps)
I should add that data consistency isn't particularly important in the instance, seeing as the data is always slightly out of date
There are a lot of ways to skin this cat.
I would attack the locking issues first. It is extremely rare that I would use CURSORS, and I think improving the performance and locking behavior there might resolve a lot of your issues.
I expect that I would solve it by using two separate staging tables. One for the FoxPro export in SQL and one transformed into the final format in SQL side-by-side. Then either swapping the final for production using sp_rename, or simply using 3 INSERT/UPDATE/DELETE transactions to apply all changes from the final table to production. Either way, there is going to be some locking there, but how big are we talking about?
You should be able to maintain one db for the website and just replicate to that table from the other sql db table.
This is assuming that you do not update any data from the website itself.
"For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)"
I cannot think of a case where I would ever need to perform an insert, update or delete using a cursor. If you can write the select for the cursor, you can convert it into an insert, update or delete. You can join to other tables in these statements and use the case stament for conditional processing. Taking the time to do this in a set -based fashion may solve your problem.
One thing you may consider if you have lots of data to move. We occassionally create a view to the data we want and then have two tables - one active and one that data will be loaded into. When the data is finsihed loading, as part of your process run a simple command to switch the table the view uses to the one you just finshed loading to. That way the users are only down for a couple of seconds at most. You won't create locking issues where they are trying to access data as you are loading.
You might also look at using SSIS to move the data.
Do you have the option of making the updates more atomic, rather than the stated 'clear out and re-insert'? I think Visual Fox Pro supports triggers, right? For your key tables, can you add a trigger to the update/insert/delete to capture the ID of records that change, then move (or delete) just those records?
Or how about writing all changes to an offline database, and letting SQL Server replication take care of the sync?
[sorry, this would have been a comment, if I had enough reputation!]
Based on your response to Ernie above, you asked how you replicate databases. Here is Microsoft's how-to about replication in SQL2005.
However, if you're asking about replication and how to do it, it indicates to me that you are a little light in experience for SQL server. That being said, it's fairly easy to muck things up and while I'm all for learning by experience, if this is mission critical data, you might be better off hiring a DBA or at the very least, testing the #$##$% out of this before you actually implement it.