A while back, I made and published software with an sdf database. Now I want to improve it and add more features: new forms and new tables in the database. How can I get the data stored in a table (where the table is still has the same proprieties in my new database) and added them to my new database?
Typically, the way to approach this problem is not to move the data to a new database, but rather use the SQL ALTER statement to add or remove the columns you need from the old database in place. The installer for the new program needs to be smart enough to detect the old database file, and you write code for the installer (or a check when the program starts up) that is able to handle the upgrade process.
This works especially well if you are doing source control correctly. With source control, you have to commit or check in changes for the code, but it's no good committing a change that needs the database to have a column that's not yet available. Thus a good source control environment encourages you to write the ALTER statement to add that column as part of the rest of the feature work. Someone else needs a different column on the same table? They write their own ALTER statement. Later, their branch can merged with yours, but the database still ends up exactly as it needs to be. Moreover, these commits to the database project can then be collected and used for the upgrade process when you are ready to publish the application.
Related
I have made changes to my model-first EDMX file, and now I want to apply the changes to my database. I've just added a new table and some new fields, nothing destructive. I want to apply the "diff" to my database, but without all the hassle of database migrations. What I actually need is a non-destructive SQL file containing only the differences.
Currently, I am doing this manually by creating new database SQL from model, deleting all the code non-relevant to the table I am creating, and running the SQL. However, my table is currently empty so I can do this destructively. Moreover, if there are any changes referencing other entities too (e.g. adding a new foreign key to one of the existing tables), the SQL is, obviously, destructive. So I need to add them manually by writing my own SQL.
Is there any tool or a shorter workaround that would automate this whole process? I am looking for something that will compare the current database and the newly created EDMX, and apply only the diff into the database, as a one time process. The whole database migration system of Entity Framework is an extreme overhead and unnecessary work, the whole process, which will run only once, can be boiled down to a single SQL file. Is there such a tool/method? What is the best practice for this (other than EF migrations)?
In this question, I was facing an issue where I was writing an update for a deployed application to bring the database up to date with the newer version we are deploying. Basic outline as follows:
Began with currently deployed version of application
Added new functionality that used existing database
Added new database tables and relationships
Added new functionality that depended on the new databse structure
Testing complete, ready for deployment
The issue here is that the currently deployed application has been in use for a few months and has a lot of data that would need to be preserved, so simply replacing the old with the new was not viable (at least not for the database, but of course it works for the code). So I used the following steps to write a script in SQL for the updated version of the application to run the first time it starts up to make the necessary changes to the database without touching existing data (aside from populating the new tables):
Use VS2010's "Generate database from model" functionality to create a .sql (the model was originally created using the "Generate model from database" functionality)
Remove all parts of the .sql that act on the existing tables, except for those that add FKs between new and old tables
Use the resulting script to build the new database
Sounds pretty clean and done, right? Wrong. The mapping from the model to the database was all wrong for the new tables. Long story short, the database that generated the model had tables named in the plural (and the mapping was correct and the application worked), and the database generated by the model created tables in the plural (identical names to what the tables where the DB generated the model, but the model did not map to them). The solution ended up being to change the script to name the tables in the singular, and then everything worked flawlessly.
What happened here? The code remained untouched, no changes were made to the model, and the old tables continued to work fine the entire time, yet somewhere in the process of
Generate script
Delete "new" tables and constraints (those that don't yet exist in the deployed version)
Run script to re-add the tables
the mapping decided to be to singularly named tables (User instead of Users, Address instead of Addresses, etc).
Can anyone explain to me how/why this would happen this way?
You might want to look at some of the tools that redgate supply - good tools for comparing two DB structures and generating a script to update.
http://www.red-gate.com/?utm_source=google&utm_medium=cpc&utm_content=brand_aware&utm_campaign=redgate&gclid=CIamkumgw6sCFcYPfAodnGVjsQ
I'm writing an application that is using a database (currently MySQL 4) to store data.
It is likely that I will make changes to this in the form of updates later to add additional data. Updating the application is simple, it essentially comes down to overwriting the program files with the new ones. However how do I go about updating the database schema?
The database is remote and so my application might exist in several places, so simply dumping the ALTER and CREATE statements in an installer would result in the changes being made multiple times, and I have been asked explicitly for an automatic solution that allows for the application copies to be updated over a transition period, and for schema updates to be automatic.
I considered examining the schema at start-up to look for missing tables and columns, and adding them as needed, however this does not seem like a clean solution. I also considered putting some kind of “schema version” number on the database, but can’t see any way to do this short of a single row table with an int “Version” column which doesn’t seem a good way either.
I can highly recommend Liquibase. It really does work - I've used it and was very impressed.
Essentially, it keeps its own log of statements run on a database and runs them only if not already run/needed. It is XML driven and allows you to use optional pre- and post-execution statements and conditions. You check your XML files into your source control and invoke it from your build tool. It's even suitable for driving production releases.
It's magic.
Rather than rolling your own system for versioning your database it's probably worth looking into an existing framework that will manage it for you.
I use liquibase and have integrated into my build using the maven plugin. Worth checking out!
Just as you proposed, add a table where you store the current version of the database schema. Then you only have to apply the changes between your last schema update and the new release, and set the new version number accordingly. I've done this to update our production database about 300 times, it just works.
here's a more general question on how you handle database schema changes in a development team.
We are a team of developers and the databases used during development are running locally on everyone's box as we want to avoid the requirement to have web access all the time. So running a single central database instance somewhere is not a real option.
Whenever one of us decides that it is time to extend/change the db schema, we mail database files (MYI/MYD) or SQL files to execute around, or give others instructions on the phone what they need to do to get the changed code running on their local DBs. That's not the perfect approach for sure. The same problem arises when we need to adjust the DB schema on staging or production once a new release is ready.
I was wondering ... how do you guys handle this kind of stuff? For source code, we use SVN.
Really appreciate your input!
Thanks,
Michael
One approach we've used in the past is to script the entire DDL for the database, along with any test/setup data needed. Store that in SVN, then when there's a change, any developer can pull down the changes, drop the database, and rebuild it from the script files.
At the very least you should have the scripts of all the objects in the database (tables, stored procedures, etc) under source control.
I don't think mailing schema changes is a real option for a professional development team.
We had a system on one of my previous teams that was the best I've encountered for dealing with this situation.
The nightly build of the application included a build of a database (SQL Server). The database got built to the Test DB server. Each developer then had a DTS package (this was a while ago, and I'm sure they upgraded to SSIS packages) to pull down that nightly DB build to their local DB environment.
This kept the master copy in one location and put the onus on the developers to keep their local dev databases fresh.
At my work, we deal with pretty large databases that are time-consuming to generate, so for us, starting from scratch with a new DB isn't ideal. Like Harper, we have our DDL in SVN. Additionally, we store a version number in a database table. Every check-in that changes the DB must be accompanied by a script that:
Will upgrade the database schema and modify any existing data appropriately, and
Will update the version number in the database.
Further, we number the scripts and database versions such that a script we've written knows how to upgrade further along a branch or from an older branch to a newer one without any input from the developer (apart from the database name and the directory to the upgrade scripts).
Thus, if I've got a copy of a customer's 4GB DB that's from a year old version and I want to test how their data will work with the version we cut yesterday, I can just run our script and let it handle the upgrades rather than having to start from scratch and redo every INSERT, UPDATE and DELETE performed since the database was created.
We have a non-SQL description of the database schema. When the application starts, it compares the desired database schema with the actual database schema, and performs whatever ADD TABLE, ADD COLUMN, ADD INDEX, etc. statements it needs to do to get the database to look right.
This doesn't handle every case; sometimes you have to delete the database and recreate if if you've changed something that the schema resolver can't handle, but most of the time we don't need to worry about it.
I'd certainly keep the database schema in source code control.
At my present job, every time there's a schema change, we write the SQL for the change (alter table xyz add column ...) and put it in SVN. Then developers can update test databases by running this script. It's pretty clumsy but it works.
At a previous job I wrote some code that at application start-up would automatically compare the actual database schema to what it expected, and if it was not up to date perform the updates. Mostly this was done for deployment reasons: When we shipped new copies of the software, it would then automatically update the user's database. But it was also handy for developers.
I think there should be some generic SQL tool to do this. Maybe there is, but I've never seen one.
I am writing code to migrate data from our live Access database to a new Sql Server database which has a different schema with a reorganized structure. This Sql Server database will be used with a new version of our application in development.
I've been writing migrating code in C# that calls Sql Server and Access and transforms the data as required. I migrated for the first time a table which has entries related to new entries of another table that I have not updated recently, and that caused an error because the record in the corresponding table in SQL Server could not be found
So, my SqlServer productions table has data only up to 1/14/09, and I'm continuing to migrate more tables from Access. So I want to write an update method that can figure out what the new stuff is in Access that hasn't been reflected in Sql Server.
My current idea is to write a query on the SQL side which does SELECT Max(RunDate) FROM ProductionRuns, to give me the latest date in that field in the table. On the Access side, I would write a query that does SELECT * FROM ProductionRuns WHERE RunDate > ?, where the parameter is that max date found in SQL Server, and perform my translation step in code, and then insert the new data in Sql Server.
What I'm wondering is, do I have the syntax right for getting the latest date in that Sql Server table? And is there a better way to do this kind of migration of a live database?
Edit: What I've done is make a copy of the current live database. Which I can then migrate without worrying about changes, then use that to test during development, and then I can migrate the latest data whenever the new database and application go live.
I personally would divide the process into two steps.
I would create an exact copy of Access DB in SQLServer and copy all the data
Copy the data from this temporary SQLServer DB to your destination database
In that way you can write set of SQL code to accomplish second step task
Alternatively use SSIS
Generally when you convert data to a new database that will take it's place in porduction, you shut out all users of the database for a period of time, run the migration and turn on the new database. This ensures no changes to the data are made while doing the conversion. Of course I never would have done this using c# either. Data migration is a database task and should have been done in SSIS (or DTS if you have an older version of SQL Server).
If the databse you are converting to is just in development, I would create a backup of the Access database and load the data from there to test the data loading process and to get the data in so you can do the application development. Then when it is time to do the real load, you just close down the real database to users and use it to load from. If you are trying to keep both in synch wile you develop, well I wouldn't do that but if you must, make a nightly backup of the file and load first thing in the morning using your process.
You may want to look at investing in a tool like SQL Data Compare.
I believe it has support for access databases too, and you can download a trial.
I you are happy with you C# code, but it fails because of the constraints in your destination database you temporarily can disable them and then enable after you copy the whole lot.
I am assuming that your destination database is brand new DB with no data, and not used by anyone when the transfer happens
It sounds like you have two problems:
You're migrating data from one database to another.
You're changing your schema.
Doing either of these things is tricky if you are trying to migrate the data while people are using the data.
The simplest approach is to migrate the data based on a static copy of the data, and also to queue updates to that data from the moment you captured the static copy. I don't know how easy this is in Access, but in SQLServer or Oracle you can use the redo logs for this or a manual solution using triggers. The poor-man's way of doing this is to make triggers for all the relevant tables that log the primary key of the records that have changed. Then after the old database is shut off you can iterate over those keys and get those records from the old database and put them into the new database. Just copy the whole record; if the record was deleted then delete it from the new database.
Your problem is compounded by the fact that you can't simply copy the data, you have to transform it. This means you probably have to shut down both databases and re-migrate the records based on the change list. It will take a lot of planning to ensure you get things right and I'd recommend writing a testing script that can validate that the resulting data is correct.
Also I'd ensure that the code for the migration runs inside one of the databases if possible. Otherwise you are copying the data twice and this will significantly harm the performance.