What DB changes require a re-syncing of dbml linq objects? - vb.net

What database table schema changes will require me to locate all the programs where I used a dbml file based on the table and re-compile? I love the advantages of linq but this maintenance seems like a big productivity hit.

Well, two kinds of changes will require you to re-sync the DBML and recompile your app
1: changes you want to incorporate in your code
Eg: you added a new field and you want to edit/show this field in your code. Obviously, the L2S classes cannot know about the new field so you would have to resync them
2: changes that will break your existing code
Eg: If you change the datatype from varchar to int and your existing code does not know this, it will crash at runtime when inserting a string in your int fields.
I love the advantages of linq but this maintenance seems like a big
productivity hit.
Not really, any other data approach would need some kind of maintenance in these cases.

Related

Change Database schema but keep data access compatible

I'm working on a fairly large project rewriting software to .Net C#. The original software was written in Visual DataFlex for DOS and used to store data in a separate files for each table, we now use a driver program for the original software that means the data base is now in SQL. But what I think really needs to change in the data base schema. What I wanted to know is could we write SQL triggers for the database so when the driver program tries to access the old data base schema we get and set the data in a new data base schema as well so we can change it but keep everything working. I haven't had to use triggers before so I am not sure if this is something that would work or what disadvantages or problems it could have.
I left the project that was rewriting this system but I have used Views for changing the way the data structure appears since then, and used triggers a bit more and I would agree with the comments on this questions now that changing the schema is a big job and it is better to just focus on rewriting the code first and write that in a way that can handle a change in the schema later on.

Solution For Updating LINQ to SQL Files After Database Schema Change

I recently started using LINQ to SQL in my database later for a C# Windows Forms project. Until now, I have been very impressed with how fast I have been able to implement the data access layer. The problem that I am facing is similar to the post from 2008 below
Best way to update LINQ to SQL classes after database schema change
In short, I am struggling to find an efficient solution for updating the LINQ to SQL files after making minor changes to the database such as constraints, foreign keys, new columns, etc...
Thus far, I have merely been deleting the tables in the LINQ to SQL designer and dragging them back onto the designer. However, I now have the need to rename many of the associations in the designer. The problem is that each time I have to re-create the LINQ to SQL files I lose the change that I manually made to the files. Can someone tell me if there are any new solutions and/or methods for solving this problem. The post that I have included as well as many other dated sources of information mentions that SQLMetal and Huagati are good tools. Additionally, I have read that you can manually create your LINQ to SQL files rather than auto-generate them with the designer (this is what I had to do when using Hibernate with Java).
I know that manually creating the domain classes and mapping files will be consuming. I am not familiar with SQLMetal or Huagati. Can anyone recommend the most elegant or preferred way to deal with this issue? I know that I could use Entity Framework but, I have inherited this project and I am under a very tight deadline. I can refactor it to another Framework once I have this phase complete.
After much research and reading, I have determined that the best solution for updating my DBML after minor database changes is to manually edit the file. The procedure used to update the DBML is below:
Right-click on the DBML file
Open with XML Editor
Add or change the columns in the affected table
Add or change any associations
Save the DBML
Rebuild the project
This is not ideal but, once it has been done a few times it is pretty painless for the types of changes that I occasionally need to make to the database such as changing data types, adding keys, etc...
I don't touch dbml or linq2sql generated files because there is risk that my changes would be overwritten. I use only my generated partial classes. When database schema changes I remove old table from dbml-editor and pull new table to it.

NHibernate and code first

Do you use SchemaExport and SchemaUpdate in real applications? Initially, you create model and then generate schema? Does it work? Or, you use it only for tests...
Usually, I create db (using visual studio database project) and then mappings and persistent classes or EF entities using designer. But now, I want to try code first approach with Fluent NHibernate.
I have researched SchemaExport and SchemaUpdate and found some issues. For example, update doesn't delete db objects, creates not null columns like nullable if table exists, doesn't generate primary key on many-to-many tables and so on. It mean that I have to recreate db very often. But, what's about data? And, how to deploy changes to production db and so on...
I want to know do you really use code first and SchemaExport(SchemaUpdate) in your applications? May be you can give me some advices...
I use SchemaUpdate in production. It is safe precisely because it never does destructive operations like deleting columns. However, it is not a comprehensive solution for updating your database. If you use it you will still have to supplement it with script to update your schema to do things like deleting (as you mention), indexes, changing column type, adding table data, etc. But SchemaUpdate covers the 90% case for me.
The only downside I've discovered is that over time it seems to occasionally add duplicate foreign-key constraints to my table.
One more thing: you should run SchemaUpdate manually from a build tool, not your app itself. It is not safe to give your application the rights to modify your db schema!
I use SchemaUpdate/SchemaExport for rapid evolution of my model, but they are not a replacement for a database migration tool. As you mention, data cannot be migrated in a sensible manner in many cases. The tool does not have enough context. (e.g. How can you automatically migrate a FullName column to FirstName/LastName?) I answered a similar question here where I discuss db migration tools in the context of NHibernate.
NHibernate, ORM : how is refactoring handled? existing data?
Yes, you can use these in real applications; I do.
Of course, almost all the work happens in that first go. My practice has been to create a separate project that references the mappings in my main project assembly and handles database creation and the initial data import, if any.
Once the project is in production, I usually unload that project from the solution, but keep it around for reference or if I ever need to switch from create scripts to update scripts.
As for the way NHibernate creates the database, you have to do a little more specification in your Fluent mappings than you otherwise might. I like to specify null/not null, foreign key constraint names, etc. to have maximum control over the way the database gets created.
I don't think you'd ever want to use automapping in this scenario.
Just with any generating code whether it be poco generation from a tool or database generation as in your question, it will probably get you 80% of the way there. From there it would be wise to tweak it the other 20% to add your indexes and any other performance tweaks to get it just right.

Upgrade strategies for bad DB schema designs

I've shown up at a new job and discovered database which is in dire need of some help. There are many many things wrong with it, including
No foreign keys...anywhere. They're faked by using ints and managing the relationship in code.
Practically every field can be NULL, which isn't really true
Naming conventions for tables and columns are practically non-existent
Varchars which are storing concatenated strings of relational information
Folks can argue, "It works", which it is. But moving forward, it's a total pain to manage all of this with code and opens us up to bugs IMO. Basically, the DB is being used as a flat file since it's not doing a whole lot of work.
I want to fix this. The issues I see now are:
We have a lot of data (migration, possibly tricky)
All of the DB logic is in code (with migration comes big code changes)
I'm also tempted to do something "radical" like moving to a schema-free DB.
What are some good strategies when faced with an existing DB built upon a poorly designed schema?
Enforce Foreign Keys: If a relationship exists in the domain, then it should have a Foreign Key.
Renaming existing tables/columns is fraught with danger, especially if there are many systems accessing the Database directly. Gotchas include tasks that run only periodically; these are often missed.
Of Interest: Scott Ambler's article: Introduction To Database Refactoring
and Catalog of Database Refactorings
Views are commonly used to transition between changing data models because of the encapsulation. A view looks like a table, but does not exist as a finite object in the database - you can change what column is being returned for a given column alias as desired. This allows you to setup your codebase to use a view, so you can move from the old table structure to the new one without the application needing to be updated. But it means the view has to return the data in the existing format. For example - your current data model has:
SELECT t.column --a list of concatenated strings, assuming comma separated
FROM TABLE t
...so the first version of the view would be the query above, but once you created the new table that uses 3NF, the query for the view would use:
SELECT GROUP_CONCAT(t.column SEPARATOR ',')
FROM NEW_TABLE t
...and the application code would never know that anything changed.
The problem with MySQL is that the view support is limited - you can't use variables within it, nor can they have subqueries.
The reality to the changes you wish to make is effectively rewriting the application from the ground up. Moving logic from the codebase into the data model will drastically change how the application gets the data. Model-View-Controller (MVC) is ideal to implement with changes like these, to minimize the cost of future changes like these.
I'd say leave it alone until you really understand it. Then make sure you don't start with one of the Things You Should Never Do.
Read Scott Ambler's book on Refactoring Databases. It covers a good many techniques for how to go about improving a database - including the transitional measures needed to allow both old and new programs to work with the changing design.
Create a completely new schema and make sure that it is fully normalized and contains any unique, check and not null constraints etc that are required and that appropriate data types are used.
Prepopulate each table that fills the parent role in a foreign key relationship with a single 'Unknown' record.
Create an ETL (Extract Transform Load) process (I can recommend SSIS (SQL Server Integration Services) but there are plenty of others) that you can use to refill the new schema from the existing one on a regular basis. Use the 'Unknown' record as the parent of any orphaned records - there will be plenty ;). You will need to put some thought into how you will consolidate duplicate records - this will probably need to be on a case by case basis.
Use as many iterations as are necessary to refine your new schema (ensure that the ETL Process is maintained and run regularly).
Create views over the new schema that match the existing schema as closely as possible.
Incrementally modify any clients to use the new schema making temporary use of the views where necessary. You should be able to gradually turn off parts of the ETL process and eventually disable it completely.
First see how bad the code is related to the DB if it is all mixed in no DAO layer you shouldn't think about a rewrite but if there is a DAO layer then it would be time to rewrite that layer and DB along with it. If possible make the migration tool based on using the two DAOs.
But my guess is there is no DAO so you need to find what areas of the code you are going to be changing and what parts of the DB that relates to hopefully you can cut it up into smaller parts that can be updated as you maintain. Biggest deal is to get FKs in there and start checking for proper indexes there is a good chance they aren't being done correctly.
I wouldn't worry too much about naming until the rest of the db is under control. As for the NULLs if the program chokes on a value being NULL don't let it be NULL but if the program can handle it I wouldn't worry about it at this point in the future if it is doing a default value move that to the DB but that is way down the line from the sound of things.
Do something about the Varchars sooner rather then later. If anything make that the first pure background fix to the program.
The other thing to do is estimate the effort of each areas change and then add that price to the cost of new development on that section of code. That way you can fix the parts as you add new features.

Do you put your indexes in source control?

And how do you keep them in synch between test and production environments?
When it comes to indexes on database tables, my philosophy is that they are an integral part of writing any code that queries the database. You can't introduce new queries or change a query without analyzing the impact to the indexes.
So I do my best to keep my indexes in synch betweeen all of my environments, but to be honest, I'm not doing very well at automating this. It's a sort of haphazard, manual process.
I periodocally review index stats and delete unnecessary indexes. I usually do this by creating a delete script that I then copy back to the other environments.
But here and there indexes get created and deleted outside of the normal process and it's really tough to see where the differences are.
I've found one thing that really helps is to go with simple, numeric index names, like
idx_t_01
idx_t_02
where t is a short abbreviation for a table. I find index maintenance impossible when I try to get clever with all the columns involved, like,
idx_c1_c2_c5_c9_c3_c11_5
It's too hard to differentiate indexes like that.
Does anybody have a really good way to integrate index maintenance into source control and the development lifecycle?
Indexes are a part of the database schema and hence should be source controlled along with everything else. Nobody should go around creating indexes on production without going through the normal QA and release process- particularly performance testing.
There have been numerous other threads on schema versioning.
The full schema for your database should be in source control right beside your code. When I say "full schema" I mean table definitions, queries, stored procedures, indexes, the whole lot.
When doing a fresh installation, then you do:
- check out version X of the product.
- from the "database" directory of your checkout, run the database script(s) to create your database.
- use the codebase from your checkout to interact with the database.
When you're developing, every developer should be working against their own private database instance. When they make schema changes they checkin a new set of schema definition files that work against their revised codebase.
With this approach you never have codebase-database sync issues.
Yes, any DML or DDL changes are scripted and checked in to source control, mostly thru activerecord migrations in rails. I hate to continually toot rails' horn, but in many years of building DB-based systems I find the migration route to be so much better than any home-grown system I've used or built.
However, I do name all my indexes (don't let the DBMS come up with whatever crazy name it picks). Don't prefix them, that's silly (because you have type metadata in sysobjects, or in whatever db you have), but I do include the table name and columns, e.g. tablename_col1_col2.
That way if I'm browsing sysobjects I can easily see the indexes for a particular table (also it's a force of habit, wayyyy back in the day on some dBMS I used, index names were unique across the whole DB, so the only way to ensure that is to use unique names).
I think there are two issues here: the index naming convention, and adding database changes to your source control/lifecycle. I'll tackle the latter issue.
I've been a Java programmer for a long time now, but have recently been introduced to a system that uses Ruby on Rails for database access for part of the system. One thing that I like about RoR is the notion of "migrations". Basically, you have a directory full of files that look like 001_add_foo_table.rb, 002_add_bar_table.rb, 003_add_blah_column_to_foo.rb, etc. These Ruby source files extend a parent class, overriding methods called "up" and "down". The "up" method contains the set of database changes that need to be made to bring the previous version of the database schema to the current version. Similarly, the "down" method reverts the change back to the previous version. When you want to set the schema for a specific version, the Rails migration scripts check the database to see what the current version is, then finds the .rb files that get you from there up (or down) to the desired revision.
To make this part of your development process, you can check these into source control, and season to taste.
There's nothing specific or special about Rails here, just that it's the first time I've seen this technique widely used. You can probably use pairs of SQL DDL files, too, like 001_UP_add_foo_table.sql and 001_DOWN_remove_foo_table.sql. The rest is a small matter of shell scripting, an exercise left to the reader.
I always source-control SQL (DDL, DML, etc). Its code like any other. Its good practice.
I am not sure indexes should be the same across different environments since they have different data sizes. Unless your test and production environments have the same exact data, the indexes would be different.
As to whether they belong in source control, am not really sure.
I do not put my indexes in source control but the creation script of the indexes. ;-)
Index-naming:
IX_CUSTOMER_NAME for the field "name" in the table "customer"
PK_CUSTOMER_ID for the primary key,
UI_CUSTOMER_GUID, for the GUID-field of the customer which is unique (therefore the "UI" - unique index).
On my current project, I have two things in source control - a full dump of an empty database (using pg_dump -c so it has all the ddl to create tables and indexes) and a script that determines what version of the database you have, and applies alters/drops/adds to bring it up to the current version. The former is run when we're installing on a new site, and also when QA is starting a new round of testing, and the latter is run at every upgrade. When you make database changes, you're required to update both of those files.
Using a grails app the indexes are stored in source control by default since you are defining the index definition inside of a file that represents your domain object. Just offering the 'Grails' perspective as an FYI.