NHibernate and code first - nhibernate

Do you use SchemaExport and SchemaUpdate in real applications? Initially, you create model and then generate schema? Does it work? Or, you use it only for tests...
Usually, I create db (using visual studio database project) and then mappings and persistent classes or EF entities using designer. But now, I want to try code first approach with Fluent NHibernate.
I have researched SchemaExport and SchemaUpdate and found some issues. For example, update doesn't delete db objects, creates not null columns like nullable if table exists, doesn't generate primary key on many-to-many tables and so on. It mean that I have to recreate db very often. But, what's about data? And, how to deploy changes to production db and so on...
I want to know do you really use code first and SchemaExport(SchemaUpdate) in your applications? May be you can give me some advices...

I use SchemaUpdate in production. It is safe precisely because it never does destructive operations like deleting columns. However, it is not a comprehensive solution for updating your database. If you use it you will still have to supplement it with script to update your schema to do things like deleting (as you mention), indexes, changing column type, adding table data, etc. But SchemaUpdate covers the 90% case for me.
The only downside I've discovered is that over time it seems to occasionally add duplicate foreign-key constraints to my table.
One more thing: you should run SchemaUpdate manually from a build tool, not your app itself. It is not safe to give your application the rights to modify your db schema!

I use SchemaUpdate/SchemaExport for rapid evolution of my model, but they are not a replacement for a database migration tool. As you mention, data cannot be migrated in a sensible manner in many cases. The tool does not have enough context. (e.g. How can you automatically migrate a FullName column to FirstName/LastName?) I answered a similar question here where I discuss db migration tools in the context of NHibernate.
NHibernate, ORM : how is refactoring handled? existing data?

Yes, you can use these in real applications; I do.
Of course, almost all the work happens in that first go. My practice has been to create a separate project that references the mappings in my main project assembly and handles database creation and the initial data import, if any.
Once the project is in production, I usually unload that project from the solution, but keep it around for reference or if I ever need to switch from create scripts to update scripts.
As for the way NHibernate creates the database, you have to do a little more specification in your Fluent mappings than you otherwise might. I like to specify null/not null, foreign key constraint names, etc. to have maximum control over the way the database gets created.
I don't think you'd ever want to use automapping in this scenario.

Just with any generating code whether it be poco generation from a tool or database generation as in your question, it will probably get you 80% of the way there. From there it would be wise to tweak it the other 20% to add your indexes and any other performance tweaks to get it just right.

Related

Which ORM frameworks will build and execute the SQL DDL for you?

Entity Framework Code First will build the database for you if it doesn't exist and structure it based on your mapping objects. I believe Roundhouse will do the same thing with Fluent Mapping files using NHibernate.
Are there any other ORM's (or tools like Roundhouse) that will take care of all your SQL DDL creation and execution?
NHibernate does not need Fluent Mappings to generate database schema. This feature is built into the NHibernate core:
new SchemaExport(_configuration).Execute(false, true, false);
In my experience however this is mostly useful for in-memory integration tests or initial rollouts. Production databases need to be upgraded. If you stick around, then you will need to add and remove columns, tables and foreign keys without affecting data. There is a continuity and versioning aspect to it. NHibernate only knows your current mapping. It does not know for example that 2 months ago you stored your customer first and last name in column called "CustomerName" and then you decided to split this into two columns "FirstName" and "LastName" (which is probably the most primitive change that can be made). NHibernate job is to map your current schema to objects, not to remember data modeling choices from few years ago.
In my experience there is no magic tool that will write upgrade scripts, they have to be written manually or at least reviewed by developer. Tools can provide you a framework for executing these scripts, like RoundhouseE. Scott Allen has an excellent series about 'forward-only, run-once' approach.
hibernate does if you set hbm2ddl.auto to create or create-drop or update in the config file. Using Java that is, I assume its the same for nHibernate.
If an ORM does not do ddl, there is not much point in having it, well its a key feature at least. Imho.

Entity Framework - Schema Upgrade, Multiple DBMS, and Code First

I'm looking into using Microsoft's Entity Framework in an upcoming project which is a point release of an existing product. Our current product supports two DBMS (Oracle and SQL Server), the schema of each is maintained in separate .sql script files.
The entity framework (4.1) looks appealing because it allows various scenarios to be implemented automatically via code generation, reflection, etc. However, as far as I can tell, some of these benefits appear to be mutually exclusive of others.
For example, to support multiple DMBSes, I am inferring that I would need to use a model or code first design, in which case EF would generate the schema for each according to the model (I have seen little to no posts or documentation on this, so I may be wrong). This means that our existing schema would need to be either abandoned (model-first), or mapped (code-first). Additionally, updating the schema would require manual scripts as EF does not appear to support schema upgrades (without wiping out data).
Are model-first and code-first the only viable means of supporting multiple DBMSes in EF? I realize that technically it would be impossible to guarantee that two arbitrary schemae are the same, so I am thinking this is true.
Are there any potential pitfalls of code-first and mapping to multiple DBMS systems? For example, Oracle does not have auto-increment columns; you have to use sequences. How is this mapped in the DbContext? Do I need to create separate maps for each DBMS?
Does EF support any mechanism to upgrade an existing DBMS schema to one of which is representative of the EF model (schema recreation =/= upgrade), or am I limited to doing this manually?
I did come up with one possible way to use database first and support multiple DBMSes, however it is a maintenance nightmare. The idea was to add another layer of abstraction to the two generated data models and create converter classes for each of the EF generated models. This seems like the best way of doing it so that each DBMS could potentially have its own model, yet my code would handle the mapping. But in doing this, what am I really gaining from EF? Maybe query generation, but is that worth it?
Actually both the model-first and the database-first have same constraints. Both these approaches are using an EDMX file which contains SSDL (a description of store = a database layer) part related directly to a single database provider so if you want to have two different database providers you must have two different SSDL parts and keep them in sync. You can use single CSDL (a description of conceptual layer = your model classes) and a single or two MSLs (a description of mapping between SSDL and CSDL - a single file is possible only if tables and columns will have exactly same names in both SSDLs). As I know EDMX file can consists only from single SSDL, CSDL and MSL parts so I expect that the designer has no support for this scenario and you will have to modify second SSDL manually or use two EDMXs = model each change twice.
The code-first approach can make this much more simple but the question is how good is Oracle provider when using the code-first and the database generation. The provider is responsible for correctly interpreting needed features like sequences in case of auto increment columns.
EF itself currently has no support for upgrading existing DB. When using EDMX the process of the database generation is controlled either by T4 template or Workflow so it can be customized and there is already separate feature called Entity Designer Database Generation Power Pack which allow incremental building of the database with the model-first approach. The problem is that this feature is using VS Database tools. I think these tools works only with SQL server. I never like these automated tools so I still think that database upgrade should be controlled manually with help of some tools to get difference script between the current and the last deployed database versions. You should need diff script only when deploying new the new version to a production environment. In a testing and a development environment you can always recreate the whole database.
There should be no abstraction needed when working with two EDMX models. Models must produce the same conceptual layer. In such case you need only a single set of POCO classes which are mapped by conventions (same class name as the entity, same properties with same types and accesibility) so they will work with both models.
Edit:
Based on #Tridus answer I'm just adding that you can create databases first and use fluentAPI from EF 4.1 to map them. Your databases must have exactly the same schema (table names, column names, etc.), they can't use any specific features (I hope sequences will not be the problem because it is just the way how Oracle handles auto increment columns).
This is actually fairly doable with a database first design, but there's some caveats you won't be able to get around easily due to how the databases handle things differently.
Sequences are one (in that they're just ignored by EF entirely). You can fake that in Oracle by putting a trigger on the table that populates it on Insert, but I also found that if you have to update the model later then EF "forgets" that the column is an identity column and it'll try to stick a 0 in it again. I also found it unreliable in Oracle to try and get the new ID if you use a trigger. We just wound up selecting from the sequence and setting the ID on the object before doing the insert because that's how you usually do it in Oracle. You could also use a stored procedure that handles it.
Numbers aren't handled the same way. SQL Server uses number formats that map to Int32, Int64, etc. Oracle's number format is totally different and a full range Int32 in SQL Server is a Number(10,0) in Oracle... which is actually an Int64 in EF because it's bigger then an Int32. I also found that Oracle's EF provider likes to use Decimal a lot even when it doesn't have to, but that's probably just a beta issue.
Stored Procedures in Oracle require some values to be put in app.config/web.config in order to work in EF. I'm not sure if that's going to just be clutter in SQL Server or if it'll cause problems.
Finally, EF Code First is pretty immature and according to the docs doesn't support changing the database structure in this version. I'm not sure if Oracle's provider supports it either (it might, haven't tried it).
Most of this is stuff you can get around, but you're going to need to do some work to hide the differences from the rest of your code and it'll probably take a wrapper layer to do it.
edit - In regards to your #4 - EF 4.1 can generate partial POCO classes. Instead of writing a wrapper around each of the generated models to hide any differences, you can create another partial class code file that won't be regenerated when you update the model, and then add properties/methods that hide the differences. Your app code would just have to be aware to use those instead, and they'd handle the issue (like the number issue I mentioned, you could completely hide it with another property that can do the necessary casting for Oracle).

Is retrofitting Entity Framework into my app appropriate for my situation?

So, I have an app that uses a SQL Server express db. I have about 80ish tables all with a primary key but no foreign keys. (The reason we have no foreign keys is because of how we do our sql client-to-server replication. It's not true replication but it's a sync system that was in place when we took over the app. We have no guarantee what records are going to make it to the database first when a client syncs to the server so it is possible that a record would make it to the database with a foreign key that points to a nonexistant record).
We use a type-per-model convention. For each of our business objects there is a table in the db. We currently use stored procedures for every database transaction. This means for every new class there is at least 4 new stored procedures (crud). We have abstracted out our data access layer from our business objects. Each business object has a corresponding businessObjectDAO.
My question is, is entity framework feasible for me to move to? With no foreign key relationships I'm going to have to set up every association between tables manually. Is it worth the time to do this?
My biggest hang up right now is trying to figure out how I map my DAOs to the EF partial classes.
Should I be creating one big .edmx or multiple?
A lot of questions I know. This is my first big architectural type decision and I've been given the go ahead to make the change if I think it is beneficial and feasible.
Maybe I should try Linq-to-SQL? NHibernate is out because we're not allowed using open source products in production (stupid, I know).
Thanks
Cody
My personal recommendation is that if something is working, leave it. I am a big big fan of both LINQ-SQL and Entity Framework, and have managed to get my workplace to make use of the Linq-Sql. I realise that if you did bring one of these in place with your project, maintainability would probably be easier, but by the sounds of it the initial work will be more work than is worth it in the end.

Upgrade strategies for bad DB schema designs

I've shown up at a new job and discovered database which is in dire need of some help. There are many many things wrong with it, including
No foreign keys...anywhere. They're faked by using ints and managing the relationship in code.
Practically every field can be NULL, which isn't really true
Naming conventions for tables and columns are practically non-existent
Varchars which are storing concatenated strings of relational information
Folks can argue, "It works", which it is. But moving forward, it's a total pain to manage all of this with code and opens us up to bugs IMO. Basically, the DB is being used as a flat file since it's not doing a whole lot of work.
I want to fix this. The issues I see now are:
We have a lot of data (migration, possibly tricky)
All of the DB logic is in code (with migration comes big code changes)
I'm also tempted to do something "radical" like moving to a schema-free DB.
What are some good strategies when faced with an existing DB built upon a poorly designed schema?
Enforce Foreign Keys: If a relationship exists in the domain, then it should have a Foreign Key.
Renaming existing tables/columns is fraught with danger, especially if there are many systems accessing the Database directly. Gotchas include tasks that run only periodically; these are often missed.
Of Interest: Scott Ambler's article: Introduction To Database Refactoring
and Catalog of Database Refactorings
Views are commonly used to transition between changing data models because of the encapsulation. A view looks like a table, but does not exist as a finite object in the database - you can change what column is being returned for a given column alias as desired. This allows you to setup your codebase to use a view, so you can move from the old table structure to the new one without the application needing to be updated. But it means the view has to return the data in the existing format. For example - your current data model has:
SELECT t.column --a list of concatenated strings, assuming comma separated
FROM TABLE t
...so the first version of the view would be the query above, but once you created the new table that uses 3NF, the query for the view would use:
SELECT GROUP_CONCAT(t.column SEPARATOR ',')
FROM NEW_TABLE t
...and the application code would never know that anything changed.
The problem with MySQL is that the view support is limited - you can't use variables within it, nor can they have subqueries.
The reality to the changes you wish to make is effectively rewriting the application from the ground up. Moving logic from the codebase into the data model will drastically change how the application gets the data. Model-View-Controller (MVC) is ideal to implement with changes like these, to minimize the cost of future changes like these.
I'd say leave it alone until you really understand it. Then make sure you don't start with one of the Things You Should Never Do.
Read Scott Ambler's book on Refactoring Databases. It covers a good many techniques for how to go about improving a database - including the transitional measures needed to allow both old and new programs to work with the changing design.
Create a completely new schema and make sure that it is fully normalized and contains any unique, check and not null constraints etc that are required and that appropriate data types are used.
Prepopulate each table that fills the parent role in a foreign key relationship with a single 'Unknown' record.
Create an ETL (Extract Transform Load) process (I can recommend SSIS (SQL Server Integration Services) but there are plenty of others) that you can use to refill the new schema from the existing one on a regular basis. Use the 'Unknown' record as the parent of any orphaned records - there will be plenty ;). You will need to put some thought into how you will consolidate duplicate records - this will probably need to be on a case by case basis.
Use as many iterations as are necessary to refine your new schema (ensure that the ETL Process is maintained and run regularly).
Create views over the new schema that match the existing schema as closely as possible.
Incrementally modify any clients to use the new schema making temporary use of the views where necessary. You should be able to gradually turn off parts of the ETL process and eventually disable it completely.
First see how bad the code is related to the DB if it is all mixed in no DAO layer you shouldn't think about a rewrite but if there is a DAO layer then it would be time to rewrite that layer and DB along with it. If possible make the migration tool based on using the two DAOs.
But my guess is there is no DAO so you need to find what areas of the code you are going to be changing and what parts of the DB that relates to hopefully you can cut it up into smaller parts that can be updated as you maintain. Biggest deal is to get FKs in there and start checking for proper indexes there is a good chance they aren't being done correctly.
I wouldn't worry too much about naming until the rest of the db is under control. As for the NULLs if the program chokes on a value being NULL don't let it be NULL but if the program can handle it I wouldn't worry about it at this point in the future if it is doing a default value move that to the DB but that is way down the line from the sound of things.
Do something about the Varchars sooner rather then later. If anything make that the first pure background fix to the program.
The other thing to do is estimate the effort of each areas change and then add that price to the cost of new development on that section of code. That way you can fix the parts as you add new features.

How can I generate "migration" DDL from NHibernate mapping files?

I'm using NHibernate 2 and PostgreSQL in my project. SchemaExport class does a great job generating DDL scheme for database, but it's great until the first application.
Is there any way to generate "migration" DLL (batch of "ALTER TABLE"'s instead of DROP/CREATE pair) using NHibernate mapping files?
Look into SchemaUpdate. Very similiar API as SchemaExport but it only creates migrations.
While SchemaUpdate very much answers my needs, it still has several problems. For example it refuses to put a new restriction on existing database column even if it's not gonna conflict with existing data.
I'm going froward to extend SchemaUpdate a little bit or, if fail, switch to one of that hand driven migration tools (for example Rails one).