I'm looking into using Microsoft's Entity Framework in an upcoming project which is a point release of an existing product. Our current product supports two DBMS (Oracle and SQL Server), the schema of each is maintained in separate .sql script files.
The entity framework (4.1) looks appealing because it allows various scenarios to be implemented automatically via code generation, reflection, etc. However, as far as I can tell, some of these benefits appear to be mutually exclusive of others.
For example, to support multiple DMBSes, I am inferring that I would need to use a model or code first design, in which case EF would generate the schema for each according to the model (I have seen little to no posts or documentation on this, so I may be wrong). This means that our existing schema would need to be either abandoned (model-first), or mapped (code-first). Additionally, updating the schema would require manual scripts as EF does not appear to support schema upgrades (without wiping out data).
Are model-first and code-first the only viable means of supporting multiple DBMSes in EF? I realize that technically it would be impossible to guarantee that two arbitrary schemae are the same, so I am thinking this is true.
Are there any potential pitfalls of code-first and mapping to multiple DBMS systems? For example, Oracle does not have auto-increment columns; you have to use sequences. How is this mapped in the DbContext? Do I need to create separate maps for each DBMS?
Does EF support any mechanism to upgrade an existing DBMS schema to one of which is representative of the EF model (schema recreation =/= upgrade), or am I limited to doing this manually?
I did come up with one possible way to use database first and support multiple DBMSes, however it is a maintenance nightmare. The idea was to add another layer of abstraction to the two generated data models and create converter classes for each of the EF generated models. This seems like the best way of doing it so that each DBMS could potentially have its own model, yet my code would handle the mapping. But in doing this, what am I really gaining from EF? Maybe query generation, but is that worth it?
Actually both the model-first and the database-first have same constraints. Both these approaches are using an EDMX file which contains SSDL (a description of store = a database layer) part related directly to a single database provider so if you want to have two different database providers you must have two different SSDL parts and keep them in sync. You can use single CSDL (a description of conceptual layer = your model classes) and a single or two MSLs (a description of mapping between SSDL and CSDL - a single file is possible only if tables and columns will have exactly same names in both SSDLs). As I know EDMX file can consists only from single SSDL, CSDL and MSL parts so I expect that the designer has no support for this scenario and you will have to modify second SSDL manually or use two EDMXs = model each change twice.
The code-first approach can make this much more simple but the question is how good is Oracle provider when using the code-first and the database generation. The provider is responsible for correctly interpreting needed features like sequences in case of auto increment columns.
EF itself currently has no support for upgrading existing DB. When using EDMX the process of the database generation is controlled either by T4 template or Workflow so it can be customized and there is already separate feature called Entity Designer Database Generation Power Pack which allow incremental building of the database with the model-first approach. The problem is that this feature is using VS Database tools. I think these tools works only with SQL server. I never like these automated tools so I still think that database upgrade should be controlled manually with help of some tools to get difference script between the current and the last deployed database versions. You should need diff script only when deploying new the new version to a production environment. In a testing and a development environment you can always recreate the whole database.
There should be no abstraction needed when working with two EDMX models. Models must produce the same conceptual layer. In such case you need only a single set of POCO classes which are mapped by conventions (same class name as the entity, same properties with same types and accesibility) so they will work with both models.
Edit:
Based on #Tridus answer I'm just adding that you can create databases first and use fluentAPI from EF 4.1 to map them. Your databases must have exactly the same schema (table names, column names, etc.), they can't use any specific features (I hope sequences will not be the problem because it is just the way how Oracle handles auto increment columns).
This is actually fairly doable with a database first design, but there's some caveats you won't be able to get around easily due to how the databases handle things differently.
Sequences are one (in that they're just ignored by EF entirely). You can fake that in Oracle by putting a trigger on the table that populates it on Insert, but I also found that if you have to update the model later then EF "forgets" that the column is an identity column and it'll try to stick a 0 in it again. I also found it unreliable in Oracle to try and get the new ID if you use a trigger. We just wound up selecting from the sequence and setting the ID on the object before doing the insert because that's how you usually do it in Oracle. You could also use a stored procedure that handles it.
Numbers aren't handled the same way. SQL Server uses number formats that map to Int32, Int64, etc. Oracle's number format is totally different and a full range Int32 in SQL Server is a Number(10,0) in Oracle... which is actually an Int64 in EF because it's bigger then an Int32. I also found that Oracle's EF provider likes to use Decimal a lot even when it doesn't have to, but that's probably just a beta issue.
Stored Procedures in Oracle require some values to be put in app.config/web.config in order to work in EF. I'm not sure if that's going to just be clutter in SQL Server or if it'll cause problems.
Finally, EF Code First is pretty immature and according to the docs doesn't support changing the database structure in this version. I'm not sure if Oracle's provider supports it either (it might, haven't tried it).
Most of this is stuff you can get around, but you're going to need to do some work to hide the differences from the rest of your code and it'll probably take a wrapper layer to do it.
edit - In regards to your #4 - EF 4.1 can generate partial POCO classes. Instead of writing a wrapper around each of the generated models to hide any differences, you can create another partial class code file that won't be regenerated when you update the model, and then add properties/methods that hide the differences. Your app code would just have to be aware to use those instead, and they'd handle the issue (like the number issue I mentioned, you could completely hide it with another property that can do the necessary casting for Oracle).
Related
I need to be able to save all the data that gets updated like so.
User inserts a car Model (Make, Type, Year). Comes back and Updates the Year. I need to be able to save both so they have a history of all the work that they did. What is the best way to do that?
There are a number of ways to do this. One way is to write some SQL triggers and do it entirely in the database. Have a look here for some clues:
Another way is to do the auditing within the Entity Framework code. There is a nuget package called AuditDbContext with the source on Codeplex.
You need to decide if you want to do the auditing in EF or in SQL. Obviously if you need to audit everything and you might sometimes access the database from different applications which don't use the same EF datalayer (e.g. different technologies, etc), then SQL triggers might well be the way to go.
Maybe (if you are facing the "history" issue more often) the CQRS pattern is of interest for you; a good primer, Microsoft on CQRS. There is a framework build on .NET for this pattern (I have not tried it yet): NCQRS.
If you really just want the requirement in your question fulfilled now and you are using SQL Server 2010 or later, then Change Tracking may be another option. I would prefer that to triggers (but in the end all such dark processing logging solutions introduce additional risk).
Is using strongly typed dataset is good.
Currently I am working on a project developed using VB.Net in Visual Studio 2010.
Previously they were using Sql queries directly into SqlCommand of System.Data.SqlClient, but then after i shifted everything to Strongly Typed DataSet and started using TableAdapters every where..
Now i just wanna ask that is this way is good for a project...
Or Should i shift back to old ones using Just SqlCommands
Or Is there any way to make Sql DataBase in a good way because its an ERP and most of the code is for Data Access..
We use strongly typed datasets all the time now.
After shifting to this behaviour it felt really bad to have SQL-querys in code instead of having it done by the table adapter. But there is a bit overhead with datasets so I guess booth ways are good for different solutions.
Its really nice to have intellisence on all fieldnames and if you change a tableadapter so it returns something different you get design-time errors everywhere where you need to change the code to reflect the change, instead of finding out runtime when the customer is running the program.
There are so many win win-things with strongly typed datasets so I'll never go back.
Table adapters .... make a lot of mess with bigger databases, also updating the table structure also causes confusion.
I would recommend to use some auto code generators for the CRUD Operations.
To me your old pattern looks better than switching altogether to table adapters and strongly typed datasets.
If you ever want to move your data across the wire to other platforms (silverlight, web services, wcf services, etc), then using any kind of dataset will box you into a corner.
The way that we have resolved this is to have classes whose list of properties match the database exactly. To move the data in and out of the database, we use reflection to either match stored procedure parameters or generate dynamic SQL statements, depending on the circumstance and platform.
When a database table is changed, the developer making the change is also responsible for updating the class structure and vice-versa.
In order to reduce the amount of hand-coding required, we use the code generation capabilities of CodeSmith to generate classes from the database and create the basic implementations of our standard add/update stored procedures that require field enumeration.
As an added benefit, this approach removes the tight link between the database and business object structure. We are able to use our same data access code and business object classes against SQL Server, Oracle, Sqlite, and SqlServerCE databases. This code is used to create applications in Windows, PocketPC, Web, iPad, and Android apps; all of the mobile apps use local databases specific to the platform, but using the common data access code.
It is a bit more work to setup initially, but it will pay significant dividends in the long run.
Is it any better? I heard the CodeFirst extension but is it ready for primetime. Please share your experience with development, any performance overheads, etc.
I think this is a timely question, as I was wondering the exact same thing. I am trying to create a serious e-commerce model and I am trying to keep my POCOs free of persistence concerns as well as trying to stay true to Domain Driven Design. So far, I am very wary, and I am on the fence about whether I should jump ship to NHibernate. The only thing keeping me from doing so is that I assume that Microsoft will improve (and quickly).
Some of the biggest problems so far:
Inability to finely control object materialization. EF calls the zero-arg constructor on your POCO, and this is a behavior you cannot change.
No enum support. The community has been screaming -- screaming! -- for this, and it hasn't happened. The workarounds are terrible, and pollute your domain model.
Weird mapping bugs when trying to control column names and relationships in the database. The main ones I can think of are with compound keys and many-to-many relationships. These can be worked around, and I assume these will be fixed by release time, but they are frustrating nonetheless.
Bad SQL. I also do DBA work, and the SQL that EF generates (with or without Code-First) is atrocious.
And this is just the tip of the iceberg: I am only starting to learn EF4 and I'm running into awful roadblocks. As I think of more reasons, I'll add them here. I'm still struggling through it.
(I wonder whether the community will give it another vote of "no confidence.")
More:
To add to the "Weird mapping bugs" problem: You cannot control the name of a column if it participates in a self-referencing relationship (for example, if you have a hierarchy). I assume this will be fixed in the final release.
Lack of batching, resulting in multiple roundtrips to the database. For example, how do you delete a bunch of items from a collection? Load all entities into memory and delete them one at a time. A smaller gripe is the number of DB hits when inserting into tables that participate in an inheritance relationship.
No intelligent way to deal with model changes. EF Code-First loves to completely drop your entire database if it needs to change the schema.
Few extensibility points. You can literally count on one hand the number of events that EF4 allows you to subscribe to (and Code-First doesn't provide much more).
As for me - I prefer EF but with some enhancements. Basically EF offers to you the following advantages:
Visual Model Editor
Database/Model Update wizard (instead of manual XML changes - what is terrible for me)
Also, I'm using 3-rd party commercial tools based on EF and L2S (LinqConnect) that provide for me the following features:
Geography support
Optimized SQL generation
Product absolutely integrated to Visual Studio
Smart database update wizard (synchronization mode)
I'm learning databases, using SqlCe, and need business object to database mapping.
Currently I try to decide if to use Linq to Sql, or EntityFramework. (I understand a bit L2S, but haven't familiarized with EF yet)
The program will only be developed and used by myself, so I have good control of the priorities:
I don't need to consider potential change of database type or data storage type, as I'm quite certain SQLce will stay sufficient.
I DO expect continued development and changes to the data scheme while the program is in active use; change business object properties (Hence database columns), and possibly overall table scheme. So old data must be transported to new scheme.
I also want to keep a decent degree of layer separation DAL/BLL, although this may not be necessary, it is good for me to learn these principles.
My question is: With these priorities, would I have any benefit by choosing either Linq2Sql vs. EntityFramwork? (and please explain why)
Btw, the project involves very simple table scheme and relations with only ~4 tables total.
Thanks!
u can use Linq to sql for this,actually linq to sql is the subset of adoentity framnework.
as per ur need its better to use linq to sql becoz ur database is not complicated as well it just have some tables. linq to sql is easy to use in respect to adoentitiesframeowrk
Keep in mind that Linq2Sql only works with MS SQL Server out of the box, not with SqlCe.
As it seems, there are some tricks to get it to work, but I never tried it myself...no idea if it works as well as with the "real" SQL Server.
So I guess Entity Framework would be the safer choice.
Do you use SchemaExport and SchemaUpdate in real applications? Initially, you create model and then generate schema? Does it work? Or, you use it only for tests...
Usually, I create db (using visual studio database project) and then mappings and persistent classes or EF entities using designer. But now, I want to try code first approach with Fluent NHibernate.
I have researched SchemaExport and SchemaUpdate and found some issues. For example, update doesn't delete db objects, creates not null columns like nullable if table exists, doesn't generate primary key on many-to-many tables and so on. It mean that I have to recreate db very often. But, what's about data? And, how to deploy changes to production db and so on...
I want to know do you really use code first and SchemaExport(SchemaUpdate) in your applications? May be you can give me some advices...
I use SchemaUpdate in production. It is safe precisely because it never does destructive operations like deleting columns. However, it is not a comprehensive solution for updating your database. If you use it you will still have to supplement it with script to update your schema to do things like deleting (as you mention), indexes, changing column type, adding table data, etc. But SchemaUpdate covers the 90% case for me.
The only downside I've discovered is that over time it seems to occasionally add duplicate foreign-key constraints to my table.
One more thing: you should run SchemaUpdate manually from a build tool, not your app itself. It is not safe to give your application the rights to modify your db schema!
I use SchemaUpdate/SchemaExport for rapid evolution of my model, but they are not a replacement for a database migration tool. As you mention, data cannot be migrated in a sensible manner in many cases. The tool does not have enough context. (e.g. How can you automatically migrate a FullName column to FirstName/LastName?) I answered a similar question here where I discuss db migration tools in the context of NHibernate.
NHibernate, ORM : how is refactoring handled? existing data?
Yes, you can use these in real applications; I do.
Of course, almost all the work happens in that first go. My practice has been to create a separate project that references the mappings in my main project assembly and handles database creation and the initial data import, if any.
Once the project is in production, I usually unload that project from the solution, but keep it around for reference or if I ever need to switch from create scripts to update scripts.
As for the way NHibernate creates the database, you have to do a little more specification in your Fluent mappings than you otherwise might. I like to specify null/not null, foreign key constraint names, etc. to have maximum control over the way the database gets created.
I don't think you'd ever want to use automapping in this scenario.
Just with any generating code whether it be poco generation from a tool or database generation as in your question, it will probably get you 80% of the way there. From there it would be wise to tweak it the other 20% to add your indexes and any other performance tweaks to get it just right.