How do I wrap an EF 4.1 DbContext in a repository? - repository

All,
I have a requirement to hide my EF implementation behind a Repository. My simple question: Is there a way to execute a 'find' across both a DbSet AND the DbSet.Local without having to deal with them both.
For example - I have standard repository implementation with Add/Update/Remove/FindById. I break the generic pattern by adding a FindByName method (for demo purposes only :). This gives me the following code:
Client App:
ProductCategoryRepository categoryRepository = new ProductCategoryRepository();
categoryRepository.Add(new ProductCategory { Name = "N" });
var category1 = categoryRepository.FindByName("N");
Implementation
public ProductCategory FindByName(string s)
{
// Assume name is unique for demo
return _legoContext.Categories.Where(c => c.Name == s).SingleOrDefault();
}
In this example, category1 is null.
However, if I implement the FindByName method as:
public ProductCategory FindByName(string s)
{
var t = _legoContext.Categories.Local.Where(c => c.Name == s).SingleOrDefault();
if (t == null)
{
t = _legoContext.Categories.Where(c => c.Name == s).SingleOrDefault();
}
return t;
}
In this case, I get what I expect when querying against both a new entry and one that is only in the database. But this presents a few issues that I am confused over:
1) I would assume (as a user of the repository) that cat2 below is not found. But it is found, and the great part is that cat2.Name is "Goober".
ProductCategoryRepository categoryRepository = new ProductCategoryRepository();
var cat = categoryRepository.FindByName("Technic");
cat.Name = "Goober";
var cat2 = categoryRepository.FindByName("Technic");
2) I would like to return a generic IQueryable from my repository.
It just seems like a lot of work to wrap the calls to the DbSet in a repository. Typically, this means that I've screwed something up. I'd appreciate any insight.

With older versions of EF you had very complicated situations that could arise quite fast due to the required references. In this version I would recomend not exposing IQueryable but ICollections or ILists. This will contain EF in your repository and create a good seperation.
Edit: furthermore, by sending back ICollection IEnumerable or IList you are restraining and controlling the queries being sent to the database. This will also allow you to fine tune and maintain the system with greater ease. By exposing IQueriable, you are exposing yourself to side affects which occur when people add more to the query, .Take() or .Where ... .SelectMany, EF will see these additions and will generate sql to reflect these uncontrolled queries. Not confining the queries can result in queries getting executed from the UI and is more complicated tests and maintenance issues in the long run.
since the point of the repository pattern is to be able to swap them out at will. the details of DbSets should be completly hidden.
I think that you're on a good path. The only thing I probaly ask my self is :
Is the context long lived? if not then do not worry about querying Local. An object that has been Inserted / Deleted should only be accessible once it has been comitted.
if this is a long lived context and you need access to deleted and inserted objects then querying the Local is a good idea, but as you've pointed out, you may run into difficulties at some point.

Related

"Convert" Entity Framework program to raw SQL

I used Entity Framework to create a prototype for a project and now that it's working I want to make the program ready for production.
I face many challenges with EF, the biggest one being the concurrency management (it's a financial software).
Given that it seems to have no way to handle pessimistic concurrency with EF, I have to switch to stored procs in SQL.
To be honest I'm a bit afraid of the workload that may represent.
I would like to know if anybody have been in the same situation before and what is the best strategy to convert a .net code using EF to raw SQL.
Edit:
I'm investigating CLR but it's not clear if pessimistic concurency can be manage with it. is it an option more interesting than TSQl in this case ? It would allow me to reuse part of my C# code and structure of function calling another functions, if I understand well.
I was there and the good news is you don't have to give up Entity Framework if you don't want to. The bad news is you have to update the database yourself. Which isn't as hard as it seems. I'm currently using EF 5 but plan to go to EF 6. I don't see why this still wouldn't work for EF 6.
First thing is in the constructor of the DbContext cast it to IObjectContextAdapter and get access to the ObjectContext. I make a property for this
public virtual ObjectContext ObjContext
{
get
{
return ((IObjectContextAdapter)this).ObjectContext;
}
}
Once you have that subscribe to the SavingChanges event - this isn't our exact code some things are copied out of other methods and redone. This just gives you an idea of what you need to do.
ObjContext.SavingChanges += SaveData;
private void SaveData(object sender, EventArgs e)
{
var context = sender as ObjectContext;
if (context != null)
{
context.DetectChanges();
var tsql = new StringBuilder();
var dbParams = new List<KeyValuePair<string, object>>();
var deletedEntites = context.ObjectStateManager.GetObjectStateEntries(EntityState.Deleted);
foreach (var delete in deletedEntites)
{
// Set state to unchanged - so entity framework will ignore
delete.ChangeState(EntityState.Unchanged);
// Method to generate tsql for deleting entities
DeleteData(delete, tsql, dbParams);
}
var addedEntites = context.ObjectStateManager.GetObjectStateEntries(EntityState.Added);
foreach (var add in addedEntites)
{
// Set state to unchanged - so entity framework will ignore
add.ChangeState(EntityState.Unchanged);
// Method to generate tsql for added entities
AddData(add, tsql, dbParams);
}
var editedEntites = context.ObjectStateManager.GetObjectStateEntries(EntityState.Modified);
foreach (var edit in editedEntites)
{
// Method to generate tsql for updating entities
UpdateEditData(edit, tsql, dbParams);
// Set state to unchanged - so entity framework will ignore
edit.ChangeState(EntityState.Unchanged);
}
if (!tsql.ToString().IsEmpty())
{
var dbcommand = Database.Connection.CreateCommand();
dbcommand.CommandText = tsql.ToString();
foreach (var dbParameter in dbParams)
{
var dbparam = dbcommand.CreateParameter();
dbparam.ParameterName = dbParameter.Key;
dbparam.Value = dbParameter.Value;
dbcommand.Parameters.Add(dbparam);
}
var results = dbcommand.ExecuteNonQuery();
}
}
}
Why we set the entity to unmodified after the update because you can do
var changed properties = edit.GetModifiedProperties();
to get a list of all the changed properties. Since all the entities are now marked as unchanged EF will not send any updates to SQL.
You will also need to mess with the metadata to go from entity to table and property to fields. This isn't that hard to do but messing the metadata does take some time to learn. Something I still struggle with sometimes. I refactored all that out into an IMetaDataHelper interface where I pass it in the entity type and property name to get the table and field back - along with caching the result so I don't have to query metadata all the time.
At the end the tsql is a batch that has all the T-SQL how we want it with the locking hints and containing the transaction level. We also change numeric fields from just being set to nfield = 10 but to be nfield = nfield + 2 in the TSQL if the user updated them by 2 to avoid the concurrency issue as well.
What you wont get to is having SQL locked once someone starts to edit your entity but I don't see how you would get that with stored procedures as well.
All in all it took me about 2 solid days to get this all up and running for us.

Managing relationships in Laravel, adhering to the repository pattern

While creating an app in Laravel 4 after reading T. Otwell's book on good design patterns in Laravel I found myself creating repositories for every table on the application.
I ended up with the following table structure:
Students: id, name
Courses: id, name, teacher_id
Teachers: id, name
Assignments: id, name, course_id
Scores (acts as a pivot between students and assignments): student_id, assignment_id, scores
I have repository classes with find, create, update and delete methods for all of these tables. Each repository has an Eloquent model which interacts with the database. Relationships are defined in the model per Laravel's documentation: http://laravel.com/docs/eloquent#relationships.
When creating a new course, all I do is calling the create method on the Course Repository. That course has assignments, so when creating one, I also want to create an entry in the score's table for each student in the course. I do this through the Assignment Repository. This implies the assignment repository communicates with two Eloquent models, with the Assignment and Student model.
My question is: as this app will probably grow in size and more relationships will be introduced, is it good practice to communicate with different Eloquent models in repositories or should this be done using other repositories instead (I mean calling other repositories from the Assignment repository) or should it be done in the Eloquent models all together?
Also, is it good practice to use the scores table as a pivot between assignments and students or should it be done somewhere else?
I am finishing up a large project using Laravel 4 and had to answer all of the questions you are asking right now. After reading all of the available Laravel books over at Leanpub, and tons of Googling, I came up with the following structure.
One Eloquent Model class per datable table
One Repository class per Eloquent Model
A Service class that may communicate between multiple Repository classes.
So let's say I'm building a movie database. I would have at least the following following Eloquent Model classes:
Movie
Studio
Director
Actor
Review
A repository class would encapsulate each Eloquent Model class and be responsible for CRUD operations on the database. The repository classes might look like this:
MovieRepository
StudioRepository
DirectorRepository
ActorRepository
ReviewRepository
Each repository class would extend a BaseRepository class which implements the following interface:
interface BaseRepositoryInterface
{
public function errors();
public function all(array $related = null);
public function get($id, array $related = null);
public function getWhere($column, $value, array $related = null);
public function getRecent($limit, array $related = null);
public function create(array $data);
public function update(array $data);
public function delete($id);
public function deleteWhere($column, $value);
}
A Service class is used to glue multiple repositories together and contains the real "business logic" of the application. Controllers only communicate with Service classes for Create, Update and Delete actions.
So when I want to create a new Movie record in the database, my MovieController class might have the following methods:
public function __construct(MovieRepositoryInterface $movieRepository, MovieServiceInterface $movieService)
{
$this->movieRepository = $movieRepository;
$this->movieService = $movieService;
}
public function postCreate()
{
if( ! $this->movieService->create(Input::all()))
{
return Redirect::back()->withErrors($this->movieService->errors())->withInput();
}
// New movie was saved successfully. Do whatever you need to do here.
}
It's up to you to determine how you POST data to your controllers, but let's say the data returned by Input::all() in the postCreate() method looks something like this:
$data = array(
'movie' => array(
'title' => 'Iron Eagle',
'year' => '1986',
'synopsis' => 'When Doug\'s father, an Air Force Pilot, is shot down by MiGs belonging to a radical Middle Eastern state, no one seems able to get him out. Doug finds Chappy, an Air Force Colonel who is intrigued by the idea of sending in two fighters piloted by himself and Doug to rescue Doug\'s father after bombing the MiG base.'
),
'actors' => array(
0 => 'Louis Gossett Jr.',
1 => 'Jason Gedrick',
2 => 'Larry B. Scott'
),
'director' => 'Sidney J. Furie',
'studio' => 'TriStar Pictures'
)
Since the MovieRepository shouldn't know how to create Actor, Director or Studio records in the database, we'll use our MovieService class, which might look something like this:
public function __construct(MovieRepositoryInterface $movieRepository, ActorRepositoryInterface $actorRepository, DirectorRepositoryInterface $directorRepository, StudioRepositoryInterface $studioRepository)
{
$this->movieRepository = $movieRepository;
$this->actorRepository = $actorRepository;
$this->directorRepository = $directorRepository;
$this->studioRepository = $studioRepository;
}
public function create(array $input)
{
$movieData = $input['movie'];
$actorsData = $input['actors'];
$directorData = $input['director'];
$studioData = $input['studio'];
// In a more complete example you would probably want to implement database transactions and perform input validation using the Laravel Validator class here.
// Create the new movie record
$movie = $this->movieRepository->create($movieData);
// Create the new actor records and associate them with the movie record
foreach($actors as $actor)
{
$actorModel = $this->actorRepository->create($actor);
$movie->actors()->save($actorModel);
}
// Create the director record and associate it with the movie record
$director = $this->directorRepository->create($directorData);
$director->movies()->associate($movie);
// Create the studio record and associate it with the movie record
$studio = $this->studioRepository->create($studioData);
$studio->movies()->associate($movie);
// Assume everything worked. In the real world you'll need to implement checks.
return true;
}
So what we're left with is a nice, sensible separation of concerns. Repositories are only aware of the Eloquent model they insert and retrieve from the database. Controllers don't care about repositories, they just hand off the data they collect from the user and pass it to the appropriate service. The service doesn't care how the data it receives is saved to the database, it just hands off the relevant data it was given by the controller to the appropriate repositories.
Keep in mind you're asking for opinions :D
Here's mine:
TL;DR: Yes, that's fine.
You're doing fine!
I do exactly what you are doing often and find it works great.
I often, however, organize repositories around business logic instead of having a repo-per-table. This is useful as it's a point of view centered around how your application should solve your "business problem".
A Course is a "entity", with attributes (title, id, etc) and even other entities (Assignments, which have their own attributes and possibly entities).
Your "Course" repository should be able to return a Course and the Courses' attributes/Assignments (including Assignment).
You can accomplish that with Eloquent, luckily.
(I often end up with a repository per table, but some repositories are used much more than others, and so have many more methods. Your "courses" repository may be much more full-featured than your Assignments repository, for instance, if your application centers more around Courses and less about a Courses' collection of Assignments).
The tricky part
I often use repositories inside of my repositories in order to do some database actions.
Any repository which implements Eloquent in order to handle data will likely return Eloquent models. In that light, it's fine if your Course model uses built-in relationships in order to retrieve or save Assignments (or any other use case). Our "implementation" is built around Eloquent.
From a practical point of view, this makes sense. We're unlikely to change data sources to something Eloquent can't handle (to a non-sql data source).
ORMS
The trickiest part of this setup, for me at least, is determing if Eloquent is actually helping or harming us. ORMs are a tricky subject, because while they help us greatly from a practical point of view, they also couple your "business logic entities" code with the code doing the data retrieval.
This sort of muddles up whether your repository's responsibility is actually for handling data or handling the retrieval / update of entities (business domain entities).
Furthermore, they act as the very objects you pass to your views. If you later have to get away from using Eloquent models in a repository, you'll need to make sure the variables passed to your views behave in the same way or have the same methods available, otherwise changing your data sources will roll into changing your views, and you've (partially) lost the purpose of abstracting your logic out to repositories in the first place - the maintainability of your project goes down as.
Anyway, these are somewhat incomplete thoughts. They are, as stated, merely my opinion, which happens to be the result of reading Domain Driven Design and watching videos like "uncle bob's" keynote at Ruby Midwest within the last year.
I like to think of it in terms of what my code is doing and what it is responsible for, rather than "right or wrong". This is how I break apart my responsibilities:
Controllers are the HTTP layer and route requests through to the underlying apis (aka, it controls the flow)
Models represent the database schema, and tell the application what the data looks like, what relationships it may have, as well as any global attributes that may be necessary (such as a name method for returning a concatenated first and last name)
Repositories represent the more complex queries and interactions with the models (I don't do any queries on model methods).
Search engines - classes that help me build complex search queries.
With this in mind, it makes sense every time to use a repository (whether you create interfaces.etc. is a whole other topic). I like this approach, because it means I know exactly where to go when I'm needing to do certain work.
I also tend to build a base repository, usually an abstract class which defines the main defaults - basically CRUD operations, and then each child can just extend and add methods as necessary, or overload the defaults. Injecting your model also helps this pattern to be quite robust.
Think of Repositories as a consistent filing cabinet of your data (not just your ORMs). The idea is that you want to grab data in a consistent simple to use API.
If you find yourself just doing Model::all(), Model::find(), Model::create() you probably won't benefit much from abstracting away a repository. On the other hand, if you want to do a bit more business logic to your queries or actions, you may want to create a repository to make an easier to use API for dealing with data.
I think you were asking if a repository would be the best way to deal with some of the more verbose syntax required to connect related models. Depending on the situation, there are a few things I may do:
Hanging a new child model off of a parent model (one-one or one-many), I would add a method to the child repository something like createWithParent($attributes, $parentModelInstance) and this would just add the $parentModelInstance->id into the parent_id field of the attributes and call create.
Attaching a many-many relationship, I actually create functions on the models so that I can run $instance->attachChild($childInstance). Note that this requires existing elements on both side.
Creating related models in one run, I create something that I call a Gateway (it may be a bit off from Fowler's definitions). Way I can call $gateway->createParentAndChild($parentAttributes, $childAttributes) instead of a bunch of logic that may change or that would complicate the logic that I have in a controller or command.

NHibernate: Using slightly different hbm mapping files depending on context

One of my applications is a public website, the other is an intranet. The public website runs using a limited security user that must access a certain table through a view, whereas the intranet can access the table itself.
This seems like it would be quite simple to setup using Fluent NHibernate. In my ClassMap I could do a check like this:
public class MyEntityClassMap : ClassMap<MyEntity>
{
public MyEntityClassMap()
{
if (NHibernateConfig.Current.Context == "intranet")
Table("t_MyEntity");
else
Table("v_MyEntity_pub");
... etc
}
}
Is there a simple way of doing this for embedded hbm files? The only method I can think of would be to have two copies of the hbm file, which would be confusing and far from ideal.
Is there perhaps a better way of achieving the same result?
Actually what you ask it is possible. You can actually access the embedded XML files and alter their content before the SessionFactory is build (on Application Start).
Assuming your will choose to reference the "t_MyEntity" in your entities by default here is how you can dynamically change this reference when you want to reference the "v_MyEntity_pub" table instead (the code may not work as it is but you will get the idea):
NHibernate.Cfg.Configuration cfg = new NHibernate.Cfg.Configuration();
cfg.AddAssembly(ASSEMBLYNAME);
if (NHibernateConfig.Current.Context != "intranet") //this is how you have stated you distinguish the intranet application from the other one.
{
string[] resourcesNames = assembly.GetManifestResourceNames();
foreach (string resourceName in resourcesNames)
{
StreamReader sr = new StreamReader(assembly.GetManifestResourceStream(resourceName));
string resourceContent = sr.ReadToEnd();
resourceContent = resourceContent.Replace("t_MyEntity", "v_MyEntity_pub");
cfg.AddXmlString(resourceContent);
}
}
ISessionFactory sessionFactory = cfg.BuildSessionFactory();
The above code should be executed only once for the lifetime of your application and only for the intranet application.
Although this is perhaps not the most helpful answer to your problem, I don't believe that this is possible in a mapping file. I also don't think that two hbm files would work for the same name, as it would be unable to distinguish between the two, you would instead have to have two identical objects each with slightly different names and mapping files. Which as you said in your question, would be completely confusing and ideal would simply be a spot on the horizon that you were hoping to, someday, reach.
Why is it that can't access everything directly through the view? I'm assuming there is no writing involved in this process? Is there any way you can change this method of accessing data while still maintaining your security?

Nhibernate Tag Cloud Query

This has been a 2 week battle for me so far with no luck. :(
Let me first state my objective. To be able to search entities which are tagged "foo" and "bar". Wouldn't think that was too hard right?
I know this can be done easily with HQL but since this is a dynamically built search query that is not an option. First some code:
public class Foo
{
public virtual int Id { get;set; }
public virtual IList<Tag> Tags { get;set; }
}
public class Tag
{
public virtual int Id { get;set; }
public virtual string Text { get;set; }
}
Mapped as a many-to-many because the Tag class is used on many different types. Hence no bidirectional reference.
So I build my detached criteria up using an abstract filter class. Lets assume for simplicity I am just searching for Foos with tags "Apples"(TagId1) && "Oranges"(TagId3) this would look something like.
SQL:
SELECT ft.FooId
FROM Foo_Tags ft
WHERE ft.TagId IN (1, 3)
GROUP BY ft.FooId
HAVING COUNT(DISTINCT ft.TagId) = 2; /*Number of items we are looking for*/
Criteria
var idsIn = new List<int>() {1, 3};
var dc = DetachedCriteria.For(typeof(Foo), "f").
.CreateCriteria("Tags", "t")
.Add(Restrictions.InG("t.Id", idsIn))
.SetProjection( Projections.ProjectionList()
.Add(Projections.Property("f.Id"))
.Add(Projections.RowCount(), "RowCount")
.Add(Projections.GroupProperty("f.Id")))
.ProjectionCriteria.Add(Restrictions.Eq("RowCount", idsIn.Count));
}
var c = Session.CreateCriteria(typeof(Foo)).Add(Subqueries.PropertyIn("Id", dc))
Basically this is creating a DC that projects a list of Foo Ids which have all the tags specified.
This compiled in NH 2.0.1 but didn't work as it complained it couldn't find Property "RowCount" on class Foo.
After reading this post I was hopeful that this might be fixed in 2.1.0 so I upgraded. To my extreme disappointment I discovered that ProjectionCriteria has been removed from DetachedCriteria and I cannot figure out how to make the dynamic query building work without DetachedCriteria.
So I tried to think how to write the same query without needing the infamous Having clause. It can be done with multiple joins on the tag table. Hooray I thought that's pretty simple. So I rewrote it to look like this.
var idsIn = new List<int>() {1, 3};
var dc = DetachedCriteria.For(typeof(Foo), "f").
.CreateCriteria("Tags", "t1").Add(Restrictions.Eq("t1.Id", idsIn[0]))
.CreateCriteria("Tags", "t2").Add(Restrictions.Eq("t2.Id", idsIn[1]))
In a vain attempt to produce the below sql which would do the job (I realise its not quite correct).
SELECT f.Id
FROM Foo f
JOIN Foo_Tags ft1
ON ft1.FooId = f.Id
AND ft1.TagId = 1
JOIN Foo_Tags ft2
ON ft2.FooId = f.Id
AND ft2.TagId = 3
Unfortunately I fell at the first hurdle with this attempt, receiving the exception "Duplicate Association Path". Reading around this seems to be an ancient and still very real bug/limitation.
What am I missing?
I am starting to curse NHibernates name at making what is you would think so simple and common a query, so difficult. Please help anyone who has done this before. How did you get around NHibernates limitations.
Forget reputation and a bounty. If someone does me a solid on this I will send you a 6 pack for your trouble.
I managed to get it working like this :
var dc = DetachedCriteria.For<Foo>( "f")
.CreateCriteria("Tags", "t")
.Add(Restrictions.InG("t.Id", idsIn))
.SetProjection(Projections.SqlGroupProjection("{alias}.FooId", "{alias}.FooId having count(distinct t1_.TagId) = " + idsIn.Count,
new[] { "Id" },
new IType[] { NHibernateUtil.Int32 }));
The only problem here is the count(t1_.TagId) - but I think that the alias should be generated the same every time in this DetachedCriteria - so you should be on the safe side hard coding that.
Ian,
Since I'm not sure what db backend you are using, can you do some sort of a trace against the produced SQL query and take a look at the SQL to figure out what went wrong?
I know I've done this in the past to understand how Linq-2-SQL and Linq-2-Entities have worked, and been able to tweak certain cases to improve the data access, as well as to understand why something wasn't working as initially expected.

NHibernate testing, mocking ISession

I am using NHibernate and Rhinomocks and having trouble testing what I want. I would like to test the following repository method without hitting the database (where _session is injected into the repository as ISession):
public class Repository : IRepository
{
(... code snipped for brevity ...)
public T FindBy<T>(Expression<Func<T, bool>> where)
{
return _session.Linq<T>().Where(where).FirstOrDefault();
}
}
My initial approach is to mock ISession, and return an IQueryable stub (hand coded) when Linq is called. I have a IList of Customer objects I would like to query in memeory to test my Linq query code without hitting the db. And I'm not sure what this would look like. Do I write my own implementation of IQueryable? If so, has someone done this for this approach? Or do I need to look at other avenues?
Thanks!
How I've done this test is to not pass the expression to the repository, instead expose IQueryable giving the repository an interface such as:
public interface IRepository<T>
{
IQueryable<T> All();
// whatever else you want
}
Easily implemented like so:
public IQueryable<T> All()
{
return session.Linq<T>();
}
This means that instead of calling your method on the repository like:
var result = repository.FindBy(x => x.Id == 1);
You can do:
var result = repository.All().Where(x => x.Id == 1);
Or the LINQ syntax:
var result = from instance in repository.All()
where instance.Id == 1
select instance;
This then means you can get the same test by mocking the repository out directly which should be easier. You just get the mock to return a list you have created and called AsQueryable() on.
As you have pointed out, the point of this is to let you test the logic of your queries without involving the database which would slow them down dramatically.
From my point of view is this would be considered Integration Testing. NHibernate has it's own tests that it passes and it seems to me like you're trying duplicate some of those tests in your own test suite. I'd either add the NHibernate code and tests to your project and add this there along with their tests, thats if they don't have one very similiar, and use their methods of testing or move this to an Integration testing scenario and hit the database.
If it's just the fact you don't want to have to setup a database to test against you're in luck since you're using NHibernate. With some googling you can find quite a few examples of how to use SQLite to "kinda" do integration testing with the database but keep it in memory.