Expected behaviour of a Repository - orm

I'm writing an ORM and am unsure of the expected behaviour of the Repository, or more precisely, the frontier between the Repository and the Unit Of Work.
From my understanding, a Repository might look like this:
interface IPersonRepository
{
public function find(Criteria criteria);
public function add(Person person);
public function delete(Person person);
}
According to Fowler (PoEAA, page 322):
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. [...] Objects can be added to and removed from the Repository, as they can from a simple collection of objects.
This would imply that the following test should work (assuming that we already have a Person persisted, whose last name is Fowler):
collection = repository.find(lastnameEqualsFowlerCriteria);
person = collection[0];
assertEquals(person.lastname, "Fowler");
person.lastname = "Evans";
newCollection = repository.find(lastnameEqualsFowlerCriteria);
assertFalse(newCollection.contains(person));
That means that when mapping to a database, even if no explicit save() method has been called somewhere, the Person model must have been automatically persisted by the Repository, so that the next query returned the correct collection, not containing the original Person.
But, isn't that the role of the Unit Of Work, to decide which model to persist to the database, and when?
In the above implementation, the Repository has to decide to persist the Person previously retrieved when receiving another find() call, so that the result is consistent with the modification. But if no other find() call were issued, the model would not have been persisted implicitly at all.
In the context of a Unit Of Work, it is not really a problem, because we can start a transaction at the beginning, and rollback any insert to the db anyway if needed.
But when used alone, can't this Repository lead to unexpected, unpredictable behaviour?

A Repository mediates between the
domain and data mapping layers, acting
like an in-memory domain object
collection. [...] Objects can be added
to and removed from the Repository, as
they can from a simple collection of
objects.
This does not mean you do not need a save method. You still need to explicitly commit your changes to storage.
See The Unit Of Work Pattern And Persistence Ignorance
public interface IUnitOfWork {
void MarkDirty(object entity);
void MarkNew(object entity);
void MarkDeleted(object entity);
void Commit();
void Rollback();
}
In a way, you can think of the Unit of Work as a place to dump all transaction-handling code. The responsibilities of the Unit of Work are to:
Manage transactions.
Order the database inserts, deletes, and updates.
Prevent duplicate updates. Inside a single usage of a Unit of Work object, different parts of the code may mark the same Invoice object as changed, but the Unit of Work class will only issue a single UPDATE command to the databas

I think what you;re asking about is following: http://martinfowler.com/eaaCatalog/identityMap.html
Repository should keep fetched objects in memory and all subsequent calls for that entity should not be retrieved from persistence storage, hence your example should work fine.

Related

domain design with nhibernate

In my domain I have something called Project which basically holds a lot of simple configuration propeties that describe what should happen when the project gets executed. When the Project gets executed it produces a huge amount of LogEntries. In my application I need to analyse these log entries for a given Project, so I need to be able to partially successively load a portion (time frame) of log entries from the database (Oracle). How would you model this relationship as DB tables and as objects?
I could have a Project table and ProjectLog table and have a foreign key to the primary key of Project and do the "same" thing at object level have class Project and a property
IEnumerable<LogEntry> LogEntries { get; }
and have NHibernate do all the mapping. But how would I design my ProjectRepository in this case? I could have a methods
void FillLog(Project projectToFill, DateTime start, DateTime end);
How can I tell NHibernate that it should not load the LogEntries until someone calls this method and how would I make NHibernate to load a specifc timeframe within that method?
I am pretty new to ORM, maybe that design is not optimal for NHibernate or in general? Maybe I shoul design it differently?
Instead of having a Project entity as an aggregate root, why not move the reference around and let LogEntry have a Product property and also act as an aggregate root.
public class LogEntry
{
public virtual Product Product { get; set; }
// ...other properties
}
public class Product
{
// remove the LogEntries property from Product
// public virtual IList<LogEntry> LogEntries { get; set; }
}
Now, since both of those entities are aggregate roots, you would have two different repositories: ProductRepository and LogEntryRepository. LogEntryRepository could have a method GetByProductAndTime:
IEnumerable<LogEntry> GetByProductAndTime(Project project, DateTime start, DateTime end);
The 'correct' way of loading partial / filtered / criteria-based lists under NHibernate is to use queries. There is lazy="extra" but it doesn't do what you want.
As you've already noted, that breaks the DDD model of Root Aggregate -> Children. I struggled with just this problem for an absolute age, because first of all I hated having what amounted to persistence concerns polluting my domain model, and I could never get the API surface to look 'right'. Filter methods on the owning entity class work but are far from pretty.
In the end I settled for extending my entity base class (all my entities inherit from it, which I know is slightly unfashionable these days but it does at least let me do this sort of thing consistently) with a protected method called Query<T>() that takes a LINQ expression defining the relationship and, under the hood in the repository, calls LINQ-to-NH and returns an IQueryable<T> that you can then query into as you require. I can then facade that call beneath a regular property.
The base class does this:
protected virtual IQueryable<TCollection> Query<TCollection>(Expression<Func<TCollection, bool>> selector)
where TCollection : class, IPersistent
{
return Repository.For<TCollection>().Where(selector);
}
(I should note here that my Repository implementation implements IQueryable<T> directly and then delegates the work down to the NH Session.Query<T>())
And the facading works like this:
public virtual IQueryable<Form> Forms
{
get
{
return Query<Form>(x => x.Account == this);
}
}
This defines the list relationship between Account and Form as the inverse of the actual mapped relationship (Form -> Account).
For 'infinite' collections - where there is a potentially unbounded number of objects in the set - this works OK, but it means you can't map the relationship directly in NHibernate and therefore can't use the property directly in NH queries, only indirectly.
What we really need is a replacement for NHibernate's generic bag, list and set implementations that knows how to use the LINQ provider to query into lists directly. One has been proposed as a patch (see https://nhibernate.jira.com/browse/NH-2319). As you can see the patch was not finished or accepted and from what I can see the proposer didn't re-package this as an extension - Diego Mijelshon is a user here on SO so perhaps he'll chime in... I have tested out his proposed code as a POC and it does work as advertised, but obviously it's not tested or guaranteed or necessarily complete, it might have side-effects, and without permission to use or publish it you couldn't use it anyway.
Until and unless the NH team get around to writing / accepting a patch that makes this happen, we'll have to keep resorting to workarounds. NH and DDD just have conflicting views of the world, here.

Business Entity - should lists be exposed only as ReadOnlyCollections?

In trying to centralize how items are added, or removed from my business entity classes, I have moved to the model where all lists are only exposed as ReadOnlyCollections and I provide Add and Remove methods to manipulate the objects in the list.
Here is an example:
public class Course
{
public string Name{get; set;}
}
public class Student
{
private List<Course>_courses = new List<Course>();
public string Name{get; set;}
public ReadOnlyCollection<Course> Courses {
get{ return _courses.AsReadOnly();}
}
public void Add(Course course)
{
if (course != null && _courses.Count <= 3)
{
_courses.Add(course);
}
}
public bool Remove(Course course)
{
bool removed = false;
if (course != null && _courses.Count <= 3)
{
removed = _courses.Remove(course);
}
return removed;
}
}
Part of my objective in doing the above is to not end up with an Anemic data-model (an anti-pattern) and also avoid having the logic that adds and removes courses all over the place.
Some background: the application I am working with is an Asp.net application, where the lists used to be exposed as a list previously, which resulted in all kinds of ways in which Courses were added to the Student (some places a check was made and others the check was not made).
But my question is: is the above a good idea?
Yes, this is a good approach, in my opinion you're not doing anything than decorating your list, and its better than implementing your own IList (as you save many lines of code, even though you lose the more elegant way to iterate through your Course objects).
You may consider receiving a validation strategy object, as in the future you might have a new requirement, for ex: a new kind of student that can have more than 3 courses, etc
I'd say this is a good idea when adding/removing needs to be controlled in the manner you suggest, such as for business rule validation. Otherwise, as you know from previous code, there's really no way to ensure that the validation is performed.
The balance that you'll probably want to reach, however, is when to do this and when not to. Doing this for every collection of every kind seems like overkill. However, if you don't do this and then later need to add this kind of gate-keeping code then it would be a breaking change for the class, which may or may not be a headache at the time.
I suppose another approach could be to have a custom descendant of IList<T> which has generic gate-keeping code for its Add() and Remove() methods which notifies the system of what's happening. Something like exposing an event which is raised before the internal logic of those methods is called. Then the Student class would supply a delegate or something (sorry for being vague, I'm very coded-out today) when instantiating _courses to apply business logic to the event and cancel the operation (throw an exception, I imagine) if the business validation fails.
That could be overkill as well, depending on the developer's disposition. But at least with something a little more engineered like this you get a single generic implementation for everything with the option to add/remove business validation as needed over time without breaking changes.
I've done that in the past and regretted it: a better option is to use different classes to read domain objects than the ones you use to modify them.
For example, use a behavior-rich Student domain class that jealously guards its ownership of courses - it shouldn't expose them at all if student is responsible for them - and a StudentDataTransferObject (or ViewModel) that provides a simple list of strings of courses (or a dictionary when you need IDs) for populating interfaces.

How to synchronize property updates to a single domain instance?

I am looking for ideas to synchronize updates to a single instance of a persisted object.
A simple domain object:
public class Employee {
long id;
String phone;
String address;
}
Suppose two UI instances pull up Employee(1) where id=1. The first client edits the phone property of Employee(1); the second client edits the address property of Employee(1). When they submit their changes, both need to be persisted.
A possible solution would be to create an update function for each property:
public void updatePhone(Employee employee) {
// right now I am synchronizing _employeeUpdateLock
// synchronize instance of Employee won't work
synchronized( something ) {
// update phone
}
}
// a similar function for address
This approach unfortunately doesn't scale well. The API needs to constantly aligns itself to the properties. Note that
public void update(Employee employee) { ... }
won't work because the function can't tell which property the client intends to change, unless a copy of the original object can be pulled up within the update function.
Hibernate provides a mechanism to lock a row to the database. This doesn't scale well either.
Perhaps the solution depends on the frequency a row's expected to be modified. For low frequency modifications, synchronized and locks are fine. For high frequency modifications, a copy of the row at the time of retrieval can be used to figure out the updated properties.
I am hoping to find a better paradigm to solve this problem. Thanks.
I don't really understand your intended architecture. Several clients are to share one instance of the same data object? How is that even possible - short of using some kind of remote object model (which is considered a bad thing nowadays)?

Is this a ddd anti-pattern?

Is it a violation of the Persistance igorance to inject a repository interface into a Entity object Like this. By not using a interface I clearly see a problem but when using a interface is there really a problem? Is the code below a good or bad pattern and why?
public class Contact
{
private readonly IAddressRepository _addressRepository;
public Contact(IAddressRepository addressRepository)
{
_addressRepository = addressRepository;
}
private IEnumerable<Address> _addressBook;
public IEnumerable<Address> AddressBook
{
get
{
if(_addressBook == null)
{
_addressBook = _addressRepository.GetAddresses(this.Id);
}
return _addressBook;
}
}
}
It's not exactly a good idea, but it may be ok for some limited scenarios. I'm a little confused by your model, as I have a hard time believing that Address is your aggregate root, and therefore it wouldn't be ordinary to have a full-blown address repository. Based on your example, you probably are actually using a table data gateway or dao rather than a respository.
I prefer to use a data mapper to solve this problem (an ORM or similar solution). Basically, I would take advantage of my ORM to treat address-book as a lazy loaded property of the aggregate root, "Contact". This has the advantage that your changes can be saved as long as the entity is bound to a session.
If I weren't using an ORM, I'd still prefer that the concrete Contact repository implementation set the property of the AddressBook backing store (list, or whatever). I might have the repository set that enumeration to a proxy object that does know about the other data store, and loads it on demand.
You can inject the load function from outside. The new Lazy<T> type in .NET 4.0 comes in handy for that:
public Contact(Lazy<IEnumerable<Address>> addressBook)
{
_addressBook = addressBook;
}
private Lazy<IEnumerable<Address>> _addressBook;
public IEnumerable<Address> AddressBook
{
get { return this._addressBook.Value; }
}
Also note that IEnumerable<T>s might be intrinsically lazy anyhow when you get them from a query provider. But for any other type you can use the Lazy<T>.
Normally when you follow DDD you always operate with the whole aggregate. The repository always returns you a fully loaded aggregate root.
It doesn't make much sense (in DDD at least) to write code as in your example. A Contact aggregate will always contain all the addresses (if it needs them for its behavior, which I doubt to be honest).
So typically ContactRepository supposes to construct you the whole Contact aggregate where Address is an entity or, most likely, a value object inside this aggregate.
Because Address is an entity/value object that belongs to (and therefore managed by) Contact aggregate it will not have its own repository as you are not suppose to manage entities that belong to an aggregate outside this aggregate.
Resume: always load the whole Contact and call its behavior method to do something with its state.
Since its been 2 years since I asked the question and the question somewhat misunderstood I will try to answer it myself.
Rephrased question:
"Should Business entity classes be fully persistance ignorant?"
I think entity classes should be fully persistance ignorant, because you will instanciate them many places in your code base so it will quickly become messy to always have to inject the Repository class into the entity constructor, neither does it look very clean. This becomes even more evident if you are in need of injecting several repositories. Therefore I always use a separate handler/service class to do the persistance jobs for the entities. These classes are instanciated far less frequently and you usually have more control over where and when this happens. Entity classes are kept as lightweight as possible.
I now always have 1 Repository pr aggregate root and if I have need for some extra business logic when entities are fetched from repositories I usually create 1 ServiceClass for the aggregate root.
By taking a tweaked example of the code in the question as it was a bad example I would do it like this now:
Instead of:
public class Contact
{
private readonly IContactRepository _contactRepository;
public Contact(IContactRepository contactRepository)
{
_contactRepository = contactRepository;
}
public void Save()
{
_contactRepository.Save(this);
}
}
I do it like this:
public class Contact
{
}
public class ContactService
{
private readonly IContactRepository _contactRepository;
public ContactService(IContactRepository contactRepository)
{
_contactRepository = contactRepository;
}
public void Save(Contact contact)
{
_contactRepository.Save(contact);
}
}

NHibernate - Changing sub-types

How do you go about changing the subtype of a row in NHibernate? For example if I have a Customer entity and a subclass of TierOneCustomer, I have a case where I need to change a Customer to a TierOneCustomer but the TierOneCustomer should have the same Id (PK) as the original Customer entity.
The mapping looks something like this:
<class name="Customer" table="SiteCustomer" discriminator-value="C">
<id name="Id" column="Id" type="Int64">
<generator class="identity" />
</id>
<discriminator column="CustomerType" />
... properties snipped ...
<subclass name="TierOneCustomer" discriminator-value="P">
... more properties ...
</subclass>
</class>
I'm using the one-table per class hierarchy model, so using plain-sql, it'd be just a matter of a sql update of the discriminator (CustomerType) and set the appropriate columns relevant for the type. I can't find the solution in NHibernate, so would appreciate any pointers.
I'm also thinking whether the model is correct considering this use-case, but before I go down that route, I want to make sure doing as described above is actually possible in the first place. If not, I'll almost certainly think about changing the model.
Short answer is yes, you can change the discriminator value for the particular row(s) using native SQL.
However, I don't think NHibernate is intended to work this way, since the discriminator is generally "invisible" to the Java layer, where its value is supposed to be set initially according to the class of the persisted object and never changed.
I recommend looking into a cleaner approach. From the standpoint of the object model, you're trying to convert a superclass object into one of its subclass types while not changing the identity of its persisted instance, and that's where the conflict is (the converted object isn't really supposed to be the same thing). Two alternative approaches are:
Create a new instance of TierOneCustomer based on the information in the original Customer object, then delete the original object. If you were relying on the Customer's Primary Key for retrieval, you'll need to take note of the new PK.
or
Change your approach so the object type (discriminator) doesn't need to change. Instead of relying on a subclass to distinguish TierOneCustomer from Customer, you can use a property that you can modify freely at any time, i.e. Customer.Tier = 1.
Here are some related discussions on the Hibernate Forums that may be of interest:
Can we update the discriminator column in Hibernate
Table-per-Class Problem: Discriminator and Property
Converting a persisted instance into a subclass
You're doing something wrong.
What you are trying to do is to change the type of an object. You can't do that in .NET or in Java. That simply doesn't make sense. An object is of exactly one concrete type, and its concrete type cannot be changed from the time the object is created until the time the object is destroyed (black magic notwithstanding). In order to accomplish what you are trying to do, but with the class hierarchy you laid out, you would have to destroy the customer object which you want to turn into a tier-one customer object, create a new tier-one customer object, and copy all the relevant properties from the customer object to the tier-one customer object. That is how you do it with objects, in object-oriented languages, with your class hierarchy.
Obviously, the class hierarchy you have isn't working for you. You don't destroy customers in real life when they become tier-one customers! So don't do it with objects either. Instead, come up with a class hierarchy that makes sense, given the scenarios you need to implement. Your use scenarios include:
A customer who previously is not tier-one status now becomes tier-one status.
That means you need a class hierarchy which can accurately capture this scenario. As a hint, you should favor composition over inheritance. That means, it may be a better idea to have a property named IsTierOne, or a property named DiscountStrategy, etc., depending on what works best.
The entire purpose of NHibernate (and Hibernate for Java) is to make the database invisible. To allow you to work with objects natively, with the database magically there behind the scenes to make your objects persistent. NHibernate will let you work with the database natively, but that's not the type of scenario which NHibernate is built for.
This is REALLY late, but may be of use to the next person looking to do something similar:
While the other answers are correct that you shouldn't change the discriminator in most cases, you can do it purely within the scope of NH (no native SQL), with some clever use of mapped properties. Here's the gist of it using FluentNH:
public enum CustomerType //not sure it's needed
{
Customer,
TierOneCustomer
}
public class Customer
{
//You should be able to use the Type name instead,
//but I know this enum-based approach works
public virtual CustomerType Type
{
get {return CustomerType.Customer;}
set {} //small code smell; setter exists, no error, but it doesn't do anything.
}
...
}
public class TierOneCustomer:Customer
{
public override CustomerType Type {get {return CustomerType.TierOneCustomer;} set{}}
...
}
public class CustomerMap:ClassMap<Customer>
{
public CustomerMap()
{
...
DiscriminateSubClassesOnColumn<string>("CustomerType");
DiscriminatorValue(CustomerType.Customer.ToString());
//here's the magic; make the discriminator updatable
//"Not.Insert()" is required to prevent the discriminator column
//showing up twice in an insert statement
Map(x => x.Type).Column("CustomerType").Update().Not.Insert();
}
}
public class TierOneCustomerMap:SubclassMap<TierOneCustomer>
{
public CustomerMap()
{
//same idea, different discriminator value
...
DiscriminatorValue(CustomerType.TierOneCustomer.ToString());
...
}
}
The end result is that the discriminator value is specified for inserts, and used to determine the instantiated type on retrieval, but then if a record of a different subtype with the same Id is saved (as if the record was cloned or un-bound from the UI to a new type), the discriminator value is updated on the existing record with that ID as an object property, so that future retrievals of that type are as the new object. The setter is required on the properties because AFAIK NHibernate can't be told that a property is read-only (and thus "write-only" to the DB); in NHibernate's world, if you write something to the DB, why wouldn't you want it back?
I used this pattern recently to allow users to change the basic type of a "tour", which is in reality a set of rules governing the scheduling of the actual "tour" (a single digital "visit" to a client's on-site equipment to ensure it all works properly). While they're all "tour schedules" and need to be collectable in lists/queues etc as such, the different types of schedules require very different data and very different processing, calling for a similar data structure as the OP has. I therefore completely understand the OP's desire to treat a TierOneCustomer in a substantially different way while minimizing the effect at the data layer, so, here ya go.
If you're doing it offline (e.g. in a DB upgrade script), just use SQL and ensure consistency yourself.
If this is something you plan will happen in while the app is running, I think your requirements are wrong, just like keeping the same pointer address for a different object is wrong.
If you save the ID and use it to access the customer again (e.g. in a URL) consider making a new field that contains a token for this that will be the business key. Since it's not the ID, it's easy to create a new entity instance and copy over the token (you'll probably need to remove the token from the old one).