How to synchronize property updates to a single domain instance? - orm

I am looking for ideas to synchronize updates to a single instance of a persisted object.
A simple domain object:
public class Employee {
long id;
String phone;
String address;
}
Suppose two UI instances pull up Employee(1) where id=1. The first client edits the phone property of Employee(1); the second client edits the address property of Employee(1). When they submit their changes, both need to be persisted.
A possible solution would be to create an update function for each property:
public void updatePhone(Employee employee) {
// right now I am synchronizing _employeeUpdateLock
// synchronize instance of Employee won't work
synchronized( something ) {
// update phone
}
}
// a similar function for address
This approach unfortunately doesn't scale well. The API needs to constantly aligns itself to the properties. Note that
public void update(Employee employee) { ... }
won't work because the function can't tell which property the client intends to change, unless a copy of the original object can be pulled up within the update function.
Hibernate provides a mechanism to lock a row to the database. This doesn't scale well either.
Perhaps the solution depends on the frequency a row's expected to be modified. For low frequency modifications, synchronized and locks are fine. For high frequency modifications, a copy of the row at the time of retrieval can be used to figure out the updated properties.
I am hoping to find a better paradigm to solve this problem. Thanks.

I don't really understand your intended architecture. Several clients are to share one instance of the same data object? How is that even possible - short of using some kind of remote object model (which is considered a bad thing nowadays)?

Related

Value object in event sourcing

Is there a place for value objects in an event sourced domain model?
Lets define a value object as an object with immutable state that guards its invariants and has no particular identifier.
An event sourced domain model in this context is a domain that is entirely or partially event sourced, meaning that its current state can be derived from applying all events that have occurred in the past. Events themselves are considered immutable, even over time.
Debate has taken place about the validity of using value objects within events - this question goes slightly further: Do value objects have a place in event sourced domains at all?
The (potential) problem with using value objects is that it becomes rather tricky to alter the domain in such a way that invariants are tightened.
An example of this scenario would be to have a Username value object, with the sole constraint that the name must be anywhere between 2 and 16 characters.
While this has been working well for some time, the business decides to only allow usernames of at least 5 characters.
A migration period begins and users with names of less than 5 characters are asked to update their names.
Lets say the process was successful, correction events are applied and everyone is happy.
We tighten the constraints on our Username value object to require at least 5 characters.
For a while everyone is happy, but then we discover a problem with the snapshots and replay all events.
We now face an exception from our Username object: by loading the historic data, we're breaking an invariant of our domain.
The rules of a value objects apply retroactively - does this make them inherently unsuitable for event sourcing? Would it be worth applying versioning of value objects? Is there a simpler way of avoiding such problems?
I would say, that at the moment you redefined what Username means, and you don't migrate historical data somehow, you've essentially created 2 different Username meanings.
Because there are 2 different meanings of the word, you have to make it explicit in the code somehow. "Versioning" is one way, although I wouldn't use such a generic solution, there are different modeling options.
You could make it explicit that the history of a "username" is just that, a history. So for example create a HistoricUsername, which is the event-sourced object, even a value object if you want. And create a Username which is at all times the username with the most current rules, which is not persisted at all, but created from a HistoricUsername if it can.
Some people suggest sometimes to extract the "rules" from the object, and re-apply it later. That way the object itself is valid at all times and you can ask it to validate itself against rules that might change. I don't really prefer these kinds of solutions, but it's an option, and the Username would still be a value-object.
So the problem is not really that value-objects don't fit into event-sourcing, it's just that the modeling has to be more accurate.
Do value objects have a place in event sourced domains at all?
Yes.
Is there a simpler way of avoiding such problems?
"Don't do that."
The problem you are describing is really one about messaging - if we make backwards incompatible changes to our messages, then things break.
(More precisely, you have a "Username" message, and you are trying to re-use that message with a new set of constraints that reject some previously valid uses of the message).
The answer is that you don't introduce backwards incompatible changes - instead, introduce new names that match the new requirements, and deprecated the old ones.
Which is to say, adding support for new messages, and removing support for the old messages, become two separately managed options.
Greg Young's book Versioning in an Event Sourced System dedicates some chapters to this idea. Also, Rich Hickey ends up touching on these important ideas in most of his talks -- I'd suggest starting from Spec-ulation.
The "value object", meaning that the type that the current implementation of the domain model uses to move the information around, is a separate concern from the messages. The data structures we use in memory don't need to be coupled to our serialization formats.
The representation of the information on the wire is distinct from the representation of information in memory, and that in turn is distinct from the abstractions that manipulate the information in memory.
The challenging thing is that, at the beginning of a project, you have the least amount of information about when the different representations are going to diverge.
We've solved this in a slightly different way. By separating the public API of our value objects from the internal (domain only) API, we are able to evolve one without affecting the other.
For example:
public class Username
{
private readonly string value;
// Domain-only (internal) constructor.
// Does not enforce constriants and can only be called within the domain.
internal Username(string value)
{
this.value = value;
}
// Public factory method.
// Enforces business constraints. Used by consumers of the domain (application layer etc.)
// to create new instances of the value object.
public static Username Create(string value)
{
// Business constraints. These will evolve and grow over time.
if (value == null)
{
// throw exception etc.
}
if (value.Length < 2)
{
// throw exception etc.
}
return new Username(value);
}
}
Consumers of the domain must use the static Create method to create a new instance of the value object. This factory method contains all of our business constraints and prevents an instance being created in an invalid state.
Inside the domain, classes have access to the internal (constraint-less) constructor. Since this does not enforce any business constraints, an instance of the value object can always be created in this way (regardless of its value). By using this constructor when replaying events we can ensure that historical data will always succeed.
The benefits of this design are:
A single class is used to represent the domain concept (no need for multiple classes, versioning etc.).
Business rules are free to evolve over time.
Historical data always works. A Username from a year ago is still a user name, even if our rules have changed.
Although already answered I do find this an interesting situation.
I agree with others that the event data should be record-based and, therefore, nothing more than a data container that may be used to reconstitute the aggregate.
That being said when the rules change so does the domain. A major portion of domain-driven design is to capture as much of the domain (rules/structure) as is required. If this is the case should the changes in the rules not also be kept?
For instance, if we have a Username Value Object and it starts out with the 2 to 16 characters rules then that is coded as such:
public class Username
{
public string Value { get; }
public Username(string value)
{
if (value.Length < 2 || value.Length > 16)
{
throw new DomainException("Username must be between 2 and 16 characters");
}
Value = value;
}
}
Now we get to 1 March 2018 and the rule changes. We can keep the rule around:
public class Username
{
public string Value { get; }
public Username(string value, DateTime registrationDate)
{
if (registrationDate < new Date(2018, 3, 1) &&
(value.Length < 2 || value.Length > 16))
{
throw new DomainException("Username must be between 2 and 16 characters");
}
if (registrationDate >= new Date(2018, 3, 1) &&
(value.Length < 5 || value.Length > 16))
{
throw new DomainException("Username must be between 5 and 16 characters");
}
Value = value;
}
}
That is the basic idea. In this way we keep our "old" rules around as well. This may become quite a hassle but I don't have enough experience to say. Changing our rules retroactively may introduce some pretty tricky situation so I guess one would need to evaluate this on a case-by-case basis.
Just a thought.

Expected behaviour of a Repository

I'm writing an ORM and am unsure of the expected behaviour of the Repository, or more precisely, the frontier between the Repository and the Unit Of Work.
From my understanding, a Repository might look like this:
interface IPersonRepository
{
public function find(Criteria criteria);
public function add(Person person);
public function delete(Person person);
}
According to Fowler (PoEAA, page 322):
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection. [...] Objects can be added to and removed from the Repository, as they can from a simple collection of objects.
This would imply that the following test should work (assuming that we already have a Person persisted, whose last name is Fowler):
collection = repository.find(lastnameEqualsFowlerCriteria);
person = collection[0];
assertEquals(person.lastname, "Fowler");
person.lastname = "Evans";
newCollection = repository.find(lastnameEqualsFowlerCriteria);
assertFalse(newCollection.contains(person));
That means that when mapping to a database, even if no explicit save() method has been called somewhere, the Person model must have been automatically persisted by the Repository, so that the next query returned the correct collection, not containing the original Person.
But, isn't that the role of the Unit Of Work, to decide which model to persist to the database, and when?
In the above implementation, the Repository has to decide to persist the Person previously retrieved when receiving another find() call, so that the result is consistent with the modification. But if no other find() call were issued, the model would not have been persisted implicitly at all.
In the context of a Unit Of Work, it is not really a problem, because we can start a transaction at the beginning, and rollback any insert to the db anyway if needed.
But when used alone, can't this Repository lead to unexpected, unpredictable behaviour?
A Repository mediates between the
domain and data mapping layers, acting
like an in-memory domain object
collection. [...] Objects can be added
to and removed from the Repository, as
they can from a simple collection of
objects.
This does not mean you do not need a save method. You still need to explicitly commit your changes to storage.
See The Unit Of Work Pattern And Persistence Ignorance
public interface IUnitOfWork {
void MarkDirty(object entity);
void MarkNew(object entity);
void MarkDeleted(object entity);
void Commit();
void Rollback();
}
In a way, you can think of the Unit of Work as a place to dump all transaction-handling code. The responsibilities of the Unit of Work are to:
Manage transactions.
Order the database inserts, deletes, and updates.
Prevent duplicate updates. Inside a single usage of a Unit of Work object, different parts of the code may mark the same Invoice object as changed, but the Unit of Work class will only issue a single UPDATE command to the databas
I think what you;re asking about is following: http://martinfowler.com/eaaCatalog/identityMap.html
Repository should keep fetched objects in memory and all subsequent calls for that entity should not be retrieved from persistence storage, hence your example should work fine.

Business Entity - should lists be exposed only as ReadOnlyCollections?

In trying to centralize how items are added, or removed from my business entity classes, I have moved to the model where all lists are only exposed as ReadOnlyCollections and I provide Add and Remove methods to manipulate the objects in the list.
Here is an example:
public class Course
{
public string Name{get; set;}
}
public class Student
{
private List<Course>_courses = new List<Course>();
public string Name{get; set;}
public ReadOnlyCollection<Course> Courses {
get{ return _courses.AsReadOnly();}
}
public void Add(Course course)
{
if (course != null && _courses.Count <= 3)
{
_courses.Add(course);
}
}
public bool Remove(Course course)
{
bool removed = false;
if (course != null && _courses.Count <= 3)
{
removed = _courses.Remove(course);
}
return removed;
}
}
Part of my objective in doing the above is to not end up with an Anemic data-model (an anti-pattern) and also avoid having the logic that adds and removes courses all over the place.
Some background: the application I am working with is an Asp.net application, where the lists used to be exposed as a list previously, which resulted in all kinds of ways in which Courses were added to the Student (some places a check was made and others the check was not made).
But my question is: is the above a good idea?
Yes, this is a good approach, in my opinion you're not doing anything than decorating your list, and its better than implementing your own IList (as you save many lines of code, even though you lose the more elegant way to iterate through your Course objects).
You may consider receiving a validation strategy object, as in the future you might have a new requirement, for ex: a new kind of student that can have more than 3 courses, etc
I'd say this is a good idea when adding/removing needs to be controlled in the manner you suggest, such as for business rule validation. Otherwise, as you know from previous code, there's really no way to ensure that the validation is performed.
The balance that you'll probably want to reach, however, is when to do this and when not to. Doing this for every collection of every kind seems like overkill. However, if you don't do this and then later need to add this kind of gate-keeping code then it would be a breaking change for the class, which may or may not be a headache at the time.
I suppose another approach could be to have a custom descendant of IList<T> which has generic gate-keeping code for its Add() and Remove() methods which notifies the system of what's happening. Something like exposing an event which is raised before the internal logic of those methods is called. Then the Student class would supply a delegate or something (sorry for being vague, I'm very coded-out today) when instantiating _courses to apply business logic to the event and cancel the operation (throw an exception, I imagine) if the business validation fails.
That could be overkill as well, depending on the developer's disposition. But at least with something a little more engineered like this you get a single generic implementation for everything with the option to add/remove business validation as needed over time without breaking changes.
I've done that in the past and regretted it: a better option is to use different classes to read domain objects than the ones you use to modify them.
For example, use a behavior-rich Student domain class that jealously guards its ownership of courses - it shouldn't expose them at all if student is responsible for them - and a StudentDataTransferObject (or ViewModel) that provides a simple list of strings of courses (or a dictionary when you need IDs) for populating interfaces.

Is this a valid use of the lock keyword in C#?

I have a singleton class AppSetting in an ASP.NET app where I need to check a value and optionally update it. I know I need to use a locking mechanism to prevent multi-threading issues, but can someone verify that the following is a valid approach?
private static void ValidateEncryptionKey()
{
if (AppSetting.Instance.EncryptionKey.Equals(Constants.ENCRYPTION_KEY, StringComparison.Ordinal))
{
lock (AppSetting.Instance)
{
if (AppSetting.Instance.EncryptionKey.Equals(Constants.ENCRYPTION_KEY, StringComparison.Ordinal))
{
AppSetting.Instance.EncryptionKey = GenerateNewEncryptionKey();
AppSetting.Instance.Save();
}
}
}
}
I have also seen examples where you lock on a private field in the current class, but I think the above approach is more intuitive.
Thanks!
Intuitive, maybe, but the reason those examples lock on a private field is to ensure that no other piece of code in the application can take the same lock in such a way as to deadlock the application, which is always good defensive practice.
If it's a small application and you're the only programmer working on it, you can probably get away with locking on a public field/property (which I presume AppSetting.Instance is?), but in any other circumstances, I'd strongly recommend that you go the private field route. It will save you a whole lot of debugging time in the future when someone else, or you in the future having forgotten the implementation details of this bit, take a lock on AppSetting.Instance somewhere distant in the code and everything starts crashing.
I'd also suggest you lose the outermost if. Taking a lock isn't free, sure, but it's a lot faster than doing a string comparison, especially since you need to do it a second time inside the lock anyway.
So, something like:
private object _instanceLock = new object () ;
private static void ValidateEncryptionKey()
{
lock (AppSetting.Instance._instanceLock)
{
if (AppSetting.Instance.EncryptionKey.Equals(Constants.ENCRYPTION_KEY, StringComparison.Ordinal))
{
AppSetting.Instance.EncryptionKey = GenerateNewEncryptionKey();
AppSetting.Instance.Save();
}
}
}
An additional refinement, depending on what your requirements are to keep the EncryptionKey consistent with the rest of the state in AppSetting.Instance, would be to use a separate private lock object for the EncryptionKey and any related fields, rather than locking the entire instance every time.

Ensuring inserts after a call to a custom NHibernate IIdentifierGenerator

The setup
Some of the "old old old" tables of our database use an exotic primary key generation scheme [1] and I'm trying to overlay this part of the database with NHibernate. This generation scheme is mostly hidden away in a stored procedure called, say, 'ShootMeInTheFace.GetNextSeededId'.
I have written an IIdentifierGenerator that calls this stored proc:
public class LegacyIdentityGenerator : IIdentifierGenerator, IConfigurable
{
// ... snip ...
public object Generate(ISessionImplementor session, object obj)
{
var connection = session.Connection;
using (var command = connection.CreateCommand())
{
SqlParameter param;
session.ConnectionManager.Transaction.Enlist(command);
command.CommandText = "ShootMeInTheFace.GetNextSeededId";
command.CommandType = CommandType.StoredProcedure;
param = command.CreateParameter() as SqlParameter;
param.Direction = ParameterDirection.Input;
param.ParameterName = "#sTableName";
param.SqlDbType = SqlDbType.VarChar;
param.Value = this.table;
command.Parameters.Add(param);
// ... snip ...
command.ExecuteNonQuery();
// ... snip ...
return ((IDataParameter)command
.Parameters["#sTrimmedNewId"]).Value as string);
}
}
The problem
I can map this in the XML mapping files and it works great, BUT....
It doesn't work when NHibernate tries to batch inserts, such as in a cascade, or when the session is not Flush()ed after every call to Save() on a transient entity that depends on this generator.
That's because NHibernate seems to be doing something like
for (each thing that I need to save)
{
[generate its id]
[add it to the batch]
}
[execute the sql in one big batch]
This doesn't work because, since the generator is asking the database every time, NHibernate just ends up getting the same ID generated multiple times, since it hasn't actually saved anything yet.
The other NHibernate generators like IncrementGenerator seem to get around this by asking the database for the seed value once and then incrementing the value in memory during subsequent calls in the same session. I would rather not do this in my implementation if I have to, since all of the code that I need is sitting in the database already, just waiting for me to call it correctly.
Is there a way to make NHibernate actually issue the INSERT after each call to generating an ID for entities of a certain type? Fiddling with the batch size settings don't seem to help.
Do you have any suggestions/other workarounds besides re-implementing the generation code in memory or bolting on some triggers to the legacy database? I guess I could always treat these as "assigned" generators and try to hide that fact somehow within the guts of the domain model....
Thanks for any advice.
The update: 2 months later
It was suggested in the answers below that I use an IPreInsertEventListener to implement this functionality. While this sounds reasonable, there were a few problems with this.
The first problem was that setting the id of an entity to the AssignedGenerator and then not actually assigning anything in code (since I was expecting my new IPreInsertEventListener implementation to do the work) resulted in an exception being thrown by the AssignedGenerator, since its Generate() method essentially does nothing but check to make sure that the id is not null, throwing an exception otherwise. This is worked around easily enough by creating my own IIdentifierGenerator that is like AssignedGenerator without the exception.
The second problem was that returning null from my new IIdentifierGenerator (the one I wrote to overcome the problems with the AssignedGenerator resulted in the innards of NHibernate throwing an exception, complaining that a null id was generated. Okay, fine, I changed my IIdentifierGenerator to return a sentinel string value, say, "NOT-REALLY-THE-REAL-ID", knowing that my IPreInsertEventListener would replace it with the correct value.
The third problem, and the ultimate deal-breaker, was that IPreInsertEventListener runs so late in the process that you need to update both the actual entity object as well as an array of state values that NHibernate uses. Typically this is not a problem and you can just follow Ayende's example. But there are three issues with the id field relating to the IPreInsertEventListeners:
The property is not in the #event.State array but instead in its own Id property.
The Id property does not have a public set accessor.
Updating only the entity but not the Id property results in the "NOT-REALLY-THE-REAL-ID" sentinel value being passed through to the database since the IPreInsertEventListener was unable to insert in the right places.
So my choice at this point was to use reflection to get at that NHibernate property, or to really sit down and say "look, the tool just wasn't meant to be used this way."
So I went back to my original IIdentifierGenreator and made it work for lazy flushes: it got the high value from the database on the first call, and then I re-implemented that ID generation function in C# for subsequent calls, modeling this after the Increment generator:
private string lastGenerated;
public object Generate(ISessionImplementor session, object obj)
{
string identity;
if (this.lastGenerated == null)
{
identity = GetTheValueFromTheDatabase();
}
else
{
identity = GenerateTheNextValueInCode();
}
this.lastGenerated = identity;
return identity;
}
This seems to work fine for a while, but like the increment generator, we might as well call it the TimeBombGenerator. If there are multiple worker processes executing this code in non-serializable transactions, or if there are multiple entities mapped to the same database table (it's an old database, it happened), then we will get multiple instances of this generator with the same lastGenerated seed value, resulting in duplicate identities.
##$##$#.
My solution at this point was to make the generator cache a dictionary of WeakReferences to ISessions and their lastGenerated values. This way, the lastGenerated is effectively local to the lifetime of a particular ISession, not the lifetime of the IIdentifierGenerator, and because I'm holding WeakReferences and culling them out at the beginning of each Generate() call, this won't explode in memory consumption. And since each ISession is going to hit the database table on its first call, we'll get the necessary row locks (assuming we're in a transaction) we need to prevent duplicate identities from happening (and if they do, such as from a phantom row, only the ISession needs to be thrown away, not the entire process).
It is ugly, but more feasible than changing the primary key scheme of a 10-year-old database. FWIW.
[1] If you care to know about the ID generation, you take a substring(len - 2) of all of the values currently in the PK column, cast them to integers and find the max, add one to that number, add all of that number's digits, and append the sum of those digits as a checksum. (If the database has one row containing "1000001", then we would get max 10000, +1 equals 10001, checksum is 02, resulting new PK is "1000102". Don't ask me why.
A potential workaround is to generate and assign the ID in an event listener rather than using an IIdentifierGenerator implementation. The listener should implement IPreInsertEventListener and assign the ID in OnPreInsert.
Why dont you just make private string lastGenerated; static?