Ad hoc data and repository pattern

Ad hoc data and repository pattern - sql

What is the recommended way to return ad hoc (custom case by case) data from repository which don't fit any model entities or which extend some?
The 101 example would be the ubiquitous hello word application: a blog system. Suppose you want to load a list of posts where post entry has some additional information which does not exists in the Post entity. Let’s say it is the number of comments and the date and time of the last comment. This would be highly trivial if one was using the plain old SQL and reading data directly from the database. How am I supposed to do it optimally using the repository pattern if I cannot afford loading the entire collections of Comments for each Post, and I want to do it in one database hit? Is there any commonly used pattern for this situation? Now imagine that you have moderately complex web application where each page needs a slightly different custom data, and loading full hierarchies is not possible (performance, memory requirements etc).
Some random ideas:
Add a list of properties to each model which could be populated by the custom data.
Subclass model entities case by case, and create custom readers for each subclass.
Use LINQ, compose ad hoc queries and read anonymous classes.
Note: I’ve asked a similar question recently, but it seemed to be too general and did not attract much attention.
Example:
Based on suggestions in answers below, I am adding a more concrete example. Here is the situation I was trying to describe:
IEnumarable<Post> posts = repository.GetPostsByPage(1);
foreach (Post post in posts)
{
// snip: push post title, content, etc. to view
// determine the post count and latest comment date
int commentCount = post.Comments.Count();
DateTime newestCommentDate = post.Comments.Max(c => c.Date);
// snip: push the count and date to view
}
If I don’t do anything extra and use an off the shelf ORM, this will result to n+1 queries or possibly one query loading all posts and comments. But optimally, I would like to be able to just execute one SQL which would return one row for each post including the post title, body etc. and the comment count and most recent comment date in the same. This is trivial in SQL. The problem is that my repository won’t be able to read and fit this type of data into the model. Where do the max dates and the counts go?
I’m not asking how to do that. You can always do it somehow: add extra methods to the repository, add new classes, special entities, use LINQ etc., but I guess my question is the following. How come the repository pattern and the proper model driven development are so widely accepted, but yet they don’t seem to address this seemingly very common and basic case.

There's a lot to this question. Are you needing this specific data for a reporting procedure? If so, then the proper solution is to have separate data access for reporting purposes. Flattened databases, views, ect.
Or is it an ad-hoc query need? If so, Ayende has a post on this very problem. http://ayende.com/Blog/archive/2006/12/07/ComplexSearchingQueryingWithNHibernate.aspx
He utilizes a "Finder" object. He's using NHibernate, so essentially what he's doing is creating a detached query.
I've done something similar in the past by creating a Query object that I can fill prior to handing it to a repository (some DDD purist will argue against it, but I find it elegant and easy to use).
The Query object implements a fluent interface, so I can write this and get the results back:
IQuery query = new PostQuery()
.WithPostId(postId)
.And()
.WithCommentCount()
.And()
.WithCommentsHavingDateLessThan(selectedDate);
Post post = _repository.Find(query);
However, in your specific case I have to wonder at your design. You are saying you can't load the comments with the post. Why? Are you just being overly worrisome about performance? Is this a case of premature optimization? (seems like it to me)
If I had a Post object it would be my aggregate root and it would come with the Comments attached. And then everything you want to do would work in every scenario.

Since we needed to urgently solve the issue I outlined in my original question, we resorted to the following solution. We added a property collection (a dictionary) to each model entity, and if the DAL needs to, it sticks custom data into to. In order to establish some kind of control, the property collection is keyed by instances of a designated class and it supports only simple data types (integers, dates, ...) which is all we need at movement, and mostly likely will ever need. A typical case which this solves is: loading an entity with counts for its subcollections instead of full populated collections. I suspect that this probably does not get any award for a software design, but it was the simplest and the most practical solution for our case.

Can't say I really see what the problem is, just firing in the air here:
Add a specific entity to encapsulate the info yo want
Add a property Comments to the Post. (I don't see why this would require you to fetch all comments - you can just fetch the comments for the particular post you're loading)
Use lazy loading to only fetch the comments when you access the property
I think you would have a greater chance of seeing your question answered if you would make platform, language and O/R mapper specific (seems to be .NET C# or VB, since you mentioned LINQ. LINQ 2 SQL? Entity framework? Something else?)

If you aren't locked into a RDBMs then a Database like CouchDB or Amazons SimpleDB might be something to look at. What you are describing is trivial in a CouchDB View. This probably doesn't really answer you specific question but sometimes it's good to look at radically different options.

For this I generally have a RepositoryStatus and a Status class that acts as my Data Transfer Object (DTO). The Status class is used in my application service layer (for the same reason) from which the RepositoryStatus inherits. Then with this class I can return error messages, response objects, etc. from the Repository layer. This class is generic in that it will accept any object in and cast it out for the receiver.
Here is the Status class:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using RanchBuddy.Core.Domain;
using StructureMap;
namespace RanchBuddy.Core.Services.Impl
{
[Pluggable("Default")]
public class Status : IStatus
{
public Status()
{
_messages = new List<string>();
_violations = new List<RuleViolation>();
}
public enum StatusTypes
{
Success,
Failure
}
private object _object;
public T GetObject<T>()
{
return (T)_object;
}
public void SetObject<T>(T Object)
{
_object = Object;
}
private List<string> _messages;
public void AddMessage(string Message)
{
_messages.Add(Message);
}
public List<string> GetMessages()
{
return _messages;
}
public void AddMessages(List<string> Messages)
{
_messages.AddRange(Messages);
}
private List<RuleViolation> _violations;
public void AddRuleViolation(RuleViolation violation)
{
_violations.Add(violation);
}
public void AddRuleViolations(List<RuleViolation> violations)
{
_violations.AddRange(violations);
}
public List<RuleViolation> GetRuleViolations()
{
return _violations;
}
public StatusTypes StatusType { get; set; }
}
}
And here is the RepositoryStatus:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using RanchBuddy.Core.Services.Impl;
using StructureMap;
namespace RanchBuddy.Core.DataAccess.Impl
{
[Pluggable("DefaultRepositoryStatus")]
public class RepositoryStatus : Status, IRepositoryStatus
{
}
}
As you can see the RepositoryStatus doesn't yet do anything special and just relies on the Status objects utilities. But I wanted to reserve the right to extend at a later date!
I am sure that some of the die-hards out there will state that this should not be used if you are to be a prueist...however I know your pain in that sometimes you need to pass out more than just a returned object!

Related

Mapping of Interface is Not Supported, But Linq-Sql Object Already Implements Property

So, I created a DataContext (Linq-Sql) in VS from an existing database. It has a table called Users, thus I have a User object. In particular, I want to focus on the UserID and Username properties.
Now, I have an interface:
interface IUser
{
int Id { get; }
string Username { get; }
}
I want to create a partial User class and implement IUser. The reason for this is so that I can treat any User as an IUser in many places and not be concerned about the precise User class:
public partial class User : IUser
{
public int Id
{
get { return UserID; }
}
}
I don't implement the Username get property because I know that the entity object already implements it.
When I have a query like dc.Users.SingleOrDefault(p => p.Id == 5); I know that it's an error because it'll translate that call to an SQL statement and it's going to try to find the Id column, which doesn't exist - UserID exists. So I understand this mapping issue.
When I query dc.Users.SingleOrDefault(p => p.Username == "admin"), it also throws an error, BUT Username IS indeed an existing column in the database, so my impression is that no custom/additional mapping needs to take place. What am I missing?
Can someone point me to a good source on how to combat Linq vs. partial classes implement a custom interface?
Update Question:
Before I try it, does anyone know if "rigging" the datacontext.designer.cs file with our custom interfaces (to implement to the classes themselves instead of in a separate partial class file) will work? Is there a consequence of doing this?

I've come across this before using Generics and LINQ, and the way I solved it was to change p.Id == 5 to p.Id.Equals(5) and LINQ was able to map the query.
In regards to rigging autogenerated code, I have done this in my projects, the only annoyance is having to type all the interfaces again if you regenerate your DBML file. I looked in to dynamically adding interfaces to classes and found this SO post, but I haven't tried it out yet:
What is the nicest way to dynamically implement an interface in C#?
Either way, re-typing is a much better trade off for us right now as we've been able to remove a lot of duplication in our implementation code with this method.
Unfortunately I'm not experienced enough with LINQ or .NET to explain why Equals() works when == does not :)

domain design with nhibernate

In my domain I have something called Project which basically holds a lot of simple configuration propeties that describe what should happen when the project gets executed. When the Project gets executed it produces a huge amount of LogEntries. In my application I need to analyse these log entries for a given Project, so I need to be able to partially successively load a portion (time frame) of log entries from the database (Oracle). How would you model this relationship as DB tables and as objects?
I could have a Project table and ProjectLog table and have a foreign key to the primary key of Project and do the "same" thing at object level have class Project and a property
IEnumerable<LogEntry> LogEntries { get; }
and have NHibernate do all the mapping. But how would I design my ProjectRepository in this case? I could have a methods
void FillLog(Project projectToFill, DateTime start, DateTime end);
How can I tell NHibernate that it should not load the LogEntries until someone calls this method and how would I make NHibernate to load a specifc timeframe within that method?
I am pretty new to ORM, maybe that design is not optimal for NHibernate or in general? Maybe I shoul design it differently?

Instead of having a Project entity as an aggregate root, why not move the reference around and let LogEntry have a Product property and also act as an aggregate root.
public class LogEntry
{
public virtual Product Product { get; set; }
// ...other properties
}
public class Product
{
// remove the LogEntries property from Product
// public virtual IList<LogEntry> LogEntries { get; set; }
}
Now, since both of those entities are aggregate roots, you would have two different repositories: ProductRepository and LogEntryRepository. LogEntryRepository could have a method GetByProductAndTime:
IEnumerable<LogEntry> GetByProductAndTime(Project project, DateTime start, DateTime end);

The 'correct' way of loading partial / filtered / criteria-based lists under NHibernate is to use queries. There is lazy="extra" but it doesn't do what you want.
As you've already noted, that breaks the DDD model of Root Aggregate -> Children. I struggled with just this problem for an absolute age, because first of all I hated having what amounted to persistence concerns polluting my domain model, and I could never get the API surface to look 'right'. Filter methods on the owning entity class work but are far from pretty.
In the end I settled for extending my entity base class (all my entities inherit from it, which I know is slightly unfashionable these days but it does at least let me do this sort of thing consistently) with a protected method called Query<T>() that takes a LINQ expression defining the relationship and, under the hood in the repository, calls LINQ-to-NH and returns an IQueryable<T> that you can then query into as you require. I can then facade that call beneath a regular property.
The base class does this:
protected virtual IQueryable<TCollection> Query<TCollection>(Expression<Func<TCollection, bool>> selector)
where TCollection : class, IPersistent
{
return Repository.For<TCollection>().Where(selector);
}
(I should note here that my Repository implementation implements IQueryable<T> directly and then delegates the work down to the NH Session.Query<T>())
And the facading works like this:
public virtual IQueryable<Form> Forms
{
get
{
return Query<Form>(x => x.Account == this);
}
}
This defines the list relationship between Account and Form as the inverse of the actual mapped relationship (Form -> Account).
For 'infinite' collections - where there is a potentially unbounded number of objects in the set - this works OK, but it means you can't map the relationship directly in NHibernate and therefore can't use the property directly in NH queries, only indirectly.
What we really need is a replacement for NHibernate's generic bag, list and set implementations that knows how to use the LINQ provider to query into lists directly. One has been proposed as a patch (see https://nhibernate.jira.com/browse/NH-2319). As you can see the patch was not finished or accepted and from what I can see the proposer didn't re-package this as an extension - Diego Mijelshon is a user here on SO so perhaps he'll chime in... I have tested out his proposed code as a POC and it does work as advertised, but obviously it's not tested or guaranteed or necessarily complete, it might have side-effects, and without permission to use or publish it you couldn't use it anyway.
Until and unless the NH team get around to writing / accepting a patch that makes this happen, we'll have to keep resorting to workarounds. NH and DDD just have conflicting views of the world, here.

Business Entity - should lists be exposed only as ReadOnlyCollections?

In trying to centralize how items are added, or removed from my business entity classes, I have moved to the model where all lists are only exposed as ReadOnlyCollections and I provide Add and Remove methods to manipulate the objects in the list.
Here is an example:
public class Course
{
public string Name{get; set;}
}
public class Student
{
private List<Course>_courses = new List<Course>();
public string Name{get; set;}
public ReadOnlyCollection<Course> Courses {
get{ return _courses.AsReadOnly();}
}
public void Add(Course course)
{
if (course != null && _courses.Count <= 3)
{
_courses.Add(course);
}
}
public bool Remove(Course course)
{
bool removed = false;
if (course != null && _courses.Count <= 3)
{
removed = _courses.Remove(course);
}
return removed;
}
}
Part of my objective in doing the above is to not end up with an Anemic data-model (an anti-pattern) and also avoid having the logic that adds and removes courses all over the place.
Some background: the application I am working with is an Asp.net application, where the lists used to be exposed as a list previously, which resulted in all kinds of ways in which Courses were added to the Student (some places a check was made and others the check was not made).
But my question is: is the above a good idea?

Yes, this is a good approach, in my opinion you're not doing anything than decorating your list, and its better than implementing your own IList (as you save many lines of code, even though you lose the more elegant way to iterate through your Course objects).
You may consider receiving a validation strategy object, as in the future you might have a new requirement, for ex: a new kind of student that can have more than 3 courses, etc

I'd say this is a good idea when adding/removing needs to be controlled in the manner you suggest, such as for business rule validation. Otherwise, as you know from previous code, there's really no way to ensure that the validation is performed.
The balance that you'll probably want to reach, however, is when to do this and when not to. Doing this for every collection of every kind seems like overkill. However, if you don't do this and then later need to add this kind of gate-keeping code then it would be a breaking change for the class, which may or may not be a headache at the time.
I suppose another approach could be to have a custom descendant of IList<T> which has generic gate-keeping code for its Add() and Remove() methods which notifies the system of what's happening. Something like exposing an event which is raised before the internal logic of those methods is called. Then the Student class would supply a delegate or something (sorry for being vague, I'm very coded-out today) when instantiating _courses to apply business logic to the event and cancel the operation (throw an exception, I imagine) if the business validation fails.
That could be overkill as well, depending on the developer's disposition. But at least with something a little more engineered like this you get a single generic implementation for everything with the option to add/remove business validation as needed over time without breaking changes.

I've done that in the past and regretted it: a better option is to use different classes to read domain objects than the ones you use to modify them.
For example, use a behavior-rich Student domain class that jealously guards its ownership of courses - it shouldn't expose them at all if student is responsible for them - and a StudentDataTransferObject (or ViewModel) that provides a simple list of strings of courses (or a dictionary when you need IDs) for populating interfaces.

Is this a ddd anti-pattern?

Is it a violation of the Persistance igorance to inject a repository interface into a Entity object Like this. By not using a interface I clearly see a problem but when using a interface is there really a problem? Is the code below a good or bad pattern and why?
public class Contact
{
private readonly IAddressRepository _addressRepository;
public Contact(IAddressRepository addressRepository)
{
_addressRepository = addressRepository;
}
private IEnumerable<Address> _addressBook;
public IEnumerable<Address> AddressBook
{
get
{
if(_addressBook == null)
{
_addressBook = _addressRepository.GetAddresses(this.Id);
}
return _addressBook;
}
}
}

It's not exactly a good idea, but it may be ok for some limited scenarios. I'm a little confused by your model, as I have a hard time believing that Address is your aggregate root, and therefore it wouldn't be ordinary to have a full-blown address repository. Based on your example, you probably are actually using a table data gateway or dao rather than a respository.
I prefer to use a data mapper to solve this problem (an ORM or similar solution). Basically, I would take advantage of my ORM to treat address-book as a lazy loaded property of the aggregate root, "Contact". This has the advantage that your changes can be saved as long as the entity is bound to a session.
If I weren't using an ORM, I'd still prefer that the concrete Contact repository implementation set the property of the AddressBook backing store (list, or whatever). I might have the repository set that enumeration to a proxy object that does know about the other data store, and loads it on demand.

You can inject the load function from outside. The new Lazy<T> type in .NET 4.0 comes in handy for that:
public Contact(Lazy<IEnumerable<Address>> addressBook)
{
_addressBook = addressBook;
}
private Lazy<IEnumerable<Address>> _addressBook;
public IEnumerable<Address> AddressBook
{
get { return this._addressBook.Value; }
}
Also note that IEnumerable<T>s might be intrinsically lazy anyhow when you get them from a query provider. But for any other type you can use the Lazy<T>.

Normally when you follow DDD you always operate with the whole aggregate. The repository always returns you a fully loaded aggregate root.
It doesn't make much sense (in DDD at least) to write code as in your example. A Contact aggregate will always contain all the addresses (if it needs them for its behavior, which I doubt to be honest).
So typically ContactRepository supposes to construct you the whole Contact aggregate where Address is an entity or, most likely, a value object inside this aggregate.
Because Address is an entity/value object that belongs to (and therefore managed by) Contact aggregate it will not have its own repository as you are not suppose to manage entities that belong to an aggregate outside this aggregate.
Resume: always load the whole Contact and call its behavior method to do something with its state.

Since its been 2 years since I asked the question and the question somewhat misunderstood I will try to answer it myself.
Rephrased question:
"Should Business entity classes be fully persistance ignorant?"
I think entity classes should be fully persistance ignorant, because you will instanciate them many places in your code base so it will quickly become messy to always have to inject the Repository class into the entity constructor, neither does it look very clean. This becomes even more evident if you are in need of injecting several repositories. Therefore I always use a separate handler/service class to do the persistance jobs for the entities. These classes are instanciated far less frequently and you usually have more control over where and when this happens. Entity classes are kept as lightweight as possible.
I now always have 1 Repository pr aggregate root and if I have need for some extra business logic when entities are fetched from repositories I usually create 1 ServiceClass for the aggregate root.
By taking a tweaked example of the code in the question as it was a bad example I would do it like this now:
Instead of:
public class Contact
{
private readonly IContactRepository _contactRepository;
public Contact(IContactRepository contactRepository)
{
_contactRepository = contactRepository;
}
public void Save()
{
_contactRepository.Save(this);
}
}
I do it like this:
public class Contact
{
}
public class ContactService
{
private readonly IContactRepository _contactRepository;
public ContactService(IContactRepository contactRepository)
{
_contactRepository = contactRepository;
}
public void Save(Contact contact)
{
_contactRepository.Save(contact);
}
}

Using NHibernate Collection Filters with DDD collections

I am trying to map a domain model in NHibernate. The domain model is implemented with what I think is DDD style. The mapping works mostly but then when I try to use a collection filter on an a collection I get an exception which says: The collection was unreferenced.
I know the problem comes from how I've implemented the collection. My question: Is it possible to use collection filters in nHibernate on collections implemented this way or should I just forget it, i.e. nHibernate cannot work with this.
The code is as follows:
Person
{
IList<Address> _addresses = new List<Address>();
public string FirstName {get; set;}
...
public void addAddress(Address address)
{
// ... do some checks or validation
_addresses.Add(address);
}
public void removeAddress(Address address) {...}
public ReadOnlyCollection<Address> Addresses
{
get { return new ReadOnlyCollection<Address>(_addresses); }
}
}
The main issue is that I don't want to expose the internal addresses collection publicly.
Every other thing works, I use the field.camelcase-underscore access so nHibernate interacts directly with the field. I've been working through the Hibernate in Action book, an now I'm in chapter 7 where it deals with collection filters.
Is there any way around this. I've got it to work by exposing the internal collection like this:
public ReadOnlyCollection<Address> Addresses
{
get { return _addresses; }
}
but I really dont want to do this.
Help would really be appreciated.
Jide

If I recall correctly - NHibernate filter works as additional clause in sql queries to reduce returned rows from db.
My question to You is - why do You need that?
I mean - how much addresses one person might have? 1? 5? 10?
About collection isolation...
I myself just accept it as a sacrifice for NHibernate (just like argument-less ctor's and "virtual`ity") and use exposed IList everywhere (with private setters) just to reduce technical complexity. Their contents surely can be modified from outside, but I just don't do that.
It's more important to keep code easily understandable than making it super safe. Safety will follow.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas