Soliciting feedback/options/comments regarding a "best" pattern to use for reference data in my services.
What do I mean by reference data?
Let's use Northwind as an example. An Order is related to a Customer in the database. When I implement my Orders Service, in some cases I'll want the reference a "full" Customer from an Order and other cases when I just want a reference to the Customer (for example a Key/Value pair).
For example, if I were doing a GetAllOrders(), I wouldn't want to return a fully filled out Order, I'd want to return a lightweight version of an Order with only reference data for each order's Customer. If I did a GetOrder() method, though, I'd probably want to fill in the Customer details because chances are a consumer of this method might need it. There might be other situations where I might want to ask that the Customer details be filled in during certain method calls, but left out for others.
Here is what I've come up with:
[DataContract]
public OrderDTO
{
[DataMember(Required)]
public CustomerDTO;
//etc..
}
[DataContract]
public CustomerDTO
{
[DataMember(Required)]
public ReferenceInfo ReferenceInfo;
[DataMember(Optional)]
public CustomerInfo CustomerInfo;
}
[DataContract]
public ReferenceInfo
{
[DataMember(Required)]
public string Key;
[DataMember(Required)]
public string Value;
}
[DataContract]
public CustomerInfo
{
[DataMember(Required)]
public string CustomerID;
[DataMember(Required)]
public string Name;
//etc....
}
The thinking here is that since ReferenceInfo (which is a generic Key/Value pair) is always required in CustomerDTO, I'll always have ReferenceInfo. It gives me enough information to obtain the Customer details later if needed. The downside to having CustomerDTO require ReferenceInfo is that it might be overkill when I am getting the full CustomerDTO (i.e. with CustomerInfo filled in), but at least I am guaranteed the reference info.
Is there some other pattern or framework piece I can use to make this scenario/implementation "cleaner"?
The reason I ask is that although we could simply say in Northwind to ALWAYS return a full CustomerDTO, that might work fine in the simplistic Northwind situation. In my case, I have an object that has 25-50 fields that are reference/lookup type data. Some are more important to load than others in different situations, but i'd like to have as few definitions of these reference types as possible (so that I don't get into "DTO maintenance hell").
Opinions? Feedback? Comments?
Thanks!
We're at the same decision point on our project. As of right now, we've decided to create three levels of DTOs to handle a Thing: SimpleThing, ComplexThing, and FullThing. We don't know how it'll work out for us, though, so this is not yet an answer grounded in reality.
One thing I'm wondering is if we might learn that our services are designed at the "wrong" level. For example, is there ever an instance where we should bust a FullThing apart and only pass a SimpleThing? If we do, does that imply we've inappropriately put some business logic at too high of a level?
Amazon Product Advertising API Web service is a good example of the same problem that you are experiencing.
They use different DTOs to provide callers with more or less detail depending on their circumstances. For example there is the small response group, the large response group and in the middle medium response group.
Having different DTOs is a good technique if as you say you don't want a chatty interface.
It seems like a complicated solution to me. Why not just have a customer id field in the OrderDTO class and then let the application decide at runtime whether it needs the customer data. Since it has the customer id it can pull the data down when it so decides.
I've decided against the approach I was going to take. I think much of my initial concerns were a result of a lack of requirements. I sort of expected this to be the case, but was curious to see how others might have tackled this issue of determining when to load up certain data and when not to.
I am flattening my Data Contract to contain the most used fields of reference data elements. This should work for a majority of consumers. If the supplied data is not enough for a given consumer, they'll have the option to query a separate service to pull back the full details for a particular reference entity (for example a Currency, State, etc). For simple lookups that really are basically Key/Value pairs, we'll be handling them with a generic Key/Value pair Data Contract. I might even use the KnownType attribute for my more specialized Key/Value pairs.
[DataContract]
public OrderDTO
{
[DataMember(Required)]
public CustomerDTO Customer;
//in this case, I think consumers will need currency data,
//so I pass back a full currency item
[DataMember(Required)]
public Currency Currency;
//in this case, I think consumers are not likely to need full StateRegion data,
//so I pass back a "reference" to it
//User's can call a separate service method to get full details if needed, or
[DataMember(Required)]
public KeyValuePair ShipToStateRegion;
//etc..
}
[DataContract]
[KnownType(Currency)]
public KeyValuePair
{
[DataMember(Required)]
public string Key;
[DataMember(Required)]
public string Value;
//enum consisting of all possible reference types,
//such as "Currency", "StateRegion", "Country", etc.
[DataMember(Required)]
public ReferenceType ReferenceType;
}
[DataContract]
public Currency : KeyValuePair
{
[DataMember(Required)]
public decimal ExchangeRate;
[DataMember(Required)]
public DateTime ExchangeRateAsOfDate;
}
[DataContract]
public CustomerDTO
{
[DataMember(Required)]
public string CustomerID;
[DataMember(Required)]
public string Name;
//etc....
}
Thoughts? Opinions? Comments?
We've faced this problem in object-relational mapping as well. There are situations where we want the full object and others where we want a reference to it.
The difficulty is that by baking the serialization into the classes themselves, the datacontract pattern enforces the idea that there's only one right way to serialize an object. But there are lots of scenarios where you might want to partially serialize a class and/or its child objects.
This usually means that you have to have multiple DTOs for each class. For example, a FullCustomerDTO and a CustomerReferenceDTO. Then you have to create ways to map the different DTOs back to the Customer domain object.
As you can imagine, it's a ton of work, most of it very tedious.
One other possibility is to treat the objects as property bags. Specify the properties you want when querying, and get back exactly the properties you need.
Changing the properties to show in the "short" version then won't require multiple round trips, you can get all of the properties for a set at one time (avoiding chatty interfaces), and you don't have to modify your data or operation contracts if you decide you need different properties for the "short" version.
I typically build in lazy loading to my complex web services (ie web services that send/receive entities). If a Person has a Father property (also a Person), I send just an identifier for the Father instead of the nested object, then I just make sure my web service has an operation that can accept an identifier and respond with the corresponding Person entity. The client can then call the web service back if it wants to use the Father property.
I've also expanded on this so that batching can occur. If an operation sends back 5 Persons, then if the Father property is accessed on any one of those Persons, then a request is made for all 5 Fathers with their identifiers. This helps reduce the chattiness of the web service.
Related
I have an object, let's call it a Request, that has associations to several other objects like:
Employee submitter;
Employee subjectsManager;
Employee pointOfContact;
And several value properties like strings, dates, and enums.
Now, I also need to keep track of another object, the subject, but this can be one of 3 different types of people. For simplicity let's just talk about 2 types: Employee and Consultant. Their data comes from different repositories and they have different sets of fields, some overlapping. So say an employee has a
String employeeName;
String employeeId;
String socialSecurityNumber;
Whereas a consultant has
String consultantName;
String socialSecurityNumber;
String phoneNumber;
One terrible idea is that the Request has both a Consultant and an Employee, and setSubject(Consultant) assigns one, setSubject(Employee) assigns the other. This sounds awful. One of my primary goals is to avoid "if the subject is this type then do this..." logic.
My thought is that perhaps an EmployeeRequest and a ConsultantRequest should extend Request, but I'm not sure how, say, setSubject would work. I would want it to be an abstract method in the base class but I don't know what the signature would be since I don't know what type the parameter would be.
So then it makes sense to go at it from an interface perspective. One important interface is that these Request objects will be passed to a single webservice that I don't own. I will have to map the object's fields in a somewhat complex manner that partially makes sense. For fields like name and SSN the mapping is straightforward, but many of the fields that don't line up across all types of people are dumped into a concatenated string AdditionalInfo field (wump wump). So they'll all have a getAdditionalInfo method, a getName, etc, and if there's any fields that don't line up they can do something special with that one.
So that makes me feel like the Request itself should not necessarily be subclassed but could contain a reference to an ISubjectable (or whatever) that implements the interface needed to get the values to send across the webservice. This sounds pretty decent and prevents a lot of "if the subject is an employee then do this..."
However, I would still at times need to access the additional fields that only a certain type of subject has, for example on a display or edit page, so that brings me right back to "if subject is instance of an employee then go to the edit employee page..." This may be unavoidable though and if so I'm ok with that.
Just for completeness I'll mention the "union of all possible fields" approach -- don't think I'd care to do that one either.
Is the interface approach the most sensible or am I going about it wrong? Thanks.
A generic solution comes to mind; that is, if the language you're using supports it:
class Request<T extends Subject> {
private T subject;
public void setSubject(T subject) {
this.subject = subject;
}
public T getSubject() {
return subject;
}
}
class EmployeeRequest extends Request<Employee> {
// ...
}
class ConsultantRequest extends Request<Consultant> {
// ...
}
You could similarly make the setSubject method abstract as you've described in your post, and then have separate implementations of it in your subclasses. Or you may not even need to subclass the Request class:
Request<Employee> employeeRequest = new Request<>();
employeeRequest.setSubject(/* ... */);
// ...
int employeeId = employeeRequest.getSubject().getEmployeeId();
Assume we have class Car which MAIN field is called VIN (Vehicle Identification Number). VIN gives us a lot of information such us:
owner
place of registration
country of production
year of production
color
engine type
etc. etc
I can continue and add more information:
last known GPS coordinates
fine list
is theft (boolean)
etc. etc.
It seems reasonable to store some of information (for example year of production and engine type) right inside Car object. However storing all this information right inside Car object will make it too complicated, "overloaded" and hard to manage. Moreover while application evolves I can add more and more information.
So where is the border? What should be stored inside Car object and what should be stored outside in something like Dictionary<Car, GPSCoordinates>
I think that probably I should store "static" data inside Car object so making it immutable. And store "dynamic" data in special storages.
I would use a class called CarModel for the base attributes shared by every possible car in your application (engine size, color, registration #, etc). You can then extend this class with any number of more specific subclasses like Car, RentalCar, or whatever fits your business logic.
This way you have one clear definition of what all cars share and additional definitions for the different states cars can be in (RentalCar with its unique parameters, for example).
Update:
I guess what you're looking for is something like this (although I would recommend against it):
public class Car
{
// mandatory
protected int engineSize;
protected int color;
// optional
protected Map<String, Object> attributes = new HashMap<String, Object>();
public void set(String name, Object value)
{
attributes.put(name, value);
}
public Object get(String name)
{
return attributes.get(name);
}
}
Why this is not a good solution:
Good luck trying to persist this class to a database or design anything that relies on a well known set of attributes for it.
Nightmare to debug potential problems.
Not a very good use of OOP with regard to type definitions. This can be abused to turn the Car class into something it is not.
Just because your Car class provide a property GPSCoordinates does not mean you need to hold those coordinates internally. Essentially, that's what encapsulation is all about.
And yes, you can then add properties such as "IsInGarageNow", "WasEverDrivedByMadonna" or "RecommendedOil".
In my domain I have something called Project which basically holds a lot of simple configuration propeties that describe what should happen when the project gets executed. When the Project gets executed it produces a huge amount of LogEntries. In my application I need to analyse these log entries for a given Project, so I need to be able to partially successively load a portion (time frame) of log entries from the database (Oracle). How would you model this relationship as DB tables and as objects?
I could have a Project table and ProjectLog table and have a foreign key to the primary key of Project and do the "same" thing at object level have class Project and a property
IEnumerable<LogEntry> LogEntries { get; }
and have NHibernate do all the mapping. But how would I design my ProjectRepository in this case? I could have a methods
void FillLog(Project projectToFill, DateTime start, DateTime end);
How can I tell NHibernate that it should not load the LogEntries until someone calls this method and how would I make NHibernate to load a specifc timeframe within that method?
I am pretty new to ORM, maybe that design is not optimal for NHibernate or in general? Maybe I shoul design it differently?
Instead of having a Project entity as an aggregate root, why not move the reference around and let LogEntry have a Product property and also act as an aggregate root.
public class LogEntry
{
public virtual Product Product { get; set; }
// ...other properties
}
public class Product
{
// remove the LogEntries property from Product
// public virtual IList<LogEntry> LogEntries { get; set; }
}
Now, since both of those entities are aggregate roots, you would have two different repositories: ProductRepository and LogEntryRepository. LogEntryRepository could have a method GetByProductAndTime:
IEnumerable<LogEntry> GetByProductAndTime(Project project, DateTime start, DateTime end);
The 'correct' way of loading partial / filtered / criteria-based lists under NHibernate is to use queries. There is lazy="extra" but it doesn't do what you want.
As you've already noted, that breaks the DDD model of Root Aggregate -> Children. I struggled with just this problem for an absolute age, because first of all I hated having what amounted to persistence concerns polluting my domain model, and I could never get the API surface to look 'right'. Filter methods on the owning entity class work but are far from pretty.
In the end I settled for extending my entity base class (all my entities inherit from it, which I know is slightly unfashionable these days but it does at least let me do this sort of thing consistently) with a protected method called Query<T>() that takes a LINQ expression defining the relationship and, under the hood in the repository, calls LINQ-to-NH and returns an IQueryable<T> that you can then query into as you require. I can then facade that call beneath a regular property.
The base class does this:
protected virtual IQueryable<TCollection> Query<TCollection>(Expression<Func<TCollection, bool>> selector)
where TCollection : class, IPersistent
{
return Repository.For<TCollection>().Where(selector);
}
(I should note here that my Repository implementation implements IQueryable<T> directly and then delegates the work down to the NH Session.Query<T>())
And the facading works like this:
public virtual IQueryable<Form> Forms
{
get
{
return Query<Form>(x => x.Account == this);
}
}
This defines the list relationship between Account and Form as the inverse of the actual mapped relationship (Form -> Account).
For 'infinite' collections - where there is a potentially unbounded number of objects in the set - this works OK, but it means you can't map the relationship directly in NHibernate and therefore can't use the property directly in NH queries, only indirectly.
What we really need is a replacement for NHibernate's generic bag, list and set implementations that knows how to use the LINQ provider to query into lists directly. One has been proposed as a patch (see https://nhibernate.jira.com/browse/NH-2319). As you can see the patch was not finished or accepted and from what I can see the proposer didn't re-package this as an extension - Diego Mijelshon is a user here on SO so perhaps he'll chime in... I have tested out his proposed code as a POC and it does work as advertised, but obviously it's not tested or guaranteed or necessarily complete, it might have side-effects, and without permission to use or publish it you couldn't use it anyway.
Until and unless the NH team get around to writing / accepting a patch that makes this happen, we'll have to keep resorting to workarounds. NH and DDD just have conflicting views of the world, here.
In trying to centralize how items are added, or removed from my business entity classes, I have moved to the model where all lists are only exposed as ReadOnlyCollections and I provide Add and Remove methods to manipulate the objects in the list.
Here is an example:
public class Course
{
public string Name{get; set;}
}
public class Student
{
private List<Course>_courses = new List<Course>();
public string Name{get; set;}
public ReadOnlyCollection<Course> Courses {
get{ return _courses.AsReadOnly();}
}
public void Add(Course course)
{
if (course != null && _courses.Count <= 3)
{
_courses.Add(course);
}
}
public bool Remove(Course course)
{
bool removed = false;
if (course != null && _courses.Count <= 3)
{
removed = _courses.Remove(course);
}
return removed;
}
}
Part of my objective in doing the above is to not end up with an Anemic data-model (an anti-pattern) and also avoid having the logic that adds and removes courses all over the place.
Some background: the application I am working with is an Asp.net application, where the lists used to be exposed as a list previously, which resulted in all kinds of ways in which Courses were added to the Student (some places a check was made and others the check was not made).
But my question is: is the above a good idea?
Yes, this is a good approach, in my opinion you're not doing anything than decorating your list, and its better than implementing your own IList (as you save many lines of code, even though you lose the more elegant way to iterate through your Course objects).
You may consider receiving a validation strategy object, as in the future you might have a new requirement, for ex: a new kind of student that can have more than 3 courses, etc
I'd say this is a good idea when adding/removing needs to be controlled in the manner you suggest, such as for business rule validation. Otherwise, as you know from previous code, there's really no way to ensure that the validation is performed.
The balance that you'll probably want to reach, however, is when to do this and when not to. Doing this for every collection of every kind seems like overkill. However, if you don't do this and then later need to add this kind of gate-keeping code then it would be a breaking change for the class, which may or may not be a headache at the time.
I suppose another approach could be to have a custom descendant of IList<T> which has generic gate-keeping code for its Add() and Remove() methods which notifies the system of what's happening. Something like exposing an event which is raised before the internal logic of those methods is called. Then the Student class would supply a delegate or something (sorry for being vague, I'm very coded-out today) when instantiating _courses to apply business logic to the event and cancel the operation (throw an exception, I imagine) if the business validation fails.
That could be overkill as well, depending on the developer's disposition. But at least with something a little more engineered like this you get a single generic implementation for everything with the option to add/remove business validation as needed over time without breaking changes.
I've done that in the past and regretted it: a better option is to use different classes to read domain objects than the ones you use to modify them.
For example, use a behavior-rich Student domain class that jealously guards its ownership of courses - it shouldn't expose them at all if student is responsible for them - and a StudentDataTransferObject (or ViewModel) that provides a simple list of strings of courses (or a dictionary when you need IDs) for populating interfaces.
I'd like to use for table storage an entity like this:
public class MyEntity
{
public String Text { get; private set; }
public Int32 SomeValue { get; private set; }
public MyEntity(String text, Int32 someValue)
{
Text = text;
SomeValue = someValue;
}
}
But it's not possible, because the ATS needs
Parameterless constructor
All properties public and
read/write.
Inherit from TableServiceEntity;
The first two, are two things I don't want to do. Why should I want that anybody could change some data that should be readonly? or create objects of this kind in a inconsistent way (what are .ctor's for then?), or even worst, alter the PartitionKey or the RowKey. Why are we still constrained by these deserialization requirements?
I don't like develop software in that way, how can I use table storage library in a way that I can serialize and deserialize myself the objects? I think that as long the objects inherits from TableServiceEntity it shouldn't be a problem.
So far I got to save an object, but I don't know how retrieve it:
Message m = new Message("message XXXXXXXXXXXXX");
CloudTableClient tableClient = account.CreateCloudTableClient();
tableClient.CreateTableIfNotExist("Messages");
TableServiceContext tcontext = new TableServiceContext(account.TableEndpoint.AbsoluteUri, account.Credentials);
var list = tableClient.ListTables().ToArray();
tcontext.AddObject("Messages", m);
tcontext.SaveChanges();
Is there any way to avoid those deserialization requirements or get the raw object?
Cheers.
If you want to use the Storage Client Library, then yes, there are restrictions on what you can and can't do with your objects that you want to store. Point 1 is correct. I'd expand point 2 to say "All properties that you want to store must be public and read/write" (for integer properties you can get away with having read only properties and it won't try to save them) but you don't actually have to inherit from TableServiceEntity.
TableServiceEntity is just a very light class that has the properties PartitionKey, RowKey, Timestamp and is decorated with the DataServiceKey attribute (take a look with Reflector). All of these things you can do to a class that you create yourself and doesn't inherit from TableServiceEntity (note that the casing of these properties is important).
If this still doesn't give you enough control over how you build your classes, you can always ignore the Storage Client Library and just use the REST API directly. This will give you the ability to searialize and deserialize the XML any which way you like. You will lose the all of the nice things that come with using the library, like ability to create queries in LINQ.
The constraints around that ADO.NET wrapper for the Table Storage are indeed somewhat painful. You can also adopt a Fat Entity approach as implemented in Lokad.Cloud. This will give you much more flexibility concerning the serialization of your entities.
Just don't use inheritance.
If you want to use your own POCO's, create your class as you want it and create a separate tableEntity wrapper/container class that holds the pK and rK and carries your class as a serialized byte array.
You can use composition to achieve what you want.
Create your Table Entities as you need to for storage and create your POCOs as wrappers on those providing the API you want the rest of your application code to see.
You can even mix in some interfaces for better code.
How about generating the POCO wrappers at runtime using System.Reflection.Emit http://blog.kloud.com.au/2012/09/30/a-better-dynamic-tableserviceentity/