DDD Aggregates: Entity holding identifier to Non-Root Entity in another Aggregate - entity

I'm trying to understand Best Practise for relationships between entities and aggregates.
Imagine a design where you have a Product Aggregate, made up of of two entities:
Aggregate Root: Product
Child Entity: Sku
Where a product can have many Skus. The part numbers and names of the Skus and Product are invariant, in that changing the name of one must transactionally ensure the other is updated. Likewise, the product aggregate needs to ensure there are never duplicate Skus.
We have another Aggregate: StorageLocation. where 1 or more Skus are stored. However it's important that the StorageLocation know the specific Sku it's storing. ie. A StorageLocation in Canada should store the Sku local to that country, and not a Sku intended for the US market.
This implies to me that the StorageLocation needs to keep a reference to the Sku (as a reference to the Product Aggregate Root by itself does not provide enough information to determine the Sku being Stored).
From my readings this seems to break the principle that another Aggregate should not hold a reference to a non-root entity in another Aggregate. So question:
Is holding just an identifier to the Product and Sku in the StorageLocation aggregate acceptable?
I'm aware that a transient reference is deemed allowable, but in this instance (at least as far as I can tell) this information needs to be stored. As mentioned, storing a reference to the Product (or ProductId) is not enough.
Product and Skus have natural identifiers (Part Number, Sku Number). Does this provide greater flexibility for storing these values in the StorageLocation aggregate as they have meaning beyond technical implications.
Am I approaching this the wrong way and need to look at things differently. I often find it difficult to break out of a PK / FK mindset.
Thanks

The guidance to not store a reference to a child entity is founded on good principals but I believe often causes confusion. The goal is actually that the child entity does not necessarily have a 'globally unique' identifier, and that no repositories give direct access to the child entity using such a globally unique identifier.
However, if your StorageLocation holds the globally unique identifier for the Product, as well as a possibly locally unique identifier for the SKU, then there is nothing wrong with patterns such as:
var storageLocation = _storageLocationRepository.Get(id);
var product = _productRepository.Get(storageLocation.ProductId);
product.DoSomethingToSku(storageLocation.SkuId);
The key is that by ensuring that you always obtain the product from a repository, and then interact with the child entity via the product, you are ensuring the product has the opportunity to protect it's own invariants.
So to summarise:
Feel free store an identifier to a child entity as long as you also store the global identifier for its aggregate root.
In this case, the child entity id may be locally unique or globally unique - it doesn't matter as long as the access to the child entity is via the aggregate root.
Never load a child entity directly from the a repository and call methods on it - as then the aggregate root has no chance to protect its invariants.
In fact, you should never even have a repository for a child entity

Is holding just an identifier to the Product and Sku in the StorageLocation aggregate acceptable?
Yes, because identifiers are value objects in DDD, storing the identifier in the StorageLocation aggregate does not break the rule of holding a reference to another aggregate's child entity because the value object is just a value object and no longer has any direct association with the originating aggregate.
I'm aware that a transient reference is deemed allowable, but in this instance (at least as far as I can tell) this information needs to be stored. As mentioned, storing a reference to the Product (or ProductId) is not enough.
The dehydration of the StorageLocation aggregate root back to the db should include everything that needs to be persisted. The infrastructure layer determines how your domain objects are stored in the physical repository and could be a completely different model design than your domain model depending on the concerns with the persistence technology.
Product and Skus have natural identifiers (Part Number, Sku Number). Does this provide greater flexibility for storing these values in the StorageLocation aggregate as they have meaning beyond technical implications.
There is nothing to compare "greater flexibility" to because you would only use natural identifiers or transient references for referring to entities inside the Product aggregate from the StorageLocation aggregate.
Am I approaching this the wrong way and need to look at things differently. I often find it difficult to break out of a PK / FK mindset.
Keep in mind that the persistence layer concerns should not entangle with the domain model and that the intent is to make the domain model clear and synonymous with current business rules.

Related

Class Diagram: In A Composition Relationship Should a Child Class Always Have An ID Field?

I'm having a hard time converting my database tables and foreign keys to a class diagram with classes and associations.
My question is:
"In in a composition relationship, should a child class always should have an ID field?".
In my CD, there are 2 compositor classes: PurchaseItem and PurchaseFinisher, which composite Purchase class. PurchaseItem already comes with an ID field from its table but, PurchaseFinisher doesn't because it is filtered by the id_purchase and id_payment_method foreign keys.
thanks in advance.
This is my DB diagram:
I can't see redundancy in between Purchase or Product, as you said. Could you, please, show me that based on my DB diagram? My tables are well modeled (hope so). My fault is in the classes definition.
In a class diagram, no class requires an id property: each class instance (aka object) has its own identity with or without explicit id property.
In a database, you need of course an explicit id property to uniquely identify the object among others in the database and find it back. By the way, you may annotate such properties with a trailing {id} . UML does not define any semantic for it, but it is in general sufficiently expressive to help database designers.
In the case of composition, the main question is whether a composed object can easily be identified by alternate means. There are several related ORM database techniques, for example:
you can use the owning object’s id together with another property if this is sufficient to identify the element. The two together would make a composite primary key in database.
you can use a unique id to identify the object (surrogate primary key) and use the id of the owning object as foreign key.
For PurchaseItem you have everything that is needed, although the diagram does not tell which of the two approaches you’ll use (e.g is the id unique globally, or unique within the purchase?).
But for PurchaseFinisher it is unclear if you could uniquely identify an occurence. If a payment method can only be used once per purchase, it’s fine as it may be used to identify the object.
If it would be allowed to pay two times the same amount (half of the overall price) in the same currency with the same payment methods, you’d have undistinguishable duplicates. So, some kind of identifier will be needed from the database point of view.

Should uniqueness be ignored when deciding if something is an Entity or Value Object?

Is uniqueness considered a persistence concern in DDD?
The reason I ask is because I have a Customer object in an order quoting context. e.g. an order is for a customer and the customer must pay a certain rate.
Technically, I won't allow a customer to have the same code or name as another. Which means if I have two Customer objects with the same code and name, they'll always be treated the same like a value object.
But instinctively, a Customer feels like an entity. Is the unique constraint throwing me off, or am I right to think it's a value object?
The order quoting context will also allow customers to be added/edited/removed from an admin page. Could the confusion be caused by this? Should admin pages be part of another context where Customer is an entity, and the order quoting context will use Customer as a value object?
This is an excellent question, and you partially answered it already, your Customer is an Entity in your bounded context of administration.
A good rule of thumb to decide whether or not an object is an entity is to think with the concept of identity. If your object require an identity which will stay the same as the time flies, even if the person's name or contact details can change then its likely an Entity.
However, with that concept defined, you can have a CustomerId, which is composed of the invariant from a business point of view, in your case the code and name.
This CustomerId is not a technical ID, its a business ID, and your entity's identity will be this ID. In the Order quoting bounded context, you can then reference your Customer using this same object (probably defined somewhere in a shared context, or by duplicating the code: its ok in DDD to duplicate some code to promote loose coupling).

How do I structure a generic item that can have a relationship with different tables?

In my example, I have a watch, which is an indication a user wants notifications about events on a different item, say a group and an organization.
I see two ways to do this:
Have a groupwatch resource, with a groupwatch table, with id,user,group (group FK to group resource and table); and a orgwatch resource, with a orgwatch table, with id,user,organization (org FK to organization resource and table)
Have a generic watch resource, with a watch table, with id,user,type,typeid. type is one of group or organization, and typeid is the ID of the group or organization being watched.
Since both of them are watches, it seems a waste to have two different tables and resources to watch 2 different objects. It gets worse if I start watching 4, 5, 6, 20, 50 different types of resources.
On the other hand, a foreign key relationship appears impossible if I just have a generic typeid, which means that my database (if relational) and my framework (activerecord or anything else) cannot enforce it correctly.
How do I best implement this type of "association to different types of record/table for each record in my table"?
UPDATE:
Are my only choices for doing this:
separate tables/resources for each watch type, which enables the database to enforce relational integrity and do joins
single table for all watches, but I will have to enforce relational integrity and do joins at the app level?
If you add a new type of resource once every six months, you may want to define your tables in such a way that adding new resources involves changing data definitions. If you add a new resource type every week, you may want to make your data definitions stay the same when you add new types. There's a downside to either choice.
If you do choose to define table in such a way that the types are visible in the table structure, there are two patterns often used with type/subtype (aka class/subclass) situations.
One pattern has been called "single table inheritance". Put data about all the types in a single table, and leave some columns NULL wherever they do not apply.
Another pattern has been called "class table inheritance". Define one table for the superclass, with all the data that is common to all the types. Then define tables for each subtype (subclass) to contain class specific data. Make the primary key of the subtype tables a duplicate of the primary key in the supertype table, and also declare it as a foreign key that references the primary key of the supertype table. It's going to be up to the app, at insert time, to replicate the value of the primary key in the supertype table over in the subtype table.
I like Fowlers' treatment of these two patterns.
http://martinfowler.com/eaaCatalog/classTableInheritance.html
http://www.martinfowler.com/eaaCatalog/singleTableInheritance.html
This matter of sharing primary keys has a few beneficial effects.
First, it enforces the one-to-one nature of the ISa relationships.
Second, it makes it easy to find out whether a given entry belongs to a desired subtype, by just joining with the subtype table. You don't really need an extra type field.
Third, it speeds up the joins, because of the index that gets built when you declare a primary key.
If you want a structure that can adapt to new attributes without changing data definitions, you can look into E-A-V design. Be careful, though. Sometimes this results in data that is nearly impossible to use, because the logical structure is so obscure. I usually think of E-A-V as an anti-pattern for this reason, although there are some who really like the results they get from it.

How do I represent this model in tables?

I have a table of warehouses and a table of clients to manage several warehouses belonging to different clients
warehouse
=====
id
address
capacity
owner_client
client
=====
id
name
My issue is, i have an ACME client, and ACME has an "ACME safety rating" attribute only applicable to their warehouses. Currently we just have this as a field of warehouses and its null for non-acme warehouses. But this feels wrong and has required some workarounds and special cases.
Whats the best way to represent this? I've thought of making an "Acme safety ratings" table with the number and FK to the warehouse, but now I've made a table specific for one client? What if we need to start tracking "is_foobar_accesible" for the baz client?
The relationally pure way to do this would be to implement your initial suggestion i.e. have a separate table such as ACME_WAREHOUSES that holds the attributes such as SAFTEY_RATING that are only applicable to this client. A different CLIENT_WAREHOUSES table would be created for each client that has its own attributes. In this way you could use standard database constraint functionality to ensure the integrity of the data in these tables.
Another method would be to add a series of nullable columns to the WAREHOUSES table such as ACME_SAFETY_RATING and BAZ_FOOBAR_ACCESSIBLE. This is not relationally pure as it means null values can exist in this table. However, you can still use standard database functionality to ensure the integrity of the data. It can be a bit more convoluted if certain values are mandatory in certain situations. Also, if there are many clients with many differing attributes the number of columns in the table can become unwieldy.
Another method is the Entity-Attribute-Value model. Generally, this is to be avoided if at all possible. It is not relationally pure, as your column values are now no longer defined over domains, and it is extremely difficult, if not impossible, to ensure the integrity of the data. Any real attempt to do so will require a lot of bespoke coding (which needs to be carefully implemented to cater for things like concurrency control that database constraints give you for free) as you cannot use standard database constraints. However, if you are just interested in storing values for information and not doing anything with them you could use this method.
The EAV method does have a danger that because it appears so easy to add attributes to an entity, it becomes the default way of doing so. It is then used to add attributes for which vital processing is dependent and, because you cannot ensure the integrity of the data using this method, you find the values being used are meaningless and the whole logical basis for the processing is destroyed.
I would create a ClientProperty and ClientWarehousePropertyValue table so that you can store these Client owned properties and their values for each warehouse:
ClientProperty
===============
ID
ClientID
Name
ClientWarehousePropertyValue
============================
WarehouseID
ClientPropertyID
Value

How to model a mutually exclusive relationship in SQL Server

I have to add functionality to an existing application and I've run into a data situation that I'm not sure how to model. I am being restricted to the creation of new tables and code. If I need to alter the existing structure I think my client may reject the proposal.. although if its the only way to get it right this is what I will have to do.
I have an Item table that can me link to any number of tables, and these tables may increase over time. The Item can only me linked to one other table, but the record in the other table may have many items linked to it.
Examples of the tables/entities being linked to are Person, Vehicle, Building, Office. These are all separate tables.
Example of Items are Pen, Stapler, Cushion, Tyre, A4 Paper, Plastic Bag, Poster, Decoration"
For instance a Poster may be allocated to a Person or Office or Building. In the future if they add a Conference Room table it may also be added to that.
My intital thoughts are:
Item
{
ID,
Name
}
LinkedItem
{
ItemID,
LinkedToTableName,
LinkedToID
}
The LinkedToTableName field will then allow me to identify the correct table to link to in my code.
I'm not overly happy with this solution, but I can't quite think of anything else. Please help! :)
Thanks!
It is not a good practice to store table names as column values. This is a bad hack.
There are two standard ways of doing what you are trying to do. The first is called single-table inheritance. This is easily understood by ORM tools but trades off some normalization. The idea is, that all of these entities - Person, Vehicle, whatever - are stored in the same table, often with several unused columns per entry, along with a discriminator field that identifies what type the entity is.
The discriminator field is usually an integer type, that is mapped to some enumeration in your code. It may also be a foreign key to some lookup table in your database, identifying which numbers correspond to which types (not table names, just descriptions).
The other way to do this is multiple-table inheritance, which is better for your database but not as easy to map in code. You do this by having a base table which defines some common properties of all the objects - perhaps just an ID and a name - and all of your "specific" tables (Person etc.) use the base ID as a unique foreign key (usually also the primary key).
In the first case, the exclusivity is implicit, since all entities are in one table. In the second case, the relationship is between the Item and the base entity ID, which also guarantees uniqueness.
Note that with multiple-table inheritance, you have a different problem - you can't guarantee that a base ID is used by exactly one inheritance table. It could be used by several, or not used at all. That is why multiple-table inheritance schemes usually also have a discriminator column, to identify which table is "expected." Again, this discriminator doesn't hold a table name, it holds a lookup value which the consumer may (or may not) use to determine which other table to join to.
Multiple-table inheritance is a closer match to your current schema, so I would recommend going with that unless you need to use this with Linq to SQL or a similar ORM.
See here for a good detailed tutorial: Implementing Table Inheritance in SQL Server.
Find something common to Person, Vehicle, Building, Office. For the lack of a better term I have used Entity. Then implement super-type/sub-type relationship between the Entity and its sub-types. Note that the EntityID is a PK and a FK in all sub-type tables. Now, you can link the Item table to the Entity (owner).
In this model, one item can belong to only one Entity; one Entity can have (own) many items.
your link table is ok.
the trouble you will have is that you will need to generate dynamic sql at runtime. parameterized sql does not typically allow the objects inthe FROM list to be parameters.
i fyou want to avoid this, you may be able to denormalize a little - say by creating a table to hold the id (assuming the ids are unique across the other tables) and the type_id representing which table is the source, and a generated description - e.g. the name value from the inital record.
you would trigger the creation of this denormalized list when the base info is modified, and you could use that for generalized queries - and then resort to your dynamic queries when needed at runtime.