I am trying to develop a data model for a very diverse set of interconnected objects. As the application matures, the types of objects supported will increase significantly. I want to avoid having to modify the model/schema whenever new object types are added.
As a simple example, let's say I'm starting with a model of people and buildings. A building can have multiple owners; a person can own multiple buildings; a person can live in a house and work in an office... Future versions might add cars and corporations. Cars can have owners, corporations can manufacture cars, people can work for corporations, etc. Most of the relationships will be many-to-many, some will be one-to-many, very few will be one-to-one.
While concepts like "owner", "employer", or "manufacture" can be considered properties of a "building", "corporation", or "car" object, I don't want to redefine the data model to support a new property type.
My current idea is to model this similar to a graph, where each piece of data is its own node. The node object would be very simple:
Unique identifier
Name (human representation)
Node type
Relationships
Extending the previous example, the possible node types would be:
Person
Car
Company
Building
A relationship would be:
Node A
Node B
Relationship type - uses, owns, has, is, etc
I have a few questions:
Are there any drawbacks to this approach?
Is there an existing pattern or model that describes this?
Are there better approaches?
Is there an existing pattern or model that describes this?
What you describe sounds like a network data model, also known as an object or object-oriented data model.
Are there any drawbacks to this approach?
Your model doesn't support ternary and higher relationships. It also creates fixed access paths between nodes, which supports node-to-node navigation, but which can make many queries convoluted. I also don't see any support for subtyping.
Without composite determinants, some situations will be difficult to model or query. You don't support predicates like (Object, Language) -> Name (or (Company, Role) -> Person, etc). One way is to create special relationship types, but your model is going to be asymmetric and more complicated to query.
Are there better approaches?
The relational model of data handles n-ary relations between object types / domains, and allows for the representation of complex predicates. N-ary relations mean it supports object hypergraphs, and user-defined joins mean ad-hoc access paths. Composite determinants are supported, and most implementations support a variety of integrity constraints.
In particular, look at Object-Role Modeling (http://www.orm.net, https://www.ormfoundation.org).
I want to avoid having to modify the model/schema whenever new object types are added.
Try doing a web search for "universal schema for knowledge representation". Facts about the world aren't limited to simple atomic observations like "John Smith has a dog named Spot". We have to deal with facts like "Company A is not allowed to distribute product B in regions within 100km of point C after date D if that product contains ingredients E or F". The most powerful knowledge representation we've got so far is natural language, and as far as I know we don't yet have a simple model of its structure.
I'm currently reading Ologs: A Categorical Framework For Knowledge
Representation. Perhaps this will be of interest to you too.
Related
The image shows the logistics of the Warehouse. Very very simplistic. What is its concept: There are documents: ReceivingWayBill, DispatchingWaybill, ReplacementOrder.
They interact with the main classes: Warehouse, Counterparty, Item.
And the Register class: ItemRemainsInWarehouse. It turns out, the document is confirmation of the operation, reception, sending, and so on. The Register simply stores information about the number of remaining goods.
If you miss a lot of problems of this scheme, such as: the lack of generalization, getters and setters and a heap of everything else.
Who can tell: the relationship between classes, and there is concrete aggregation everywhere, are placed correctly, or can we somehow consider the association in more detail?
It is so hard (maybe impossible) to correct your whole model with provided explanation. I give some improvements.
You should put Multiplicity of you relationships. They are so important. In some relationship, you have 1 (ReplacementOrder , Warehouse) and some of your relatioships are maybe * (Item , ReceivingWayBill)
You put Aggregation between your classes and we know that Aggregation is type of Association. You can put Associations too. You can find a lot of similar questions and answers that explain differences between Association and Aggregation (and Composition). see Question 1, Question 2 and Question 3. But I recommend this answer.
I think, there is NOT a very significant difference between Aggregation and Association. See my example in this question.
Robert C. Martin says (see here):
Association represents the ability of one instance to send a message to another instance.
Aggregation is the typical whole/part relationship. This is exactly the same as an association with the exception that instances
cannot have cyclic aggregation relationships (i.e. a part cannot
contain its whole).
Therefor: some of your relationships are exactly an Aggregation. (relationship between Item and other classes). Your Counterparty has not good API definition. Your other relationships is about using Warehouse class. I think (just guess) the other classes only use Warehouse class services (public methods). In this case, they can be Associations. Otherwise, if they need an instance of Warehouse as a part, they are Aggregations.
Aggregation is evil!
Read the UML specs about the two variants they introduced (p. 110):
none: Indicates that the Property has no aggregation semantics. [hear, hear!]
shared: Indicates that the Property has shared aggregation semantics. Precise semantics of shared aggregation varies by application area and modeler.
composite: Indicates that the Property is aggregated compositely, i.e., the composite object has responsibility for the existence and storage of the composed objects (see the definition of parts in 11.2.3).
Composite aggregation is a strong form of aggregation that requires a part object be included in at most one composite object at a time. If a composite object is deleted, all of its part instances that are objects are deleted with it.
Now, that last sentence clearly indicates where you should use composite (!) aggregation: in security related appications. When you delete a person record in a database you need to also delete all related entities. That often used example with a car being composed of motor, tires, etc. does not really fit. The tires do not vanish when you "delete" the car. Simply because you can not delete it. Even worse is the use of a shared composite since it has no definition per definition (sic!).
So what should you do? Use multiplicities! That is what people usually want to show. There are 0..n, 1, etc. elements related to to the class at the other side. Eventually you name these by using roles to make it explicit.
If you consider DispatchingWaybill and ReceivingWaybill it looks like those are association classes. With the right multiplicities (1-* / *-1) you can leave it this way. (Edit: note the little dots at the association's ends which tell that the class at the opposite has an attribute named after the role.)
Alternatively attach either with a dashed line to an association between the classes where they are currently connected to.
I'm getting started with UML after years of programming and I want to make sure that I'm properly using the symbols in my diagrams.
Does the following diagram look like a proper representation of a simple Car class?
Update:
Actually, I just realized that Make knows about model so I removed the arrowhead, but Make and Model do not know about Car so I added arrowheads:
I would make some adjustments to your original diagram:
Make
The Make identifies the manufacturer, so Make will have many Models, and because no other manufacturer can take ownership of those models, the black diamond is shown to denote composition
The relationship is bidirectionally navigable, so no arrows are shown
Because we can navigate from Make to Model, there is a SortedSet (sorted by year) of Models shown in Make
Model
Model holds the only direct relationship to Make; this guarantees the validity of the relationships and avoids incorrect connections that could result from allowing other classes to create relationships with Make
Because Model may be used to navigate to Make, Model contains a reference to its Make; this ensures that any Car may navigate to its Model information and from there (and only there) navigate to its Make
For year, the immutable type Integer was chosen, because the value will never change once it has been assigned
Car
The relationship between Model and Car is only navigable from Car to Model, because the business domain has no need to obtain a set of all Cars that mere manufactured for any given Model (and it keeps things simpler)
Each car is assigned a VIN, shown as an instance of UUID - Universally Unique Identifier
The relationship between Car and Engine is only navigable from Car to Engine, so Car has an Engine member; there are use cases where parts of the system that receive a reference to a Car may want to navigate to the Car's Engine
Engine
Given that an Engine may start its life in one Car but then be moved into a different Car, the relationship is shown using a white diamond, denoting aggregation
The Engine is assigned a Part Number, which is a specific identifier for that instance of Engine; therefore the immutable type String is shown
The Engine maintains a mileage counter, which is defined to be of type: int, because the value is mutable and will change over the lifetime of the Engine
I hope this is helpful and provides some good feedback that makes your modeling exercise thought provoking. I know I had fun creating the diagram and thinking about the relationships and some of the details. Haven't done this in awhile -
Say you need to architect an app with an entity that can be associated with multiple other kinds of entities. For example, you have a Picture entity that can be associated with a Meal entity, a Person entity, a Boardroom entity, a Furniture entity, etc. I can think of a number of different ways to address this problem, but -- perhaps because I'm new to Core Data -- I'm not comfortable with any of them.
The most obvious approach that comes to mind is simply creating a relationship between Picture and each entity that supports associated pictures, but this seems sloppy since pictures will have multiple "null pointers."
Another possibility is creating a superentity -- Pictureable -- or something. Every entity that supports associated pictures would be a subentity of Pictureable, and Picture itself would have a one-to-one with Pictureable. I find this approach troubling because it can't be used more than once in the context of a project (since Core Data doesn't support multiple inheritance) AND the way Core Data seems to create one table for any given root entity -- assuming a SQLite backing -- has me afeard of grouping a whole bunch of disparate subentities under the umbrella of a common superentity (I realize that thinking along these lines may smack of premature optimization, so let me know if I'm being a ninny).
A third approach is to create a composite key for Picture that consists of a "type" and a "UID." Assuming every entity in my data model has a UID, I can use this key to derive an associated managed object from a Picture instance and vice versa. This approach worries me because it sounds like it might get slow when fetching en masse; it also doesn't feel native enough to me.
A fourth approach -- the one I'm leaning towards for the app I'm working on -- is creating subentities for both Picture and X (where X is either Meal, Person, Boardroom, etc.) and creating a one-to-one between both of those subentities. While this approach seems like the lesser of all evils, it still seems abstruse to my untrained eye, so I wonder if there's a better way.
Edit 1: In the last paragraph, I meant to say I'm leaning towards creating subentities just for Picture, not both Picture and X.
I think the best variations on this theme are (not necessarily in order):
Use separate entities for the pictures associated with Meal, Person, Boardroom, etc. Those entities might all have the same attributes, and they might in fact all be implemented using the same class. There's nothing wrong with that, and it makes it simple to have a bidirectional relationship between each kind of entity and the entity that stores its picture.
Make the picture an attribute of each of the entity types rather than a separate entity. This isn't a great plan with respect to efficiency if you're storing the actual picture data in the database, but it'd be fine if you store the image as a separate file and store the path to that file in an attribute. If the images or the number of records is small, it may not really be a problem even if you do store the image data in the database.
Use a single entity for all the pictures but omit the inverse relationship back to the associated entity. There's a helpful SO question that considers this, and the accepted answer links to the even more helpful Unidirectional Relationships section of the docs. This can be a nice solution to your problem if you don't need the picture->owner relationship, but you should understand the possible risk before you go down that road.
Give your picture entity separate relationships for each possible kind of owner, as you described in the first option you listed. If you'll need to be able to access all the pictures as a group and you need a relationship from the picture back to its owner, and if the number of possible owner entities is relatively small, this might be your best option even if it seems sloppy to have empty attributes.
As you noticed, when you use inheritance with your entities, all the sub-entities end up together in one big table. So, your fourth option (using sub-entities for each kind of picture) is similar under the hood to your first option.
Thinking more about this question, I'm inclined toward using entity inheritance to create subentities for the pictures associated with each type of owner entity. The Picture entity would store just the data that's associated with any picture. Each subentity, like MealPicture and PersonPicture, would add a relationship to it's own particular sort of owner. This way, you get bidirectional Meal<->MealPicture and Person<->PersonPicture relationships, and because each subentity inherits all the common Picture stuff you avoid the DRY violation that was bugging you. In short, you get most of the best parts of options 1 and 3 above. Under the hood, Core Data manages the pictures as in option 4 above, but in use each of the picture subentities only exposes a single relationship.
Just to expand a bit on Caleb's excellent summation...
I think it's important not to over emphasize the similarities between entities and classes. Both are abstractions that help define concrete objects but entities are very "lightweight" compared to classes. For one thing, entities don't have behaviors but just properties. For another, they exist purely to provide other concrete objects e.g. managed object context and persistent stores, a description of the data model so those concrete objects can piece everything together.
In fact, under the hood, there is no NSEntity class, there is only an NSEnitity***Description*** class. Entities are really just descriptions of how the objects in an object graph will fit together. As such, you really don't get all the overhead an inefficiency of multiplying classes when you multiply entities e.g. having a bunch of largely duplicate entities doesn't slow down the app, use more memory, interfere with method chains etc.
So, don't be afraid to use multiple seemingly redundant entities when that is the simplest solution. In Core Data, that is often the most elegant solution.
I am struggling with esactly this dilemma right now. I have many different entities in my model that can be "quantified". Say I have Apple, Pear, Farmer for all of those Entities, I need a AppleStack, PearStack, FarmerGroup, which are all just object+number. I need a generic approach to this because I want to support it in a model editor I am writing, so I decided I will define a ObjectValue abstract entity with attributes object, value. Then I will create child entities of ObjectValue and will subclass them and declare a valueEntity constant. this way I define it only once and I can write generic code that, for example, returns the possible values of the object relationship. Moreover if I need special attributes (and I actually do for a few of those) I can still add them in the child entities.
I'm writing an application to help diabetics manage their condition. Information that is tracked includes blood sugar results, nutrition, exercise, and medication information.
In similar applications these entries are all presented in a single table view even though each type of entry has different fields. This data is manually tracked by many diabetics in a logbook, and I'm looking to keep that paradigm.
Each entry has some common information (timestamp, category, and notes) as well as information specific to each entry type. For instance, meal entries would have detailed nutrition information (carb counts, fiber, fat, etc), medication entries would indicate which medication and dosage, etc.
I've considered two different approaches but I'm getting stuck at both a conceptual level and a technical level when attempting to implement either. The first approach was to create an abstract entity to contain all the common fields and then create entities for each log entry type (meals, meds, bg, etc.) that are parented to the abstract entity. I had this all modeled out but couldn't quite figure out how to bind these items to an array controller to have them show up in a single table view.
The second approach is to have one entity that contains the common fields, and then model the specific entry types as separate entities that have a relationship back to the common record (sort of like a decorator pattern). This was somewhat easier to build the UI for (at least for the common field entity), but I come to the same problem when wanting to bind the specific data entities.
Of course the easiest approach is to just throw all the fields from each different entry type into one entity but that goes against all my sensibilities. And it seems I would still run into a similar problem when I go to bind things to the table view.
My end goal is to provide an interface to the user that shows each entry in chronological order in a unified interface instead of having to keep a separate list of each entry type. I'm fine with adding code where needed, but I'd like to use the bindings as much as possible.
Thanks in advance for any advice.
Don't get bogged down with entity inheritance. You shouldn't use it save duplicate attributes like you would with classes. It's major use is allow different entities to be in the same relationship. Also, entity inheritance and class inheritance don't have to overlap. You can have a class inheritance hierarchy without an entity inheritance hierarchy.
I'm not sure I understand exactly what you really need but here's some generic advice: You shouldn't create your data model based on the needs of the UI. The data model is really a simulation of the real-world objects, events or conditions that your app deals with. You should create your data model first and foremost to accurately simulate the data. Ideally, you should create a data model that could be used with any UI e.g. command-line, GUI, web page etc.
Once your model is accurately setup, then whipping up the UI is usually easy.
context:
I have an entity Book. A book can have one or more Descriptions. Descriptions are value objects.
problem:
A description can be more specific than another description. Eg if a description contains the content of the book and how the cover looks it is more specific than a description that only discusses how the cover looks. I don't know how to model this and how to have the repository save it. It is not the responsibility of the book nor of the book description to know these relationships. Some other object can handle this and then ask the repository to save the relationships. But BookRepository.addMoreSpecificDescription(Description, MoreSpecificDescription) seems difficult for the repository to save.
How is such a thing handled in DDD?
The other two answers are one direction (+1 btw). I am coming in after your edit to the original question, so here are my two cents...
I define a Value Object as an object with two or more properties that can (and is) shared amongst other entities. They can be shared only within a single Aggregate Root, that's fine too. Just the fact that they can (and are) shared.
To use your example, you define a "Description" as a Value Object. That tells me that "Description" with multiple properties can be shared amongst several Books. In the real-world, that does not make sense as we all know each book has unique descriptions written by the master of who authored or published the book. Hehe. So, I would argue that Descriptions aren't really Value Objects, but themselves are additional Entity objects within your Book Aggregate Root Entity boundery (you can have multiple entities within a single aggregate root's entity). Even books that are re-released, a newer revision, etc have slightly different descriptions describing that slight change.
I believe that answers your question - make the descriptions entity objects and protect them behind your main Book Entity Aggregate Root (e.g. Book.GetDescriptions()...). The rest of this answer addresses how I handle Value Objects in Repositories, for others reading this post...
For storing Value Objects in a repository, and retrieving them, we start to encroach onto the same territory I wrestled with myself when I went switched from a "Database-first" modeling approach to a DDD approach. I myself wreslted with this one, on how to store a Value Object in the DB, and retrieve it without an Identity. Until I stepped back and realized what i was doing...
In Domain Driven Design, you are modeling the Value Objects in your domain - not your data store. That is the key phrase. It means you are not designing the Value Objects to be stored as independant objects in the data store, you can store them however you like!
Let's take the common DDD example of Value Objects, that being an Address(). DDD presents that an Mailing Address is the perfect Value Object example, as the definition of a Value Object is an object of who's properties sum up to create the uniqueness of the object. If a property changes, it will be a different Value Object. And the same Value Object 9teh sum of its properties) can be shared amongst other Entities.
A Mailing Address is a location, a long/lat of a specific location on the planet. Multiple people can live at the address, and when someone moves, the new people to occupy the same Mailing Address now use the same Value Object.
So, I have a Person() object with a MailingAddress() object that has the address information in it. It is protected behind my Person() aggregate root with get/update/create methods/services.
Now, how do we store that and share it amongst the people in the same household? Ah, there lies DDD - you aren't modeling your data store straight from your DDD (even though, that would be nice). With that said, you simple create a single Table that presents your Person object, and it has the columns for your mailing address within it. It is the job of your Repository to re-hydrate that information back into your Person() and MailingAddress() object from the data store, and to split it up during the Create/Update operations.
Yep, you'd have duplicate data now in your data store. Three Person() entities with the same mailing address all now have three seperate copies of that Value Object data - and that is ok! Value Objects are meant to be copied and destoyed quite easily. "Copy" is the optimum word there in the DDD playbook.
So to sum up, Domain Drive Design is about modeling your Domain to represent your actual business use of the objects. You model a Person() entity and a MailingAddress Value Object seperately, as they are represented differently in your application. You persist them a copied-data, that being additional columns in the same table as your Person table.
All of the above is strict-DDD. But, DDD is meant to be just "suggestions", not rules to live by. That's why you are free to do what myself and many others have done, kind of a loose-DDD style. If you don't like the copied data, your only option is that being you can create a seperate table for MailingAddress() and stick an Identity column on it, and update your MailingAddress() object to have now have that identity on it - knowing you only use that identity to link it to other Person() objects that share it (I personally like a 3rd many-to-many relationship table, to keep the speed of the queries up). You would mask that Idenity (i.e. internal modifier) from being exposed outside of your Aggregate Root/Domain, so other layers (such as the Application or UI) do not know of the Identity column of the MailingAddress, if possible. Also, I would create a dedicated Repository just for MailingAddress, and use your PersonService layer to combine them into the correct object, Person.MailingAddress().
Sorry for the rant... :)
First, I think that reviews should be entities.
Second, why are you trying to model relationships between reviews? I don't see a natural relationship between them. "More specific than" is too vague to be useful as a relationship.
If you're having difficulty modeling the situation, that suggests that maybe there is no relationship.
I agree with Jason. I don't know what your rationale is for making reviews value objects.
I would expect a BookReview to have BookReviewContentItems so that you could have a method on the BookReview to call to decide if it is specific enough, where the method decides based on querying its collection of content items.