OO: Good method for maintaining consistency between related objects - oop

I have two classes, Server and Application, with a many-to-many relationship: a server can run multiple applications, an application can run across multiple servers. A third class, Host, represents a single application on a single server, including references to the Application and Server objects as well as additional data such as the amount of disk space used by the application on the server. Both Server and Application objects contain a list of all their hosts: hence, Applications know about Hosts and Hosts know about Applications, and Servers know about Hosts and Hosts know about Servers.
The purpose of my project is to work out the schedule for migrating a bunch of applications onto new servers. Originally each application had a migration-start and migration-end date. Some applications also have start and end dates for virtualisation. Virtualisation occurs if the migration cannot be performed within the application's constraints (never mind what these are). It occurs prior to the migration and frees the application from its constraints. An object called 'Schedule' is held by the Application object, which includes these 4 dates as well as a boolean flag to say whether it is to be virtualised, and a list of 'months' which contain the man-hours required to migrate (or virtualise) the application in each particular month.
We now want to allow servers to undergo virtualisation separately, on a specified date. All the applications (or parts of applications, i.e. hosts) on these servers will be virtualised on this date; they will be migrated along with the rest of the application. We originally decided to have the server class hold its own Schedule object. The virtualisation dates were then set in the server. However, we decided we wanted to keep the server and application schedules consistent - so that, for example, the server schedule's migration-start and end dates should be set to the earliest start and latest end dates, respectively, of all applications running on that server. This meant that every time we updated the Application dates, we had to remember to update all its server dates (via the host object). Or, if we wanted to update the Application's man-hours for a particular month, we had to update the server's man-hours also.
Then we thought about putting a single Schedule object inside each Host object. This solves the consistency problem, but leads to quite a bit of redundancy: since all Host objects belonging to an application will necessarily have the same migration dates (but possibly different virtualisation dates), when you set the migration dates for an app, you have to set the same dates for every host. Also, there are a few instances where we need to work out the earliest-start and latest-finish dates for servers AND applications, as above. This would involve either: holding this data in each of the application and server objects (effectively giving each its own Schedule, thereby brining back problems with consistency), or: calculating this data on-the-fly each time it is needed, by looping through all the hosts' schedules. The same goes for the man-hours required by an application each month, which is calculated at the application level, fractioned into hours for each host per month, and then recalculated when we need to figure it out at the application level again. This is, as you would expect, not efficient in the slightest.
This isn't a straightforward question, but I'm wondering if there are any accepted strategies for dealing with this sort of situation. Sorry in advance for the prolixity of my post; hopefully I've made the situation clear enough.

This is complex once we get into 3rd paragraph onwards
I will use the following design principle
Keep Application, Server, Host objects contain the minimum required behaviors and states.
For example, Application Object may contain start date, end date and virtualization start and virtualization end dates. Think whether it require to contain a list of servers? or instance of Host?
Then a think about a small frame work like this.
a) MigrationManager who does the complete process of Migration using List
b) MigratioContext will composite information for migration process.
c) ErrorContext will composite the error and exception handling
Migration Manager gets an instance of Scheduler and schedules the migration
In this way we can gradually evolve a frame work kind of stuff around the core business object and business logic.
The important thing to remember
Separation of Concerns.
Reusability of Code: For exmple Your Application object may not be required to tied up the whole migration process. Instead those things can be done by another object
(This answer is based on my high level understanding and assumptions that could be wrong. But I think you may get some directions to build the application to meet the requirements)
Once more suggestion I have. Use a Modeling Tool such as StarUML or ArgoUML to put your ideas in a pictorial form. This will help all of the members to get into the question very quickly.
Thanks

I think a fundamental principle of object-oriented programming is that to the extent possible, every mutable aspect of state should at all times have exactly one well-defined owner (that owner may, in turn, be owned by exactly one other entity, which is in turned by one other entity, etc.). Other objects and entities may hold references to that mutable state, but any such references should be thought of in terms of the owner. For example, if a method accepts a reference to a collection and is supposed to to populate it, the method would not think in terms of operating on a collection it owns, but rather in terms of operating on a collection owned by someone else, for the benefit of that entity.
Sometimes it is necessary to have various objects to have separate copies of what is supposed to be the same mutable state. This situation frequently arises in things like graphical user interfaces, where an object might "own" the rotational angle of an object, but a display control might need to cache a private rendering of that object in its current orientation. Handling of such cases may be greatly simplified in cases where one object is designated the absolute master, and it keeps other objects notified of its state. It will be greatly complicated if there are multiple objects, none of which has exclusive ownership, but all of which are supposed to nonetheless keep in sync with each other.
If you can work your model so that no piece of state needs to be duplicated, I would highly recommend doing so. It's crucial, though, that your model be capable of representing all scenarios of interest, including the possibility that two things which are supposed to be in the same state, might not be. What's most important is that every aspect of state has a well-defined chain of ownership, so that when an aspect of state is changed, one can say whose state is affected.

Related

In ECS (Entity Component System) what is the difference between Component-Manager and a System?

I'm trying to understand ECS. So a component is just plain data, and some manager holds these components in a container and loops through all of them to act on this data, to "update" them.
Is this manager what people call "component-manager" or is it a "system"? Or do they mean the same thing? If not, what does a component-manager and a system do?
ECS means different things to different people. There are a large number of approaches when it comes to implementation but I personally go by the following rules:
A Component is just plain data, typically a structure or some object with no logic associated with it what so ever.
An Entity is a collection of components. It is defined by an identifier, typically an integer, that can be used to look up components like an index.
A System is where all the game logic lives. Each System has an archetype, that is a specific set of components that it operates on. Systems have an update function, which when invoked accesses the specific set of components its interested in (its archetype), for all entities that have that specific collection of components. This update function is triggered externally (by what? see the next paragraph).
Now, here's the bit that addresses your question directly (or at least attempts to). Video games are simulations and they are typically driven by whats called an update loop (typically sync-ed to a monitor's refresh rate). In ECS architecture, there is typically dedicated code that strings your systems together in a queue and on each time-step of the update loop executes those systems in sequence (ie. calls their update functions). That bit of dedicated code not only manages the system update loop but is also responsible for managing components (stored as lists/arrays that can be indexed by an entity id) and a myriad of other tasks. In many implementations its referred to as the "Engine". This is what I take to be a "component-manager". But that could mean something else in another ECS approach. Just my two-cents. Hope it helped.

Aggregate - Correct Usage (DDD)

I have been trying to get started on Domain Driven Design (DDD) and therefore I've been studying it for a while now. I have a problem and I seek help around how I can solve it in a DDD fashion.
I have a Client class, which contains a hell lot of attributes - some of them are simple attributes, such as string contactName whereas others are complex ones, such as list addresses, list websites, etc.
DDD advocates that Client should be an Entity and it should also be an Aggregate root - ie, the client code should manipulate only the Client object itself and it's down to the Client object to perform operations on its inner objects (addresses, websites, names, etc.).
Here's the point where I get confused:
There are tons of business rules in the application that depend on the Client's inner objects - for instance:
Depending on the Client's country of birth or resident and her address, some FATCA (an US regulation) restrictions may be applicable.
I need to enrich some inner objects with data that comes from other systems, both internal to my organisation as well as external.
The application has to decide whether a Client is allowed to perform an operation and to that end, the app needs to scrutinize a lot of client details and make a decision - also, as the app scrutinizes the Client it needs to update many of its attributes to keep track of what led the application to that decision.
I could list hundreds of rules here - but you get the idea. My point is that I need to update many of the Client's inner attributes. From the domain perspective, the root is the Client - it's the Client that the user searches for in the GUI. The user cares only about the Client as a whole. Say, an isolated address is meaningless - it only exists if it's part of a Client.
Having said all that, my question is:
Eric Evans says it's OK for the root to return transient references to inner objects, preferably VOs (keyword here: VO) - but any manipulation on the inner objects should be performed by the root itself.
I have hundreds of manipulations that I need to perform on my clients - if I move all of them to the root, the root is going to become huge - it will have at least 10K lines of code!
According to Eric, a VO should be immutable - so if my root returns VOs, the client code won't be allowed to change them. So doing something like this would be unacceptable in a service: client.getExternalInfo().update(getDataFromExternalSystem())
So my question boils down to how on Earth I should update the inner objects without breaking the DDD rules?
I don't see any easy way out.
UPDATE I:
I've just come across Specifications, which seems to be the ideal DDD concept to my problem.
I'm still reading about it but I have decided to post this update anyway.
I have been studying DDD for awhile myself and am struggling to master it.
First, you're right: Specification is a fine pattern to use for validation or business rules in general, assuming the rules you are applying fit well with a predicate tree.
Of course, I don't know the specifics of your design, but I do wonder about the model itself. You mention that your Client class has "a hell lot of attributes". Are you sure your model is not somewhat anemic? Could your design benefit from some more analysis, perhaps breaking it out into other Aggregates? Is this a single Bounded Context? Should it be?
Specifications is definitely the way to go for complex business logic.
One question though - are you modeling the inner entities like addresses and names as ValueObjects? The best rule of thumb I can think of for those is if you can say they're equal, without an ID, they're likely value objects. Does your domain consider names to have a state?
If you're looking at a problem where few entities take in many types of change AND need an audit trail, you might want to also explore EventSourcing. Ideally the entity declares its reaction to an event, but you can also have the mutating code be held in the event for easy extensibility. There's pros and cons in that approach, of course.

Erlang ETS tables versus message passing: Optimization concerns?

I'm coming into an existing (game) project whose server component is written entirely in erlang. At times, it can be excruciating to get a piece of data from this system (I'm interested in how many widgets player 56 has) from the process that owns it. Assuming I can find the process that owns the data, I can pass a message to that process and wait for it to pass a message back, but this does not scale well to multiple machines and it kills response time.
I have been considering replacing many of the tasks that exist in this game with a system where information that is frequently accessed by multiple processes would be stored in a protected ets table. The table's owner would do nothing but receive update messages (the player has just spent five widgets) and update the table accordingly. It would catch all exceptions and simply go on to the next update message. Any process that wanted to know if the player had sufficient widgets to buy a fooble would need only to peek at the table. (Yes, I understand that a message might be in the buffer that reduces the number of widgets, but I have that issue under control.)
I'm afraid that my question is less of a question and more of a request for comments. I'll upvote anything that is both helpful and sufficiently explained or referenced.
What are the likely drawbacks of such an implementation? I'm interested in the details of lock contention that I am likely to see in having one-writer-multiple-readers, what sort of problems I'll have distributing this across multiple machines, and especially: input from people who've done this before.
first of all, default ETS behaviour is consistent, as you can see by documentation: Erlang ETS.
It provides atomicity and isolation, also multiple updates/reads if done in the same function (remember that in Erlang a function call is roughly equivalent to a reduction, the unit of measure Erlang scheduler uses to share time between processes, so a multiple function ETS operation could possibly be split in more parts creating a possible race condition).
If you are interested in multiple nodes ETS architecture, maybe you should take a look to mnesia if you want an OOTB multiple nodes concurrency with ETS: Mnesia.
(hint: I'm talking specifically of ram_copies tables, add_table_copy and change_config methods).
That being said, I don't understand the problem with a process (possibly backed up by a not named ets table).
I explain better: the main problem with your project is the first, basic assumption.
It's simple: you don't have a single writing process!
Every time a player takes an object, hits a player and so on, it calls a non side effect free function updating game state, so even if you have a single process managing game state, he must also tells other player clients 'hey, you remember that object there? Just forget it!'; this is why the main problem with many multiplayer games is lag: lag, when networking is not a main issue, is many times due to blocking send/receive routines.
From this point of view, using directly an ETS table, using a persistent table, a process dictionary (BAD!!!) and so on is the same thing, because you have to consider synchronization issues, like in objects oriented programming languages using shared memory (Java, everyone?).
In the end, you should consider just ONE main concern developing your application: consistency.
After a consistent application has been developed, only then you should concern yourself with performance tuning.
Hope it helps!
Note: I've talked about something like a MMORPG server because I thought you were talking about something similar.
An ETS table would not solve your problems in that regard. Your code (that wants to get or set the player widget count) will always run in a process and the data must be copied there.
Whether that is from a process heap or an ETS table makes little difference (that said, reading from ETS is often faster because it's well optimized and doesn't perform any other work than getting and setting data). Especially when getting the data from a remote node. For multple readers ETS is most likely faster since a process would handle the requests sequentially.
What would make a difference however, is if the data is cached on the local node or not. That's where self replicating database systems, such as Mnesia, Riak or CouchDB, comes in. Mnesia is in fact implemented using ETS tables.
As for locking, the latest version of Erlang comes with enhancements to ETS which enable multiple readers to simultaneously read from a table plus one writer that writes. The only locked element is the row being written to (thus better concurrent performance than a normal process, if you expect many simultaneous reads for one data point).
Note however, that all interaction with ETS tables is non-transactional! That means that you cannot rely on writing a value based on a previous read because the value might have changed in the meantime. Mnesia handles that using transactions. You can still use the dirty_* functions in Mneisa to squeeze out near-ETS performance out of most operations, if you know what you're doing.
It sounds like you have a bunch of things that can happen at any time, and you need to aggregate the data in a safe, uniform way. Take a look at the Generic Event behavior. I'd recommend using this to create an event server, and have all these processes share this information via events to your server, at that point you can choose to log it or store it somewhere (like an ETS table). As an aside, ETS tables are not good for peristent data like how many "widgets" a player has - consider Mnesia, or an excellent crash only db like CouchDB. Both of these replicate very well across machines.
You bring up lock contention - you shouldn't have any locks. Messages are processed in a synchronous order as they are received by each process. In fact, the entire point of the message passing semantics built into the language is to avoid shared-state concurrency.
To summarize, normally you communicate with messages, from process to process. This is hairy for you, because you need information from processes scattered all over the place, so my recommendation for you is based of the idea of concentrating all information that is "interesting" outside of the originating processes into a single, real-time source.

How to handle complex availability of information in OOP from a RESTful API

My issue is that I'm dealing with a RESTful API that returns information about objects, and when writing classes to represent them, I'm not sure how best to handle all the possibilities of the status of each variable's availability. From what I can tell, there are 5 possibilities: The information
is available
has not been requested
is currently being requested (asynchronously)
is unavailable
is not applicable
So with these, having an object represent its data with a value or null doesn't cut it. To give a more concrete example, I'm working with an API about the United States Congress, so the problem goes as thus:
I request information about a bill, and it contains a stub about the sponsoring legislator.
I eventually need to request all the information about that legislator. Not all the legislators will have all the information. Those in the House of Representatives won't have a senate class (Senators' six-year terms are staggered so a third expire every two years, the House is entirely re-elected every two years). Some won't have a twitter id, just because they don't have one. And, of course, if I have already requested information, I shouldn't try to request it again.
There's a couple options I see:
I can create a Legislator object and fill it with what information I have, but then I have to have some mechanism of tracking information availability with the getters and setters. This is kind of what I'm doing right now, but it requires a lot of repeated code.
I could create a separate class for abbreviated objects and replace them when I get more with immutable "complete" objects, but then I have to be really careful about replacing all references to them and also go through a bunch of hoops for unavailable, and especially, not applicable information.
So, I'm just wondering what other people's take on this issue is. Are there other (better?) ways of handling this complexity? What are the advantages and drawbacks of different approaches? What should I consider about what I'm trying to do in choosing an approach?
[Note: I'm working in Objective-C, but this isn't necessarily specific to that language.]
If you want to treat those remote resources as objects on the client side, the do yourself a huge favour and forget about the REST buzzword. You will drive yourself crazy. Just accept that you are doing HTTP RPC and move on as you would doing any other RPC project.
However, if you really want to do REST, you need to understand what is meant by the "State Transfer" part of the REST acronym and you need to read about HATEOAS. It is a huge mental shift for building clients, but it does have a bunch of benefits. But maybe you don't need those particular benefits.
What I do know, is if you are trying using a "REST API" to retrieve objects over the wire, you are going to come to the conclusion that REST is a load of crap.
It's an interesting question, but I think you're probably overthinking this a bit.
Firstly, I think you're considering the possible states of information a bit too much; consider the more basic consideration that you either have the information or you don't. WHY you have the information doesn't really matter, except in one case. Let me explain; if the information about a certain bill or legislator or anything is not applicable, you shouldn't be requesting it / needing it. That "state" is irrelevant. Similarly, if the information is in the process of being requested, then it is simply not yet available; the only state you really care about is whether you have the information or if you do not yet have the information.
If you start worrying about further depths of the request process, you risk getting into a deep, endless cycle of managing state; has the information changed between when I got it and now? All you can know about the information is if you've been told what it is. This is fundamental to the REST process; you're getting REPRESENTATION of the underlying data, but there's no mistake about it; the representation is NOT the underlying data, any more than a congressman's name is the congressman himself.
Second, don't worry about information availability. If an object has a subobject, when you query the object, query for the subobject. If you get back data, great. If you get back that the data isn't available, that too is a representation of the subobject's data; it's just a different representation than you were hoping for, but it's equally valid. I'd represent that as an object with a null value; the object exists (was instantiated because it belonged to the parent), but you have no valid data about it (the representation returned was empty due to some reason; lack of availability, server down, data changed; whatever).
Finally, the real key here is that you need to be remembering that a RESTful structure is driven by hypermedia; a request to an object that does not return the full object's data should return an URI for requesting the subobject's data; and so forth. The key here is that those structures aren't static, like your object structure seems to be hoping to treat them; they're dynamic, and it's up to the server to determine the representation (i.e., the interrelationship). Attempting to define that in stone with a concrete object representation ahead of time means that you're dealing with the system in a way that REST was never meant to be dealt with.

Service Design (WCF, ASMX, SOA)

Soliciting feedback/thoughts on a pattern or best practice to address a situation that I have seen a few times over the years, yet I haven't found any one solution that addresses it the way I'd like.
Here is the background.
Company has 3 applications supporting 3 separate "lines of business" that are very much related to each other. Two of the applications are literally copy/paste from the original. The applications need to be able to grow at different rates and have slightly different functionality. The main differences in functionality come from the data entry fields. The differences essentially fall into one of the following categories:
One instance has a few fields
that the other does not.
String field has a max length of 200 in one
instance, but 50 in another.
Lookup/Reference fields have
different underlying values (i.e.
same table structures, but coming
from different databases).
A field is defined as a user supplied,
free text, value in one instance,
but a lookup/reference in another.
The problem is that there are other applications within the company that need to consume data from these three separate applications, but ideally, talk to them in a core/centralized manner (i.e. through a central service rather than 3 separate services). My question is how to handle, in particular, item D above. I am thinking a "lowest common denominator" approach might be the only way. For example:
<SomeFieldName>
<Code></Code> <!-- would store a FK ref value if instance used lookup, otherwise would be empty or nonexistent-->
<Text></Text> <!-- would store the text from the lookup if instance used lookup, would store user supplied text if not-->
</SomeFieldName>
Other thoughts/ideas on this?
TIA!
So are the differences strictly from a Datamodel view or are there functional business / behavioral differences at the application level.
If the later is the case then I would definetly go down the path you appear to be heading down with SOA. Now how you impliment your SOA just depends upon your architecture needs. What I would look at for design would be into some various patterns. Its hard to say for sure which one(s) would meet the needs with out more information / example on how the behavioral / functional differences are being used. From off of the top of my head tho with what you have described I would probably start off looking at a Strategy pattern in my initial design.
Definetly prototype this using TDD so that you can determine if your heading down the right path.
How about: extend your LCD approach, put a facade in front of these systems. devise a normalised form of the data which (if populated with enough data) can be transformed to any of the specific instances. [Heading towards an ESB here.]
Then you have the problem, how does a client know what "enough" is? Some kind of meta-data may be needed so that you can present a suiatble UI. So extend the services to provide an operation to deliver the meta data.