In ECS (Entity Component System) what is the difference between Component-Manager and a System? - game-engine

I'm trying to understand ECS. So a component is just plain data, and some manager holds these components in a container and loops through all of them to act on this data, to "update" them.
Is this manager what people call "component-manager" or is it a "system"? Or do they mean the same thing? If not, what does a component-manager and a system do?

ECS means different things to different people. There are a large number of approaches when it comes to implementation but I personally go by the following rules:
A Component is just plain data, typically a structure or some object with no logic associated with it what so ever.
An Entity is a collection of components. It is defined by an identifier, typically an integer, that can be used to look up components like an index.
A System is where all the game logic lives. Each System has an archetype, that is a specific set of components that it operates on. Systems have an update function, which when invoked accesses the specific set of components its interested in (its archetype), for all entities that have that specific collection of components. This update function is triggered externally (by what? see the next paragraph).
Now, here's the bit that addresses your question directly (or at least attempts to). Video games are simulations and they are typically driven by whats called an update loop (typically sync-ed to a monitor's refresh rate). In ECS architecture, there is typically dedicated code that strings your systems together in a queue and on each time-step of the update loop executes those systems in sequence (ie. calls their update functions). That bit of dedicated code not only manages the system update loop but is also responsible for managing components (stored as lists/arrays that can be indexed by an entity id) and a myriad of other tasks. In many implementations its referred to as the "Engine". This is what I take to be a "component-manager". But that could mean something else in another ECS approach. Just my two-cents. Hope it helped.

Related

ECS / CES shared and dependent components and cache locality

I have been trying to wrap my head around how ECS works when there are components which are shared or dependent. I've read numerous articles on ECS and can't seem to find a definitive answer to this.
Assume the following scenario:
I have an entity which has a ModelComponent (or MeshComponent), a PositionComponent and a ParticlesComponent (or EmitterComponent).
The ModelRenderSystem needs both the ModelComponent and the PositionComponent.
The ParticleRenderSystem needs ParticlesComponent and the PositionComponent.
In the ModelRenderSystem, for cache efficiency / locality, I would like run through all the ModelComponents which are in a compact array and render them, however for each model I need to pull the PositionComponent. I haven't even started thinking about how to deal with the textures, shaders etc for each model (which will definitely blow the cache).
A similar issue with the ParticleRenderSystem.. I need both the ParticlesComponent as well as the PositionComponent, and I want to be able to run through all ParticlesComponents in a cache efficient / friendly manner.
I considered having ModelComponent and ParticlesComponent each having their own position, but they will need to be synched every time the models position changes (imagine a particle effect on a character). This adds another entity or component which needs to track and synch components or values (and potentially negates any cache efficiency).
How does everyone else handle these kinds of dependency issues?
One way to reduce the complexity could be to invert flow of data.
Consider that your ModelRenderSystem has a listener callback that allows the entity framework to inform it that an entity has been added to the simulation that contains both a position and model component. During this callback, the system could register a callback on the position component or the system that owns that component allowing the ModelRenderSystem to be informed when that position object changes.
As the change events from the position changes come in, the ModelRenderSystem can queue up a list of modifications it must replicate during its update phase and then during update, its really a simple lookup each modifications model and set the position to the value in the event.
The benefit is that per frame, you're only ever replicating position changes that actually changed during the frame and you minimize lookups needed to replicate the data. While the update of the position propagates to various systems of interest may not be as cache friendly, the gains you observe otherwise out weigh that.
Lastly, don't forget that systems do not necessarily need to iterate over the components proper. The components in your entity system exist to allow you to toggle plug-able behavior easily. The systems can always manage a more cache friendly data structure and using the above callback approach, allows you to do that and manage data replication super easily with minimal coupling.

In DDD, are repositories the only type of classes which can touch persistence?

In DDD, aggregate roots are persisted via repositories. But are repositories the only classes that can touch persistence in a bounded context?
I am using CQRS along side DDD. In the query side, things like view count, upvotes, these things need to be persisted but I feel it is awkward to model them as aggregate roots. I am limiting DDD aggregate root modeling to the command side. The query side is not allowed to use repositories. But often query side asks for small amount of persistence capabilities.
Also, I am using domain events, certain domain events also need to be persisted. I need something called event storage, but I only heard such terms appear in event sourcing (ES) and I am not using ES.
If such persistent classes are indeed needed. How do I call them, which layer should they belong to?
[Update]
When I read answers below, I realized my question is a bit ambiguous. By touch, I mainly mean write (and also including read).
Thanks.
In the query side, things like view count, upvotes, these things need
to be persisted
Not necessarily. CQRS doesn't specify
whether the read model should be materialized in its own database
how the read model is updated
The simplest CQRS implementation is one where the query side and command side use the same tables. The persistent source for Read Models could also be SQL (materialized) views based on these tables. If you do have a separate database for reads, it can be kept up-to-date by additional Command Handlers or sub-handlers or Event Handlers that operate after the command has been executed.
You can see a minimalist - yet perfectly CQRS compliant - implementation here : https://github.com/gregoryyoung/m-r/tree/master/SimpleCQRS
But are repositories the only classes that can touch persistence in a
bounded context?
No, in a CQRS context, Read Model Facades (a.k.a. read side repos) can also read from it and your read model update mechanism write to it.
Also, I am using domain events, certain domain events also need to be
persisted. I need something called event storage, but I only heard
such terms appear in event sourcing (ES) and I am not using ES.
Event stores are the main storage technology of event-sourced systems. You could use them to store a few domain events on the side in a non-ES application, but they may be overkill and too complex for the task. It depends if you need all the guarantees they offer in terms of delivery, consistency, concurrency/versioning, etc. Otherwise, a regular RDBMS or NoSQL store can do the trick.
First, you need to think about your object model independantly of how you will store it in the database. You're designing an object model. Forget about the database for a moment.
You're saying that you don't want view counts or upvotes to be aggregate roots. That means you want to put them in an aggregate with some other objects. One of those objects is the aggregate root.
Without knowing more about your model, it's hard to say what you could do with more details, but the basic way would be to persist the aggregate root with the corresponding repository. The repository is not only responsible of storing the aggregate root, but the entire aggregate, following the relationships.
Think about the other side, when you are using the repository to retrieve an entity. You get an instance of your aggregate root, but if you follow the relationships, you also have all those other objects. It's perfectly logical that when you save an entity, all those other objects are saved too.
I don't know which technology you're using, but you should write your repository so that it does that.
also, why is the query side not allowed to use repositories ? Repositories are not only used to save data. They are also used to retrieve it. How are you retrieving objects without repositories (even if you don't modify them ?)

OO: Good method for maintaining consistency between related objects

I have two classes, Server and Application, with a many-to-many relationship: a server can run multiple applications, an application can run across multiple servers. A third class, Host, represents a single application on a single server, including references to the Application and Server objects as well as additional data such as the amount of disk space used by the application on the server. Both Server and Application objects contain a list of all their hosts: hence, Applications know about Hosts and Hosts know about Applications, and Servers know about Hosts and Hosts know about Servers.
The purpose of my project is to work out the schedule for migrating a bunch of applications onto new servers. Originally each application had a migration-start and migration-end date. Some applications also have start and end dates for virtualisation. Virtualisation occurs if the migration cannot be performed within the application's constraints (never mind what these are). It occurs prior to the migration and frees the application from its constraints. An object called 'Schedule' is held by the Application object, which includes these 4 dates as well as a boolean flag to say whether it is to be virtualised, and a list of 'months' which contain the man-hours required to migrate (or virtualise) the application in each particular month.
We now want to allow servers to undergo virtualisation separately, on a specified date. All the applications (or parts of applications, i.e. hosts) on these servers will be virtualised on this date; they will be migrated along with the rest of the application. We originally decided to have the server class hold its own Schedule object. The virtualisation dates were then set in the server. However, we decided we wanted to keep the server and application schedules consistent - so that, for example, the server schedule's migration-start and end dates should be set to the earliest start and latest end dates, respectively, of all applications running on that server. This meant that every time we updated the Application dates, we had to remember to update all its server dates (via the host object). Or, if we wanted to update the Application's man-hours for a particular month, we had to update the server's man-hours also.
Then we thought about putting a single Schedule object inside each Host object. This solves the consistency problem, but leads to quite a bit of redundancy: since all Host objects belonging to an application will necessarily have the same migration dates (but possibly different virtualisation dates), when you set the migration dates for an app, you have to set the same dates for every host. Also, there are a few instances where we need to work out the earliest-start and latest-finish dates for servers AND applications, as above. This would involve either: holding this data in each of the application and server objects (effectively giving each its own Schedule, thereby brining back problems with consistency), or: calculating this data on-the-fly each time it is needed, by looping through all the hosts' schedules. The same goes for the man-hours required by an application each month, which is calculated at the application level, fractioned into hours for each host per month, and then recalculated when we need to figure it out at the application level again. This is, as you would expect, not efficient in the slightest.
This isn't a straightforward question, but I'm wondering if there are any accepted strategies for dealing with this sort of situation. Sorry in advance for the prolixity of my post; hopefully I've made the situation clear enough.
This is complex once we get into 3rd paragraph onwards
I will use the following design principle
Keep Application, Server, Host objects contain the minimum required behaviors and states.
For example, Application Object may contain start date, end date and virtualization start and virtualization end dates. Think whether it require to contain a list of servers? or instance of Host?
Then a think about a small frame work like this.
a) MigrationManager who does the complete process of Migration using List
b) MigratioContext will composite information for migration process.
c) ErrorContext will composite the error and exception handling
Migration Manager gets an instance of Scheduler and schedules the migration
In this way we can gradually evolve a frame work kind of stuff around the core business object and business logic.
The important thing to remember
Separation of Concerns.
Reusability of Code: For exmple Your Application object may not be required to tied up the whole migration process. Instead those things can be done by another object
(This answer is based on my high level understanding and assumptions that could be wrong. But I think you may get some directions to build the application to meet the requirements)
Once more suggestion I have. Use a Modeling Tool such as StarUML or ArgoUML to put your ideas in a pictorial form. This will help all of the members to get into the question very quickly.
Thanks
I think a fundamental principle of object-oriented programming is that to the extent possible, every mutable aspect of state should at all times have exactly one well-defined owner (that owner may, in turn, be owned by exactly one other entity, which is in turned by one other entity, etc.). Other objects and entities may hold references to that mutable state, but any such references should be thought of in terms of the owner. For example, if a method accepts a reference to a collection and is supposed to to populate it, the method would not think in terms of operating on a collection it owns, but rather in terms of operating on a collection owned by someone else, for the benefit of that entity.
Sometimes it is necessary to have various objects to have separate copies of what is supposed to be the same mutable state. This situation frequently arises in things like graphical user interfaces, where an object might "own" the rotational angle of an object, but a display control might need to cache a private rendering of that object in its current orientation. Handling of such cases may be greatly simplified in cases where one object is designated the absolute master, and it keeps other objects notified of its state. It will be greatly complicated if there are multiple objects, none of which has exclusive ownership, but all of which are supposed to nonetheless keep in sync with each other.
If you can work your model so that no piece of state needs to be duplicated, I would highly recommend doing so. It's crucial, though, that your model be capable of representing all scenarios of interest, including the possibility that two things which are supposed to be in the same state, might not be. What's most important is that every aspect of state has a well-defined chain of ownership, so that when an aspect of state is changed, one can say whose state is affected.

Erlang ETS tables versus message passing: Optimization concerns?

I'm coming into an existing (game) project whose server component is written entirely in erlang. At times, it can be excruciating to get a piece of data from this system (I'm interested in how many widgets player 56 has) from the process that owns it. Assuming I can find the process that owns the data, I can pass a message to that process and wait for it to pass a message back, but this does not scale well to multiple machines and it kills response time.
I have been considering replacing many of the tasks that exist in this game with a system where information that is frequently accessed by multiple processes would be stored in a protected ets table. The table's owner would do nothing but receive update messages (the player has just spent five widgets) and update the table accordingly. It would catch all exceptions and simply go on to the next update message. Any process that wanted to know if the player had sufficient widgets to buy a fooble would need only to peek at the table. (Yes, I understand that a message might be in the buffer that reduces the number of widgets, but I have that issue under control.)
I'm afraid that my question is less of a question and more of a request for comments. I'll upvote anything that is both helpful and sufficiently explained or referenced.
What are the likely drawbacks of such an implementation? I'm interested in the details of lock contention that I am likely to see in having one-writer-multiple-readers, what sort of problems I'll have distributing this across multiple machines, and especially: input from people who've done this before.
first of all, default ETS behaviour is consistent, as you can see by documentation: Erlang ETS.
It provides atomicity and isolation, also multiple updates/reads if done in the same function (remember that in Erlang a function call is roughly equivalent to a reduction, the unit of measure Erlang scheduler uses to share time between processes, so a multiple function ETS operation could possibly be split in more parts creating a possible race condition).
If you are interested in multiple nodes ETS architecture, maybe you should take a look to mnesia if you want an OOTB multiple nodes concurrency with ETS: Mnesia.
(hint: I'm talking specifically of ram_copies tables, add_table_copy and change_config methods).
That being said, I don't understand the problem with a process (possibly backed up by a not named ets table).
I explain better: the main problem with your project is the first, basic assumption.
It's simple: you don't have a single writing process!
Every time a player takes an object, hits a player and so on, it calls a non side effect free function updating game state, so even if you have a single process managing game state, he must also tells other player clients 'hey, you remember that object there? Just forget it!'; this is why the main problem with many multiplayer games is lag: lag, when networking is not a main issue, is many times due to blocking send/receive routines.
From this point of view, using directly an ETS table, using a persistent table, a process dictionary (BAD!!!) and so on is the same thing, because you have to consider synchronization issues, like in objects oriented programming languages using shared memory (Java, everyone?).
In the end, you should consider just ONE main concern developing your application: consistency.
After a consistent application has been developed, only then you should concern yourself with performance tuning.
Hope it helps!
Note: I've talked about something like a MMORPG server because I thought you were talking about something similar.
An ETS table would not solve your problems in that regard. Your code (that wants to get or set the player widget count) will always run in a process and the data must be copied there.
Whether that is from a process heap or an ETS table makes little difference (that said, reading from ETS is often faster because it's well optimized and doesn't perform any other work than getting and setting data). Especially when getting the data from a remote node. For multple readers ETS is most likely faster since a process would handle the requests sequentially.
What would make a difference however, is if the data is cached on the local node or not. That's where self replicating database systems, such as Mnesia, Riak or CouchDB, comes in. Mnesia is in fact implemented using ETS tables.
As for locking, the latest version of Erlang comes with enhancements to ETS which enable multiple readers to simultaneously read from a table plus one writer that writes. The only locked element is the row being written to (thus better concurrent performance than a normal process, if you expect many simultaneous reads for one data point).
Note however, that all interaction with ETS tables is non-transactional! That means that you cannot rely on writing a value based on a previous read because the value might have changed in the meantime. Mnesia handles that using transactions. You can still use the dirty_* functions in Mneisa to squeeze out near-ETS performance out of most operations, if you know what you're doing.
It sounds like you have a bunch of things that can happen at any time, and you need to aggregate the data in a safe, uniform way. Take a look at the Generic Event behavior. I'd recommend using this to create an event server, and have all these processes share this information via events to your server, at that point you can choose to log it or store it somewhere (like an ETS table). As an aside, ETS tables are not good for peristent data like how many "widgets" a player has - consider Mnesia, or an excellent crash only db like CouchDB. Both of these replicate very well across machines.
You bring up lock contention - you shouldn't have any locks. Messages are processed in a synchronous order as they are received by each process. In fact, the entire point of the message passing semantics built into the language is to avoid shared-state concurrency.
To summarize, normally you communicate with messages, from process to process. This is hairy for you, because you need information from processes scattered all over the place, so my recommendation for you is based of the idea of concentrating all information that is "interesting" outside of the originating processes into a single, real-time source.

Service Design (WCF, ASMX, SOA)

Soliciting feedback/thoughts on a pattern or best practice to address a situation that I have seen a few times over the years, yet I haven't found any one solution that addresses it the way I'd like.
Here is the background.
Company has 3 applications supporting 3 separate "lines of business" that are very much related to each other. Two of the applications are literally copy/paste from the original. The applications need to be able to grow at different rates and have slightly different functionality. The main differences in functionality come from the data entry fields. The differences essentially fall into one of the following categories:
One instance has a few fields
that the other does not.
String field has a max length of 200 in one
instance, but 50 in another.
Lookup/Reference fields have
different underlying values (i.e.
same table structures, but coming
from different databases).
A field is defined as a user supplied,
free text, value in one instance,
but a lookup/reference in another.
The problem is that there are other applications within the company that need to consume data from these three separate applications, but ideally, talk to them in a core/centralized manner (i.e. through a central service rather than 3 separate services). My question is how to handle, in particular, item D above. I am thinking a "lowest common denominator" approach might be the only way. For example:
<SomeFieldName>
<Code></Code> <!-- would store a FK ref value if instance used lookup, otherwise would be empty or nonexistent-->
<Text></Text> <!-- would store the text from the lookup if instance used lookup, would store user supplied text if not-->
</SomeFieldName>
Other thoughts/ideas on this?
TIA!
So are the differences strictly from a Datamodel view or are there functional business / behavioral differences at the application level.
If the later is the case then I would definetly go down the path you appear to be heading down with SOA. Now how you impliment your SOA just depends upon your architecture needs. What I would look at for design would be into some various patterns. Its hard to say for sure which one(s) would meet the needs with out more information / example on how the behavioral / functional differences are being used. From off of the top of my head tho with what you have described I would probably start off looking at a Strategy pattern in my initial design.
Definetly prototype this using TDD so that you can determine if your heading down the right path.
How about: extend your LCD approach, put a facade in front of these systems. devise a normalised form of the data which (if populated with enough data) can be transformed to any of the specific instances. [Heading towards an ESB here.]
Then you have the problem, how does a client know what "enough" is? Some kind of meta-data may be needed so that you can present a suiatble UI. So extend the services to provide an operation to deliver the meta data.