I'm new to O.O.P and would like advice on best practice.
Say for example I have a Course class which holds course information, and a Location class which holds location details. Classes have corresponding repository classes. Now, each Course HAS A location which I have added Location as a property.
When I am pulling the details of a Course from the database, is it best practice to:
A – Populate the Location object from within the CourseRepository Class meaning SQL would return both course and location details
B – Only populate Course object, returning the Location ID, then use the LocationRepository class to find the location details
I’m leaning more towards B as this is a separation of responsibility, however, the thing that’s getting me is performance. Say I need a List instead which returns a result of 50. Would it be wise to query SQL 50 times to seek location details? Would appreciate your thoughts on this.
Lewis
In part, you're thinking in a wrong conceptual direction. It should be: one location can have many courses, not the reciprocal.
That said, theoretical, a Course domain object should not contain a location as class member, but just a location id. On the other hand, a domain object Location could contain an array of Course objects as class member, if needed. You see the difference?
Now, in your case, indeed pass a Location as argument to a Course object. And, in the Course repository, define a method like fetchCoursesWithLocations() in which you run only one sql query to fetch 50 courses TOGETHER WITH the corresponding location details - based on your criterias - into an array. Then loop through the records array. For each of the record item build a Location object and a Course object (to which you pass the Location object as argument). Then pass each so created Course object to another array holding all resulting Course objects, or to a CourseCollection object (which I recommend). In the end return the Courses array (or the CourseCollection content) from the method.
Now, all is somehow too complex to present in here. But I'll give you here three great articles (a serie) which will make the whole process very clear to you. You'll find out in there how a CourseCollection should see, too. In the articles (from the second one upwards), it is used the term "Mapper", which I'm pretty sure it's the same as your "repository". Actually, there are two abstraction layers for data access in the db: mappers and repositories. Plus the adapters.
Look to the part with the PostMapper and the CommentMapper. They are the parallels to your CourseRepository, respectively your LocationRepository. The same roles have Post and Comment models (domain objects!): as parallels to your Course and Location.
The articles are:
Building a Domain Model - An Introduction to Persistence
Agnosticism
Building a Domain Model - Integrating Data Mappers
Handling Collections of Aggregate Roots - the Repository Pattern
Related
I want to understand what would be the best way to represent this in a RESTful way, taking in consideration that the codebase it's a very large - inherited - legacy project and I have to add a lot of new functionality on top of it.
The API Definition is built with OpenaAPI3.
Let's take in consideration the following example:
/v1/{customer}/types/{id}
But the Types collection also has a database constraint of Unique(customer, code) - customer and code being columns from the Types table.
What I need to implement now is a new endpoint that will retrieve a single entity, based on the customer path param and code path param, without having to use the ID path param.
It's a matter of reducing the number of calls, that's why I don't want to make use of the ID path param also.
One solution would be to use query params:
/v1/{customer}/types?code=123
But this will basicaly return a Singleton List so it's not that trivial and definetley not a best practice.
What would be your take on this? I know I should have the ID in the place I want that entity to be returned, but this some case I want to get resovled without having to do another call to get the ID of the entity so I can call the initial endpoint.
I'm starting to work with dto's (data transfer objects) and I have some doubts about the best way to build the system architecture of the API.
Imagine a domain entity 'A', with relations to 'B', 'C' and 'D'. We have a service 'S' that return a json list with all "A's". It's correct to create an 'ADTO' in that service, fill with "BDTO's", "CDTO's" and "DDTO's"? If then we have another service "S2", and we need to return an specific set of "B's", then we need to create another tree of "B2DTO's" with "C2DTOS's", "D2DTO's"... ? Is this the correct way to do it?
I see that this way, we'll have a huge and complex tree of DTO's, with an specific DTO's for each use case.
EDIT:
I forgot the assemblers part. Is necessary to implement a different assembler for every DTO? for example, for an entity A, we have two DTO's. Can I use the same assembler or is better to have A1Assembler and A2Assembler?
Your DTOs should represent a set of data that you want your client to have. Usually, you should never 'copy' your entities into DTOs because you may have fields that you don't want to share with the world. Let's supposed that you are creating automatically a 'tracking' column with the ID of who entered that data, or say that you have a Customer entity with password fields. You don't want that to be part of your DTOs. That's why you must be EXTRA CAREFUL when using AutoMapper etc.
When you design DTOs think about what your client needs from that endpoint specifically. Sometimes DTOs may look the same and that's ok. Also, your DTOs can be as simple or as complex as needed. One crazy example, lets say that a page where you show an artist, his songs, the voting rate for those songs and some extra data.
If your use case justifies it, you may very well put all of that into a DTO. DTO all they do is carry data.
YES, your services should return DTOS (POCO).
Also, DTO is just naming convention. Don't get caught up in the "dto" suffix. A 'command' coming from a client is a DTO, but in that case you would call it AddNewCustomerCommand for example.
Makes sense?
I think you mistake what your DTO's are. You'll have 2 kind of DTO's Roughly speaking
1) they can be your domain entities, then you can return ADTO, BDTO and CDTO. But those DTO's can be fairly consistent (why would B2DTO be any different from BDTO)
If you look at what your json would look at
{
Id: 1
name: "foobar",
$type: "A",
B: [ {
name: "b-bar",
$type: "B"}]
CIds: [ 2,23, 42]
}
Here you see 2 kind of objects, some (B's) are returned in full in your DTO as subobjects. Others (like C) are turned by Id and can be queried separately. If it's S2 which implements the C query or not you don't care about.
2) When you get to an architecture like CQRS then you do get different DTO's. (projections or commands) but then you would also see this in the naming of the DTO's. Forexample are
AListOnOverviewPageDTO, AUserEditDetailDTO, etc.
Now it makes very much sense to have different DTO's since they are projections representing very different usecases. (and not a full object as is common in DDD)
Update The reason you want different DTO's are twofold. First of all it allows you to optimize each call separately. Maybe the list needs to be faster so (using CQRS) you can put the right indexes on your data so your listDTO will be returned faster. It allows you to reason about usecases more easily. Since each DTO represents 1 usecase/screen (Otherwise you get cases ok in the userlistDTO i need to populate only these 3 fields in this case ..etc.).
Furthermore you need to make sure you API is honest. NEVER EVER return a "age" field with empty data but have some other call return the same user but with another call return the same user with a real age. It makes your backend appear broken. However if i had a call to /users/list and another call to /users/1/detail it would be natural if the detail calls returned more fields about a specific user
I have an api that displays, let's say, cities:
/cities
/city/{id}
The cities endpoint returns an overview of the city (id, city name, city area), whereas the city endpoint returns the same plus some more (population, image, thumbnail...). Now, when modelling this in the client I see different alternatives:
Have a CityOverview class which has a City subclass that adds the extra attributes.
Have a City class that has all the attributes with a CityOverview subclass that hides all the extra attributes (say, in Java, by throwing an UnsupportedOperationException on all the getters for attributes it doesn't have).
Have the above classes with no inheritance relationship.
Have a single City class that allows all the extra attributes to be nulled.
What are the pros and cons of the above approaces and/or any other you can think of?
I would go with option 3 - have 2 classes ,i.e. not an inheritance relationship. Here are the reasons for this decisions -
You will need to de-serialize\serialize JSON using an api such as jackson. For this you will need a field mapped POJO. Since, you have 2 separate classes now, you can map different POJOs to the 2 different api calls. It makes the code cleaner.
Inheritance is not an option because of 2 reasons - 1).At one time any of your API calls will either bring data for parent or child. I.e. the fields of either will always be empty depending on which API call you make. This is an unnecessary waste.
2). From an OOP design point-of-view the data object being returned in cities call is not a parent of the data being returned in the city{id} call. So, we should not have this in the class design as well. They together make up the "City" entity.
I think that if there is not a big difference between the OverviewCity and the City You should only keep a single City class in the BE.
In your /cities api your can pass the full list of your cities.
Displaying a list of cities with a few details on each city (OverviewCity) can be created easily from a City object by the client side.
In the Backend I dont think there is any need to support 2 classes.
I successfully amended the nice CloudBalancing example to include the fact that I may only have a limited number of computers open at any given time (thanx optaplanner team - easy to do). I believe this is referred to as a bounded-space problem. It works dandy.
The processes come in groupwise, say 20 processes in a given order per group. I would like to amend the example to have optaplanner also change the order of these groups (not the processes within one group). I have therefore added a class ProcessGroup in the domain with a member List<Process>, the instances of ProcessGroup being stored in a List<ProcessGroup>. The desired optimisation would shuffle the members of this List, causing the instances of ProcessGroup to be placed at different indices of the List List<ProcessGroup>. The index of ProcessGroup should be ProcessGroup.index.
The documentation states that "if in doubt, the planning entity is the many side of the many-to-one relationsship." This would mean that ProcessGroup is the planning entity, the member index being a planning variable, getting assigned to (hopefully) different integers. After every new assignment of indices, I would have to resort the list List<ProcessGroup in ascending order of ProcessGroup.index. This seems very odd and cumbersome. Any better ideas?
Thank you in advance!
Philip.
The current design has a few disadvantages:
It requires 2 (genuine) entity classes (each with 1 planning variable): probably increases search space (= longer to solve, more difficult to find a good or even feasible solution) + it increases configuration complexity. Don't use multiple genuine entity classes if you can avoid it reasonably.
That Integer variable of GroupProcess need to be all different and somehow sequential. That smelled like a chained planning variable (see docs about chained variables and Vehicle Routing example), in which case the entire problem could be represented as a simple VRP with just 1 variable, but does that really apply here?
Train of thought: there's something off in this model:
ProcessGroup has in Integer variable: What does that Integer represent? Shouldn't that Integer variable be on Process instead? Are you ordering Processes or ProcessGroups? If it should be on Process instead, then both Process's variables can be replaced by a chained variable (like VRP) which will be far more efficient.
ProcessGroup has a list of Processes, but that a problem property: which means it doesn't change during planning. I suspect that's correct for your use case, but do assert it.
If none of the reasoning above applies (which would surprise me) than the original model might be valid nonetheless :)
I am prototyping an idea on the iPhone but I am at the SQLite vs CoreData crossroads. The main reason is that I can't seem to figure out how to do grouping with core data.
Essentially I want to show the most recent item posted grouped by username. It is really easy to do in a SQL statement but I have not been able to make it work in core data. I figure since I am starting a new app, I might as well try to make core data work but this part is a major snag.
I added a predicate to my fetchrequest but that only gave me the single most recently added record and not the most recently added record per user.
The data model is pretty basic at this point. It uses the following fields:
username (string), post (string), created (datetime)
So long story short, are these types of queries possible with CoreData? I imagine that if SQLite is under the hood, there has to be some way to do it.
First of all, don't think of Core Data as another way of doing SQL. SQL is not "under the hood" of Core Data. Core Data deals with objects. Entity descriptions are not tables and entity instances are not records. Programming with Core Data has nothing to do with SQL, it merely uses SQL as one of several possible types of persistent stores. You don't deal with it directly and should never, ever think of Core Data in SQL terms.
That way lies madness.
You need drink a lot of tequila and punch yourself in the head repeatedly until you forget everything you ever knew about SQL. Otherwise, you will just end up with an object graph that is nothing but a big spread sheet.
There are several ways to accomplish what you want in Core Data. Usually you would construct fetch with a compound predicate that would return all post within a certain date range made by a specific user. Fetched results controllers are especially handy for this.
A most straightforward method would be to set up you object graph like:
UserEntity
--Attribute username
--Relationship post <-->> PostEntity
PostEntity
--Attribute creationDate
--Attribute content
-- Relationship user <<--> UserEntity
Then in your UserEntity class have a method like so:
- (NSArray *) mostRecentPost{
NSPredicate *recentPred=[NSPredicate predicateWithFormat:#"creationDate>%#", [NSDate dateWithTimeIntervalSinceNow:-(60*60*24)]];
NSSet *recentSet=[self.post filteredSetUsingPredicate:recentPred];
NSSortDescriptor *dateSort=[[NSSortDescriptor alloc] initWithKey:#"creationDate" ascending:NO];
NSArray *returnArray=[[recentSet allObjects] sortedArrayUsingDescriptors:[NSArray arrayWithObject:dateSort]];
return returnArray;
}
When you want a list of the most recent post of a particular user sorted by date just call:
NSArray *arrayForDisplay=[aUserEntityClassInstance mostRecentPost];
Edit:
...do I just pass each post block of
data (content,creationDate) to the
post entity? Do I also pass the
username to the post entity? How does
the user entity know when to create a
new user?
Let me pseudo code it. You have two classes that define instances of userObj and a postObj. When a new post comes in, you:
Parse inputPost for a user;
Search existing userObj for that name;
if userObj with name does not exist
create new userObj;
set userObj.userName to name;
else
return the existing userObj that matches the name;
Parse inputPost for creation date and content;
Search post of chosen userObj;
if an exiting post does not match content or creation date
create new postObj
set postObj.creationDate to creation date;
set postObj,content to content;
set postObj.user to userObj; // the reciprocal in userObj is created automatically
else // most likely ignore as duplicate
You have separate userObj and postObj because while each post is unique, each user may have many post.
The important concept to grasp is that your dealing with object i.e. encapsulated instance of data AND logic. This isn't just rows and columns in a db. For example, you could write managed object subclasses in which a single instance could decide whether to form a relationship with an instance of another class unless a specific internal state of the object was reached. Records in dbs don't have that sort of logic or autonomy.
The best way to get a handle on using objects graphs for data models is to ignore not only db but Core Data itself. Instead, set out to write a small test app in which you hand code all the data model classes. It doesn't have to be elaborate just a couple of attributes per class and a reference of some sort to the other class. Think about how you would manage parsing the data out to each class, linking the classes and their data together and then getting it back out. Do that by hand once or twice and the nature of object graphs becomes readily apparent.
There are other considerations that might tip your decision in the direction of SQLite versus Core Data with a SQLite store. I found myself nodding in agreement while reading a good blog post on the subject. I've found exactly the same thing (and am consequently moving a high-performance app away from Core Data): "Core Data is the right answer, except when it’s not..."
It's a great technology, but one size definitely does not fit all.
If 'posts' is a NSSet of User, you could get the last post with a predicate:
NSDate *lastDate = [userInstance valueForKeyPath:#"#max.date"];
NSSet *resultsTemp = [setOfPosts filteredSetUsingPredicate:[NSPredicate predicateWithFormat:#"fecha==%#", lastDate] ];
The resultsTemp set will contain an object of type Post which has the newest date.