Doctrine2: fill up related entities after loading main - lazy-loading

There are HotelComment and CommentPhoto (1:n) - user can add some photos to own comment. I'm loading slice of comments with one query and want load photos to this comments using other query (using WHERE IN).
$comments = $commentsRepo->findByHotel($hotel);
$comments->loadPhotos(); // of course comments is simple array yet
Loading comments needed on demand, not on PostLoad event.
So question is: how it possible associate loaded comments with objects of HotelComment? Using ReflectionProperty: setAcesseble() + setValue()? Is there simpler sollution? And I'm afraid that UoW detects HotelComment entities as modified and will send updates to db.

If you want to hydrate the related objects this one time only, and not every time the object is loaded, you need to use DQL:
$em->createQuery("SELECT comments, photos FROM HotelComment comments JOIN comments.photos photos");
You can put this in a method on the repository.
This will issue a single SELECT statement, with an INNER JOIN to the comment photos table.

You have to configure your relation as "LAZY". See doctrine documentation:
ManyToOne
ManyToMany
OneToOne
Than you'll be able to load it lazily with $comments->loadPhotos(), at least documentation says so
UPDATE: I think you don't have to to something special to avoid your entities flushing to the DB. In fact, when you query your entries with DQL, they have managed state, so attaching them to other managed entity's collection does not change their states, so they are not flushed unless you have modified them.
Hovewer, that doesn't help at all, because associations are fetched before first usage, so adding an entity to the collection with the following code will result in an implicit database query:
$comment->addPhoto($photo);
//in Comment class
function addPhoto(Photo $photo){
//var_dump(count($this->photos)); //if you have any - they are already here
$this->photos->add($photo);
}
Maybe declaring your collection as public (or that tricks with ReflectionProperty) will help fool the Doctrine, but that's a dirty hack, so I haven't even tried them.
Detaching parent entity also doesn't help. I've ran out of ideas for now....

Related

Optimising DRF Serialization

I am facing a problem where I have to optimise the serialization of the ORM object. I have an object say Foo for which I have a huge serializer. I have mentioned a lot of fields, like
class FooSerializer(ModelSerializer):
bar = serializers.StringRelatedField(source="bar")
apple = serializers.StringRelatedField(source="bar.food")
cat = serializers.StringRelatedField(source="bar.animals.pet")
ball = serializers.StringRelatedField(source="bar.toy")
# a lot of other complex fields related with Foo
# direct-indirect, 1-1 or 1-M relations
class Meta:
model = Foo
fields = ['bar', 'apple', 'cat', 'ball', ....]
Now, this is causing the serialisation to take a lot of time. I added logging and saw many SQL queries getting executed. A lot these queries are repeated. As per my understanding from documentations, even though Django QuerySet is lazily executed, the serialization in DRF is querying for each field to get populated. Please elaborate on how serialization fields are populated on lower level as well as it will help me more.
What I want to achieve here is do minimal possible queries. In the example above, To get bar.food and bar.toy I want to do only one single query which will fetch bar object and I can access food and toy object.
One possible solution I can think of is evaluating all related objects and pass them in context. That is, evaluate bar object and send it as a context. Then my apple field will be populated as self.context['bar'].food in a SerializerMethodField. Can you suggest a better way? May be a batch processing?
Assume:
The serialised data is hot and we cannot cache it.
Edit:
Current SQL queries being done are in double digits for each serialisation.
Edit (Query as requested by Daneil)
SELECT `app_foo`.`id`, `app_foo`.`field_1`, (many app_foo fields),
`app_foo`.`created_at`, `app_foo`.`updated_at` FROM `app_foo` INNER JOIN
`app_bar` ON `app_foo`.`id` = `app_bar`.`id` WHERE `app_foo`.`id` = 12; args(12,)
Dear NIkhil Please try using prefetch select and select related
The result cache of the primary QuerySet and all specified related objects will then be fully loaded into memory. This changes the typical behavior of QuerySets, which normally try to avoid loading all objects into memory before they are needed, even after a query has been executed in the database.
More detail here

RavenDB Consistency - WaitForIndexesAfterSaveChanges() / WaitForNonStaleResultsAsOfNow

I am using RavenDB 3.5.
I know that querying entities is not acid but loading per ID is.
Apparently writing to DB is also acid.
So far so good.
Now a question:
I've found some code:
session.Advanced.WaitForIndexesAfterSaveChanges();
entity = session.Load<T>(id);
session.Delete(entity);
session.SaveChanges();
// Func<T, T> command
command?.Invoke(entity);
what would be the purpose of calling WaitForIndexesAfterSaveChanges() here?
is this because of executing a command?
or is it rather because might depedning/consuming queries are supposed to immediately catch up with those changes made?
if this would be the case, I could remove WaitForIndexesAfterSaveChanges() in this code block and just add WaitForNonStaleResultsAsOfNow() in the queries, couldn't I?
When would I use WaitForIndexesAfterSaveChanges() in the first place if my critical queries are already flagged with WaitForNonStaleResultsAsOfNow()?
The most likely reason for this behavior is wanting to wait, in this operation, for the indexes to complete.
A good example why you want to do that is when you create a new item, and the next operation is going to show a list of items. You can use WaitForIndexesAfterSaveChanges to wait, during the save, for the indexes to update.

Retrieving records and manipulating them as Ruby objects

New to Sequel and SQL in general, so bear with me. I'm using Sequel's many_through_many plugin and I retrieve resources that are indirectly associated with particular tasks through groups, via a groups_tasks join table and a groups_resources join table. Then when I query task.resource on a Task dataset I get resource objects in Ruby, like so:
>>[#<Resource #values={:id=>2, :group_id=>nil, :display_name=>"some_name"}>, #<Resource #values={:id=>3, :group_id=>nil, :display_name=>"some_other_name"}>]
Now, I want to be able to add a new instance variable, schedule to these resource objects and do work on it in Ruby. However, every time I query task.resources for each task, Sequel is bringing resources objects in to ruby as different resource objects each time (which makes sense), despite being the same record in the database:
>>
"T3"
#<Resource:0x007fd4ca0c6fd8>
#<Resource:0x007fd4ca0c6920>
#<Resource:0x007fd4ca0c60d8>
#<Resource:0x007fd4ca0c57a0>
"T1"
#<Resource:0x007fd4ca0a4c08>
#<Resource:0x007fd4ca097f58>
#<Resource:0x007fd4ca097b48>
"T2"
#<Resource:0x007fd4ca085ba0>
#<Resource:0x007fd4ca0850d8>
I had wanted to just put a setter in class Resource and do resource.schedule = Schedule.new, but since they're all different objects, each resource is going to have a ton of different schedules. What's the most straightforward way to manipulate these resource objects client side, but maintain their task associations that I query from the server?
If I am understanding your question correctly, you want to retrieve Resource objects and then manipulate some attribute named schedule. I am not very familiar with Sequel, but looking over the docs it seems to work similarly to ActiveRecord.
Set up your instance variable (I imagine using something like attr_accessor :schedule on the Resource class).
Store the records in a variable, you will be working with same instance each time, rather than the new instance Sequel returns.

How to model "hasMany" in NoSQL for near real time query?

I've been playing with Couchbase and I'm trying to find best ways to model relationships.
belongsTo: this is fairly easy. When I have Posts and Comments, I can have the following structure in comments.
Comment:
id: 1
parent: this is where I store an id of post
hasMany: This seemed pretty easy at first. Assuming I have Posts and Users and users can like a Post, I had the following structure.
Posts:
id: 1
likedBy: [
'user-id-1',
'user-id-2'
]
This works if I have may be...a thousand likes, but as the # of likes increases..it gets slower and slower and I have to lock the document.
My first solution was using view, but then view is not real time even though it is adequate for most of queries. There is always delay for indexing.
Then I thought about using a relational database just to save relationship and I think this might be pretty good choice, but I would like to know if there is something I'm missing.
For the comments I might use something like this, but instead of "SomeEventType" and date time stamp like it has in the blog post, I would do the ID of the post itself. This way you get the counter object for that post, which gives you the upper bound of the array of comments. Then you can iterate through that list, use pagination or do a bulk get for all of them. Since this would be interacting purely with the Data Service, it would meet your consistency and real time needs.
For the number of likes, you could use a counter object. For recording which user's like a post or comment, you could store that in a separate object or maybe have an index object like you have in your question per user? Let me think more about this one.

Grouping a Core Data data result?

I am prototyping an idea on the iPhone but I am at the SQLite vs CoreData crossroads. The main reason is that I can't seem to figure out how to do grouping with core data.
Essentially I want to show the most recent item posted grouped by username. It is really easy to do in a SQL statement but I have not been able to make it work in core data. I figure since I am starting a new app, I might as well try to make core data work but this part is a major snag.
I added a predicate to my fetchrequest but that only gave me the single most recently added record and not the most recently added record per user.
The data model is pretty basic at this point. It uses the following fields:
username (string), post (string), created (datetime)
So long story short, are these types of queries possible with CoreData? I imagine that if SQLite is under the hood, there has to be some way to do it.
First of all, don't think of Core Data as another way of doing SQL. SQL is not "under the hood" of Core Data. Core Data deals with objects. Entity descriptions are not tables and entity instances are not records. Programming with Core Data has nothing to do with SQL, it merely uses SQL as one of several possible types of persistent stores. You don't deal with it directly and should never, ever think of Core Data in SQL terms.
That way lies madness.
You need drink a lot of tequila and punch yourself in the head repeatedly until you forget everything you ever knew about SQL. Otherwise, you will just end up with an object graph that is nothing but a big spread sheet.
There are several ways to accomplish what you want in Core Data. Usually you would construct fetch with a compound predicate that would return all post within a certain date range made by a specific user. Fetched results controllers are especially handy for this.
A most straightforward method would be to set up you object graph like:
UserEntity
--Attribute username
--Relationship post <-->> PostEntity
PostEntity
--Attribute creationDate
--Attribute content
-- Relationship user <<--> UserEntity
Then in your UserEntity class have a method like so:
- (NSArray *) mostRecentPost{
NSPredicate *recentPred=[NSPredicate predicateWithFormat:#"creationDate>%#", [NSDate dateWithTimeIntervalSinceNow:-(60*60*24)]];
NSSet *recentSet=[self.post filteredSetUsingPredicate:recentPred];
NSSortDescriptor *dateSort=[[NSSortDescriptor alloc] initWithKey:#"creationDate" ascending:NO];
NSArray *returnArray=[[recentSet allObjects] sortedArrayUsingDescriptors:[NSArray arrayWithObject:dateSort]];
return returnArray;
}
When you want a list of the most recent post of a particular user sorted by date just call:
NSArray *arrayForDisplay=[aUserEntityClassInstance mostRecentPost];
Edit:
...do I just pass each post block of
data (content,creationDate) to the
post entity? Do I also pass the
username to the post entity? How does
the user entity know when to create a
new user?
Let me pseudo code it. You have two classes that define instances of userObj and a postObj. When a new post comes in, you:
Parse inputPost for a user;
Search existing userObj for that name;
if userObj with name does not exist
create new userObj;
set userObj.userName to name;
else
return the existing userObj that matches the name;
Parse inputPost for creation date and content;
Search post of chosen userObj;
if an exiting post does not match content or creation date
create new postObj
set postObj.creationDate to creation date;
set postObj,content to content;
set postObj.user to userObj; // the reciprocal in userObj is created automatically
else // most likely ignore as duplicate
You have separate userObj and postObj because while each post is unique, each user may have many post.
The important concept to grasp is that your dealing with object i.e. encapsulated instance of data AND logic. This isn't just rows and columns in a db. For example, you could write managed object subclasses in which a single instance could decide whether to form a relationship with an instance of another class unless a specific internal state of the object was reached. Records in dbs don't have that sort of logic or autonomy.
The best way to get a handle on using objects graphs for data models is to ignore not only db but Core Data itself. Instead, set out to write a small test app in which you hand code all the data model classes. It doesn't have to be elaborate just a couple of attributes per class and a reference of some sort to the other class. Think about how you would manage parsing the data out to each class, linking the classes and their data together and then getting it back out. Do that by hand once or twice and the nature of object graphs becomes readily apparent.
There are other considerations that might tip your decision in the direction of SQLite versus Core Data with a SQLite store. I found myself nodding in agreement while reading a good blog post on the subject. I've found exactly the same thing (and am consequently moving a high-performance app away from Core Data): "Core Data is the right answer, except when it’s not..."
It's a great technology, but one size definitely does not fit all.
If 'posts' is a NSSet of User, you could get the last post with a predicate:
NSDate *lastDate = [userInstance valueForKeyPath:#"#max.date"];
NSSet *resultsTemp = [setOfPosts filteredSetUsingPredicate:[NSPredicate predicateWithFormat:#"fecha==%#", lastDate] ];
The resultsTemp set will contain an object of type Post which has the newest date.