I have an interesting problem. I'm using Ruby 1.9.2 and Rails 3.1.3.
I have 2 models, for simplification let's say customers and stores.
Stores have many customers, and a customer belongs to a store.
I'm trying to collect all customers for a store, and create a place for a few more that I can populate with values later. Instead, customer.save is called when I don't expect it.
store = Store.find(1)
customers_array = store.customers
random_array = Array.new
customers_count = customers_array.count + 1
(customers_count..2).each do |i|
customer = Customer.new
c.id = "#{i}000000000000"
random_array << customer # this line doesn't call customer.save
customers_array << customer # this line calls customer.save when store has customers
end
For some reason when the customer is pushed into the array, customer.save is called.
It doesn't happen if you push to an array is a plain array and not a relation.
I found a workaround, but I'm still wondering why that happens.
The workaround:
store = Store.find(1)
initial_customers_array = store.customers
additional_customers_array = Array.new
customers_count = initial_customers_array.count + 1
(customers_count..2).each do |i|
customer = Customer.new
c.id = "#{i}000000000000"
additional_customers_array << customer
end
customers_array = initial_customers_array + additional_customers_array
<< is an alias for push
http://apidock.com/rails/ActiveRecord/Associations/CollectionProxy/%3C%3C
http://apidock.com/rails/ActiveRecord/Associations/CollectionProxy/push
which in the ActiveRecord::Associations::CollectionProxy calls concat
http://apidock.com/rails/ActiveRecord/Associations/CollectionAssociation/concat (view the source of the method)
https://github.com/rails/rails/blob/master/activerecord/lib/active_record/associations/collection_proxy.rb#L283
which calls concat_records
http://apidock.com/rails/v3.2.3/ActiveRecord/Associations/CollectionAssociation/concat_records
where you can see the insert taking place.
So, with an existing record (persisted into the database), running << or .push will insert records into the collection, persisting them to the database if necessary. Calling << on an Array, not the record collection, as you're doing in
random_array << customer
calls Ruby's << Array method, not the AR equivalent (as you found, no save takes place in this case).
Edit: To be clear, the workaround you found is more or less how I typically handle the situation you're dealing with; my answer focuses more on why << has this behavior.
Another way around this would be to change your second line (of your original code) to:
customers_array = store.customers.to_a
That casts the active record association to a real array object, so the << method will be the normal Array#push method.
Related
I'm trying to make a news feed. Each time the page is called, server must send multiple items. One item contain a post, number of likes, number of comments, number of comment children, comments data, comment children data etc.
My problem is, each time my page is called, it takes more than 5 secondes to be loaded. I've already implemented a caching system. But it's still slow.
posts = Posts.objects.filter(page="feed").order_by('-likes')[:'10'].cache()
posts = PostsSerializer(post,many=True)
hasPosted = Posts.objects.filter(page="feed",author="me").cache()
hasPosted = PostsSerializer(hasPosted,many=True)
for post in post.data:
commentsNum = Comments.objects.filter(parent=posts["id"]).cache(ops=['count'])
post["comments"] = len(commentsNum)
comments = Comments.objects.filter(parent=posts["id"]).order_by('-likes')[:'10'].cache()
liked = Likes.objects.filter(post_id=posts["id"],author="me").cache()
comments = CommentsSerializer(comments,many=True)
commentsObj[posts["id"]] = {}
for comment in comments.data:
children = CommentChildren.objects.filter(parent=comment["id"]).order_by('date')[:'10'].cache()
numChildren = CommentChildren.objects.filter(parent=comment["id"]).cache(ops=['count'])
posts["comments"] = posts["comments"] + len(numChildren)
children = CommentChildrenSerializer(children,many=True)
liked = Likes.objects.filter(post_id=comment["id"],author="me").cache()
for child in children.data:
if child["parent"] == comment["id"]:
liked = Liked.objects.filter(post_id=child["id"],author="me").cache()
I'm trying to find a simple method to fetch all these data quicker and without unnecessary database hit. I need to reduce the loading time from 5 secs to less than 1 if possible.
Any suggestion ?
Add the number of children as a integer on the comment field that gets updated every time a comment is added or removed. That way, you won't have to query for that value. You can do this using signals.
Add an ArrayField(if you're using postgres) or something similar on your Profile model that stores all the primary keys of Liked posts. Instead of querying the Likes model, you would be able to do this:
profile = Profile.objects.get(name='me')
liked = True if comment_pk in profile.liked_posts else False
Use select_related to CommentChildren instead of making an extra query for it.
Implementing these 3 items will get rid of all the db queries being executed in the "comment in comments.data" forloop which is probably taking up the majority of the processing time.
If you're interested, check out django-debug-toolbar which enables you to see what queries are being executed on every page.
I have 2 related models with 10 Million rows each and want to perform an efficient paginated request of 50 000 items of one of them and access related data on the other one:
class RnaPrecomputed(models.Model):
id = models.CharField(max_length=22, primary_key=True)
rna = models.ForeignKey('Rna', db_column='upi', to_field='upi', related_name='precomputed')
description = models.CharField(max_length=250)
class Rna(models.Model):
id = models.IntegerField(db_column='id')
upi = models.CharField(max_length=13, db_index=True, primary_key=True)
timestamp = models.DateField()
userstamp = models.CharField(max_length=30)
As you can see, RnaPrecomputed is related to RNA via a foreign key. Now, I want to fetch a specific page of 50 000 items of RnaPrecomputed and corresponding Rnas related to them. I expect N+1 requests problem, if I do this without select_related() call. Here are the timings:
First, for reference I won't touch the related model at all:
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all(), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.id)
Takes:
real 0m12.614s
user 0m1.073s
sys 0m0.188s
Now, I'll try accessing data on related model:
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all(), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.rna.upi)
it takes:
real 2m27.655s
user 1m20.194s
sys 0m4.315s
Which is a lot, so, probably I have N+1 requests problem.
But now, if I use select_related(),
rna_paginator = paginator.Paginator(RnaPrecomputed.objects.all().select_related('rna'), 50000)
message = ""
for object in rna_paginator.page(400).object_list:
message = message + str(object.rna.upi)
it takes even more:
real 7m9.720s
user 0m1.948s
sys 0m0.337s
So, somehow select_related() made things 3 times slower, instead of making them faster. And probably without it, I have N+1 requests, so for each entry of RnaPrecomputed, Django ORM probably has to do an additional request to the database to fetch the corresponding Rna?
What am I doing wrong and how to make select_related() perform well with paginated queryset?
It's worth checking that you're not missing an index in your database. You have db_index=True for the Rna.upi field, but are you sure the index exists in the database?
If the select_related is making the count() query slow, then you could try doing the select_related on the paginated object_list.
for object in rna_paginator.page(300).object_list.select_related():
message = message + str(object.rna.upi)
I have there model objects, Team, Post, User and a join table between Team and User called Member. Users in teams can vote on posts. When voting on a posts I want give the user who created the Post points. I therefore have a ´points integerattribute in myMember` model. I have a method where I give points to users which looks like this:
def give_points_to_user(post, increase)
member = post.user.members.where(team_id: post.team.id).first
if increase
member.points += 5
else
member.points -= 5
end
member.save!
end
Calling this method gives me this error:
undefined method `+' for nil:NilClass
So, how should my find (or where) call look if I want to find the correct member? That is, the member/user who created the post.
undefined method `+' for nil:NilClass
The problem is the points value is initially set to nothing,so member.points += 5 won't work because when you give member.points += 5 it is actually member.points = member.points+5.So you are actually giving member.points = nil+5,which fails giving that error.
Solution
As,#Mandeep said,you could make it work by setting a default value to points attribute in the table.Suppose if the value is zero(0) then the result of member.points += 5 would be 5.
I would like to be able to pull all records from the db:
u = User.all
And then once loaded be able to apply AR methods to the resulting collection:
u.first
Is this possible in rails?
Once you actually query the database, the results become an array instead of an ActiveRecord::Relation. (Though #first would still work fine, since it's a method that also exists on Array).
If you just need a starting point to build an ActiveRecord::Relation though, you can use scoped:
# Doesn't execute a query yet
u = User.scoped
# This now executes a query similar to SELECT * FROM users LIMIT 1
u.first
Note that in Rails 4.0, #all now does the same thing as #scoped (whereas in Rails 3, it returns an array).
Why don't you try it?
User.all doesn't return an AR collection it returns an Array. Get rid of the .all and you will have a working example.
Is there a more efficient method for doing a Rails SQL statement of the following code?
It will be called across the site to hide certain content or users based on if a user is blocked or not so it needs to be fairly efficient or it will slow everything else down as well.
users.rb file:
def is_blocked_by_or_has_blocked?(user)
status = relationships.where('followed_id = ? AND relationship_status = ?',
user.id, relationship_blocked).first ||
user.relationships.where('followed_id = ? AND relationship_status = ?',
self.id, relationship_blocked).first
return status
end
In that code, relationship_blocked is just an abstraction of an integer to make it easier to read later.
In a view, I am calling this method like this:
- unless current_user.is_blocked_by_or_has_blocked?(user)
- # show the content for unblocked users here
Edit
This is a sample query.. it stops after it finds the first instance (no need to check for a reverse relationship)
Relationship Load (0.2ms) SELECT "relationships".* FROM "relationships" WHERE ("relationships".follower_id = 101) AND (followed_id = 1 AND relationship_status = 2) LIMIT 1
You can change it to only run one query by making it use an IN (x,y,z) statement in the query (this is done by passing an array of ids to :followed_id). Also, by using .count, you bypass Rails instantiating an instance of the model for the resulting relationships, which will keep things faster (less data to pass around in memory):
def is_blocked_by_or_has_blocked?(user)
relationships.where(:followed_id => [user.id, self.id], :relationship_status => relationship_blocked).count > 0
end
Edit - To get it to look both ways;
Relationship.where(:user_id => [user.id, self.id], :followed_id => [user.id, self.id], :relationship_status => relationship_blocked).count > 0