Should I drop a polymorphic association? - ruby-on-rails-3

My code is still just in development, not production, and I'm hitting a wall with generating data that I want for some views.
Without burying you guys in details, I basically want to navigate through multiple model associations to get some information at each level. The one association giving me problems is a polymorphic belongs_to. Here are the most relevant associations
Model Post
belongs_to :subtopic
has_many :flags, :as => :flaggable
Model Subtopic
has_many :flags, :as => :flaggable
Model Flag
belongs_to :flaggable, :polymorphic => true
I'd like to display multiple flags in a Flags#index view. There's information from other models that I want to display, as well, but I'm leaving out the specifics here to keep this simpler.
In my Flags_controller#index action, I'm currently using #flags = paginate_by_sql to pull everything I want from the database. I can successfully get the data, but I can't get the associated model objects eager-loaded (though the data I want is all in memory). I'm looking at a few options now:
rewrite my views to work on the SQL data in the #flags object. This should work and will prevent the 5-6 association-model-SQL queries per row on the index page, but will look very hackish. I'd like to avoid this if possible
simplify my views and create additional pages for the more detailed information, to be loaded only when viewing one individual flag
change the model hierarchy/definitions away from polymorphic associations to inheritance. Effectively make a module or class FlaggableObject that would be the parent of both Subtopic and Post.
I'm leaning towards the third option, but I'm not certain that I'll be able to cleanly pull all the information I want using Rails' ActiveRecord helpers only.
I would like insight on whether this would work and, more importantly, if you you have a better solution
EDIT: Some nit-picky include behavior I've encountered
#flags = Flag.find(:all,:conditions=> "flaggable_type = 'Post'", :include => [{:flaggable=>[:user,{:subtopic=>:category}]},:user]).paginate(:page => 1)
=> (valid response)
#flags = Flag.find(:all,:conditions=> ["flaggable_type = 'Post' AND
post.subtopic.category_id IN ?", [2,3,4,5]], :include => [{:flaggable=>
[:user, {:subtopic=>:category}]},:user]).paginate(:page => 1)
=> ActiveRecord::EagerLoadPolymorphicError: Can not eagerly load the polymorphic association :flaggable

Don't drop the polymorphic association. Use includes(:association_name) to eager-load the associated objects. paginate_by_sql won't work, but paginate will.
#flags = Flag.includes(:flaggable).paginate(:page => 1)
It will do exactly what you want, using one query from each table.
See A Guide to Active Record Associations. You may see older examples using the :include option, but the includes method is the new interface in Rails 3.0 and 3.1.
Update from original poster:
If you're getting this error: Can not eagerly load the polymorphic association :flaggable, try something like the following:
Flag.where("flaggable_type = 'Post'").includes([{:flaggable=>[:user, {:subtopic=>:category}]}, :user]).paginate(:page => 1)
See comments for more details.

Issues: Count over a polymorphic association.
#flags = Flag.find(:all,:conditions => ["flaggable_type = 'Post' AND post.subtopic.category_id IN ?",
[2,3,4,5]], :include => [{:flaggable => [:user, {:subtopic=>:category}]},:user])
.paginate(:page => 1)
Try like the following:
#flags = Flag.find(:all,:conditions => ["flaggable_type = 'Post' AND post.subtopic.category_id IN ?",
[2,3,4,5]], :include => [{:flaggable => [:user, {:subtopic=>:category}]},:user])
.paginate(:page => 1, :total_entries => Flag.count(:conditions =>
["flaggable_type = 'Post' AND post.subtopic.category_id IN ?", [2,3,4,5]]))

Related

Rails sort collection by association count

I'm working in Rails 4, and have two relevant models:
Account Model
has_many :agent_recalls, primary_key: "id", :foreign_key => "pickup_agent_id", class_name: "Booking"
Hence, queries like Account.find(10).agent_recalls would work.
What I want to do is sort the entire Account collection by this agent_recalls association.
Ideally it'd look something like (but obviously not):
#agents = Account.where(agent: true).order(:agent_recalls)
Question: What's the correct query to output an ordered list, by this agent_recall count?
Well to accomplish what you are looking for you have 2 options:
first, only a query, but it will implied a join, so there will be lost the Accounts that doesn't have any agent_recalls, so i will discard this option
second, i think this one is more appropriate for what you are trying to do
Account.find(:all, :conditions => { :agent => true }, :include => :agent_recalls).sort_by {|a| a. agent_recalls.size}
As you can see is a mix between a query and ruby, hope it helps :)

Have more than 400 000, repopulating the DB takes 5 hours

Simply running
ElectricityProfile.find_each do |ep|
if UserProfile.exists?(ep.owner_id) && ep.owner_type == 'UserProfile'
ElectricityProfileSummary.create(ep)
end
end
Takes ages (5 hours) to populate the table. Is there any better way to populate the DB?
Lets say get all the data from the DB and store it in array, hash, etc and then push to create a DB
ElectricityProfile.find_each do |ep|
if UserProfile.exists?(ep.owner_id) && ep.owner_type == 'UserProfile'
array_of_electricity_profiles.push(ep)
end
end
ElectricityProfileSummary.mass_create(ep) # => or any other method :)
Sorry forgot mention I do have overridden method create, that takes multiple models and creates ElectricityProfileSummary...
create!(:owner_id => electricity_profile.owner_id,
:owner_type => electricity_profile.owner_type,
:property_type => electricity_profile.owner.user_property_type,
:household_size => electricity_profile.owner.user_num_of_people,
:has_swimming_pool => electricity_profile.has_swimming_pool,
:bill => electricity_bill,
:contract => electricity_profile.on_contract,
:dirty => true,
:provider => electricity_profile.supplier_id,
:plan => electricity_profile.plan_id,
:state => state,
:postcode => postcode,
:discount => discount,
:has_air_conditioner => electricity_profile.has_air_conditioner,
:has_electric_hot_water => electricity_profile.has_electric_hot_water,
:has_electric_central_heating => electricity_profile.has_electric_central_heating,
:has_electric_cooktup => electricity_profile.has_electric_cooktup
)
Doing this in a stored procedure or raw SQL would probably be the best way to go since ActiveRecord can be very expensive when dealing with that many records. However, you can speed it up quite a bit by using includes or joins.
It looks like you only want to create ElectricityProfileSummary models. I am a little unsure of how your relationships look, but assuming you have the following:
class ElectricityProfile
belongs_to :owner, polymorphic: true
end
class UserProfile
has_many :electricity_profiles, as: owner
end
... you should be able to do something like this:
ElectricityProfile.includes(:owner).each do |ep|
ElectricityProfileSummary.create(ep)
end
Now, I am basing this on the assumption that you are using a polymorphic relationship between ElectricityProfile and UserProfile. If that is not the case, let me know. (I made the assumption because you have owner_id and owner_type, which as a pair make up the two fields necessary for polymorphic relationships.)
Why is using an includes better? Using includes causes ActiveRecord to eager load the relationship between the two models, so you're not doing n+1 queries like you are now. Actually, because you are creating records based on the number of ElectricityProfile records, you're still doing n+1, but what you are doing now is more expensive than n+1 because you are querying UserProfile for every single ElectricityProfile, and then you are querying UserProfile again when creating the ElectricityProfileSummary because you are lazy loading the relationship between EP and UP.
When you do includes, Rails will use an inner join to query between the two tables. Using an inner join eliminates the necessity to do ensure that the UserProfile exists, since the inner join will only return records where both sides of the relationship exist.
If you could wrap your import loop into one transaction block, it should speed up import immensely. Read on about ROR transactions here.

putting a condition on an includes

I have the following relationships:
Category has_many :posts
Post has_many :comments
Post has_many :commenters, :through => :comments
I have the following eager load, giving me posts, comments and commenters (note that I need all 3, and hence the includes as opposed to joins)
category.posts.includes(:comments, :commenters)
However, I'd like to limit comments (and if possible commenters) to only those created in the past two weeks while still returning the same set of posts. Initially I thought I could specify a condition on the includes:
category.posts.includes(:comments, :commenters).where("comments.created_at > ?", 2.weeks.ago)
But found that this returns only the posts that meet the condition. I'm thinking that I may need to do something like performing a subquery on comments and then performing a join. Is there an easy way to do this with AR of would I be better off doing this with sql?
Finally managed to figure this out from reading this page:
http://api.rubyonrails.org/classes/ActiveRecord/Associations/ClassMethods.html
I simply needed to create an association in my Post model like:
Post has_many :recent_comments, :class_name = 'Comment', :conditions => ["created_at > ?", 2.weeks.ago]
Then I could do the following to get the desired ActiveRecord::Association object:
category.posts.includes(:recent_comments => :commenters)
There was also a suggestion of doing this by using a scope on a model. However, I read somewhere (I think it was on SO) that scopes are on their way out and that ARel has taken their place so I decided to do this without scopes.
Try :
category.posts.all(:includes => {:comments =>:commenters}, :conditions => ["comments.created_at = ? AND commenters.created_at = ?", 2.weeks.ago, 2.weeks.ago]

Cancan Thinking Sphinx current_ability Questions

trying to get cancan working with thinking sphinx but running into some issues.
Before using sphinx, I had this in my companies view:
#companies = Company.accessible_by(current_ability)
That prevented my users from seeing anyone else's companies...
After installing sphinx, I ended up with:
#companies = Company.accessible_by(current_ability).search(params[:search], :include => :order, :match_mode => :extended ).paginate(:page => params[:page])
Which now displays all my companies and isn't refining per user based on ability.
It would see ts isn't set up for cancan?
I think it's more that accessible_by is probably a scope - which is Database/SQL-driven. Sphinx has its own query interface, and so ActiveRecord scopes don't apply.
An inefficient workaround (gets all companies first):
company_ids = Company.accessible_by(current_ability).collect &:id
#companies = Company.search params[:search],
:include => :order,
:match_mode => :extended,
:page => params[:page],
:with => {:sphinx_internal_id => company_ids}
A couple of things to note: sphinx_internal_id is the indexed model's primary key - Sphinx has its own unique identifier named id, hence the distinction. Also: You don't want to call paginate on a search collection - Sphinx always paginates, so just pass the :page param through to the search call.
There'd be two better workarounds that I can think of - either have a Sphinx equivalent of accessible_by, with the relevant information added to your indices as attributes - or, simpler if not quite as ideal, just get the company ids returned in the first line of my above snippet without loading up every company as an ActiveRecord object. Both will probably mean bypassing and/or duplicating Cancan's helpers.
Although... maybe this would do the trick, taking the latter approach:
sql = Company.accessible_by(current_ability).select(:id).to_sql
company_ids = Company.connection.select_values sql
#companies = Company.search params[:search],
:include => :order,
:match_mode => :extended,
:page => params[:page],
:with => {:sphinx_internal_id => company_ids}
Avoids loading unnecessary Company objects, uses the Cancan helper (provided it is/returns a scope), and works neatly with what Sphinx/Thinking Sphinx expects. I've not used Cancan though, so this is a bit of guesswork.

looping through a rails data object from the controller

I have called a function from a class to find all the items related to a particular ID in a many to many HABTM relationship.
Procedures -> Tasks with a join table: procedures_tasks
I call the information like #example = Procedure.get_tasks(1,1)
I would like to be able to iterate through the data returned so that I can create an instance of each task_id related to the procedure in question
def self.get_tasks(model_id, operating_system_id)
find(:first, :select => 'tasks.id, procedures.id', :conditions => ["model_id = ? AND operating_system_id = ?", model_id, operating_system_id], :include => [:tasks])
end
I tried rendering the data as i normally would and then using .each do |f| in the view layer, but i get:
undefined method `each' for #<Procedure:0x2b879be1db30>
Original Question:
I am creating a rails application to track processes we perform. When a new instance of a process is created I want to automatically create rows for all the tasks that will need to be performed.
tables:
decommissions
models
operating_systems
procedures
tasks
procedures_tasks
host_tasks
procedures -> tasks is many to many through the procedures_tasks join table.
when you start a new decommissioning process you specify a model and OS, the model and OS specify which procedure you follow, each procedure has a list of tasks available in the join table. I am wanting to create a entry in host_tasks for each task relevant to the procedure relevant to the decommission being created.
I've done my head in over this for days, any suggestions?
class Procedure < ActiveRecord::Base
has_and_belongs_to_many :tasks
#has_many :tasks, :through => :procedures_tasks
# has_many :procedures_tasks
belongs_to :model
belongs_to :operating_system
validates_presence_of :name
validates_presence_of :operating_system_id
validates_presence_of :model_id
def self.get_tasks(model_id, operating_system_id)
find(:first, :select => 'tasks.id, procedures.id', :conditions => ["model_id = ? AND operating_system_id = ?", model_id, operating_system_id], :include => [:tasks])
end
end
the get_tasks method will retrieve the tasks associated with the procedure, but I don't know how to manipulate the data pulled from the database in rails, I haven't been able to access the attributes of the returned object through the controller because they haven't been rendered yet?
ideally i would like to be able to format this data so that I only have an array of the task_id's which i can then loop through creating new rows in the appropriate table.
It wasn't looping through because I was using the :first option when finding the data. I changed it to :all which allowed me to .each do |f| etc.
Not the best option, but there will only ever be one option anyway, so it won't cause a problem.