How to make an ActiveRecord request to get an item common to several other items - sql

I am trying to modify Sharetribe, a Ruby on Rails framework for online communities. There is this method that returns me relevant search filters.
For now, it returns me a filter if it is present in any one of the categories (identified by category_ids ) .
I would like it to return a filter if and only if it is present in ALL of the categories identified by category_ids.
Being new to Rails and ActiveRecord, I'm a bit lost. Here is the method returning relevant filters :
# Database select for "relevant" filters based on the `category_ids`
#
# If `category_ids` is present, returns only filter that belong to
# one of the given categories. Otherwise returns all filters.
#
def select_relevant_filters(category_ids)
relevant_filters =
if category_ids.present?
#current_community
.custom_fields
.joins(:category_custom_fields)
.where("category_custom_fields.category_id": category_ids, search_filter: true)
.distinct
else
#current_community
.custom_fields.where(search_filter: true)
end
relevant_filters.sort
end
Is there a way to change the SQL request, or should I retrieve all the fields as it is doing right now and then delete the ones I am not interested in ?

Try the following
def select_relevant_filters_if_all(category_ids)
relevant_filters =
if category_ids.present?
#current_community
.custom_fields
.joins(:category_custom_fields)
.where("category_custom_fields.category_id": category_ids, search_filter: true)
.group("category_custom_fields.id")
.having("count(category_custom_fields.id)=?", category_ids.count)
.distinct
else
#current_community
.custom_fields.where(search_filter: true)
end
relevant_filters.sort
end
This is a new method in your HomeController, pay attention the name is different, just to omit monkeypatching. Comments are welcome.

So I solved my problem by selecting filters that are pesent in all of the subcategories of the selected category. For that I select all filters of all subcategory, and only keep the ones that are returned a number of times exactly equal to the number of subcategory.
all_relevant_filters = select_relevant_filters(m_selected_category.own_and_subcategory_ids.or_nil)
nb_sub_category = m_selected_category.subcategory_ids.size
if nb_sub_category.none?
relevant_filters = all_relevant_filters
else
relevant_filters = all_relevant_filters.select{ |e| all_relevant_filters.count(e) == nb_sub_category.get }.uniq
end

Related

Ruby on Rails - Active Record FIlter where the value of a referenced table is > 0

I am currently trying to filter out from selected data in Ruby on Rails those where the attribute "amount_available" is greater than zero. This would be no problem via #events.where(ticket_categories.amount_available > 0), but ticket_categories is an array with not a fixed length, because there can be multiple categories. How can you easily iterate through the array in the where clause and do this comparison?
I only need the events in the output where at least one associated category has the amount_available > 0.
This is my code:
#upcoming_events = #events.where("date >=?", Date.current)
#available_events = #upcoming_events.where(ticket_categories[0].amount_available > 0)
json_response(#available_events)
You can chain where conditions and you can add conditions that are based on associated models with joins:
available_events = #events
.where('date >= ?', Date.current)
.joins(:ticket_categories).where('ticket_categories.amount > 0')
.group(:id)
render json: available_events
Note: Database joins might return duplicate records (depending on your database structure and the condition) therefore the need to group the result set by id.
It is only a representation because the Events table is linked to TicketCategories via has_many. I use PostgresSQL and could now solve it with the following code:
#upcoming_events = #close_events.where("date >=?", Date.current)
available_events = []
#upcoming_events.each do |event|
event.ticket_categories.each do|category|
if category.amount_available > 0
available_events.push(event)
break;
end
end
end
render json: available_events

Filtering records based on all members of association

I have a model called Story, which has — and belongs to many — Tags. I'm trying to create functionality to display only certain stories based on the story attributes. I do this by chaining where()s:
q = Story.where(condition1)
q = q.where(condition2)
...et cetera. One of the things I want to be able to filter on is tags, which at first I tried to do as follows:
q = q.joins(:tags)
q = q.where(tagCondition1)
q = q.where(tagCondition2)
...
However, this only finds stories that have a single tag that matches all conditions. I want to find all stories that have at least one tag that matches each condition. That is, currently if I have the conditions LIKE %al% and LIKE %be%, it will match a story with the tag 'alpha beta'; I want it to also match a story with the tag 'alpha' and the tag 'beta'.
Maybe you need the below to match multiple conditions:
q.where([tagCondition1, tagCondition2, ...])
You can use HAVING with a count.
class Story < ActiveRecord::Base
has_and_belongs_to_many :tags
def self.with_tags(*tags, min: 1)
joins(:tags)
.where(tags: { name: tags })
.group('story.id')
.having("count(*) = ?", min)
end
end
Usage:
params[:tag] = "foo bar baz"
Story.with_tags(params[:tag], *params[:tag].split)
Would include stories with any of the tags ["foo bar baz", "foo", "bar", "baz"].
The query is wrong. Since you are using AND and you want to match the first OR the second condition. Maybe you should use something like
where("tags.field like '%al% or tags.field like '%be%'")
Okay, so here's what I ended up doing: (added for anyone who's googling and has a similar problem):
conditions.each do |condition|
q_part = select('1').from('stories_tags')
q_part = q_part.where('stories_tags.story_id = stories.id')
q_part = q_part.where('stories_tags.name SIMILAR TO ?', condition)
q = q.where("EXISTS (#{q_part.to_sql})")
end

Scope with association and ActiveRecord

I have an app that records calls. Each call can have multiple units associated with it. Part of my app has a reports section which basically just does a query on the Call model for different criteria. I've figured out how to write some scopes that do what I want and chain them to the results of my reporting search functionality. But I can't figure out how to search by "unit". Below are relevant excerpts from my code:
Call.rb
has_many :call_units
has_many :units, through: :call_units
#Report search logic
def self.report(search)
search ||= { type: "all" }
# Determine which scope to search by
results = case search[:type]
when "open"
open_status
when "canceled"
cancel
when "closed"
closed
when "waitreturn"
waitreturn
when "wheelchair"
wheelchair
else
scoped
end
#Search results by unit name, this is what I need help with. Scope or express otherwise?
results = results. ??????
results = results.by_service_level(search[:service_level]) if search[:service_level].present?
results = results.from_facility(search[:transferred_from]) if search[:transferred_from].present?
results = results.to_facility(search[:transferred_to]) if search[:transferred_to].present?
# If searching with BOTH a start and end date
if search[:start_date].present? && search[:end_date].present?
results = results.search_between(Date.parse(search[:start_date]), Date.parse(search[:end_date]))
# If search with any other date parameters (including none)
else
results = results.search_by_start_date(Date.parse(search[:start_date])) if search[:start_date].present?
results = results.search_by_end_date(Date.parse(search[:end_date])) if search[:end_date].present?
end
results
end
Since I have an association for units already, I'm not sure if I need to make a scope for units somehow or express the results somehow in the results variable in my search logic.
Basically, you want a scope that uses a join so you can use a where criteria in against the associated model? Is that correct?
So in SQL you're looking for something like
select * from results r
inner join call_units c on c.result_id = r.id
inner join units u on u.call_unit_id = c.id
where u.name = ?
and the scope would be (from memory, I haven't debugged this) something like:
scope :by_unit_name, lambda {|unit_name|
joins(:units).where('units.unit_name = ?', unit_name)
}
units.name isn't a column in the db. Changing it to units.unit_name didn't raise an exception and seems to be what I want. Here's what I have in my results variable:
results = results.by_unit_name(search[:unit_name]) if search[:unit_name].present?
When I try to search by a different unit name no results show up. Here's the code I'm using to search:
<%= select_tag "search[unit_name]", options_from_collection_for_select(Unit.order(:unit_name), :unit_name, :unit_name, selected: params[:search].try(:[], :unit_name)), prompt: "Any Unit" %>

Rails ignores columns from second table when using .select

By example:
r = Model.arel_table
s = SomeOtherModel.arel_table
Model.select(r[:id], s[:othercolumn].as('othercolumn')).
joins(:someothermodel)
Will product the sql:
`SELECT `model`.`id`, `someothermodel`.`othercolumn` AS othercolumn FROM `model` INNER JOIN `someothermodel` ON `model`.`id` = `someothermodel`.`model_id`
Which is correct. However, when the models are loaded, the attribute othercolumn is ignored because it is not an attribute of Model.
It's similar to eager loading and includes, but I don't want all columns, only the one specified so include is no good.
There must be an easy way of getting columns from other models? I'd preferably have the items return as instances of Model than simple arrays/hashes
When you do a select with joins or includes, you will be returned an ActiveRecordRelation. This ActiveRecordRelation is composed of only the objects of the class which you use to call select on. The selected columns from the joined models are added to the objects returned. Because these attributes are not Model's attribute they don't show up when you inspect these objects, and I believe this is the primary reason for confusion.
You could try this out in your rails console:
> result = Model.select(r[:id], s[:othercolumn].as('othercolumn')).joins(:someothermodel)
=> #<ActiveRecord::Relation [#<Model id: 1>]>
# "othercolumn" is not shown in the result but doing the following will yield correct result
> result.first.othercolumn
=> "myothercolumnvalue"

ActiveRecord find_each combined with limit and order

I'm trying to run a query of about 50,000 records using ActiveRecord's find_each method, but it seems to be ignoring my other parameters like so:
Thing.active.order("created_at DESC").limit(50000).find_each {|t| puts t.id }
Instead of stopping at 50,000 I'd like and sorting by created_at, here's the resulting query that gets executed over the entire dataset:
Thing Load (198.8ms) SELECT "things".* FROM "things" WHERE "things"."active" = 't' AND ("things"."id" > 373343) ORDER BY "things"."id" ASC LIMIT 1000
Is there a way to get similar behavior to find_each but with a total max limit and respecting my sort criteria?
The documentation says that find_each and find_in_batches don't retain sort order and limit because:
Sorting ASC on the PK is used to make the batch ordering work.
Limit is used to control the batch sizes.
You could write your own version of this function like #rorra did. But you can get into trouble when mutating the objects. If for example you sort by created_at and save the object it might come up again in one of the next batches. Similarly you might skip objects because the order of results has changed when executing the query to get the next batch. Only use that solution with read only objects.
Now my primary concern was that I didn't want to load 30000+ objects into memory at once. My concern was not the execution time of the query itself. Therefore I used a solution that executes the original query but only caches the ID's. It then divides the array of ID's into chunks and queries/creates the objects per chunk. This way you can safely mutate the objects because the sort order is kept in memory.
Here is a minimal example similar to what I did:
batch_size = 512
ids = Thing.order('created_at DESC').pluck(:id) # Replace .order(:created_at) with your own scope
ids.each_slice(batch_size) do |chunk|
Thing.find(chunk, :order => "field(id, #{chunk.join(',')})").each do |thing|
# Do things with thing
end
end
The trade-offs to this solution are:
The complete query is executed to get the ID's
An array of all the ID's is kept in memory
Uses the MySQL specific FIELD() function
Hope this helps!
find_each uses find_in_batches under the hood.
Its not possible to select the order of the records, as described in find_in_batches, is automatically set to ascending on the primary key (“id ASC”) to make the batch ordering work.
However, the criteria is applied, what you can do is:
Thing.active.find_each(batch_size: 50000) { |t| puts t.id }
Regarding the limit, it wasn't implemented yet: https://github.com/rails/rails/pull/5696
Answering to your second question, you can create the logic yourself:
total_records = 50000
batch = 1000
(0..(total_records - batch)).step(batch) do |i|
puts Thing.active.order("created_at DESC").offset(i).limit(batch).to_sql
end
Retrieving the ids first and processing the in_groups_of
ordered_photo_ids = Photo.order(likes_count: :desc).pluck(:id)
ordered_photo_ids.in_groups_of(1000, false).each do |photo_ids|
photos = Photo.order(likes_count: :desc).where(id: photo_ids)
# ...
end
It's important to also add the ORDER BY query to the inner call.
Rails 6.1 adds support for descending order in find_each, find_in_batches and in_batches.
One option is to put an implementation tailored for your particular model into the model itself (speaking of which, id is usually a better choice for ordering records, created_at may have duplicates):
class Thing < ActiveRecord::Base
def self.find_each_desc limit
batch_size = 1000
i = 1
records = self.order(created_at: :desc).limit(batch_size)
while records.any?
records.each do |task|
yield task, i
i += 1
return if i > limit
end
records = self.order(created_at: :desc).where('id < ?', records.last.id).limit(batch_size)
end
end
end
Or else you can generalize things a bit, and make it work for all the models:
lib/active_record_extensions.rb:
ActiveRecord::Batches.module_eval do
def find_each_desc limit
batch_size = 1000
i = 1
records = self.order(id: :desc).limit(batch_size)
while records.any?
records.each do |task|
yield task, i
i += 1
return if i > limit
end
records = self.order(id: :desc).where('id < ?', records.last.id).limit(batch_size)
end
end
end
ActiveRecord::Querying.module_eval do
delegate :find_each_desc, :to => :all
end
config/initializers/extensions.rb:
require "active_record_extensions"
P.S. I'm putting the code in files according to this answer.
You can iterate backwards by standard ruby iterators:
Thing.last.id.step(0,-1000) do |i|
Thing.where(id: (i-1000+1)..i).order('id DESC').each do |thing|
#...
end
end
Note: +1 is because BETWEEN which will be in query includes both bounds but we need include only one.
Sure, with this approach there could be fetched less than 1000 records in batch because some of them are deleted already but this is ok in my case.
As remarked by #Kirk in one of the comments, find_each supports limit as of version 5.1.0.
Example from the changelog:
Post.limit(10_000).find_each do |post|
# ...
end
The documentation says:
Limits are honored, and if present there is no requirement for the batch size: it can be less than, equal to, or greater than the limit.
(setting a custom order is still not supported though)
I was looking for the same behaviour and thought up of this solution. This DOES NOT order by created_at but I thought I would post anyways.
max_records_to_retrieve = 50000
last_index = Thing.count
start_index = [(last_index - max_records_to_retrieve), 0].max
Thing.active.find_each(:start => start_index) do |u|
# do stuff
end
Drawbacks of this approach:
- You need 2 queries (first one should be fast)
- This guarantees a max of 50K records but if ids are skipped you will get less.
You can try ar-as-batches Gem.
From their documentation you can do something like this
Users.where(country_id: 44).order(:joined_at).offset(200).as_batches do |user|
user.party_all_night!
end
Using Kaminari or something other it will be easy.
Create batch loader class.
module BatchLoader
extend ActiveSupport::Concern
def batch_by_page(options = {})
options = init_batch_options!(options)
next_page = 1
loop do
next_page = yield(next_page, options[:batch_size])
break next_page if next_page.nil?
end
end
private
def default_batch_options
{
batch_size: 50
}
end
def init_batch_options!(options)
options ||= {}
default_batch_options.merge!(options)
end
end
Create Repository
class ThingRepository
include BatchLoader
# #param [Integer] per_page
# #param [Proc] block
def batch_changes(per_page=100, &block)
relation = Thing.active.order("created_at DESC")
batch_by_page do |next_page|
query = relation.page(next_page).per(per_page)
yield query if block_given?
query.next_page
end
end
end
Use the repository
repo = ThingRepository.new
repo.batch_changes(5000).each do |g|
g.each do |t|
#...
end
end
Adding find_in_batches_with_order did solve my usecase, where I was having ids already but need batching and ordering. It was inspired by #dirk-geurs solution
# Create file config/initializers/find_in_batches_with_order.rb with follwing code.
ActiveRecord::Batches.class_eval do
## Only flat order structure is supported now
## example: [:forename, :surname] is supported but [:forename, {surname: :asc}] is not supported
def find_in_batches_with_order(ids: nil, order: [], batch_size: 1000)
relation = self
arrangement = order.dup
index = order.find_index(:id)
unless index
arrangement.push(:id)
index = arrangement.length - 1
end
ids ||= relation.order(*arrangement).pluck(*arrangement).map{ |tupple| tupple[index] }
ids.each_slice(batch_size) do |chunk_ids|
chunk_relation = relation.where(id: chunk_ids).order(*order)
yield(chunk_relation)
end
end
end
Leaving Gist here https://gist.github.com/the-spectator/28b1176f98cc2f66e870755bb2334545
I had the same problem with a query with DISTINCT ON where you need an ORDER BY with that field, so this is my approach with Postgres:
def filtered_model_ids
Model.joins(:father_model)
.select('DISTINCT ON (model.field) model.id')
.order(:field)
.map(&:id)
end
def processor
filtered_model_ids.each_slice(BATCH_SIZE).lazy.each do |batch|
Model.find(batch).each do |record|
# Code
end
end
end
My code
batch_size = 100
total_count = klass.count
offset = 0
processed_count = 0
while processed_count < total_count
relation = klass.order({ active_at: :asc, created_at: :desc }).offset(offset).limit(batch_size)
relation.each do |record|
record.process
end
processed_count += batch_size
end
Do it in one query and avoid iterating:
User.offset(2).order('name DESC').last(3)
will product a query like this
SELECT "users".* FROM "users" ORDER BY name ASC LIMIT $1 OFFSET $2 [["LIMIT", 3], ["OFFSET", 2]