Efficient way to update multiple records with independent values? - sql

I have the following (overly db expensive) method:
def reorder_area_routes_by_demographics!
self.area_routes.joins(:route).order(self.demo_criteria, :proximity_rank).readonly(false).each_with_index do |area_route, i|
area_route.update_attributes(match_rank: i)
end
end
But this results in an UPDATE query for each area_route. Is there a way to do this in one query?
--Edit--
Final solution, per coreyward suggestion:
def reorder_area_routes_by_demographics!
sorted_ids = area_routes.joins(:route).order(self.demo_criteria, :proximity_rank).pluck(:'area_routes.id')
AreaRoute.update_all [efficient_sort_sql(sorted_ids), *sorted_ids], {id: sorted_ids}
end
def efficient_sort_sql(sorted_ids, offset=0)
offset.upto(offset + sorted_ids.count - 1).inject('match_rank = CASE id ') do |sql, i|
sql << "WHEN ? THEN #{id} "
end << 'END'
end

I use the following to do a similar task: updating the sort positions of a bevy of records according to their order in params. You might need to refactor or incorporate this differently to accomodate the scopes you're applying, but I think this will send you in the right direction.
def efficient_sort_sql(sortable_ids, offset = 1)
offset.upto(offset + sortable_ids.count - 1).reduce('position = CASE id ') do |sql, i|
sql << "WHEN ? THEN #{i} "
end << 'END'
end
Model.update_all [efficient_sort_sql(sortable_ids, offset), *sortable_ids], { id: sortable_ids }
sortable_ids is an array of integers representing the ids of each object. The resulting SQL looks something like this:
UPDATE pancakes SET position = CASE id WHEN 5 THEN 1 WHEN 3 THEN 2 WHEN 4 THEN 3 WHEN 1 THEN 4 WHEN 2 THEN 5 WHERE id IN (5,3,4,1,2);
This is, ugliness aside, a pretty performant query and (at least in Postgresql) will either fully succeed or fully fail.

Related

Spark Sql Parser append additional parameter in UDF call

I use SQL statements as input from users something like "CASE WHEN CALL_UDF("12G", 2) < 0 THEN 4 ELSE 5 END".
I would like to add in this string an additional parameter.
Expected result: "CASE WHEN CALL_UDF("12G", 2, additional_parameter) < 0 THEN 4 ELSE 5 END".
To achieve this goal I am trying to use SparkSqlParser but faced problems during the implementation of this replacement. Probably someone has implemented a similar solution. Thanks.
What I have already tried:
val expression = parser.parseExpression("CASE WHEN CALL_UDF("12G", 2) < 0 THEN 3 ELSE 4 END")
.transformDown {
case expression if expression.isInstanceOf[UnresolvedFunction] && expression.asInstanceOf[UnresolvedFunction].name.funcName == "CALL_UDF" =>
UnresolvedFunction(FunctionIdentifier("CALL_UDF"), expression.children.toList ++ Seq(parser.parseExpression("4")), false)
}

Rails: batched attribute queries using AREL

I'd like to use something like find_in_batches, but instead of grouping fully instantiated AR objects, I would like to group a certain attribute, like, let's say, the id. So, basically, a mixture of using find_in_batches and pluck:
Cars.where(:engine => "Turbo").pluck(:id).find_in_batches do |ids|
puts ids
end
# [1, 2, 3....]
# ...
Is there a way to do this (maybe with Arel) without having to write the OFFSET/LIMIT logic myself or recurring to pagination gems like will paginate or kaminari?
This is not the ideal solution, but here's a method that just copy-pastes most of find_in_batches but yields a relation instead of an array of records (untested) - just monkey-patch it into Relation :
def in_batches( options = {} )
relation = self
unless arel.orders.blank? && arel.taken.blank?
ActiveRecord::Base.logger.warn("Scoped order and limit are ignored, it's forced to be batch order and batch size")
end
if (finder_options = options.except(:start, :batch_size)).present?
raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
relation = apply_finder_options(finder_options)
end
start = options.delete(:start)
batch_size = options.delete(:batch_size) || 1000
relation = relation.reorder(batch_order).limit(batch_size)
relation = start ? relation.where(table[primary_key].gteq(start)) : relation
while ( size = relation.size ) > 0
yield relation
break if size < batch_size
primary_key_offset = relation.last.id
if primary_key_offset
relation = relation.where(table[primary_key].gt(primary_key_offset))
else
raise "Primary key not included in the custom select clause"
end
end
end
With this, you should be able to do :
Cars.where(:engine => "Turbo").in_batches do |relation|
relation.pluck(:id)
end
this is not the best implementation possible (especially in regard to primary_key_offset calculation, which instantiates a record), but you get the spirit.

django using .extra() got error `only a single result allowed for a SELECT that is part of an expression`

I'm trying to use .extra() where the query return more than 1 result, like :
'SELECT "books_books"."*" FROM "books_books" WHERE "books_books"."owner_id" = %s' % request.user.id
I got an error : only a single result allowed for a SELECT that is part of an expression
Try it on dev-server using sqlite3. Anybody knows how to fix this? Or my query is wrong?
EDIT:
I'm using django-simple-ratings, my model like this :
class Thread(models.Model):
#
#
ratings = Ratings()
I want to display each Thread's ratings and whether a user already rated it or not. For 2 items, it will hit 6 times, 1 for the actual Thread and 2 for accessing the ratings. The query:
threads = Thread.ratings.order_by_rating().filter(section = section)\
.select_related('creator')\
.prefetch_related('replies')
threads = threads.extra(select = dict(myratings = "SELECT SUM('section_threadrating'.'score') AS 'agg' FROM 'section_threadrating' WHERE 'section_threadrating'.'content_object_id' = 'section_thread'.'id' ",)
Then i can print each Thread's ratings without hitting the db more. For the 2nd query, i add :
#continue from extra
blahblah.extra(select = dict(myratings = '#####code above####',
voter_id = "SELECT 'section_threadrating'.'user_id' FROM 'section_threadrating' WHERE ('section_threadrating'.'content_object_id' = 'section_thread'.'id' AND 'section_threadrating'.'user_id' = '3') "))
Hard-coded the user_id. Then when i use it on template like this :
{% ifequal threads.voter_id user.id %}
#the rest of the code
I got an error : only a single result allowed for a SELECT that is part of an expression
Let me know if it's not clear enough.
The problem is in the query. Generally, when you are writing subqueries, they must return only 1 result. So a subquery like the one voter_id:
select ..., (select sectio_threadrating.user_id from ...) as voter_id from ....
is invalid, because it can return more than one result. If you are sure it will always return one result, you can use the max() or min() aggregation function:
blahblah.extra(select = dict(myratings = '#####code above####',
voter_id = "SELECT max('section_threadrating'.'user_id') FROM 'section_threadrating' WHERE ('section_threadrating'.'content_object_id' = 'section_thread'.'id' AND 'section_threadrating'.'user_id' = '3') "))
This will make the subquery always return 1 result.
Removing that hard-code, what user_id are you expecting to retrieve here? Maybe you just can't reduce to 1 user using only SQL.

Use When Case in SQL server to Ruby On Rails

I have this query in SQL
select #cd_x=
Case
when tp_x2='ZZZ' then tp_x3
when tp_x2='XXX' then tp_x3
else
tp_x2
end
from table
where id=#id
How can I translate this query to a sentence in Ruby on Rails?
I think you would be looking at something like:
#cd_x = table.select("CASE WHEN tp_x2='ZZZ' THEN tp_x3 WHEN tp_x2='XXX' then tp_x3 ELSE tp_x2 END").where(:id => id)
I am using this as a model though they created a whole module out of it rather than one line: Case Statement in ActiveRecord
I had a kind a similar problem. I wanted to update more records with one query. Basically an order sorting update, using CASE sql.
#List of product ids in sorted order. Get from jqueryui sortable plugin.
#product_ids = [3,1,2,4,7,6,5]
#product_ids.each_with_index do |id, index|
# Product.where(id: id).update_all(sort_order: index+1)
#end
##CASE syntax example:
##Product.where(id: product_ids).update_all("sort_order = CASE id WHEN 539 THEN 1 WHEN 540 THEN 2 WHEN 542 THEN 3 END")
case_string = "sort_order = CASE id "
product_ids.each_with_index do |id, index|
case_string += "WHEN #{id} THEN #{index+1} "
end
case_string += "END"
Product.where(id: product_ids).update_all(case_string)

searching for and ranking results

I'm trying to write a relatively simple algorithm to search for a string on several attributes
Given some data:
Some data:
1: name: 'Josh', location: 'los angeles'
2: name: 'Josh', location: 'york'
search string: "josh york"
The results should be [2, 1] because that query string hits the 2nd record twice, and the 1st record once.
It's safe to assume case-insensitivity here.
So here's what I have so far, in ruby/active record:
query_string = "josh new york"
some_attributes = [:name, :location]
results = {}
query_string.downcase.split.each do |query_part|
some_attributes.each do |attribute|
find(:all, :conditions => ["#{attribute} like ?", "%#{query_part}%"]).each do |result|
if results[result]
results[result] += 1
else
results[result] = 1
end
end
end
end
results.sort{|a,b| b[1]<=>a[1]}
The issue I have with this method is that it produces a large number of queries (query_string.split.length * some_attributes.length).
Can I make this more efficient somehow by reducing the number of queries ?
I'm okay with sorting within ruby, although if that can somehow be jammed into the SQL that'd be nice too.
Why aren't you using something like Ferret? Ferret is a Ruby + C extension to make a full text index. Since you seem to be using ActiveRecord, there's also acts_as_ferret.