Naming SQL queries in Rails / ActiveRecord - sql

When using Rails with ActiveRecord (and PostgreSQL), executing "simple" queries adds a name to them, e.g. calling
Article.all
# => Article Load (2.6ms) SELECT "articles".* FROM "articles"
names the query Article Load. However, when executing slightly more complex queries, no name is being generated, as for example with
Article.group(:article_group_id).count
# => (1.2ms) SELECT COUNT(*) AS count_all, "articles"."article_group_id" AS articles_article_group_id FROM "articles" GROUP BY "articles"."article_group_id"
I can add a name if executing a custom query using the execute method:
ActiveRecord::Base.connection.execute("SELECT * FROM articles", "My custom query name")
# => My custom query name (2.5ms) SELECT * FROM articles
But is there a way to add a custom name to a query built with the ActiveRecord-methods?
If you wonder why: The name is useful for all kinds of monitoring, e.g. when looking at slow queries in AppSignal.

Since you just want to custom query name for monitoring purpose, so i think you only need to change the query name in the ActiveRecord::ConnectionAdapters#log method, this method is the one log the sql query that be executed, include the query name.
Here is my solution:
# lib/active_record/base.rb
# note that MUST be base.rb
# otherwise you need to add initializer to extend Rails core
#
module ActiveRecord
module ConnectionAdapters
class AbstractAdapter
attr_accessor :log_tag
private
alias old_log log
def log(sql, name = "SQL", binds = [], type_casted_binds = [], statement_name = nil)
if name != 'SCHEMA'
name = #log_tag
#log_tag = nil # reset
end
old_log(sql, name, binds, type_casted_binds, statement_name) do
yield
end
end
end
end
module QueryMethods
def log_tag(tag_name) # class method
spawn.log_tag(tag_name)
self
end
end
module Querying
delegate :log_tag, to: :all
end
class Relation
def log_tag(tag_name) # instance method
conn = klass.connection
conn.log_tag = tag_name
self
end
end
end
Demo
Task.log_tag("DEMO").group(:status).count
# DEMO (0.7ms) SELECT COUNT(*) AS count_all, "tasks"."status" AS tasks_status FROM "tasks" GROUP BY "tasks"."status"
Task.where(status: 6).log_tag("SIX").first(20)
# SIX (0.8ms) SELECT "tasks".* FROM "tasks" WHERE "tasks"."status" = ? ORDER BY "tasks"."id" ASC LIMIT ?
Task.where(status: 6).first(20)
# (0.8ms) SELECT "tasks".* FROM "tasks" WHERE "tasks"."status" = ? ORDER BY "tasks"."id" ASC LIMIT ?
Note
In case you want to fix query name for specific query, you can use a hash with key is the whole the specific sql string (or hash of whole sql, such as the way Rails core cache query: query_signature = ActiveSupport::Digest.hexdigest(to_sql)) and the value is the query name you want.
# set up before hand
ActiveRecord::ConnectionAdapters::LogTags[Product.where...to_sql] = "DEMO"
# AbstractAdapter
LogTags = Hash.new
def log(sql, name...)
name = LogTags[sql]
# ...
end

Related

How to use exec_query with dynamic SQL

I am working on a query and am using exec_query with binds to avoid potential SQL injection. However, I am running into an issue when trying to check that an id is in an array.
SELECT JSON_AGG(agg_date)
FROM (
SELECT t1.col1, t1.col2, t2.col1, t2.col2, t3.col3, t3.col4, t4.col7, t4.col8, t5.col5, t5.col6
FROM t1
JOIN t2 ON t1.id = t2.t1_id
JOIN t3 ON t1.id = t3.t3_id
JOIN t4 ON t2.is = t4.t2_id
JOIN t5 ON t3.id = t5.t3_id
WHERE t2.id IN ($1) AND t4.id = $2
) agg_data
this gives an error of invalid input syntax for integer: '1,2,3,4,5'
And SELECT ... WHERE t.id = ANY($1) gives ERROR: malformed array literal: "1,2,3,4,5,6,7" DETAIL: Array value must start with "{" or dimension information.
If I add the curly braces around the bind variable I get invalid input syntax for integer: "$1"
Here is the way I'm using exec_query
connection.exec_query(<<~EOQ, "-- CUSTOM SQL --", [[nil, array_of_ids], [nil, model_id]], prepare: true)
SELECT ... WHERE t.id IN ($1)
EOQ
I have tried with plain interpolation but that throws brakeman errors about sql injection so I can't use that way :(
Any help on being able to make this check is greatly appreciated. And if exec_query is the wrong way to go about this, I'm definitely down to try other things :D
In my class, I am using AR's internal sql injection prevention to search for the first bind variable ids, then plucking the ids and joining into a string for the sql query. I am doing the same for the other bind variable, finding the object and using that id. Just as a further precaution. So by the time the user inputs are used for the query, they've been through AR already. It's a brakeman scan that it throwing the error. I ahve a meeting on monday with our security team about this, but wanted to check here also :D
Let Rails do the sanitization for you:
ar = [1,2,8,9,100,800]
MyModel.where(id: ar)
your concern for sql injection suggests that ar is derived from user input. It's superfluous, but maybe want to make sure it's a list of integers. ar = user_ar.map(&:to_i).
# with just Rails sanitization
ar = "; drop table users;" # sql injection
MyModel.where(id: ar)
# query is:
# SELECT `my_models`.* from `my_models` WHERE `my_models`.`id` = NULL;
# or
ar = [1,2,8,100,"; drop table users;"]
MyModel.where(id: ar)
# query is
# SELECT `my_models`.* from `my_models` WHERE `my_models`.`id` in (1,2,8,100);
Rails has got you covered!
With Arel you could compose that query as:
class Aggregator
def initialize(connection: ActiveRecord::Base.connection)
#connection = connection
#t1 = Arel::Table.new('t1')
#t2 = Arel::Table.new('t2')
#t3 = Arel::Table.new('t3')
#t4 = Arel::Table.new('t4')
#t5 = Arel::Table.new('t5')
#columns = [
:col1,
:col2,
#t2[:col1],
#t2[:col2],
#t3[:col3],
#t3[:col4],
#t4[:col7],
#t4[:col8],
#t5[:col5],
#t5[:col6]
]
end
def query(t2_ids:, t4_id:)
agg_data = t1.project(*columns)
.where(
t2[:id].in(t2_ids)
.and(t4[:id].eq(t4_id))
)
.join(t2).on(t1[:id].eq(t2[:t1_id]))
.join(t3).on(t1[:id].eq(t3[:t1_id]))
.join(t4).on(t1[:id].eq(t4[:t1_id]))
.join(t5).on(t1[:id].eq(t5[:t1_id]))
.as('agg_data')
yield agg_data if block_given?
t1.project('JSON_AGG(agg_data)')
.from(agg_data)
end
def exec_query(t2_ids:, t4_id:)
connection.exec_query(
query(t2_ids: t2_ids, t4_id: t4_id),
"-- CUSTOM SQL --"
)
end
private
attr_reader :connection, :t1, :t2, :t3, :t4, :t5, :columns
end
Of course it would be a lot cleaner to just setup some models so that you can do t1.joins(:t2, :t3, :t4, ...). Your performance concerns are pretty unfounded as ActiveRecord has quite a few methods to query and get raw results instead of model instances.
Using bind variables for a WHERE IN () condition is somewhat problematic as you have to use a matching number of bind variables to the number of elements in the list:
irb(main):118:0> T1.where(id: [1, 2, 3])
T1 Load (0.2ms) SELECT "t1s".* FROM "t1s" WHERE "t1s"."id" IN (?, ?, ?) /* loading for inspect */ LIMIT ?
Which means that you have to know the number of bind variables beforehand when preparing the query. As a hacky workaround you can use some creative typecasting to get Postgres to split a comma seperated string into an array:
class Aggregator
# ...
def query
agg_data = t1.project(*columns)
.where(
t2[:id].eq('any (string_to_array(?)::int[])')
.and(t4[:id].eq(Arel::Nodes::BindParam.new('$2')))
)
.join(t2).on(t1[:id].eq(t2[:t1_id]))
.join(t3).on(t1[:id].eq(t3[:t1_id]))
.join(t4).on(t1[:id].eq(t4[:t1_id]))
.join(t5).on(t1[:id].eq(t5[:t1_id]))
.as('agg_data')
yield agg_data if block_given?
t1.project('JSON_AGG(agg_data)')
.from(agg_data)
end
def exec_query(t2_ids:, t4_id:)
connection.exec_query(
query,
"-- CUSTOM SQL --"
[
[t2_ids.map {|id| Arel::Nodes.build_quoted(id) }.join(',')],
[t4_id]
]
)
end
# ...
end

How to insert custom value after validation in rails model

This has been really difficult to find information on. The crux of it all is that I've got a Rails 3.2 app that accesses a MySQL database table with a column of type POINT. Without non-native code, rails doesn't know how to interpret this, which is fine because I only use it in internal DB queries.
The problem, however, is that it gets cast as an integer, and forced to null if blank. MySQL doesn't allow null for this field because there's an index on it, and integers are invalid, so this effectively means that I can't create new records through rails.
I've been searching for a way to change the value just before insertion into the db, but I'm just not up enough on my rails lit to pull it off. So far I've tried the following:
...
after_validation :set_geopoint_blank
def set_geopoint_blank
raw_write_attribute(:geopoint, '') if geopoint.blank?
#this results in NULL value in INSERT statement
end
---------------------------
#thing_controller.rb
...
def create
#thing = Thing.new
#thing.geopoint = 'GeomFromText("POINT(' + lat + ' ' + lng + ')")'
#thing.save
# This also results in NULL and an error
end
---------------------------
#thing_controller.rb
...
def create
#thing = Thing.new
#thing.geopoint = '1'
#thing.save
# This results in `1` being inserted, but fails because that's invalid spatial data.
end
To me, the ideal would be to be able to force rails to put the string 'GeomFromText(...)' into the insert statement that it creates, but I don't know how to do that.
Awaiting the thoughts and opinions of the all-knowing community....
Ok, I ended up using the first link in steve klein's comment to just insert raw sql. Here's what my code looks like in the end:
def create
# Create a Thing instance and assign it the POSTed values
#thing = Thing.new
#thing.assign_attributes(params[:thing], :as => :admin)
# Check to see if all the passed values are valid
if #thing.valid?
# If so, start a DB transaction
ActiveRecord::Base.transaction do
# Insert the minimum data, plus the geopoint
sql = 'INSERT INTO `things`
(`thing_name`,`thing_location`,`geopoint`)
values (
"tmp_insert",
"tmp_location",
GeomFromText("POINT(' + params[:thing][:lat].to_f.to_s + ' ' + params[:thing][:lng].to_f.to_s + ')")
)'
id = ActiveRecord::Base.connection.insert(sql)
# Then load in the newly-created Thing instance and update it's values with the passed values
#real_thing = Thing.find(id)
#real_thing.update_attributes(b, :as => :admin)
end
# Notify the user of success
flash[:message] = { :header => 'Thing successfully created!' }
redirect_to edit_admin_thing_path(#real_thing)
else
# If passed values not valid, alert and re-render form
flash[:error] = { :header => 'Oops! You\'ve got some errors:', :body => #thing.errors.full_messages.join("</p><p>").html_safe }
render 'admin/things/new'
end
end
Not beautiful, but it works.

Rails: batched attribute queries using AREL

I'd like to use something like find_in_batches, but instead of grouping fully instantiated AR objects, I would like to group a certain attribute, like, let's say, the id. So, basically, a mixture of using find_in_batches and pluck:
Cars.where(:engine => "Turbo").pluck(:id).find_in_batches do |ids|
puts ids
end
# [1, 2, 3....]
# ...
Is there a way to do this (maybe with Arel) without having to write the OFFSET/LIMIT logic myself or recurring to pagination gems like will paginate or kaminari?
This is not the ideal solution, but here's a method that just copy-pastes most of find_in_batches but yields a relation instead of an array of records (untested) - just monkey-patch it into Relation :
def in_batches( options = {} )
relation = self
unless arel.orders.blank? && arel.taken.blank?
ActiveRecord::Base.logger.warn("Scoped order and limit are ignored, it's forced to be batch order and batch size")
end
if (finder_options = options.except(:start, :batch_size)).present?
raise "You can't specify an order, it's forced to be #{batch_order}" if options[:order].present?
raise "You can't specify a limit, it's forced to be the batch_size" if options[:limit].present?
relation = apply_finder_options(finder_options)
end
start = options.delete(:start)
batch_size = options.delete(:batch_size) || 1000
relation = relation.reorder(batch_order).limit(batch_size)
relation = start ? relation.where(table[primary_key].gteq(start)) : relation
while ( size = relation.size ) > 0
yield relation
break if size < batch_size
primary_key_offset = relation.last.id
if primary_key_offset
relation = relation.where(table[primary_key].gt(primary_key_offset))
else
raise "Primary key not included in the custom select clause"
end
end
end
With this, you should be able to do :
Cars.where(:engine => "Turbo").in_batches do |relation|
relation.pluck(:id)
end
this is not the best implementation possible (especially in regard to primary_key_offset calculation, which instantiates a record), but you get the spirit.

Call display value for choices with SQL

I'm wanting to write SQL for a Django field that uses CharField(choices=()) and have the display value show up in the SQL rather than the call value. Any idea how to do this? It's similar to get_FOO_display().
For reference's sake, here's my model:
class Person(models.Model):
STUDENT_CHOICES=(
(0,'None'),
(1,'UA Current LDP'),
(2,'UA LDP Alumni'),
(3,'MSU Current LDP'),
(4,'MSU LDP Alumni')
)
...
studentStatus=models.IntegerField(choices=STUDENT_CHOICES, verbose_name="Student Status", null=True, blank=True)
And my query:
def mailingListQuery(request):
...
if request.POST:
...
sql = """
...
per."studentStatus" # Here's where I want to access the display value
left outer join person as per on (per.party_id = p.id)
"""
Thanks in advance!
You can use something like:
STUDENT_CHOICES=(
('None', 'None'),
)
Also, avoid using raw SQL. If you really need it - always use parametrized queries
connection.cursor().execute('sql with %s params', [params])

rails 3 where statement

In rails 3, the where statement of active record returns an active record object. i.e it uses lazy loading like
cars = Car.where(:colour => 'black') # No Query
cars.each {|c| puts c.name } # Fires "select * from cars where ..."
but when I fires,
cars = Car.where(:colour => 'black')
in console, it returns the result without this lazy loading why ?
Your console implicitly calls inspect on the result of your expression, which triggers the query.
You can avoid the inspection by appending a semicolon:
cars = Car.where(:colour => 'black');