I am working on a query and am using exec_query with binds to avoid potential SQL injection. However, I am running into an issue when trying to check that an id is in an array.
SELECT JSON_AGG(agg_date)
FROM (
SELECT t1.col1, t1.col2, t2.col1, t2.col2, t3.col3, t3.col4, t4.col7, t4.col8, t5.col5, t5.col6
FROM t1
JOIN t2 ON t1.id = t2.t1_id
JOIN t3 ON t1.id = t3.t3_id
JOIN t4 ON t2.is = t4.t2_id
JOIN t5 ON t3.id = t5.t3_id
WHERE t2.id IN ($1) AND t4.id = $2
) agg_data
this gives an error of invalid input syntax for integer: '1,2,3,4,5'
And SELECT ... WHERE t.id = ANY($1) gives ERROR: malformed array literal: "1,2,3,4,5,6,7" DETAIL: Array value must start with "{" or dimension information.
If I add the curly braces around the bind variable I get invalid input syntax for integer: "$1"
Here is the way I'm using exec_query
connection.exec_query(<<~EOQ, "-- CUSTOM SQL --", [[nil, array_of_ids], [nil, model_id]], prepare: true)
SELECT ... WHERE t.id IN ($1)
EOQ
I have tried with plain interpolation but that throws brakeman errors about sql injection so I can't use that way :(
Any help on being able to make this check is greatly appreciated. And if exec_query is the wrong way to go about this, I'm definitely down to try other things :D
In my class, I am using AR's internal sql injection prevention to search for the first bind variable ids, then plucking the ids and joining into a string for the sql query. I am doing the same for the other bind variable, finding the object and using that id. Just as a further precaution. So by the time the user inputs are used for the query, they've been through AR already. It's a brakeman scan that it throwing the error. I ahve a meeting on monday with our security team about this, but wanted to check here also :D
Let Rails do the sanitization for you:
ar = [1,2,8,9,100,800]
MyModel.where(id: ar)
your concern for sql injection suggests that ar is derived from user input. It's superfluous, but maybe want to make sure it's a list of integers. ar = user_ar.map(&:to_i).
# with just Rails sanitization
ar = "; drop table users;" # sql injection
MyModel.where(id: ar)
# query is:
# SELECT `my_models`.* from `my_models` WHERE `my_models`.`id` = NULL;
# or
ar = [1,2,8,100,"; drop table users;"]
MyModel.where(id: ar)
# query is
# SELECT `my_models`.* from `my_models` WHERE `my_models`.`id` in (1,2,8,100);
Rails has got you covered!
With Arel you could compose that query as:
class Aggregator
def initialize(connection: ActiveRecord::Base.connection)
#connection = connection
#t1 = Arel::Table.new('t1')
#t2 = Arel::Table.new('t2')
#t3 = Arel::Table.new('t3')
#t4 = Arel::Table.new('t4')
#t5 = Arel::Table.new('t5')
#columns = [
:col1,
:col2,
#t2[:col1],
#t2[:col2],
#t3[:col3],
#t3[:col4],
#t4[:col7],
#t4[:col8],
#t5[:col5],
#t5[:col6]
]
end
def query(t2_ids:, t4_id:)
agg_data = t1.project(*columns)
.where(
t2[:id].in(t2_ids)
.and(t4[:id].eq(t4_id))
)
.join(t2).on(t1[:id].eq(t2[:t1_id]))
.join(t3).on(t1[:id].eq(t3[:t1_id]))
.join(t4).on(t1[:id].eq(t4[:t1_id]))
.join(t5).on(t1[:id].eq(t5[:t1_id]))
.as('agg_data')
yield agg_data if block_given?
t1.project('JSON_AGG(agg_data)')
.from(agg_data)
end
def exec_query(t2_ids:, t4_id:)
connection.exec_query(
query(t2_ids: t2_ids, t4_id: t4_id),
"-- CUSTOM SQL --"
)
end
private
attr_reader :connection, :t1, :t2, :t3, :t4, :t5, :columns
end
Of course it would be a lot cleaner to just setup some models so that you can do t1.joins(:t2, :t3, :t4, ...). Your performance concerns are pretty unfounded as ActiveRecord has quite a few methods to query and get raw results instead of model instances.
Using bind variables for a WHERE IN () condition is somewhat problematic as you have to use a matching number of bind variables to the number of elements in the list:
irb(main):118:0> T1.where(id: [1, 2, 3])
T1 Load (0.2ms) SELECT "t1s".* FROM "t1s" WHERE "t1s"."id" IN (?, ?, ?) /* loading for inspect */ LIMIT ?
Which means that you have to know the number of bind variables beforehand when preparing the query. As a hacky workaround you can use some creative typecasting to get Postgres to split a comma seperated string into an array:
class Aggregator
# ...
def query
agg_data = t1.project(*columns)
.where(
t2[:id].eq('any (string_to_array(?)::int[])')
.and(t4[:id].eq(Arel::Nodes::BindParam.new('$2')))
)
.join(t2).on(t1[:id].eq(t2[:t1_id]))
.join(t3).on(t1[:id].eq(t3[:t1_id]))
.join(t4).on(t1[:id].eq(t4[:t1_id]))
.join(t5).on(t1[:id].eq(t5[:t1_id]))
.as('agg_data')
yield agg_data if block_given?
t1.project('JSON_AGG(agg_data)')
.from(agg_data)
end
def exec_query(t2_ids:, t4_id:)
connection.exec_query(
query,
"-- CUSTOM SQL --"
[
[t2_ids.map {|id| Arel::Nodes.build_quoted(id) }.join(',')],
[t4_id]
]
)
end
# ...
end
I have this Raw query, but when I run it I get a sql syntax error.
here is the date params i use :
$last_week = Carbon::now()->subdays(7);
$last_2_week = Carbon::now()->subdays(14);
here is the query:
DB::raw("(SELECT SUM(vd.qty_available) FROM products AS P
JOIN variations AS V ON V.product_id=P.id
JOIN variation_location_details AS VD ON VD.variation_id=V.id
JOIN transaction_sell_lines as ts ON ts.variation_id=v.id
JOIN transactions as t ON ts.transaction_id=t.id
WHERE t.transaction_date>=$last_2_week) AS last_2_week_quantity"),
error :SQLSTATE[42000]: Syntax error or access violation: 1064
The actual SQL output will be similar to WHERE t.transaction_date >= 2020-01-01 00:00:00
which is invalid, you should wrap it in a single quotes.
something like
WHERE t.transaction_date>= '{$last_2_week}'
or better
WHERE t.transaction_date>= '{$last_2_week->toDateTimeString()}'
The toDateTimeString() forces the output to be in the correct format in this case.
or even better use the parameter binding
->whereRaw("... t.transaction_date>= ?", [$last_2_week->toDateTimeString()])
Put some space between the operand.
Instead of:
WHERE t.transaction_date>=$last_2_week) AS last_2_week_quantity")
Put:
WHERE t.transaction_date >= $last_2_week) AS last_2_week_quantity")
In my Ruby on Rails app I'm using blazer(https://github.com/ankane/blazer) and I have the following sql query:
SELECT *
FROM survey_results sr
LEFT JOIN clients c ON c.id = sr.client_id
WHERE sr.client_id = {client_id}
This query works really well. But I need to add conditional logic to check if client_id variable is present. If yes then I filter by this variable, if not then I not launching this where clause. How can I do it in PostgreSQL?
Check if its null OR your condition like this:
WHERE {client_id} IS NULL OR sr.client_id = {client_id}
The WHERE clause evaluate as follow: If the variable is empty, then the WHERE clause evaluate to true, and therefore - no filter. If not, it continue to the next condition of the OR
If anyone faced with the psql operator does not exist: bigint = bytea issue, here is my workaround:
WHERE ({client_id} < 0 AND sr.client_id > {client_id}) OR sr.client_id = {client_id}
Please consider that, client_id generally cannot be negative so you can use that information for eleminating the operation cast issue.
My solution:
I use spring data jpa, native query.
Here is my repository interface signature.
#Query(... where (case when 0 in :itemIds then true else i.id in :itemIds end) ...)
List<Item> getItems(#Param("itemIds) List<Long> itemIds)
Prior calling this method, I check if itemIds is null. If yes, I set value to 0L:
if(itemIds == null) {
itemIds = new ArrayList<Long>();
itemIds.add(0L);
}
itemRepo.getItems(itemIds);
My IDs starts from 1 so there is no case when ID = 0.
I am trying the following code but nhibernate is throwing the following exception:
Expression type 'NhSumExpression' is not supported by this SelectClauseVisitor.
var data =
(
from a in session.Query<Activity>()
where a.Date.Date >= dateFrom.Date && a.Date.Date <= dateTo.Date
group a by new { Date = a.Date.Date, UserId = a.RegisteredUser.ExternalId } into grp
select new ActivityData()
{
UserID = grp.Key.UserId,
Date = grp.Key.Date,
Bet = grp.Sum(a => a.Amount < 0 ? (a.Amount * -1) : 0),
Won = grp.Sum(a => a.Amount > 0 ? (a.Amount) : 0)
}
).ToArray();
I've been looking around and found this answer
But I am not sure what I should use in place of the Projections.Constant being used in that example, and how I should create a group by clause consisting of multiple fields.
It looks like your grouping over multiple columns is correct.
This issue reported in the NHibernate bug tracker is similar: NH-2865 - "Expression type 'NhSumExpression' is not supported by this SelectClauseVisitor."
Problem is that apart from the less-than-helpful error message, it's not really a bug as such. What happens in NH-2865 is that the Sum expression contains something which NHibernate doesn't know how to convert into SQL, which result in this exception being thrown by a later part of the query processing.
So the question is, what does you sum expression contains that NHibernate cannot convert? The thing that jumps to mind is the use of the ternary operator. I believe the NHibernate LINQ provider has support for the ternary operator, but maybe there is something in this particular combination that is problematic.
However, I think your expressions can be written like this instead:
Bet = grp.Sum(a => Math.Min(a.Amount, 0) * -1), // Or Math.Abs() instead of multiplication.
Won = grp.Sum(a => Math.Max(a.Amount, 0))
If that doesn't work, try to use a real simple expression instead, like the following. If that works, we at least know the grouping itself work as expected.
Won = grp.Sum(a => a.Amount)
In the following code find_by_sql fails with exception: wrong number of parameters (0 for 1).
Any idea what's going on?
def filter_new_unfollowers(unfollower_ids)
relationships = TwitterRelationship.find_by_sql["SELECT * FROM twitter_relationships
INNER JOIN twitter_identities ON (twitter_identities.twitter_id=twitter_relationships.source_twitter_id)
INNER JOIN member_twitter_identities ON (member_twitter_identities.twitter_identity_id = twitter_identities.id)
WHERE member_twitter_identities.member_id IN (?)", unfollower_ids]
end
The way you wrote it, you are trying to execute find_by_sql with no arguments, and then call the [] operator on the result (but it failed before you got that far).
You need a space before the "[". To be even more clear, I would put parentheses around the array argument "...find_by_sql([...])".
Try adding brackets:
relationships = TwitterRelationship.find_by_sql(["SELECT * FROM twitter_relationships
INNER JOIN twitter_identities ON (twitter_identities.twitter_id=twitter_relationships.source_twitter_id)
INNER JOIN member_twitter_identities ON (member_twitter_identities.twitter_identity_id = twitter_identities.id)
WHERE member_twitter_identities.member_id IN (?)", unfollower_ids])