I'm trying to do native SQL in Doctrine. Basically I have 2 parameters:
CANDIDATE_ID - user for who we delete entries,
list of FILE_ID to keep
So I make
$this->getEntityManager()->getConnection()->
executeUpdate( "DELETE FROM FILE WHERE CANDIDATE_ID = :ID AND NOT ID IN :KEEPID",
array(
"ID" => $candidate->id,
"KEEPID" => array(2) )
);
But Doctrine fails:
Notice: Array to string conversion in D:\xampp\htdocs\azk\vendor\doctrine\dbal\lib\Doctrine\DBAL\Connection.php on line 786
Is this bug in Doctrine? I'm making somewhere else select with IN but with QueryBuilder and it's working. Maybe someone could suggest better way of deleting entries, with QueryBuilder for example?
$stmt = $conn->executeQuery('SELECT * FROM articles WHERE id IN (?)',
array(array(1, 2, 3, 4, 5, 6)),
array(\Doctrine\DBAL\Connection::PARAM_INT_ARRAY)
);
From Doctrine's documentation.
You can't pass an array of IDs to a parameter. You can do this for scalar values, but even if this had a 'toString', it wouldn't be what you want.
String concatenation is one method,
"DELETE FROM FILE WHERE CANDIDATE_ID = :ID AND NOT ID IN (". implode(",", $list_of_ids) .")"
But this method goes straight around parameters, and therefore suffers in terms of readability, and is limited to a certain maximum line length, which can vary between databases.
Another approach is to write a function returning a table result, which takes a string of IDs as a parameter.
You could also solve this with a join to a table containing the IDs to keep.
It's a problem I've seen many times with few good answers, but it's usually caused by a misunderstanding in the way the database is modelled. This is a 'code smell' for database access.
Related
I am wondering how I can use where cause with the ActiveRecord find method.
Here is the code I am using:
Supplier.joins(:products).find(params[:id]).where('suppliers.permalink = ? AND variants.master = ?', params[:id], TRUE)
which gives me:
undefined method `where' for #<Supplier:0x007fe49b4eb330>
Supplier.joins(:products).find(params[:id]).where('suppliers.permalink = ? AND variants.master = ?', params[:id], TRUE)
What you're doing here is finding the first record with the id contained in params[:id], then trying to run a where statement on that single record. where only works when run against the model itself.
The confusing part here is that you are using params[:id] both for the primary key (find searches the id field) but then also comparing it to the permalink column in the where clause.
To explain the usage of both methods:
find will search for result(s) from the table, matching the argument you provide it to the id field. You can pass in multiple id's and this method is mostly used to select a row that you know exists, by id. Most commonly it is used with a single id and returns a single instance.
where is used to find all results from the table that match the clause and return a collection of records. You can then refine these results or select one, for example by using .first:
Supplier.joins(:products).where('suppliers.permalink = ? AND variants.master = ?', params[:permalink], true).first
(Note that you're using joins(:products) but then querying variants table. Is this incorrect?)
Supplier.joins(:products).where('suppliers.permalink = ? AND variants.master = ?', params[:id], TRUE).find(params[:id])
I'm trying to learn about SQL injections and have tried to implement these, but when I put this code in my controller:
params[:username] = "johndoe') OR admin = 't' --"
#user_query = User.find(:first, :conditions => "username = '#{params[:username]}'")
I get the following error:
Couldn't find all Users with 'id': (first, {:conditions=>"username = 'johndoe') OR admin = 't' --'"}) (found 0 results, but was looking for 2)
I have created a User Model with the username "johndoe", but I am still getting no proper response. BTW I am using Rails 4.
You're using an ancient Rails syntax. Don't use
find(:first, :condition => <condition>) ...
Instead use
User.where(<condtion>).first
find accepts a list of IDs to lookup records for. You're giving it an ID of :first and an ID of condition: ..., which aren't going to match any records.
User.where(attr1: value, attr2: value2)
or for single items
User.find_by(attr1: value, attr2: value)
Bear in mind that while doing all this, it would be valuable to check what the actual sql statement is by adding "to_sql" to the end of the query method (From what I remember, find_by just does a LIMIT by 1)
I am using Doctrine2 and Zf2 , now when I need to fetch count of rows, I have got the following two ways to fetch it. But my worry is which will be more optimized and faster way, as in future the rows would be more than 50k. Any suggestions or any other ways to fetch the count ?? Is there any function to get count which can be used with findBy ???
Or should I use normal Zf2 Database library to fetch count. I just found that ORM is not preferred to fetch results when data is huge. Please any help would be appreciated
$members = $this->getEntityManager()->getRepository('User\Entity\Members')->findBy(array('id' => $id, 'status' => '1'));
$membersCnt = sizeof($members);
or
$qb = $this->getEntityManager()->createQueryBuilder();
$qb->select('count(p)')
->from('User\Entity\Members', 'p')
->where('p.id = '.$id)
->andWhere('p.status = 1');
$membersCnt = $qb->getQuery()->getSingleScalarResult();
Comparison
1) Your EntityRepository::findBy() approach will do this:
Query the database for the rows matching your criteria. The database will return the complete rows.
The database result is then transformed (hydrated) into full PHP objects (entities).
2) Your EntityManager::createQueryBuilder() approach will do this:
Query the database for the number of rows matching your criteria. The database will return a simple number (actually a string representing a number).
The database result is then transformed from a string to a PHP integer.
You can safely conclude that option 2 is far more efficient than option 1:
The database can optimize the query for counting, which might make the query faster (take less time).
Far less data is returned from the database.
No entities are hydrated (only a simple string to integer cast).
All in all less processing power and less memory will be used.
Security comment
Never concatenate values into a query!
This can make you vulnerable to SQL injection attacks when those values are (derived from) user-input.
Also, Doctrine2 can't make use of prepared statements / parameter binding, which can lead to some performance-loss when the same query is used often (with or without different parameters).
In other words, replace this:
->where('p.id = '.$id)
->andWhere('p.status = 1')
with this:
->where('p.id = :id')
->andWhere('p.status = :status')
->setParameters(array('id' => $id, 'status' => 1))
or:
->where($qb->expr()->andX(
$qb->expr()->eq('p.id', ':id'),
$qb->expr()->eq('p.status', ':status')
)
->setParameters(array('id' => $id, 'status' => 1))
Additionally
For this particular query, there's no need to use the QueryBuilder, you can use straight DQL in stead:
$dql = 'SELECT COUNT(p) FROM User\Entity\Members p WHERE p.id = :id AND p.status = :status';
$q = $this->getEntityManager()->createQuery($dql);
$q->setParameters(array('id' => $id, 'status' => 1));
$membersCnt = $q->getSingleScalarResult();
You should totally go to the dql version of the count.
With the first method you will hydrate (convert from db resultset to objects) each of the rows as single object and put them on one array and then count the amount items in that array. That will be a totally waste of memory and cycles if the only objective is to know the number of elements in that result set.
With the second method the dql will be gracefully converted to SELECT COUNT(*) Blah blah blah
plain SQL sentence and will retrieve directly the count from db.
The comment about ORM is not preferred to when to retrieve data is huge is true, in big batch process you should paginate your query to retrieve data instead all at the same time to avoid memory overrides but in that case you are only retrieving a single number, the total count so this rule doesn’t apply.
Query builder is so slow .
Use DQL for faster select .
$query = $this->getEntityManager()->createQuery("SELECT count(m) FROM User\Entity\Members m WHERE m.status = 1 AND m.id = :id ");
$query->setParameter(':id', $id);
You need setParameter for prevent SQL injection .
Stored procedure is fastest but it depend on your DB .
Make all relations of entity Lazy.
With a simple model like that
class Model < ActiveRecord::Base
# ...
end
we can do queries like that
Model.where(["name = :name and updated_at >= :D", \
{ :D => (Date.today - 1.day).to_datetime, :name => "O'Connor" }])
Where the values in the hash will be substituted into the final SQL statement with proper escaping depending on the underlying database engine.
I would like to know a similar feature for SQL execution like:
ActiveRecord::Base.connection.execute( \
["update models set name = :name, hired_at = :D where id = :id;"], \
{ :id => 73465, :D => DateTime.now, :name => "O'My God" }] \
) # THIS CODE IS A FANTASY. NOT WORKING.
(Please do not solve the example with loading a Model object, modifying and then saving! The example is only an illustration for the feature I would like to have / know. Concentrate on the subject!)
The original problem is that I want to insert large amount (many thousand lines) of data into the database. I want to use some features of the SQL abstraction of the ActiveRecord framework but I don't want to use model objects based on ActiveRecord::Base because they are damn slow! (8 queries per second for my current problem.)
query = ActiveRecord::Base.connection.raw_connection.prepare("INSERT INTO users (name) VALUES(:name)")
query.execute(:name => 'test_name')
query.close
Extending the #peufeu solution with concrete code example for bulk insert:
users_places = []
users_values = []
timestamp = Time.now.strftime('%Y-%m-%d %H:%M:%S')
params[:users].each do |user|
users_places << "(?,?,?,?)"
users_values << user[:name] << user[:punch_line] << timestamp << timestamp
end
bulk_insert_users_sql_arr = ["INSERT INTO users (name, punch_line, created_at, updated_at) VALUES #{users_places.join(", ")}"] + users_values
begin
sql = ActiveRecord::Base.send(:sanitize_sql_array, bulk_insert_users_sql_arr)
ActiveRecord::Base.connection.execute(sql)
rescue
"something went wrong with the bulk insert sql query"
end
Here is the reference to sanitize_sql_array method in ActiveRecord::Base, it generates the proper query string by escaping the single quotes in the strings. For example the punch_line "Don't let them get you down" will become "Don\'t let them get you down".
Yes you could do raw SQL, but checkout the ar-extensions gem that helps with batch inserts:
https://github.com/zdennis/ar-extensions
Here's a post on it, and various other techniques:
http://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/
For INSERTs, batching them using a long VALUES clause (as shown by Simon's link) is the fastest way (unless you want to generate a text file and load it in your database with MySQL's LOAD DATA INFILE). But you have to be very careful about escaping your text values (which is not done in the example).
I was asking "what database are you using" because it does matter for mass UPDATEs.
For instance, you can do this on postgres (and I believe SQL Server changing "columnX" to "colX" ):
UPDATE foo
JOIN (VALUES (1,2),(3,4),... long list) v ON (foo.id=v.column1)
SET foo.bar = v.column2
And you can update a load of rows using a single statement, very fast.
If you don't need Ruby to perform some Ruby-specific magic on your data, the fastest way to transfer data from one DB to a different one is to export as a text file (CSV or tab separated), load it on the other DB (LOAD DATA INFILE on MySQL), perhaps in a temporary table, and bulk process using SQL.
EDIT : Here's how I do this in Python :
sql = [ "INSERT INTO foo (column list) VALUES " ]
values = []
for tuple in tuple_list:
append "(?,?,?,?)" to sql
extend values list with tuple
Then join sql into a string, you get "INSERT INTO foo (column list) VALUES (?,?,?,?),(?,?,?,?),(?,?,?,?)" with the "(?,?,?,?)" repeated as many times as you have lines to insert.
Then "values" contains a list of (a1,b1,c1,d1,a2,b2,c2,d2,a3,b3,c3,d3) with an,bn,cn,dn being the tuples you want to insert for line n. Each one corresponds to a placeholder in the sql string.
Then pass this to the usual "execute query with parameters" function which will handle quoting and escaping as usual.
I encountered a similar issue recently when tying to insert 100K+ records into a MySQL database for a Rails 4 app using mysql2 gem. The data included characters that had to be sanitized prior to insert.
The solution I ended going with was a slightly modified version of Option 3 described at https://www.coffeepowered.net/2009/01/23/mass-inserting-data-in-rails-without-killing-your-performance/
Here's the relevant code block from the above link:
TIMES = 10000
inserts = []
TIMES.times do
inserts.push "(3.0, '2009-01-23 20:21:13', 2, 1)"
end
sql = "INSERT INTO user_node_scores (`score`, `updated_at`, `node_id`, `user_id`) VALUES #{inserts.join(", ")}"
The modification I made was using the public method ActiveRecord::Base.sanitize() on values that required it.
inserts = []
created = Time.now.strftime "%Y-%m-%d %H:%M:%S"
params[:audits].each do |audit|
inserts.push "(#{audit.user_id), #{created}," + ActiveRecord::Base.sanitize(audit.comment) + ", #{audit.status})"
end
sql = "INSERT INTO user_audits (`user_id`, `created_at`, `comment`, `status`) VALUES #{inserts.join(", ")}"
Per section 2.2 of rails guide on Active Record query interface here:
which seems to indicate that I can pass a string specifying the condition(s), then an array of values that should be substituted at some point while the arel is being built. So I've got a statement that generates my conditions string, which can be a varying number of attributes chained together with either AND or OR between them, and I pass in an array as the second arg to the where method, and I get:
ActiveRecord::PreparedStatementInvalid: wrong number of bind variables (1 for 5)
which leads me to believe I'm doing this incorrectly. However, I'm not finding anything on how to do it correctly. To restate the problem another way, I need to pass in a string to the where method such as "table.attribute = ? AND table.attribute1 = ? OR table.attribute1 = ?" with an unknown number of these conditions anded or ored together, and then pass something, what I thought would be an array as the second argument that would be used to substitute the values in the first argument conditions string. Is this the correct approach, or, I'm just missing some other huge concept somewhere and I'm coming at this all wrong? I'd think that somehow, this has to be possible, short of just generating a raw sql string.
This is actually pretty simple:
Model.where(attribute: [value1,value2])
Sounds like you're doing something like this:
Model.where("attribute = ? OR attribute2 = ?", [value, value])
Whereas you need to do this:
# notice the lack of an array as the last argument
Model.where("attribute = ? OR attribute2 = ?", value, value)
Have a look at http://guides.rubyonrails.org/active_record_querying.html#array-conditions for more details on how this works.
Instead of passing the same parameter multiple times to where() like this
User.where(
"first_name like ? or last_name like ? or city like ?",
"%#{search}%", "%#{search}%", "%#{search}%"
)
you can easily provide a hash
User.where(
"first_name like :search or last_name like :search or city like :search",
{search: "%#{search}%"}
)
that makes your query much more readable for long argument lists.
Sounds like you're doing something like this:
Model.where("attribute = ? OR attribute2 = ?", [value, value])
Whereas you need to do this:
#notice the lack of an array as the last argument
Model.where("attribute = ? OR attribute2 = ?", value, value) Have a
look at
http://guides.rubyonrails.org/active_record_querying.html#array-conditions
for more details on how this works.
Was really close. You can turn an array into a list of arguments with *my_list.
Model.where("id = ? OR id = ?", *["1", "2"])
OR
params = ["1", "2"]
Model.where("id = ? OR id = ?", *params)
Should work
If you want to chain together an open-ended list of conditions (attribute names and values), I would suggest using an arel table.
It's a bit hard to give specifics since your question is so vague, so I'll just explain how to do this for a simple case of a Post model and a few attributes, say title, summary, and user_id (i.e. a user has_many posts).
First, get the arel table for the model:
table = Post.arel_table
Then, start building your predicate (which you will eventually use to create an SQL query):
relation = table[:title].eq("Foo")
relation = relation.or(table[:summary].eq("A post about foo"))
relation = relation.and(table[:user_id].eq(5))
Here, table[:title], table[:summary] and table[:user_id] are representations of columns in the posts table. When you call table[:title].eq("Foo"), you are creating a predicate, roughly equivalent to a find condition (get all rows whose title column equals "Foo"). These predicates can be chained together with and and or.
When your aggregate predicate is ready, you can get the result with:
Post.where(relation)
which will generate the SQL:
SELECT "posts".* FROM "posts"
WHERE (("posts"."title" = "Foo" OR "posts"."summary" = "A post about foo")
AND "posts"."user_id" = 5)
This will get you all posts that have either the title "Foo" or the summary "A post about foo", and which belong to a user with id 5.
Notice the way arel predicates can be endlessly chained together to create more and more complex queries. This means that if you have (say) a hash of attribute/value pairs, and some way of knowing whether to use AND or OR on each of them, you can loop through them one by one and build up your condition:
relation = table[:title].eq("Foo")
hash.each do |attr, value|
relation = relation.and(table[attr].eq(value))
# or relation = relation.or(table[attr].eq(value)) for an OR predicate
end
Post.where(relation)
Aside from the ease of chaining conditions, another advantage of arel tables is that they are independent of database, so you don't have to worry whether your MySQL query will work in PostgreSQL, etc.
Here's a Railscast with more on arel: http://railscasts.com/episodes/215-advanced-queries-in-rails-3?view=asciicast
Hope that helps.
You can use a hash rather than a string. Build up a hash with however many conditions and corresponding values you are going to have and put it into the first argument of the where method.
WRONG
This is what I used to do for some reason.
keys = params[:search].split(',').map!(&:downcase)
# keys are now ['brooklyn', 'queens']
query = 'lower(city) LIKE ?'
if keys.size > 1
# I need something like this depending on number of keys
# 'lower(city) LIKE ? OR lower(city) LIKE ? OR lower(city) LIKE ?'
query_array = []
keys.size.times { query_array << query }
#['lower(city) LIKE ?','lower(city) LIKE ?']
query = query_array.join(' OR ')
# which gives me 'lower(city) LIKE ? OR lower(city) LIKE ?'
end
# now I can query my model
# if keys size is one then keys are just 'brooklyn',
# in this case it is 'brooklyn', 'queens'
# #posts = Post.where('lower(city) LIKE ? OR lower(city) LIKE ?','brooklyn', 'queens' )
#posts = Post.where(query, *keys )
now however - yes - it's very simple. as nfriend21 mentioned
Model.where(attribute: [value1,value2])
does the same thing