I have a series of INSERT statements to execute, and I would like to use Laravel's Eloquent to do so:
$mountain = Mountain::find(1);
$dragons = [1, 2, 3, 4];
foreach ($dragons as $id){
$mountain->dragons()->attach($id); // Eloquent performs an `INSERT` here
}
I know that Eloquent uses prepared statements, but will it re-prepare the same query over each iteration, or is it smart enough to cache the prepared INSERT statement on the first iteration, and then simply run ->execute on each subsequent iteration?
From PHP's PDO documentation:
By using a prepared statement the application avoids repeating the analyze/compile/optimize cycle. This means that prepared statements use fewer resources and thus run faster.
I realize that saving every prepared statement ever throughout the application lifecycle might not be the best idea, but in certain cases it seems warranted. If Eloquent doesn't do this by default, is there a way to tell it to cache the prepared statements for a particular model operation?
Checkout Illuminate\Database\Eloquent\Relations\BelongsToMany.php::attach.
There is a call to $query->insert, which is an instance of the Illuminate\Database\Query\Builder class.
Now look at method insert in said Builder class.
I won't post the Laravel code, as it's too long, but you can clearly see that the query is compiled and executed against the current connection instantly.
The advantage of open source code is that you are free to look through it, so don't be scared of looking through it.
As noted by Your Common Sense, the time savings here are minute. If you are performing many inserts, this won't be your bottleneck. Perhaps look at queues or background services - the user shouldn't need to wait while your app performs a 1000 inserts.
Well, smart application certainly would not keep every prepared statement open forever just in case. Such a behavior will do way much more harm than good.
And about 'good', you're putting too much meaning in that "caching" and "analyze/compile/optimizing". An idea of speed gain from prepared statement is a bit exaggerated. And speaking of your particular case, there is not much to optimize in the INSERT query, you know. You'll never be able to measure such an "optimization".
Thus, there is not much to worry about.
Related
Is it possible to use WAITFOR in a SQL Server View?
I have a view that is being referenced by a few applications and I need to introduce some slowness to the view to do some application testing (to test how well they handle it).
Unfortunately, I can't find a way to edit this view and make it run slowly.
There's no way that you can make a SQL view have a delay. It doesn't even really make sense to do so.
If you really want to introduce some wait time, you could theoretically build a stored procedure that sleeps, then returns the results of your query.
Alternatively, you could introduce the slow down into your application logic (which makes more sense).
A VIEW is not a program - in fact SQL DML statements themselves are not programs at all: they do not represent a series of instructions. Instead they're a representation of relational-algebra, so the idea of having a delay in them is meaningless.
I note that a VIEW is ultimately always referenced by a SELECT statement. You can add delays around the SELECT if it's inside a PROCEDURE or non-inline FUNCTION, but as others have said, you cannot add a delay between rows in a SQL Server result set.
I have a view that is being referenced by a few applications and I need to introduce some slowness to the view to do some application testing (to test how well they handle it).
I think a better way of testing this is to use actual testing tools, such as a mocking, stub, or fakes system.
Assuming it's a .NET system you're targeting, you could subclass SqlDataReader that has an await Task.Delay(100) inside the virtual MoveNextAsync method, for example.
If you don't have any way of modifying the application's source code, you could use a network speed limiter to artificially reduce your computer's network speed to less than a kilobyte-per-second.
Another approach might be to write a custom script for Wireshark that detects and parses TDS (Table Data Stream, SQL Server's wire protocol) and proxies it as a new server while inserting delays of its own. This may be the best approach for a long-term solution as it lends itself well to other projects - you or your company could sell it as a database latency testing tool and make a nice little earning from it.
Finally, you could switch from a VIEW to a CURSOR that has a WAITFOR DELAY step between each FETCH instruction - but this may require changing the application source code extensively, and for little gain as queries generated from stateless components (inline functions, views, and SELECT) will always have a better runtime execution plan and provable correctness. (Cursors are a legacy from xBase-style databases - avoid them, only use them if your query cannot be expressed as a SELECT).
No, this is not possible.
You could use a stored procedure instead to achieve this functionality.
If we have a prepared statement like:
SELECT my_func($1::text, $2::int)
Is there is any gain in speed if I prepare a statement with this call and do the call via the prepared statement.
Let me quote the docs here:
Prepared statements have the largest performance advantage when a
single session is being used to execute a large number of similar
statements. The performance difference will be particularly
significant if the statements are complex to plan or rewrite, for
example, if the query involves a join of many tables or requires the
application of several rules. If the statement is relatively simple to
plan and rewrite but relatively expensive to execute, the performance
advantage of prepared statements will be less noticeable.
Emphasize is mine. I think it clearly states in which conditions PREPARE can have benefits.
Still, all languages currently provide a native way to prepare statements (like PHP), so the overall machinery is executed for you behind the scenes.
To make it short:
if it is a one-timer from the client, execute directly;
if it comes from the application and assumes user input, use your platform and it's functionality to prepare for security reasons;
if statement is executed many times within a session, use any means (either PREPARE or platform's functionality) to prepare for performance reasons.
All that I will do is insert records into the database.
However, I would definitly like the database-independence that Ruby/ActiveRecord can offer me.
What would be the reccommended choice?
Using simple insert queries, and rewriting them, and also maintaining my own database class for things as batch insertions;
Using the power of ActiveRecord, but also having the overhead;
Some other solution, mayhaps?
For the record, optimistically speaking I'll be doing an insertion per second/couple of seconds.
I'll also (probably) be using ActiveRecord for reading from the database later on - but, in a different application.
The main reason for writing your own queries would be to optimize performance if Active Record would prove too inefficient. But since one insert per second isn't really that much of a load, Active Record's performance will probably be more than enough for your needs.
Thus, I would definitely use Active Record in this situation -- there's no need to bother with your own database wrapper unless you really need to. Also, an extra bonus is that you can reuse the model definitions for reading data later on.
I've been reading a lot about prepared statements and in everything I've read, no one talks about the downsides of using them. Therefore, I'm wondering if there are any "there be dragons" spots that people tend to overlook?
Prepared statement is just a parsed and precompiled SQL statement which just waits for the bound variables to be provided to be executed.
Any executed statement becomes prepared sooner or later (it need to be parsed, optimized, compiled and then executed).
A prepared statement just reuses the results of parsing, optimization and compilation.
Usually database systems use some kind of optimization to save some time on query preparation even if you don't use prepared queries yourself.
Oracle, for instance, when parsing a query first checks the library cache, and if the same statement had already been parsed, it uses the cached execution plan instead.
If you use a statement only once, or if you automatically generate dynamic sql statements (and either properly escape everythin or know for certain your parameters have only safe characters) then you should not use prepared statements.
There is one other small issue with prepared statements vs dynamic sql, and that is that it can be harder to debug them. With dynamic sql, you can always just write out a problem query to a log file and run it directly on the server exactly as your program sees it. With prepared statements it can take a little more work to test your query with a specific set of parameters determined from crash data. But not that much more, and the extra security definitely justifies the cost.
in some situations, the database engine might come up with an inferior query plan when using a prepared statement (because it can't make the right assumptions without having the actual bind values for a search).
see e.g. the "Notes" section at
http://www.postgresql.org/docs/current/static/sql-prepare.html
so it might be worth testing your queries with and without preparing statements to find out which is faster. ideally, you would then decide on a per-statement basis whether to use prepared statements or not, although not all ORMs will allow you to do that.
The only downside that I can think of is that they take up memory on the server. It's not much, but there are probably some edge cases where it would be a problem but I'm hard pressed to think of any.
One of my co-workers claims that even though the execution path is cached, there is no way parameterized SQL generated from an ORM is as quick as a stored procedure. Any help with this stubborn developer?
I would start by reading this article:
http://decipherinfosys.wordpress.com/2007/03/27/using-stored-procedures-vs-dynamic-sql-generated-by-orm/
Here is a speed test between the two:
http://www.blackwasp.co.uk/SpeedTestSqlSproc.aspx
Round 1 - You can start a profiler trace and compare the execution times.
For most people, the best way to convince them is to "show them the proof." In this case, I would create a couple basic test cases to retrieve the same set of data, and then time how long it takes using stored procedures versus NHibernate. Once you have the results, hand it over to them and most skeptical people should yield to the evidence.
I would only add a couple things to Rob's answer:
First, Make sure the amount of data involved in the test cases is similiar to production values. In other words if your queries are normally against tables with hundreds of thousands or rows, then create such a test environment.
Second, make everything else equal except for the use of an nHibernate generated query and a s'proc call. Hopefully you can execute the test by simply swapping out a provider.
Finally, realize that there is usually a lot more at stake than just stored procedures vs. ORM. With that in mind the test should look at all of the factors: execution time, memory consumption, scalability, debugging ability, etc.
The problem here is that you've accepted the burden of proof. You're unlikely to change someone's mind like that. Like it or not, people--even programmers-- are just too emotional to be easily swayed by logic. You need to put the burden of proof back on him- get him to convince you otherwise- and that will force him to do the research and discover the answer for himself.
A better argument to use stored procedures is security. If you use only stored procedures, with no dynamic sql, you can disable SELECT, INSERT, UPDATE, DELETE, ALTER, and CREATE permissions for the application database user. This will protect you against most 2nd order SQL Injection, whereas parameterized queries are only effective against first order injection.
Measure it, but in a non-micro-benchmark, i.e. something that represents real operations in your system. Even if there would be a tiny performance benefit for a stored procedure it will be insignificant against the other costs your code is incurring: actually retrieving data, converting it, displaying it, etc. Not to mention that using stored procedures amounts to spreading your logic out over your app and your database with no significant version control, unit tests or refactoring support in the latter.
Benchmark it yourself. Write a testbed class that executes a sampled stored procedure a few hundred times, and run the NHibernate code the same amount of times. Compare the average and median execution time of each method.
It is just as fast if the query is the same each time. Sql Server 2005 caches query plans at the level of each statement in a batch, regardless of where the SQL comes from.
The long-term difference might be that stored procedures are many, many times easier for a DBA to manage and tune, whereas hundreds of different queries that have to be gleaned from profiler traces are a nightmare.
I've had this argument many times over.
Almost always I end up grabbing a really good dba, and running a proc and a piece of code with the profiler running, and get the dba to show that the results are so close its negligible.
Measure it.
Really, any discussion on this topic is probably futile until you've measured it.
He may be correct for the specific use case he is thinking of. A stored procedure will probably execute faster for some complex set of SQL, that can be arbitrarily tuned. However, something you get from things like hibernate is caching. This may prove much faster for the lifetime of your actual application.
The additional layer of abstraction will cause it to be slower than a pure call to a sproc. Just by the fact that you have additional allocations on the managed heap, and additional pushes and pops off the callstack, the truth of the matter is that it is more efficient to call a sproc over having an ORM build the query, regardless how good the ORM is.
How slow, if its even measurable, is debatable. This is also helped by the fact that most ORM's have a caching mechanism to avoid doing the query at all.
Even if the stored procedure is 10% faster (it probably isn't), you may want to ask yourself how much it really matters. What really matters in the end, is how easy it is to write and maintain code for your system. If you are coding a web app, and your pages all return in 0.25 seconds, then the extra time saved by using stored procedures is negligible. However, there can be many added advantages of using an ORM like NHibernate, which would be extremely hard to duplicate using only stored procedures.