First query slow on Firebird - sql

The first query run on a large dataset on a Firebird database after starting our application is always very slow. Subsequent calls to the same query (it is a stored procedure) are fine. I assume that this is to do with something being loaded into memory but I could do with a explanation of what and whether there is anything that can be done to get around the issue.

If is a stored procedure the first query it compiles the stored procedure also it fetches the buffers and caches the result.
On the second query the procedure is not compiled again (precached) and the results are instant (the fetches are also in memory for some operating systems so no need for disk io)
one way is to optimize the sp or the tables
How larger are they? (number of records for each table)
one simple way to optimize this is to put a cron script that will run once per day/hour to prefill the caches so you will get fast sp

Maybe it's not about the query, but the connection time (delay) is long? There was such a problem with [old] Firebird/Interbase engines.

You didn't explain which Firebird version you are using but, in version 2.50, there is a bug (CORE 3227 - slow compilation of stored procedures) that can be the cause of your problem. More details:
http://www.firebirdnews.org/?p=5282&utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+FirebirdNews+%28Firebird+News%29

Related

SQL Server Stored Procedure Caching Old Results Even After Update

I have a weird issue I've never ran into before. I have a stored procedure that joins a bunch of data. When I do an update on a table that is found in one of the joins it doesnt update, until say 20-30 seconds later, or not at all. I see the value updated in the actual table, but the stored procedure has the old value. I didnt think stored procedures could cache like this, or delay like this. Where should I look to fix this?
Try aliasing all of your tables and using the correct alias in the SELECT portion of your query.ff
Unfortunately there is some caching possibly of the old query plan for the previous version of the stored procedure. It seems like a bug in SQL Server. After a short period of time, it seems to update it but yes - pretty sure it's a bug.
Source: using SSMS on SQL Server 2012 -> 2017 for large data management

Stored Procedure vs direct SQL command in SSIS data flow source

I'm providing maintenance support for some SSIS packages. The packages have some data flow sources with complex embedded SQL scripts that need to be modified from time to time. I'm thinking about moving those SQL scripts into stored procedures and call them from SSIS, so that they are easier to modify, test, and deploy. I'm just wondering if there is any negative impact for the new approach. Can anyone give me a hint?
Yes there are issues with using stored procs as data sources (not in using them in Execute SQL tasks though in the control flow)
You might want to read this:
http://www.jasonstrate.com/2011/01/31-days-of-ssis-no-more-procedures-2031/
Basically the problem is that SSIS cannot always figure out the result set and thus the columns from a stored proc. I personally have run into this if you write a stored proc that uses a temp table.
I don't know that I would go as far as the author of the article and not use procs at all, but be careful that you are not trying to do too much with them and if you have to do something complicated, do it in an execute sql task before the dataflow.
I can honestly see nothing but improvements. Stored procedures will offer better security, the possibility for better performance due to cached execution plans, and easier maintenance, like you pointed out.
Refactor away!
You will not face issues using only simple stored procedures as data source. If procedure is using temp tables and CTE - there is no guarantee you will not face issues. Even when you can preview results in design time - you may get errors in a run time.
My experience has been that trying to get a sproc to function as a data source is just not worth the headache. Maybe some simple sprocs are fine, and in some cases TVFs will work well instead, but if you need to do some complex operations there's no alternative to a sproc.
The best workaround I've found is to create an output table for each sproc you need to use in SSIS.
Modify the sproc to truncate the new output table at start, and to write its output to this instead of (or in addition to) ending with a SELECT statement.
Call the sproc with an Exec SQL task before your data flow.
Have your data flow read from the output table - a much simpler task.
If you want to save space, truncate the output table again with another Exec SQL. I prefer to leave it, as it lets me examine the data later and lets me rerun the output data flow if it fails without calling the sproc again.
This is certainly less elegant than reading directly from a sproc's output, but it works. FWIW, this pattern follows the philosophy (obligate in Oracle) that a sproc should not try to be a parameterized view.
Of course, all this assumes that you have privs to adjust the sproc in question. If necessary, you could write a new wrapper sproc which truncates the output table, then calls the old sproc and redirects its output to the new table.

Fastest way to insert in parallel to a single table

My company is cursed by a symbiotic partnership turned parasitic. To get our data from the parasite, we have to use a painfully slow odbc connection. I did notice recently though that I can get more throughput by running queries in parallel (even on the same table).
There is a particularly large table that I want to extract data from and move it into our local table. Running queries in parallel I can get data faster, but I also imagine that this could cause issues with trying to write data from multiple queries into the same table at once.
What advice can you give me on how to best handle this situation so that I can take advantage of the increased speed of using queries in parallel?
EDIT: I've gotten some great feedback here, but I think I wasn't completely clear on the fact that I'm pulling data via a linked server (which uses the odbc drivers). In other words that means I can run normal INSERT statements and I believe that would provide better performance than either SqlBulkCopy or BULK INSERT (actually, I don't believe BULK INSERT would even be an option).
Have you read Load 1TB in less than 1 hour?
Run as many load processes as you have available CPUs. If you have
32 CPUs, run 32 parallel loads. If you have 8 CPUs, run 8 parallel
loads.
If you have control over the creation of your input files, make them
of a size that is evenly divisible by the number of load threads you
want to run in parallel. Also make sure all records belong to one
partition if you want to use the switch partition strategy.
Use BULK insert instead of BCP if you are running the process on the
SQL Server machine.
Use table partitioning to gain another 8-10%, but only if your input
files are GUARANTEED to match your partitioning function, meaning
that all records in one file must be in the same partition.
Use TABLOCK to avoid row at a time locking.
Use ROWS PER BATCH = 2500, or something near this if you are
importing multiple streams into one table.
For SQL Server 2008, there are certain circumstances where you can utilize minimal logging for a standard INSERT SELECT:
SQL Server 2008 enhances the methods that it can handle with minimal
logging. It supports minimally logged regular INSERT SELECT
statements. In addition, turning on trace flag 610 lets SQL Server
2008 support minimal logging against a nonempty B-tree for new key
ranges that cause allocations of new pages.
If your looking to do this in code ie c# there is the option to use SqlBulkCopy (in the System.Data.SqlClient namespace) and as this article suggests its possible to do this in parallel.
http://www.adathedev.co.uk/2011/01/sqlbulkcopy-to-sql-server-in-parallel.html
If by any chance you've upgraded to SQL 2014, you can insert in parallel (compatibility level must be 110). See this:
http://msdn.microsoft.com/en-us/library/bb510411%28v=sql.120%29.aspx

Oracle Report taking forever, but the query runs fast

My query takes about 3-5 seconds to run. When I run the report, a simple summary of a few columns it takes 25-30 minutes!! It is a Group Left report. I've tried playing around with the query, and I've tried handling the grouping in the query with no luck. Any ideas what might be causing this?
Is the query being executed within a stored procedure? If yes try executing the SQL without passing the variables through an SQL stored procedure.
If there is a difference in time it takes to execute, try some optimizations like for example removing parameter sniffing (create local variables from within the stored procedure which contain copies of the variable values being passed through the stored procedure). These can give you an indication if the query requires optimization.
In my experience, sometimes queries that return a lot of data will appear to run fast from a tool like Toad or SQL Developer, but when you try to get all the rows, then you hit the real overall performance of the query.
So perhaps your query returns a lot of rows and all that time is spending doing all the i/o.

Same SQL Query Slower from NHibernate Application than SQL Studio?

Our application issues an NHibernate-generated SQL query. At application runtime, the query takes about 12 seconds to run against a SQL Server database. SQL Profiler shows over 500,000 reads.
However, if I capture the exact query text using SQL Profiler, and run it again from SQL Studio, it takes 5 seconds and shows less than 4,600 reads.
The query uses a couple of parameters whose values are supplied at the end of the SQL text, and I'd read a little about parameter sniffing and inefficient query plans, but I had thought that related to stored procedures. Maybe NHibernate holds the resultset open while it instantiates its entities, which could explain the longer duration, but what could explain the extra 494,000 "reads" for the same query as performed by NHibernate? (No additional queries appear in the SQL Profiler trace.)
The query is specified as a LINQ query using NHibernate 3.1's LINQ facility. I didn't include the query itself because it seems like a basic question of philosophy: what could explain such a dramatic difference?
In case it's pertinent, there also happens to be a varbinary(max) column in the results, but in our situation it always contains null.
Any insight is much appreciated!
Be sure to read: http://www.sommarskog.se/query-plan-mysteries.html
Same rules apply for procs and sp_executesql. A huge reason for shoddy plans can be passing in a nvarchar param for a varchar field, it causes index scans as opposed to seeks.
I very much doubt the output is affecting the perf here, it is likely to be an issue with one of the params sent in, or selectivity of underlying tables.
When testing your output from profiler, be sure to include sp_executesql and make sure your settings match (stuff like SET ARITHABORT), otherwise you will cause a new plan to be generated.
You can always dig up the shoddy plan from the execution cache via sys.dm_exec_query_stats