Same Length Sql on Local and Production - sql

I have a sql local and on production servers which is of same length. When I test sql on local it takes about 2 seconds to run and when i run the same thing on production or server it takes about 7 seconds to run.
Why so much difference?

the primary factor responsible for variation of SQL response time (especially when running the same query a few times in a row) is caching. Actually, there may be several caching effects at play at the same time:
Code caching (next time you issue the same query you won't have to do the hard parse -- saves time and resources)
Data caching, first of all
a) database-level caching (use of buffer cache)
but also
b) OS-level caching
or even
c) hardware-level caching
You can determine what exactly is going on by enabling autotrace and analyzing its output. If the first time you see a lot of recursive calls, and none (or much less) subsequently, that tells you about code caching (cursor sharing preventing you from parsing every time). If the first time you see a lot of physical reads, but much fewer subsequently, then it's database buffer cache at play. If the number of physical reads stays the same, but elapsed time changes, then it could be low-level data caching (OS or hardware).
There are, of course, other factors that may affect elapsed time -- such as database workload -- but if you are observing this over a short period of time, then it's probably not them.

There are two more ways to get this fixed, what you do is run the proc through cache tables it will make it faster, or just indexed it

Related

How do I compare two SQL queries to run on Postgres

I need to compare two queries that will run in my Postgres database.
How do I know the execution time and any other statistics of them so I can produce a reliable benchmark between them?
I can think of two interesting data points to collect and compare:
The execution time.
For that, simply execute the query using psql connected via UNIX sockets (to factor out the network) and use psql's \timing command to measure the execution time as seen on the client.
Do not use EXPLAIN (ANALYZE) for that since that would add notable overhead which affects your measurements.
Make sure to run the query several times to get a reliable number. That number will correspond to the execution time with a warm cache.
If you want to measure execution time with a cold cache, restart PostgreSQL and empty the file system cache.
The number of blocks touched by the query.
For that, run EXPLAIN (ANALYZE, BUFFERS) once for each query.
The number of blocks touched is significant for performance: the fewer blocks a query touches, the faster it will (often) be. This number is particularly significant for performance with a cold cache; the fewer blocks, the less execution time will depend on caching.

Is it reasonable to rely on an RDBMS buffer cache?

I'm writing a RESTful API which queries an RDBMS. The first time the SQL query in question is run, it's taking around 10 seconds. Subsequent runs of the query with the same bind variables take milliseconds. I believe this is due to the results being present in the buffer cache.
I don't want to store the results of this query in the application layer due to space considerations and would rather run it every time I need it. For performance reasons, I'm relying on the fact that the results will be retrieved through a buffer cache hit. Subsequent query runs will be seconds or minutes apart and it appears that the results stay present in the buffer cache for longer than this.
Is it a reasonable design decision to leverage the buffer cache in this manner?
I would suggest wrapping the query logic at the application layer in some sort of function. Then, implement the function using a query, which can rely on the query cache.
In the future, you may want to change the implementation to locally cache the results. Encapsulating the functionality makes it easy to change the implementation, without affecting other code.
Is this a reasonable design decision?
I would suggest you'd want to do some fairly extensive load and performance testing to answer that question.
I am a pessimist, but when you have queries that take 10 seconds, it's not unusual for them to slow down over time (you're probably not hitting an index, so as the data grows, the performance gets worse). Those queries also tend to struggle with contention for resources, so performance degrades as the number of concurrent queries increases.
I'd also want to look at which business events would cause a cache refresh in Oracle, and their frequency. If you're querying orders in a web shop for instance, you may get new data every second, so your cache will not have a long time-to-live.
If you rely on the buffer cache to keep the application performant, I'd want to understand exactly what happens when the cache is missed, how many concurrent queries you can handle, and what happens when 10 seconds becomes 20 or 30 or 60.

SQL Server query performance slows over time

I've seen this question asked in many ways all over the Internet but despite implementing the abundance of advice (and some voodoo), I'm still struggling. I have a 100GB+ database that is constantly inserting and updating records in very large transactions (200+ statements per trans). After a system restart, the performance is amazing (data is written to a large SATA III SSD connected via USB 3.0). The SQL Server instance is running on a VM running under VMWare Workstation. The host is set to hold the entire VM in memory. The VM itself has a paging cache of 5000 MB. The SQL Server user is set to 'hold pages in memory'. I have 5 GBs of RAM allocated to the VM, and the max memory of the SQL Server instance is set to half a Gig.
I have played with every single one of these parameters to attempt to maintain consistent performance, but sure and steady, the performance eventually degrades to the point where it begins to time out. Here's the kicker though, if I stop the application that's loading the database, and then execute the stored proc in the Management Studio, it runs like lightning, clearly indicating it's not an issue with the query, and probably nothing to do with memory management or paging. If I then restart the loader app, it still crawls. If I reboot the VM however, the app once again runs like lightning...for a while...
Does anybody have any other suggestions based upon the symptoms presented?
Depending on how large your hot set is, 5GB memory may just tax it for a 100+gb database.
Check indices and query plans. We can not help you without them. And I bet you miss some indices - which is the standard performance issue people have.
Otherwise, once you made your homework - head over to dba.stackexchange.com and ask there.
Generally - consider that 200 statements per transaction may simply indicate a seriously sub-optimal programming. For example you could bulk-load the data into a temp table then merge into the final one.
Actually, I may have a working theory. What I did was add some logic to the app that when it times out, sit for two minutes, and then try again, and voila! Back to full speed. I rubber-ducky'd my co-worker and came up with the concept that my perceived SSD write speeds were actually the write speed to the VMWare host's virtual USB 3 buffer, and that the actual SSD write speeds were slower. I'm probably hitting against the host's buffer size and by forcing the app to wait 2 minutes, the host has a chance to dump its back-buffered data to the SSD. Elementary, Watson :)
If this approach also fails to be sustainable, I'll report in.
Try executing this to determine your problem queries:
SELECT TOP 20
qs.sql_handle,
qs.execution_count,
qs.total_worker_time AS Total_CPU,
total_CPU_inSeconds = --Converted from microseconds
qs.total_worker_time/1000000,
average_CPU_inSeconds = --Converted from microseconds
(qs.total_worker_time/1000000) / qs.execution_count,
qs.total_elapsed_time,
total_elapsed_time_inSeconds = --Converted from microseconds
qs.total_elapsed_time/1000000,
st.text,
qp.query_plan
FROM
sys.dm_exec_query_stats as qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) as st
cross apply sys.dm_exec_query_plan (qs.plan_handle) as qp
ORDER BY qs.total_worker_time desc
Then check your estimated and actual execution plans on the queries this command helps you pinpoint.
Source How do I find out what is hammering my SQL Server? and at the bottom of the page of http://technet.microsoft.com/en-us/magazine/2007.11.sqlquery.aspx
Beyond the excellent indexing suggestions already given,
be sure to read up on parameter sniffing. That could be the cause of the problem.
SQL Server - parameter sniffing
http://www.sommarskog.se/query-plan-mysteries.html#compileps
As a result you could have a bad query plan being re-used, or SQL's buffer could be getting full and writing pages out of memory to disk (maybe that's other allocated memory in your case).
You could run DBCC FreeProcCache and DBCC FreeSystemCache to empty it and see if you get a performance boost.
You should give SQL more memory too - as much as you can while leaving room for other critical programs and the OS. You might have 5gb of Ram on the VM, but SQL is only getting to play with a 1/2 gb, which seems REALLY small for what you're describing.
If those things don't move you in the right direction, install the SQL Management Data Warehouse so you can see exactly what is happening when your slow down begins. Running it takes up additional memory, but you will give the DBA's more to go on.
In the end, what I did was a combination of two things, putting in logic to recover when timeouts occurred, and setting the host core count to only reflect physical cores, not logical cores, so for example, the host has 2 cores that are hyper-threaded. When I set my VM to use 4 cores, it occasionally gets hung in some infinite loop, but when I set it to 2 cores, it runs without fail. Still, aberrant behavior like this is difficult to mitigate reliably.

Fine tuning oracle query with pipelined function

I have a query (that powers an Oracle Application Express Report) that I was told by our users was executing "slowly" or at an unacceptable speed (wasn't given an actual load time for the page and the query is the only thing on the page).
The query involves many tables and actually references a pipelined function which identifies the currently logged-in users to our website and returns a custom "table" of records they have permission to based upon a custom security scheme we have.
My main question is around Oracle's caching of queries and how they could be affected by our setup.
When I took the query out of the webpage and ran it in Sql Developer (and manually specified a user ID to simulate a logged-in user to the website), the performance went from 71 seconds to 19 seconds to .5 seconds. Clearly, Oracle is utilizing its caching mechanism to make subsequent runs faster.
How is this affected by?:
The fact that different users will get different tables from the
pipe-lined function (all the same columns, just different number of
rows and the values in the rows). Does the pipe-lining prevent
caching from working? Am I only seeing caching because I'm running
a very isolated test?
Further more - is caching easily influenced by the number of people using the system? I'm not sure how "much" can get cached. Therefore, if we have 50 concurrent users that are accessing different parts of the website that are loading different queries all day long, is it likely that oracle won't be able to cache many/any of them because it is constantly seeing different request for queries?
Sorry my question isn't very technical.
I'm a developer who has been asked to help out in this seemingly DBA question.
Also, this is complicated because I can't really determine what the actual load times are since our users don't report that level of detail.
Any thoughts on:
how I can determine if this query is actually slow?
what the average processing time would be?
and how to proceed with fine tuning if it is a problem?
Thanks!
It doesn't sound like this has anything to do with APEX, pipelined table functions, or query caching. It sounds like you are describing the effects of plain old data caching (most likely at the database level but potentially at the operating system and disk subsystem layers).
As a very basic overview, data is stored in rows, rows are stored in blocks (most commonly 8 kb in size), blocks are stored in extents (generally a few MB in size), and extents roll up to segments (i.e. a table). Oracle maintains a buffer cache where the most recently accessed blocks are stored. When you run a query, Oracle figures out which blocks it needs to read in order to get your data (this is the query plan). It then looks to see whether those blocks are in the buffer cache or whether they have to be read from disk. Obviously, reading a block from cache is much more efficient than reading it off the disk since RAM is much faster than disk. If you run the same query with the same set of bind variable values multiple times in a row, you'll be accessing the same set of blocks each time but more and more of the blocks you care about are going to be in the cache. So you'd generally expect that the second and third time that you call the query, you'll see faster performance.
If you run the query with a different set of bind variable values, if the second set of bind variable values causes Oracle to access many of the same blocks, those executions will benefit from the data the prior test cached. Otherwise, you'd be back to square 1 potentially reading all the data you need off disk. Most likely, you'll see some combination of the two.
Remember as well that it is not just Oracle that is caching data. Frequently, the operating system will be caching the most active pieces of the underlying Oracle data files. And the I/O subsystem will be caching the most recently accessed data as well. So even if Oracle thinks that it needs to go out to fetch a block because it is not in the database's buffer cache, the file system or the I/O subsystem may have cached that data so it may not require an actual physical read off of disk. These other caches behave similarly where running the same query multiple times in a row is likely to cause the cache to be "warm" and improve the performance of the later runs.

SQL DB performance and repeated queries at short intervals

If a query is constantly sent to a database at short intervals, say every 5 seconds, could the number of reads generated cause problems in terms of performance or availability? If the database is Oracle are there any tricks that can be used to avoid a performance hit? If the queries are coming from an application is there a way to reduce any impact through software design?
Unless your query is very intensive or horribly written then it won't cause any noticeable issues running once every few seconds. That's not very often for queries that are generally measured in milliseconds.
You may still want to optimize it though, simply because there are better ways to do it. In Oracle and ADO.NET you can use an OracleDependency for the command that ran the query the first time and then subscribe to its OnChange event which will get called automatically whenever the underlying data would cause the query results to change.
It depends on the query. I assume the reason you want to execute it periodically is because the data being returned will changed frequently. If that's the case, then application level caching is obviously not an option.
Past that, is this query "big" in terms of the number of rows returned, tables joined, data aggregated / calculated? If so, it could be a problem if:
You are querying faster than it takes to execute the query. If you are calling it once a second, but it takes 2 seconds to run, that's going to become a problem.
If the query is touching a lot of data and you have a lot of other queries accessing the same tables, you could run into lock escalation issues.
As with most performance questions, the only real answer is to test. In this case test with realistic data in the DB and run this query concurrent with the other query load you expect on the system.
Along the lines of Samuel's suggestion, Oracle provides facilities in JDBC to do database change notification so that your application can subscribe to changes in the underlying data rather than re-running the query every few seconds. If the data is changing less frequently than you're running the query, this can be a major performance benefit.
Another option would be to use Oracle TimesTen as an in memory cache of the data on the middle tier machine(s). That will reduce the network round-trips and it will go through a very optimized retrieval path.
Finally, I'd take a look at using the query result cache to have Oracle cache the results.