Optimize 0ms query (or optimizing ahead)? - sql

In the final stage of development I started looking at code to try to find some bad practices. I discovered that while accessing a page, I'm querying the DB n times (n is the number of rows of an HTML table) just to get the translation (different languages) for a given record... I immediately thought that was bad and I tried a small optimization.
Running the SQL profiler shows that those query took 0ms.
Since these tables I'm querying are small (20-100 records) I thought I could fetch all data and cache them into the web server RAM, retrieving later using LINQ to Objects. Execution time this way is also 0ms.
The environment where I'm running these test is a DB and Web server with 0% load on the same machine. It's only me using the application.
The question starts here. Since I have no performance difference at all should I avoid that optimization? Should I keep it in the way to balance the usage of both DB and web server (the servers will be on 2 different machines in the production environment)?
In my opinion this optimization can't damage the performances, it could only make some better in case of heavy loaded DB. I have something in my brain that say it's wrong to optimize if there is no need...
Thanks for your opinion.

I don't think you've actually shown that there's no performance difference at all.
Try running the query a million times each way, and time how long it takes in total... I think you'll see a big difference.
The SQL profiler only shows you (as far as I'm aware) the time taken on the database to perform the query. It doesn't take into account:
The time taken to set up the connection or prepare the query
The time taken in network latency issuing the query
The time taken in network latency returning the results
The time taken converting the results into useful objects.
Of course, premature optimisation is a bad thing in general, but this does sound like a reasonable change to make - if you know that the contents of the tables won't change.

SQL Server is a bit peculiar in that way, all query execution times between 0 and 15 ms are rounded down to 0 ms. So you don't actually know from looking at the number if your query is taking 0 ms or 15 ms. It's a big difference between doing 1000 * 1 ms queries and doing 1000 * 15 ms.
Regarding translations, I've found that the best way is using Resources and tying them to a SQL database and the cache the translations in the web application for a reasonable amount of time. That is pretty efficient.
Other than that, what Jon Skeet says... :)

Both answer are true.
I actually measured 1 million times as Jon suggested and in fact.....there is a huge difference! Thanks Jon
And even what Jonas says is true. The query actually took something around 15ms (measured by the program) even if sql profiler says 0.
Thanks guys!

Related

BigQuery interactive query response times degradation since 2/19/16

For the Google BigQuery infrastructure folks: we've been running a set of short running interactive queries for many months now averaging about 5 seconds to complete. Starting Friday 2/19 these response times have been rising steadily (SQL has not changed and we're dealing with a steady stream of data we're querying using a sliding window)
Is this a global BigQuery issue you are aware of?
edit: more granular response times:
There is good news and bad news; the good news is that the query took only 0.5 seconds to execute. The bad news is that it took 191 seconds to find the files where the data was stored.
We have a couple of performance regressions that cause high tail latency for resolving paths. Tables (like yours) where the data is stored in many paths will see worse performance.
This is performance issue is exacerbated by the fact that you're using time-range decorators, which mean that our efforts to optimize the file layout doesn't work as well.
We are starting the roll-out of a fix to the underlying performance problem this afternoon; it will likely take at least a week for it to take effect everywhere. I'll update this answer once it is complete (if I forget, please remind me)
In the mean time, you may get faster results by removing the time-range decorators from your queries. You are already filtering by time, so the queries should still be correct. Of course, this may mean that the queries cost a bit more to run.

trouble shooting a speed bottleneck in a program with sql queries

This is an open ended interview question. I am not able to get a satisfactory answer.
The question is:
If you were trying to fix a speed bug involving a feature that took 90 seconds to execute when the customer expected the feature to take less than 10 seconds, how would you approach the problem and solve it? Assume the feature had 10 queries, 30 calculations, and 3000 lines of code spread over 5 modules
I think the first part of the answer is that you would use a profiler in whatever the code language is to first verify that the bottleneck is in the SQL queries and not in some processing. The profiler will also be able to tell you which queries are taking the most time by telling you the amount of time spent in each method. Once you got that, you can use a database query optimizer to fix the queries if that is the problem.
Too open ended, but there's a few things you can try:
Make sure you understand the customer's requirements correctly
Make sure you understand where the performance bottleneck is in your current setup
Make sure that the right indexes exist for the queries you have in
mind
Optimize the DB schema, de-normalize where necessary to avoid joins
Explore caching and pre-computing results where applicable as an option so you don't have to query the DB in the first place.
If all technical avenues are explored or time and effort would be too much, reset expectations with the customer if necessary.
humm...I would try to isolate each of the "steps" and see how long each of them are taking to execute. I would focus on the SQLs first by running a trace with profiler because they usually take longer to run. Once I have the values I would decide the next step. I cant tell that i would focus 100% on database if I see that Db is only responsible for 10% of the exec time for example

SQL DB performance and repeated queries at short intervals

If a query is constantly sent to a database at short intervals, say every 5 seconds, could the number of reads generated cause problems in terms of performance or availability? If the database is Oracle are there any tricks that can be used to avoid a performance hit? If the queries are coming from an application is there a way to reduce any impact through software design?
Unless your query is very intensive or horribly written then it won't cause any noticeable issues running once every few seconds. That's not very often for queries that are generally measured in milliseconds.
You may still want to optimize it though, simply because there are better ways to do it. In Oracle and ADO.NET you can use an OracleDependency for the command that ran the query the first time and then subscribe to its OnChange event which will get called automatically whenever the underlying data would cause the query results to change.
It depends on the query. I assume the reason you want to execute it periodically is because the data being returned will changed frequently. If that's the case, then application level caching is obviously not an option.
Past that, is this query "big" in terms of the number of rows returned, tables joined, data aggregated / calculated? If so, it could be a problem if:
You are querying faster than it takes to execute the query. If you are calling it once a second, but it takes 2 seconds to run, that's going to become a problem.
If the query is touching a lot of data and you have a lot of other queries accessing the same tables, you could run into lock escalation issues.
As with most performance questions, the only real answer is to test. In this case test with realistic data in the DB and run this query concurrent with the other query load you expect on the system.
Along the lines of Samuel's suggestion, Oracle provides facilities in JDBC to do database change notification so that your application can subscribe to changes in the underlying data rather than re-running the query every few seconds. If the data is changing less frequently than you're running the query, this can be a major performance benefit.
Another option would be to use Oracle TimesTen as an in memory cache of the data on the middle tier machine(s). That will reduce the network round-trips and it will go through a very optimized retrieval path.
Finally, I'd take a look at using the query result cache to have Oracle cache the results.

How much SQL Query is too much SQL Query?

I was researching for a CMS to use and ran into a review on vBulletin 4.0; using about 200 queries on one page load.
I was then worried.
Further research brought me to other sites to see how much queries they are using and I found that some forum software such as Invision Power Board and PHPBB are using queries as low as 6 or 8.
Currently, my site uses about 25 to 40 queries.
Should I be worried?
Don't be worried about number of queries.
Be worried about:
Pages loading too slowly
The SQL being too complicated to maintain.
Clarification:
SQL being too complicated can come from either too many queries OR a few queries that are very complicated (lots of joins and sub queries, etc).
If you aim for something, aim for 3 reads and 1 writes per HTTP hit.
While these are arbitrary numbers (somehow, they are actually taken from the Advanced PHP Programming), they emphasize the ideas:
the number of SQL roundtrips should be low, under 10 for sure, per HTTP call
there is a difference between reads and writes, and the ratio should favour reads. writes create contention
Also remember that not all reads are equal: the 3 reads should be highly optimized reads, not full table scans with 4-5 outer joins...
It Depends. The more you hit the db, the more load you have. Just some things to look for. If you need to display values from several different tables, you will probably need to run several queries. If you only have a couple of users and you know you're not going to have lots of data, it probably doesn't matter.
Some things to consider:
Are you running the same query multiple times per page load? If you can reuse the result, do it.
Are you running a query-per-result of another query? If so, maybe allow the DB to do the join and only do one pull.
If your page is slow from hitting the db too much, look at memcached.
You might try re-factoring your code over time to decrease the number of round-trips to SQL Server. One way to do this could be to utilize caching. For example, data you need frequently can be loaded when the application is started, then grabbed from the cache when it is needed.
Another approach could be to de-normalize your data into tables that are specifically designed to give you the data your site needs in a fewer number of queries.
Also consider if some of those queries (those you use to populate lookup values for instance) can be cached. That way if the same query is called on multiple pages or each time you move from one group of records to another, the database isn't hit again to run exaclty the same query. I remeber one time we were trying to determine why the site was so slow when the stored proc that was running was very fast and found using profiler that it was being sent over and over and over again when it didn't need to be.
You can cache all those queries with vbulletin. If you look at pbnation.com they have over a million visitors a day and only around 3-4 queries per page load. Everything else is cached in memcached.

Monitoring SQL JOB Performance Issues

I am working on a SQL Job which involves processing around 75000 records.
Now, the job works fine for 10000/20000 records with speed of around 500/min. After around 20000 records, execution just dies. It loads around 3000 records every 30 mins and stays at similar speed.
I asked a similar question yesterday and got few good suggestions on procedure changes.
Here's the link:
SQL SERVER Procedure Inconsistent Performance
I am still not sure how to find the real issue here. Here are few of the questions I have:
If problem is tempdb, how can I monitor activities on tempdb?
How can I check if it's the network being the bottleneck?
Are there any other ways to find what is different between when job is running fast and when it slows down?
I have been the administrator for a couple large data warehouse implementations where this type of issue was common. Although, I can't be sure of it, what it sounds like is that the performance of your server is being degraded by either growing log files or by memory usage. A great tool for reviewing these types of issues is Perfmon.
A great article on using this tool can be found here
Unless your server is really chimped down, 75000 records should not be a problem for tempdb, so I really doubt that is your problem.
Your prior question indicated SQL Server, so I'd suggest running a trace while the proc is running. You can get statement timings etc from the trace and use that to determine where or what is slowing things down.
You should be running each customer's processing in separate transactions, or small groups of customers. Otherwise, the working set of items that the the ultimate transaction has to write keeps getting bigger and each addition causes a rewrite. You can end up forcing your current data to be paged out and that really slows things down.
Check the memory allocated to SQL Server. If it's too small, you end up paging SQL Server's processes. If it's too large, you can leave the OS without enough memory.