If in an interview I am asked As a DB2 DBA how would you approach a job or a query which is consuming more time than normal? Which commands would you use and what all steps would you take to resolve it?
If it is a job I would use db2mon.sh as a starting point, if it is a query I would first try db2advis to see if it recommends any indexes.
The Db2 Knowledge Center has a section on "Troubleshooting" which includes Troubleshooting Db2 Servers and there is a section on Troubleshooting SQL Performance which mentions db2mon.
There is also a section on Performance Tuning. As a DBA, it is worth reading all these sections (at least once anyway).
Related
This question already has an answer here:
When to use Oracle hints?
(1 answer)
Closed 5 years ago.
I have gone through some documentation on the net and using hints is mostly discouraged. I still have doubts about this. Can hints be really useful in production specially when same query is used by hundreds of different customer.
Is hint only useful when we know the number of records that are present in the tables? I am using leading in my query and it gives faster results when the data is very large but the performance is not that great when the records fetched are less.
This answer by David is very good but I would appreciate if someone clarified this in more details.
Most hints are a way of communicating our intent to the optimizer. For instance, the leading hint you mention means join tables in this order. Why is this necessary? Often it's because the optimal join order is not obvious, because the query is badly written or the database statistics are inaccurate.
So one use of hints such as leading is to figure out the best execution path, then to figure out why the database doesn't choose that plan without the hint. Does gathering fresh statistics solve the problem? Does rewriting the FROM clause solve the problem? If so, we can remove the hints and deploy the naked SQL.
Some times there are times where we cannot resolve this conundrum, and have to keep the hints in Production. However this should be a rare exception. Oracle have had lots of very clever people working on the Cost-Based Optimizer for many years, so its decisions are usually better than ours.
But there are other hints we would not blink to see in Production. append is often crucial for tuning bulk inserts. driving_site can be vital in tuning distributed queries.
Conversely other hints are almost always abused. Yes parallel, I'm talking about you. Blindly putting /*+ parallel (t23, 16) */ will probably not make your query run sixteen times faster, and not infrequently will result in slower retrieval than a single-threaded execution.
So, in short, there is no universally applicable advice to when we should use hints. The key things are:
understand how the database works, and especially how the cost-based optimizer works;
understand what each hint does;
test hinted queries in a proper tuning environment with Production-equivalent data.
Obviously the best place to start is the Oracle documentation. However, if you feel like spending some money, Jonathan Lewis's book on the Cost-Based Optimizer is the best investment you could make.
I couldn't just rephrase that, so I will paste it here
(it's a brief explanation as of "When Not To Use Hints", that I had bookmarked):
In summary, don’t use hints when
What the hint does is poorly understood, which is of course not limited to the (ab)use of hints;
You have not looked at the root cause of bad SQL code and thus not yet tapped into the vast expertise and experience of your DBA in tuning the database;
Your statistics are out of date, and you can refresh the statistics more frequently or even fix the statistics to a representative state;
You do not intend to check the correctness of the hints in your statements on a regular basis, which means that, when statistics change, the hint may be woefully inadequate;
You have no intention of documenting the use of hints anyway.
Source link here.
I can summarize this as: The use of hints is not only a last resort, but also a lack of knowledge on the root cause of the issue. The CBO (Cost Based Optimizer) does an excellent job, if you just ensure some basics for it. Those include:
Fresh statistics
1.1. Index statistics
1.2. Table statistics
1.3. Histograms
Correct JOIN conditions and INDEX utilization
Correct Database settings
This article here is worth reading:
Top 10 Reasons for poor Oracle performance
Presented by non other, but Mr. Donald Burleson.
Cheers
In general hints should be used only exceptional, I know following situations where they make sense:
Workaround for Oracle bugs
Example: Once for a SELECT statement I got an error ORA-01795: maximum expression number in list - 1000, although the query did not contain an IN expression at all.
Problem was: The queried table contains more than 1000 (sub-) partitions and Oracle did a transformation of my query. Using the (undocumented) hint NO_EXPAND_TABLE solved the issue.
Datewarehouse application while staging
While staging you can have significant changes on your data where the table/index statistics are not aware about as statistics are gathered only once a week by default. If you know your data structure then hints could be useful as they are faster than running DBMS_STATS.GATHER_TABLE_STATS(...) manually all the time in between your operations. On the other hand you can run DBMS_STATS.GATHER_TABLE_STATS() even for single columns which might be the better approach.
Online Application Upgrade Hints
From Oracle documentation:
The CHANGE_DUPKEY_ERROR_INDEX, IGNORE_ROW_ON_DUPKEY_INDEX, and
RETRY_ON_ROW_CHANGE hints are unlike other hints in that they have a
semantic effect. The general philosophy explained in "Hints" does not
apply for these three hints.
I've just inherited an old PostgreSQL installation and need to do some diagnostics to find out why this database is running slow. On MS SQL you would use a tool such as Profiler to see what queries are running and then see how their execution plan looks like.
What tools, if any, exist for PostgreSQL that I can do this with? I would appreciate any help since I´m quite new with Postgres.
Use pg_stat_statements extension to get long running queries. then use select* from pg_stat_statements order by total_time/calls desc limit 10 to get ten longest. then use explain to see the plan...
My general approach is usually a mixture of approaches. This requires no extensions.
set the log_min_duration_statement to catch long-running queries. https://dba.stackexchange.com/questions/62842/log-min-duration-statement-setting-is-ignored should get you started.
Use profiling of client applications to see which queries they are spending their time on. Sometimes one has queries which take a small duration but are so frequently repeated to cause performance problems.
Of course then explain analyze can help. If you are looking inside plpgsql functions however, often you need to pull out the queries and run explain analyze on them directly.
Note: ALWAYS run explain analyze in a transaction that rolls back or a read-only transaction unless you know that it does not write to the database.
Sql server query takes 1 second when run in query analyzer with single user. I started stress tool written by Adam Machanic with same query and run that for 200 users and in parrallel I ran the same query in query analyzer it takes more than when 20 second.
How to find which join or where clause is creating problem in a stress test situation. What is taking so long?
Thanks,
Ron
It's likely going to be locking and blocking. A starting point is reading this article on the MSDN that gives a sproc you can run (and the output of which is very verbose). Indexes may be one way to sort it out, but without any more information (schema, query, volumes of data, etc) it's unlikely we can provide more.
see this article: Slow in the Application, Fast in SSMS? Understanding Performance Mysteries by Erland Sommarskog, it is the most comprehensive article that I've ever seen on this issue.
I'd bet that it is one of The Default Settings, like QUOTED_IDENTIFIER.
I've found a number of resources that talk about tuning the database server, but I haven't found much on the tuning of the individual queries.
For instance, in Oracle, I might try adding hints to ignore indexes or to use sort-merge vs. correlated joins, but I can't find much on tuning Postgres other than using explicit joins and recommendations when bulk loading tables.
Do any such guides exist so I can focus on tuning the most run and/or underperforming queries, hopefully without adversely affecting the currently well-performing queries?
I'd even be happy to find something that compared how certain types of queries performed relative to other databases, so I had a better clue of what sort of things to avoid.
update:
I should've mentioned, I took all of the Oracle DBA classes along with their data modeling and SQL tuning classes back in the 8i days ... so I know about 'EXPLAIN', but that's more to tell you what's going wrong with the query, not necessarily how to make it better. (eg, are 'while var=1 or var=2' and 'while var in (1,2)' considered the same when generating an execution plan? What if I'm doing it with 10 permutations? When are multi-column indexes used? Are there ways to get the planner to optimize for fastest start vs. fastest finish? What sort of 'gotchas' might I run into when moving from mySQL, Oracle or some other RDBMS?)
I could write any complex query dozens if not hundreds of ways, and I'm hoping to not have to try them all and find which one works best through trial and error. I've already found that 'SELECT count(*)' won't use an index, but 'SELECT count(primary_key)' will ... maybe a 'PostgreSQL for experienced SQL users' sort of document that explained sorts of queries to avoid, and how best to re-write them, or how to get the planner to handle them better.
update 2:
I found a Comparison of different SQL Implementations which covers PostgreSQL, DB2, MS-SQL, mySQL, Oracle and Informix, and explains if, how, and gotchas on things you might try to do, and his references section linked to Oracle / SQL Server / DB2 / Mckoi /MySQL Database Equivalents (which is what its title suggests) and to the wikibook SQL Dialects Reference which covers whatever people contribute (includes some DB2, SQLite, mySQL, PostgreSQL, Firebird, Vituoso, Oracle, MS-SQL, Ingres, and Linter).
As for badly performing queries - do explain analyze and read it.
You can put explain analyze output on site like explain.depesz.com - it will help you find the elements that really take the most time.
There is a nice online tool that takes the output of EXPLAIN ANALYZE, and graphically shows you critical parts (e.g. wrong estimates, hot spots, etc)
http://explain.depesz.com/help
Btw, I think posted queries become public, and the "previous explains" link has been hit by spambots.
http://www.postgresql.org/docs/current/static/indexes-examine.html
You can give hints: SET enable_indexscan TO false; would make PostgreSQL try to not use indexes
To address your point, unfortunately the only way to tune a query in Postgres is pretty much to tune the database underlying it. In oracle, you can set all of those options on a query by query basis, trump the optimizers plan in the process, but in Postgres, you're pretty much at the mercy of the optimizer, for good and ill.
The PGAdmin3 tool includes a graphical explanation tool for breaking down how a query is handled. It also is especially helpful for showing where table scans occur.
Best I've seen are in here: http://wiki.postgresql.org/wiki/Using_EXPLAIN, but the latest PDF in there is from 2008, so there may be something more recent. I'm interested to hear other user's answers.
Also, something's brewing in the contrib packages: http://www.sai.msu.su/~megera/wiki/plantuner
Are there any standard queries that can be run that will show the performance of a SQL Server 2005 database?
Note: I need to know the performance of every aspect of the database.
EDIT:
I am looking for a way to measure the time it takes for typical queries to execute. I am then going to apply indexing to certain tables in the database and then time how long the same queries take to execute and see if there is a significant difference. Is there an easy way to do this?
Thanks!
(Edited, link hopefully fixed)
For general background research/analysis of SQL Server performance, I prefer to watch how SQL is performing as it is performing. The best tools for that are SQL Profiler and sometimes Windows System Monitor (aka Performance Monitor aka PerfMon). Alas, neither are particularly simple, let alone simple queries against the system -- though some PerfMon counters are exposed through a few DMViews I can't dig up just now.
BOL has reasonable information on these; a good top-level (online) page for this is here. Be wary, there is serious DBA stuff beyond that point
There are some dynamic management views and functions build in:
http://msdn.microsoft.com/en-us/library/ms188754(SQL.90).aspx
select * from sys.dm_db_index_usage_stats
select * from sys.dm_os_memory_objects