Looking for SQL Optimization Interview Questions - sql

I'm on a project where I was asked to take a quick peek at some reporting SQL (in a SQL Server 2K5 environment) and was surprised at what I found: 4 to 5 levels of subquerys, distinct clauses, unions, and NoLock hints (which were needed because the SQL was running so long it was blocking standard processing) - all in the same set!.
Because I (foolishly :) mentioned that I thought the SQL was inefficient I've been labeled the "expert" and have been tasked with creating a test for a couple of interviewees to do that will assess their SQL optimizing abilities. I'm hoping someone can point me to some URLS, or maybe provide a list that I can use to help weed out the good from the bad.

Since you mentioned the SQL Server 2005 environment:
More SQL Server interview questions than you possibly could have imagined:
The classic set. Most interviewees will probably have studied these...maybe a good way to gauge who has prepared.
Another classic
Questions from one of the original Stack Overflow DBAs
Another link for best SQL questions and answers

I would give them a Query plan (via EXPLAIN, or whatever your flavor of SQL uses as a keyword) and see if they can decipher what it means, what the weak points are, and how to improve the query.
Look at MySQL's Explain Documentation for help using MySQL's explain and what it means.

Related

Guides for PostgreSQL query tuning?

I've found a number of resources that talk about tuning the database server, but I haven't found much on the tuning of the individual queries.
For instance, in Oracle, I might try adding hints to ignore indexes or to use sort-merge vs. correlated joins, but I can't find much on tuning Postgres other than using explicit joins and recommendations when bulk loading tables.
Do any such guides exist so I can focus on tuning the most run and/or underperforming queries, hopefully without adversely affecting the currently well-performing queries?
I'd even be happy to find something that compared how certain types of queries performed relative to other databases, so I had a better clue of what sort of things to avoid.
update:
I should've mentioned, I took all of the Oracle DBA classes along with their data modeling and SQL tuning classes back in the 8i days ... so I know about 'EXPLAIN', but that's more to tell you what's going wrong with the query, not necessarily how to make it better. (eg, are 'while var=1 or var=2' and 'while var in (1,2)' considered the same when generating an execution plan? What if I'm doing it with 10 permutations? When are multi-column indexes used? Are there ways to get the planner to optimize for fastest start vs. fastest finish? What sort of 'gotchas' might I run into when moving from mySQL, Oracle or some other RDBMS?)
I could write any complex query dozens if not hundreds of ways, and I'm hoping to not have to try them all and find which one works best through trial and error. I've already found that 'SELECT count(*)' won't use an index, but 'SELECT count(primary_key)' will ... maybe a 'PostgreSQL for experienced SQL users' sort of document that explained sorts of queries to avoid, and how best to re-write them, or how to get the planner to handle them better.
update 2:
I found a Comparison of different SQL Implementations which covers PostgreSQL, DB2, MS-SQL, mySQL, Oracle and Informix, and explains if, how, and gotchas on things you might try to do, and his references section linked to Oracle / SQL Server / DB2 / Mckoi /MySQL Database Equivalents (which is what its title suggests) and to the wikibook SQL Dialects Reference which covers whatever people contribute (includes some DB2, SQLite, mySQL, PostgreSQL, Firebird, Vituoso, Oracle, MS-SQL, Ingres, and Linter).
As for badly performing queries - do explain analyze and read it.
You can put explain analyze output on site like explain.depesz.com - it will help you find the elements that really take the most time.
There is a nice online tool that takes the output of EXPLAIN ANALYZE, and graphically shows you critical parts (e.g. wrong estimates, hot spots, etc)
http://explain.depesz.com/help
Btw, I think posted queries become public, and the "previous explains" link has been hit by spambots.
http://www.postgresql.org/docs/current/static/indexes-examine.html
You can give hints: SET enable_indexscan TO false; would make PostgreSQL try to not use indexes
To address your point, unfortunately the only way to tune a query in Postgres is pretty much to tune the database underlying it. In oracle, you can set all of those options on a query by query basis, trump the optimizers plan in the process, but in Postgres, you're pretty much at the mercy of the optimizer, for good and ill.
The PGAdmin3 tool includes a graphical explanation tool for breaking down how a query is handled. It also is especially helpful for showing where table scans occur.
Best I've seen are in here: http://wiki.postgresql.org/wiki/Using_EXPLAIN, but the latest PDF in there is from 2008, so there may be something more recent. I'm interested to hear other user's answers.
Also, something's brewing in the contrib packages: http://www.sai.msu.su/~megera/wiki/plantuner

Help finding old SQL tool that rewrote queries [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 8 years ago.
Improve this question
There was this old sql server tool called Lectoneth or something like that, you'd put sql queries in it, and it would rewrite it for you.
I think quest bought them out, but I can't find where to download a free copy of that software.
Really helps when you have no dba, and have lots of sql queries to rewrite.
Thanks
Craig
Doesn't ring a bell, and presumably you've seen, but nothing obvious on Quest's website
Perhaps a tool like Red Gate's SQL Prompt would help - the Pro edition does SQL reformatting.
Edit
Think i've found what you're looking for, mentioned here - LECCO SQL Expert. The link to the Lecco website does indeed direct to quest, but a 404.
LECCO SQL Expert is the only complete
SQL performance tuning and
optimization solution offering
problematic SQL detection and
automatic SQL rewrite. With its
built-in Artificial Intelligence (AI)
based Feedback Searching Engine, LECCO
SQL Expert reduces the effort required
to optimize SQL and makes even the
most junior programmer an expert.
Developers use LECCO SQL Expert to
optimize SQL during application
development. DBAs eliminate
problematic SQL before users
experience application performance
problems by using LECCO SQL Expert in
production systems.
Looks like it's no longer about - all mentions of I could find indicated it supported up to SQL 2000, and stale links - looks like it wasn't a free tool. As said in my comments, I think this kind of thing is a skill well worth possessing and would benefit in the long run to not relay on a tool to try and do it for you.
I wasn't aware of this tool before now, so I have picked up something from this question - got me intrigued!
Final Update:
To confirm, that product has indeed gone as Lecco was acquired some years ago now. Thanks to Brent Ozar for confirmation.
I think you're looking for a product that's been merged into Toad for SQL Server. The commercial version of Toad has a SQL Optimizer feature that tries lots of ways to rewrite your SQL statements, then tests them to find which ways are the fastest.
You can download Toad here:
http://www.toadsoft.com/
But be aware that that feature is a paid-version-only feature.
Well rather than spending your time looking for a magic bullet, why not spend some time learning performance tuning (you will need a book, this is too complex for the Internet generally). Plus it is my belief that if you want ot write decent new code, you need to understand performance in databases. There is no reason to be unable to write code that avvoids the most common problems.
First, rewrite every query to use ANSII syntax anytime you open it up to revise it for any other reason. Code review all SQl changes and do not pass the code review unless explicit joins were used.
Your first step in performance tuning to identify which queries and procs are causing the trouble. You can use tools that will tell you the worst performing queries in terms of overall time, but don't forget to tune the queries that are run frequnetly as well. Cutting seconds off a query that runs thousands of times a day can really speed things up. Also since you are in oprod already, likely your users are complaining about certain areas, those areas should be looked at first.
Things to look for that cause performance problems:
Cursors
Correlated subqueries
Views that call views
Lack of proper indexing
Functions (especially scalar function that make the query run row by row insted of through a set)
Where clauses that aren't sargeable
EAV tables
Returning more data than you need (If you have anything with select * and a join, immediately fix that.)
Reusing sps that act on one record to loop throuhg a large group of records
Badly designed autogenerated complex queries from ORMs
Incorrect data types resulting in the need to be continually be converting data in order to use it.
Since you have the old style syntax it is highly likely you have a lot of accidental cross joins
Use of distinct when it can be replaced with a derived table instead
Use of union when Union all would work
Bad table design that requires difficult construction of queries that can never perform well. If you find yourself frequently joining to the same table multiple times to get the data you need, then look at the design of the tables.
Also since you have used implicit joins you need to be aware that even in SQL Server 2000 the left and right implicit syntax does not work correctly. Sometimes this interprets as a cross join instead of a left join or right join. I would make it a priority to find and fix all of these queries immediately as they may currently be returning an incorrect result set. Bad data results are even worse that slow data returns.
Good luck.

How different is PostgreSQL to MySQL? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
I've been asked to support and take on a PostgreSQL app, but am a MySQL guy - is this a realistic task?
PostgreSQL has some nice features like generate_series, custom aggregate functions, arrays etc, which can ease your life greatly if you take some time to learn them.
On the other hand it lacks some features of MySQL like using and assigning session variables in queries, FORCE INDEX, etc., which is quite annoying if you are used to these features.
If you just use basic SQL, then you will hardly notice any difference.
How different is PostgreSQL to MySQL?
That depends if you're talking about SQL only (which is mostly the same) or the stored procedures (which are quite different).
is this a realistic task?
Absolutely. PostgreSQL has very good documentation and community. There are also lot of ppl, who have experience with MySQL and PostgreSQL.
"MySQL vs PostgreSQL wiki" — centers on "which is better", but gives you some idea of differences.
PostgreSQL compared to MySQL is as any other pair of DBMSs compared. What they have in common is non-functional, specifically the consequences of each being open source. In terms of features, use, and strengths they are no closer to each other than PostgreSQL is to Oracle or DB2 is to Sybase.
Now on to your real question: you are a SQL guy, albeit one who has not yet had experience with PostgreSQL. This is a completely realistic task for you, and a good one since you'll expand your understanding of the varieties of DBMSs and gain a perspective on MySQL that you can't get from working solely within its sphere.
As someone who was once in exactly the same position, my guess is that you'll quickly pick up PostgreSQL and might even hesitate to return to MySQL ;-).
If you're interested in the different flavors of SQL, here are a few resources (though some may be outdated):
SQLZoo
SQL Dialects Reference Wikibook
Tips on Writing Portable SQL
SQL Bible
You may want to take a look at these pages:
Why PostgreSQL Instead of MySQL: Comparing Reliability and Speed in 2007, Why PostgreSQL Instead of MySQL 2009.
I faced the same situation about a month ago.... I have been doing fine with postgres. There is a strong online community for postgres and you should be able to find help if you run into any trouble and learn stuff easily :)
I didn't take very long to switch from MySQL to PostgreSQL back when I first started using PostgreSQL in anger at a previous company. I found it very nice and very refreshing (not that MySQL was bad) compared to MySQL which I had used previously. PostgreSQL was also a good stepping stone to Oracle which I use at my current company. I liked that it had a proper command line application like MySQL, but the configuration options are harder - but if you're not setting it up then there is no problem.

Can you recommend a good source for Teradata Best Practices?

Looks like my data warehouse project is moving to Teradata next year (from SQL Server 2005).
I'm looking for resources about best practices on Teradata - from limitations of its SQL dialect to idioms and conventions for getting queries to perform well - particularly if they highlight things which are significantly different from SQL Server 2005. Specifically tips similar to those found in The Art of SQL (which is more Oracle-focused).
My business processes are currently in T-SQL stored procedures and rely fairly heavily on SQL Server 2005 features like PIVOT, UNPIVOT, and Common Table Expressions to produce about 27m rows of output a month from a 4TB data warehouse.
One place to start is here: http://www.teradataforum.com/
This might be a little late, but there are a few things which I can warn you about Teradata which I have learned.
Use the most recent version as often as possible.
For V12 the optimizer was re-written and the database performs much better now.
Try to realize that SQL Server and Teradata are very different beasts, most of the concepts will not transition well.
Do not underestimate the importance of a primary index.
The locks that teradata uses are very primitive when compared to other databases.
Do NOT use TERA mode. You do not have any code which is legacy, ANSI mode is far superior and is widely encouraged.
Join indexes are very helpful tools, but they do not provide all the answers.
Parallelism, take the time to understand how FASTLOAD, MULTILOAD, and TPUMP works and find out how one can leverage it with their ETL strategy.
If you are attempting to run a query which needs to be performant, do not use any casts, the optimizer will not use statistics to generate the best execution plan.
Working with dates are going to be a pain, just a warning.
Teradata is very DDL oriented, try to understand all the syntax related when creating a table.
Compression is a wonderful tool, if you have any values which are repeated in a table, make use of it.
There are not many tools available with Teradata, be prepared to build a lot. The tools that exist are very expensive.
Unfortunately, I do not know much about SQL Server, so I cannot say what tools in SQL Server appear in Teradata.
Hope this helps
I would also look into the recently launched Teradata Developer Exchange as well as the TeradataForum and forums on Teradata's main website.
I don't know of any good references available online. Teradata has some design manuals that are available for download, but they're more instruction manuals and not "best practices" as such. check them out here: http://www.info.teradata.com/DataWarehouse/eTeradata-BrowseBy.cfm?page=Teradata%20Database
Alternatively, you need to find a friendly Teradata expert to bounce ideas off. Try Teradata themselves, or find a local consultant with Teradata experience.
Best Practices on Teradata isn't a topic that gets lots of discussions and most of the best tricks tend to be proprietary knowledge of the person/people who discovered them.
Sorry,
David Stewardson
Satyam Computer Services
Top of the list on a Google search for "Teradata Best Practices" gave me TERADATA ADVISORY GROUP SETS BEST PRACTICES FOR BUSINESS OBJECTS AND TERADATA CUSTOMERS
EDIT: Seeing as that's just advertising, as you've pointed out, see how you go with these. Please bear in mind that I don't have a clue what Teradata is and can't see myself using it any time this side of the 22nd century AD.
Teradata Discussion Forums
Best Practices for Teradata Deployments
Best Study Guides For NCR Teradata Certifications
The middle one looks promising with it's nice long link tree at the top
Oracle® Business Intelligence Applications Installation and Configuration Guide > Preinstallation and Predeployment Considerations for Oracle BI Applications > Teradata-Specific Database Guidelines for Oracle Business Analytics Warehouse >
and the first link, to the forums, should put you in touch with the right people.

Library of Useful (Difficult) SQL scripts [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Does anyone know where I can find a library of common but difficult (out of the ordinary) SQL script examples. I am talking about those examples you cannot find in the documentation but do need very often to accomplish tasks such as finding duplicates etc.
It would be a big time saver to have something like that handy.
EDIT: Thanks everyone, I think this is turning into a great quick reference. The more descriptive the more effective it would be, so please if you see your way open - please edit and add some descriptions of what one could find. Many thanks to those that have already done so!
You may find this wiki on LessThanDot useful, for the most part, it is by Denis Gobo, Microsoft SQL MVP.
EDIT:
The wiki includes 100+ SQL Server Programming Hacks, the list is, I think, too long to include here, however, there is a comprehensive index.
Also available from the same site: SQL Server Admin Hacks.
Here are a few that I find very useful:
SQL Server Best Practices - Microsoft SQL Server White Papers and Best Practices
vyaskn - A mixture of articles From DBA to Developer
Backup, Integrity Check and Index
Optimization
SQLServerCentral Scripts - Scripts for most most DBA tasks and more
Script Repository: SQL Server 2005 - TechNet Script Center
Scripts and Tools for Performance Tuning and Troubleshooting SQL Server 2005
SQL Server Query Processing Team - Hardcore advice from the MS SQL optimisation team
Common Solutions for T-SQL Problems
Davide Mauri's Index Scripts
Some Administration stuff
Glenn Berry: Five Very Useful Index Selection Queries for SQL Server 2005
Find "Missing" Indexes for the entire instance of SQL Server
Find "Missing" Indexes for a single table
Examine the current index structure for a single table
Look at index usage for a single table
Look for possible bad indexes inside the entire current database
Drill into your workload (Bonus)
SQL Server Central: Seven Monitoring Scripts
Failed jobs report
Free space by drive
Disabled jobs
Running jobs
Server role members
Last backup date
SQL Log
And last, but not least this resource: SQL Server Programming Hacks - 100+ List
Joe Celko's SQL Puzzles and Answers
The Art of SQL (slight Oracle bias)
Sql Cookbook has a variety of interesting example, though some will undoubtedly be unsupported by your RDBMS of choice. O'Reilly also has a T-SQL Cookbook, but I've never personally read it.
directly from MS Script Repository: SQL Server 2005:
http://www.microsoft.com/technet/scriptcenter/scripts/sql/sql2005/default.mspx?mfr=true
Nigel's very usefull stuff:
http://www.nigelrivett.net/#TransactSQL
Forgive me for the self-advertising, but I have posted a few on my blog (http://progblog.wordpress.com) because I'm rubbish at SQL and it's a good place to store things I know I'll need in the future :-) If anyone has anything more substantial then please post, I'm as keen as anyone to get hold of something like this!
I would guess that a copy of the "SQL Cookbook" would help too.
I've had some use of these SQL "hacks" for Oracle a couple of times.
Concatenate as grouping function
In query data generation for joining purposes
Here is another link for SQL Server: best practices - dozens of script examples
http://www.sqlusa.com/bestpractices2005/
Check Out SQLCAT.com (MS SQL BEST PRACTICES TEAM)
Riffing off the Celko answer: SQL For Smarties. This has great in depth chapters that will augment the SQL Puzzles book. Also there is another Celko book I just learned of named
Joe Celko's Trees and Hierarchies in SQL for Smarties.