Looks like my data warehouse project is moving to Teradata next year (from SQL Server 2005).
I'm looking for resources about best practices on Teradata - from limitations of its SQL dialect to idioms and conventions for getting queries to perform well - particularly if they highlight things which are significantly different from SQL Server 2005. Specifically tips similar to those found in The Art of SQL (which is more Oracle-focused).
My business processes are currently in T-SQL stored procedures and rely fairly heavily on SQL Server 2005 features like PIVOT, UNPIVOT, and Common Table Expressions to produce about 27m rows of output a month from a 4TB data warehouse.
One place to start is here: http://www.teradataforum.com/
This might be a little late, but there are a few things which I can warn you about Teradata which I have learned.
Use the most recent version as often as possible.
For V12 the optimizer was re-written and the database performs much better now.
Try to realize that SQL Server and Teradata are very different beasts, most of the concepts will not transition well.
Do not underestimate the importance of a primary index.
The locks that teradata uses are very primitive when compared to other databases.
Do NOT use TERA mode. You do not have any code which is legacy, ANSI mode is far superior and is widely encouraged.
Join indexes are very helpful tools, but they do not provide all the answers.
Parallelism, take the time to understand how FASTLOAD, MULTILOAD, and TPUMP works and find out how one can leverage it with their ETL strategy.
If you are attempting to run a query which needs to be performant, do not use any casts, the optimizer will not use statistics to generate the best execution plan.
Working with dates are going to be a pain, just a warning.
Teradata is very DDL oriented, try to understand all the syntax related when creating a table.
Compression is a wonderful tool, if you have any values which are repeated in a table, make use of it.
There are not many tools available with Teradata, be prepared to build a lot. The tools that exist are very expensive.
Unfortunately, I do not know much about SQL Server, so I cannot say what tools in SQL Server appear in Teradata.
Hope this helps
I would also look into the recently launched Teradata Developer Exchange as well as the TeradataForum and forums on Teradata's main website.
I don't know of any good references available online. Teradata has some design manuals that are available for download, but they're more instruction manuals and not "best practices" as such. check them out here: http://www.info.teradata.com/DataWarehouse/eTeradata-BrowseBy.cfm?page=Teradata%20Database
Alternatively, you need to find a friendly Teradata expert to bounce ideas off. Try Teradata themselves, or find a local consultant with Teradata experience.
Best Practices on Teradata isn't a topic that gets lots of discussions and most of the best tricks tend to be proprietary knowledge of the person/people who discovered them.
Sorry,
David Stewardson
Satyam Computer Services
Top of the list on a Google search for "Teradata Best Practices" gave me TERADATA ADVISORY GROUP SETS BEST PRACTICES FOR BUSINESS OBJECTS AND TERADATA CUSTOMERS
EDIT: Seeing as that's just advertising, as you've pointed out, see how you go with these. Please bear in mind that I don't have a clue what Teradata is and can't see myself using it any time this side of the 22nd century AD.
Teradata Discussion Forums
Best Practices for Teradata Deployments
Best Study Guides For NCR Teradata Certifications
The middle one looks promising with it's nice long link tree at the top
Oracle® Business Intelligence Applications Installation and Configuration Guide > Preinstallation and Predeployment Considerations for Oracle BI Applications > Teradata-Specific Database Guidelines for Oracle Business Analytics Warehouse >
and the first link, to the forums, should put you in touch with the right people.
Related
we are just starting to use Google Analytics data in BigQuery and previously used just the MSSQL Server in the work environment. We would like to move some of the analysis to the GCP and BigQuery, but could not decide on what is the better option to use - standard or legacy SQL?
In both cases we would have to adjust to the new language version, but the real question is what is the best choice when it comes to Google Analytics data analysis? Is there something that from the technical point of view should make us choose legacy over standard, or the other way around?
It is very misleading for us that there are two versions, because legacy seems to be more developed now, but perphaps standard will be the main version for SQL in the future in BQ?
BigQuery Standard SQL is the way to go. It has much more features than Legacy SQL.
Note: it is not binary choice. You always can use Legacy SQL - if there is something that you will find easier to express with it. From my experience it is mostly opposite - with very few exceptions. Most prominent (for me for example being) - Table Decorators - Support for table decorators in standard SQL is planned but not yet implemented.
I would recommend looking into Migrating from legacy SQL - not from migration point of view as you are the new to BigQuery - but because it is a good place to see and compare features of both dialects in one place.
Also I recommend to check BigQuery Issue Tracker so you can get some extra insight
Standard SQL is the preferred SQL dialect for use in BigQuery, as stated in the migration guide. While legacy SQL has been around for quite some time--and is still the default at the time of this writing--there is no active development work on it. If you are evaluating which to use, you should pick standard SQL, since in addition to being more similar to T-SQL (SQL Server's dialect) it is more expressive, has fewer surprising edge cases, and generally has more features.
Go with Standard SQL, as that's on the longterm roadmap.
From experience some queries are faster under Legacy SQL, but this is changing as Standard SQL is the one that is actively developed.
I know there are things out there to help to optimize queries, ect... but is there anything else, something like a full package that can scan your database and highlight all the performance issues, naming conventions, tables not properly normalized, etc?
I know this is the job of a DBA and if the DBA is good, he shouldn't need a tool like that, but sometimes you start a new job, you get in charge of an existing database and the DB is a mess, so you don't know where to start...
Thanks to everyone
Dave
In my opinion a normalized database does not guarantee good performance. Normalization is concerned primarily with data consistency.
It's not really practical for an automated tool to handle what you are proposing because there is no one case fits all implementation for best practice, which after all is to be treated as more of a guideline. That's why businesses hire people like me to take an objective look at their unique SQL Server environment (shameless plug).
The performance aspect can be addressed either by the wealth of features already available in the SQL Server product or by off the shelf tools such a those from Quest/Redgate and the like.
If you want to get a quick overall feel for the performance of a new SQL Server box that has come under your administrative control then I suggest either using the freely available Performance Dashboard Reports or SQL Server DMV's. You could also take a look at the current Wait Types on the server.
SQL Server 2005 Performance Dashboard
Reports
SQL Server Wait Type Example
How to identify the most costly SQL
Server queries using DMV’s
I hope this answers your question and do let me know if I can assist further.
Edited in response to comment:
Maybe a general Health Check could provide useful info.
SQL Server Best Practice
Analyzer
yesterday i went for an interview to be a sql / .net developer. my experience with sql is limited to basic pl/sql with oracle. they drilled me "do you know ssrs, do you know tsql, etc" well i kept saying no because i havent worked with them.
question: what do i have to learn in order to be able to work with microsoft sql? is it really that much different than oracle?
Grab a copy of SQL Server Express (see here) and start playing with it. There are sample databases that you can download to get you started.
SQL is the same, as it's a standard. T-SQL is an add on that has some flavors that are helpful to know. The way you setup procedures, functions, etc. is also different than PL-SQL, so that would be good to read up on. Outside the SQL Server engine and the various built-in tools, there are a lot of other MS products:
SSRS - SQL Server Reporting Services features a reporting engine, which are developed in Visual Studio.
SSIS - SQL Server Integration Services is a data import/export, etc. process, it's very handy to use for data import/export and other batch processing
SSAS - Analysis Services for OLAP
And so on. I don' tknow that SSAS helps you in this regard, but SSRS is pretty big so as a developer, reporting is a key feature and that would be handy to know something about. SSIS is good to know a little bit about, but might not be that handy, depending on what the org's needs are.
HTH.
SQL is pretty much SQL. There are some engine-specific differences but for most apps they're not significant. The management tools are obviously different. The OOB tools are vastly different.
SSRS is a reporting package (think Crystal reports on double steroids and you'd be close) not a DB engine. That should be listed as a separate job requirement.
I'd say get an MSDN license OR the free trial for SQL Server and install them all and try them out. Bookstore is a fairly generic app that you can extend forever and tryout new things.
Just keep in mind that someone hiring you is still going to want actual app experience, not your trials. If you can't get it at work, volunteer at an organization.
A good place to start is reading the MSDN SQL Server resource page. You'll find good information there about the whole MS SQL Ecosystem.
Then get a trial license, a virtual machine and start playing around.
It's kinda limited to their knowledge, as if you know basic ANSI sql then you can get almost all the basics running on SQL Server as they have a common base. As for SSRS, that is specific and will require reading and playing with it to learn. The SQL2008 Express with Advanced services should help you out.
With .net developer interviews I've been to they expect you to know the basics at minimum and be able to do joins and stuff in sql. Learning how to do temp tables and stored procedures as well as updates/selects/deletes and stuff should get you a bit further.
Potentially if they want that kinda experience either they are aiming the roll too low, or you've managed to slip through the net for a higher level role (which is sometimes a good thing) :-)
Can anyone recommend a good ANSI SQL reference manual?
I don't necessary mean a tutorial but a proper reference document to lookup when you need either a basic or more in-depth explanation or example.
Currently I am using W3Schools SQL Tutorial and SQL Tutorial which are ok, but I don't find them "deep" enough.
Of course, each major RDBMS producer will have some sort of reference manuals targeting their own product, but they tend to be biased and sometime will use proprietary extensions.
EDITED: The aim of the question was to focus on the things database engines have in common i.e. the SQL roots. But understanding the differences can also be a positive thing - this is quite interesting.
Here's the ‘Second Informal Review Draft’ of SQL:1992, which seems to have been accurate enough for everything I've looked up. 1992 covers most of the stuff routinely used across DBMSs.
SQL isn't like C or Java, where there is a standard for the language, and then a number of companies and organizations are implementing the language as best they can, following the standard.
Instead, the major databases came before the SQL standard, and the standard is a sort of compromise where every database vendor wanted to get their particular dialect and features in the standard.
Therefore, there is much more variety between databases than between typical programming language compilers, and to use a database, you really need to know that particular SQL dialect.
Having said that, I've got Gultzan and Peltzer's SQL-99 Complete, Really here in my bookshelf. It is a good book if you need to know what the standard really contains. (And yes, there is a newer version since SQL-99, but noone seems to care.)
EDIT: Actually, there is not just one newer version since SQL-99, but three: SQL:2003, SQL:2006, and SQL:2008. And still noone seems to care. Actually, many don't even care about SQL-99, so SQL-92 is still, in a way, "the standard".
ANSI documents can all be purchased from -- you guessed it -- ANSI.
http://webstore.ansi.org/
The main problem with an ANSI SQL reference manual is that you can't find a DB which implements it. And when it does, then you'll find that ANSI SQL can't solve some of the daily problems. Which is why all professional databases define extensions.
So at work, you'll need a reference manual for the specific version of the database which you use.
This reminds me of my 2nd year university course where we learn relational theory instead of SQL.
Read a good book on Relational Theory. Database theory and practice have evolved since Edgar Codd originally defined the relational model back in 1969. Independent of any SQL products, SQL and Relational Theory draws on decades of research to present the most up-to-date treatment of the material available anywhere. Anyone with a modest to advanced background in SQL will benefit from the many insights in this book.
Oreilly January 2009
I found the best way to learn SQL was to actually get to writing queries and understanding the nature of joins/conditionals etc. I found this link with a lot of DIY examples to be the best : http://sqlzoo.net/
It's a littel outdated, but this book is really helpful is looking at how the differnt vendors implement things, I belive it includes ANSII standard.
http://www.amazon.com/SQL-Nutshell-2nd-Kevin-Kline/dp/0596004818/ref=sr_1_1?ie=UTF8&s=books&qid=1257963172&sr=8-1
I really like just about anything Joe Celko has written Celko's Books
I think this may be helpful to you.
Understanding the ANSI SQL standard
By: Kevin Kline
http://www.amazon.com/gp/product/1565927443/102-0105946-4028970?v=glance&n=283155
The DevGuru resources always worked well for me:
http://www.devguru.com/technologies/t-sql/home.asp
Although I must admit it's not strictly an 'ANSI' focused resource. I've always been MS SQL centric, and it was helpful to me when I was starting out. IMHO Your best bet will be to use several resources - specifically including at least one of for each DB platform you want to use.
To Quote the DevGuru intro for their T-SQL resource:
Although there are standards for SQL,
such as ANSI SQL92 and SQL99, most
databases use their own dialect and/or
extentions. Microsoft's flavor of SQL
used in SQL Server 7 and SQL Server
2000 is called T-SQL. While many of
the examples in this quick reference
may work on other databases, it is
assumed that SQL Server 2000 is used,
especially for advanced topics such as
stored procedures.
One of the aspects of computer science/practical software engineering I am weaker at is actually doing significant work in database systems. That is to say, I can do simple queries on smaller datasets, no problem. However, working with complex queries on large datasets invokes a level of understanding of databases beyond me right now. For example, I built an amusing query some time ago that computed a join using a n^2 size where n=20,000- the hosting server suspended my account for blowing the CPU. Shocking.
I am interested in bringing myself up to speed on how to design schema and queries that, well, don't bring down the server. Pursuant to that end, what materials do you recommend that discuss professional database/SQL design and writing?
I would go to the bookstore and pick out some books on performance tuning for the database of your choice (it is very differnet depending onteh database backend). This will help you understand what not to do which is critical to designing databases.
Here's a site with a lot of good info
http://wiki.lessthandot.com/index.php/Category:Data_Management
For generic SQL I would go for Celko's books. For vendor specific, it depends on the platform of your choice. I know the SQL Server platform well and for that my praise go to the Inside series.
Blogs are also usefull, look at the all time SQL tag right here on SO and check the top answerers info, some have personal blogs that are very usefull. Eg. go through Quassnoi's blog, it has a LOT of useful info on MySQL, Oracle, SQL Server.