Why are table aliases commonly lowercase? - sql

I always see examples this way, but why? Is this a good practice?

So they're distinguishable from the rest of the query (which is typically written in upper case).
As for whether or not it's a best practice...if you're writing queries in all upper case, then yes it definitely makes your queries easier to read and understand.

I use lower case for the names invented by me.
These are table names, column names, my function names, aliases, etc.
The upper case is for the names invented by somebody else
That is reserved words, built-in functions, etc.
dual and dummy in Oracle are notable exception from this rule, but they are table name and column name, so I just use like with like.

Convention is always a good practice, so it is good to follow what your dev team has agreed upon. Many people subscribe to putting keywords in UPPER case, so differentiating aliases from keywords by making them lower is common.

I think just like casing questions with SQL-it's all personal preference. I like all lowercase in my queries so I tend to just keep it that way with aliases as well.

As said before, I think its personal preference.
I mostly use lower case, except aliases which I always capitalize.
I write queries only in stored procedures so I write only the important part of my query (and other TSQL "commands" like BEGIN, END, IF, ELSE, WHILE, etc.) in upper case use.
All aliases are capitalized so I can see at a glance to which table an attribute belongs.
If someone joins my team (project) he has to do the same, as I do when I join someone else's project.
As I try to make it more readable, I think that line breaks and indentations are more important that case (as long as it stays the same through the whole project).

I think there are several reasons to write them in lowercase:
Lowercase looks less like "shouting". Lots of UPPERCASE characters look like shouting (most people on a forum won't like posts in UPPERCASE). Some write keywords like IF in upper case, when also writing your aliases in uppercase you might get confused.
If you start a tablename with a Capital and your aliases with lower case characters you can keep them apart. Otherwise you might get confused when there are also tables with shorter names.
But no matter what standard you use in a team as long as everybody is using the same rules you can read each other codes and it will look less messy. Sometimes code can be so complex you don't want te get distracted by code which violates the "rules".

Related

Should underscores be used in column names?

Technically, the underscore character (_) can be used in column names. But is it good practice to use underscores in column names ? It seems to make the name more readable but I'm concerned about technical issues which may arise from using them. Column names will not be prefixed with an underscore.
There are no direct technical issue with using an underscore in the name. In fact, I do it quite often and find it helpful. Ruby even auto generate underscores in column names and SQL Servers own system objects use underscores too.
In general, it is a good idea to have some naming convention that you stick to in the database, and if that includes underscores, no big deal.
Any character can be used in the name, if you put square brackets or quotes around the name when referring to it. I try to avoid spaces though, since it makes things harder to read.
There are a few things you want to avoid when coming up with a naming convention for SQL Server. They are:
Don't prefix stored procedures with sp_ unless you are planning to make them system wide.
Don't prefix columns with their data type (since you may want to change it).
Avoid putting stuff in the sys schema (you can with hacking, but you shouldn't).
Pretend your code is case sensitive, even when it isn't. You never know when you end up on a server that has tempdb set up to be case sensitive.
When creating temp table, always specify collation for string types.
There is no problem with this, as long as it makes the column name clearer.
If you check PostgreSQL documentation you may find that almost all the objects are named with Snake Case.
Moreover, a lot of system objects in MySQL, MS SQL Server, Oracle DB, and aforementioned PostgreSQL use Snake Case.
From my personal experience it is not a big deal to use underscores for objects naming.
But there is a caveat.
Underscore symbol is a placeholder for any symbol in SQL LIKE operator:
SELECT * FROM FileList WHERE Extention LIKE 'ex_'
It is a potential issue when there is a lot of dynamic SQL code, especially if we are talking about autogenerated object names. And such bugs are quite hard to find.
Personally I would rather avoid underscores in naming. But at the same time there is no need to rewrite all the existing code if this type of naming has already being used.
Forewarned is forearmed.

Using backquote/backticks for mysql queries

I have building MYSQL queries with backticks. For example,
SELECT `title` FROM `table` WHERE (`id` = 3)
as opposed to:
SELECT title FROM table WHERE (id = 3)
I think I got this practice from the Phpmyadmin exports, and from what I understood, even Rails generates its queries like this.
But nowadays I see less and less queries built like this, and also, the code looks messier and more complicated with backticks in queries. Even with SQL helper functions, things would be simpler without them. Hence, I'm considering to leave them behind.
I wanted to find out if there is other implication in this practice such as SQL (MySQL in my case) interpretation speed, etc. What do you think?
Backticks also allow spaces and other special characters (except for backticks, obviously) in table/column names. They're not strictly necessary but a good idea for safety.
If you follow sensible rules for naming tables and columns backticks should be unnecessary.
Every time I see this discussed, I try to lobby for their inclusion, because, well, the answer is hidden in here already, although wryly winked away without further thought. When we mistakenly use a keyword as a field or table name, we can escape confusion by various methods, but only the keenly aware back-tick ` allows an even greater benefit!!!
Every word in a sql statement is run through the entire keyword hash table to see if conflicts, therefore, you've done you query a great favor by telling the compiler that, hey, I know what I'm doing, you don't need to check these words because they represent table and field names. Speed and elegance.
Cheers,
Brad
backticks are used to escape reserved keywords in your mysql query, e.g. you want to have a count column—not that uncommon.
you can use other special characters or spaces in your column/table/db names
they do not keep you safe from injection attacks (if you allow users to enter column names in some way—bad practice anyway)
they are not standardized sql and will only work in mysql; other dbms will use " instead
Well, if you ensure that you never accidentally use a keyword as an identifier, you don't need the backticks. :-)
You read the documentation on identifiers at http://dev.mysql.com/doc/refman/5.6/en/identifiers.html
SQL generators will often include backticks, as it is simpler than including a list of all MySQL reserved words. To use any1 sequence of BMP Unicode characters except U+0000 as an identifier, they can simply
Replace all backticks with double backticks
Surround that with single backticks
When writing handmade queries, I know (most of) MySQL's reserved words, and I prefer to not use backticks where possible as it is shorter and IMO easier to read.
Most of the time, it's just a style preference -- unless of course, you have a field like date or My Field, and then you must use backticks.
1. Though see https://bugs.mysql.com/bug.php?id=68676
My belief was that the backticks were primarily used to prevent erroneous queries which utilized common SQL identifiers, i.e. LIMIT and COUNT.

Why is SQL's grammar inside-out?

In just about any formally structured set of information, you start reading either from the start towards the end, or occasionally from the end towards the beginning (street addresses, for example.) But in SQL, especially SELECT queries, in order to properly understand its meaning you have to start in the middle, at the FROM clause. This can make long queries very difficult to read, especially if it contains nested SELECT queries.
Usually in programming, when something doesn't seem to make any sense, there's a historical reason behind it. Starting with the SELECT instead of the FROM doesn't make sense. Does anyone know the reason it's done that way?
I think the way in which a SQL statement is structured makes logical sense as far as English sentences are structured. Basically
I WANT THIS
FROM HERE
WHERE WHAT I WANT MEETS THESE CRITERIA
I don't think it makes much sense, In English at least, to say
FROM HERE
I WANT THIS
WHERE WHAT I WANT MEETS THESE CRITERIA
The SQL Wikipedia entry briefly describes some history:
During the 1970s, a group at IBM San Jose Research Laboratory developed the System R relational database management system, based on the model introduced by Edgar F. Codd in his influential paper, "A Relational Model of Data for Large Shared Data Banks". Donald D. Chamberlin and Raymond F. Boyce of IBM subsequently created the Structured English Query Language (SEQUEL) to manipulate and manage data stored in System R. The acronym SEQUEL was later changed to SQL because "SEQUEL" was a trademark of the UK-based Hawker Siddeley aircraft company.
The original name explicitly mentioned English, explaining the syntax.
Digging a little deeper, we find the FLOW-MATIC programming language.
FLOW-MATIC, originally known as B-0 (Business Language version 0), is possibly the first English-like data processing language. It was invented and specified by Grace Hopper, and development of the commercial variant started at Remington Rand in 1955 for the UNIVAC I. By 1958, the compiler and its documentation were generally available and being used commercially.
FLOW-MATIC was the inspiration behind the Common Business Oriented Language, one of the oldest programming languages still in active use. Keeping with that spirit, SEQUEL was designed with English-like syntax (1970s is modern, compared with 1950s and 1960s).
In perspective, "modern" programming systems still access databases using the age old ideas behind
MULTIPLY PRICE BY QUANTITY GIVING COST.
I must disagree. SQL grammar is not inside-out.
From the very first look you can tell whether the query will SELECT, INSERT, UPDATE, or DELETE data (all the rest of SQL, e.g. DDL, omitted on purpose).
Back to your SELECT statement confusion: The aim of SQL is to be declarative. Which means you express WHAT you want and not HOW you want it. So it makes every sense to first state WHAT YOU WANT (list of attributes you're selecting) and then provide the DBMS with some additional info on where that should be looked up FROM.
Placing the WHERE clause at the end makes great sense too: Imagine a funnel, wide at the top, narrow at the bottom. By adding a WHERE clause towards the end of the statement, you are choking down the amount of resulting data. Applying restrictions to your query any place else than at the bottom would require the developer to turn their head around.
ORDER BY clause at the very end: once the data has gone through the funnel, sort it.
JOINS (JOIN criteria) really belong into the FROM clause.
GROUPING: basically running data through a funnel before it gets into another funnel.
SQL sytax is sweet. There's nothing inside out about it. Maybe that's why SQL is so popular even after so many decades. It's rather easy to grasp and to make sense out of. (Although I have once faced a 7-page (A4-size) SQL statement which took me quite a while to get my head around.)
It's designed to be English like. I think that's the primary reason.
As a side note, I remember the initial previews of LINQ were directly modeled after it (select ... from ...). This was changed in later previews to be more programming language like (so that the scope goes downwards). Anders Hejlsberg specifically mentioned this weird fact about SQL (which makes IntelliSense harder and doesn't match C# scope rules) as the reason they made this decision.
Anyhow, good or bad, it's what it is and it's too late to change anything.
The order of the clauses in SQL is absolutely logical. Remember that SQL is a declarative language, where you declare what you want and the system figures out how best to get it for you. The first clause is the select clause where you list the columns that you want in the result table. This is the primary purpose of the query. Having stated what you want the result to look like, you next state where the data should come from. The where clause limits the amount of data being returned. There is no point in thinking about how to limit your data unless you know where it comes from, so it goes after the from clause. The group by clause works with the aggregation operators in the select clause and could go anywhere after the from clause however it is better to think about aggregation on the filtered data, so it comes after the where clause. The having clause has to come after the group by clause. The order by clause is about how the data is presented and could go anywhere after the select.
It's consistent with the rest of SQL's syntax of having every statement start with a verb (CREATE, DROP, UPDATE, etc.).
The major disadvantage of having the column list first is that it's inconvenient for auto-complete (as Hejlsberg has mentioned), but this wasn't a concern when the syntax was designed in the 1970s.
We could have had the best of both worlds with a syntax like SELECT FROM SomeTable: ColumnA, ColumnB, but it's too late to change it now.
Anyhow, SQL's SELECT statement order isn't unique. It exactly matches that of Python list comprehensions:
[(rec.a, rec.b) for rec in data where rec.a > 0]
History of the language aside (although it is fascinating) I think the thing you are missing is that SQL isn't about telling the system what to do, so much as what end result you want (and it figures out how to do it)
saying 'go over there to that rack, pick up the hats with hatbands, blue hats first, then green, then red, and bring them to me' is very much telling the system how to do what you want. it's programmer think where we presume the worker is very stupid and needs minutely detailed instructions.
SQL is starting with the end result first, the data you want, the order of the columns, etc.. it's very much the perspective of someone who is building a report. "I want firstname, lastname, then age, then....." That is after all the purpose of making the request. So it starts with that, the format of the results you want. Then it goes into where you expect it to find the data, what criteria to look for, the order to present it, etc.
So as an alternative to specifying in minute detail what you want the worker to do, SQL presumes the system knows how to do that, and centers more on what you want.
So instead of pedantically telling your worker to go here, get this, bring it over there.. it's more like saying "I want hats, from rack 12, which have hatbands, and please sort them by color."

Why can you have a column named ORDER in DB2?

In DB2, you can name a column ORDER and write SQL like
SELECT ORDER FROM tblWHATEVER ORDER BY ORDER
without even needing to put any special characters around the column name. This is causing me pain that I won't get into, but my question is: why do databases allow the use of SQL keywords for object names? Surely it would make more sense to just not allow this?
I largely agree with the sentiment that keywords shouldn't be allowed as identifiers. Most modern computing languages have 20 or maybe 30 keywords, in which case imposing a moratorium on their use as identifiers is entirely reasonable. Unfortunately, SQL comes from the old COBOL school of languages ("computing languages should be as similar to English as possible"). Hence, SQL (like COBOL) has several hundred keywords.
I don't recall if the SQL standard says anything about whether reserved words must be permitted as identifiers, but given the extensive (excessive!) vocabulary it's unsurprising that several SQL implementations permit it.
Having said that, using keywords as identifiers isn't half as silly as the whole concept of quoted identifiers in SQL (and these aren't DB2 specific). Permitting case sensitive identifiers is one thing, but quoted identifiers permit all sorts of nonsense including spaces, diacriticals and in some implementations (yes, including DB2), control characters! Try the following for example:
CREATE TABLE "My
Tablé" ( A INTEGER NOT NULL );
Yes, that's a line break in the middle of an identifier along with an e-acute at the end... (which leads to interesting speculation on what encoding is used for database meta-data and hence whether a non-Unicode database would permit, say, a table definition containing Japanese column names).
Many SQL parsers (expecially DB2/z, which I use) are smarter than some of the regular parsers which sometimes separate lexical and semantic analysis totally (this separation is mostly a good thing).
The SQL parsers can figure out based on context whether a keyword is valid or should be treated as an identifier.
Hence you can get columns called ORDER or GROUP or DATE (that's a particularly common one).
It does annoy me with some of the syntax coloring editors when they brand an identifier with the keyword color. Their parsers aren't as 'smart' as the ones in DB2.
Because object names are ... names. All database systems let you use quoted names to stop you from running into trouble.
If you are running into issues, the fault lies not with the practice of permitting object names to be names, but with faulty implementations, or with faulty code libraries which don't automatically quote everything or cannot be made to quote names as-needed.
Interestingly you can use keywords as field names in SqlServer as well. The only differenc eis that you would need to use parenthesis with the name of the field
so you can do something like
create table [order](
id int,
[order] varchar(50) )
and then :)
select
[order]
from
[order]
order by [order]
That is of course a bit extreme example but at least with the use of parenthesis you can see that [order] is not a keyword.
The reason I would see people using names already reserved by keywords is when there is a direct mapping between column names, or names of the tables and the data presentation. You can call that being lazy or convenient.

SQL Table Aliases - Good or Bad? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
What are the pros and cons of using table aliases in SQL? I personally try to avoid them, as I think they make the code less readable (especially when reading through large where/and statements), but I'd be interested in hearing any counter-points to this. When is it generally a good idea to use table aliases, and do you have any preferred formats?
Table aliases are a necessary evil when dealing with highly normalized schemas. For example, and I'm not the architect on this DB so bear with me, it can take 7 joins in order to get a clean and complete record back which includes a person's name, address, phone number and company affiliation.
Rather than the somewhat standard single character aliases, I tend to favor short word aliases so the above example's SQL ends up looking like:
select person.FirstName
,person.LastName
,addr.StreetAddress
,addr.City
,addr.State
,addr.Zip
,phone.PhoneNumber
,company.CompanyName
from tblPeople person
left outer join tblAffiliations affl on affl.personID = person.personID
left outer join tblCompany company on company.companyID = affl.companyID
... etc
Well, there are some cases you must use them, like when you need to join to the same table twice in one query.
It also depends on wether you have unique column names across tables. In our legacy database we have 3-letter prefixes for all columns, stemming from an abbreviated form from the table, simply because one ancient database system we were once compatible with didn't support table aliases all that well.
If you have column names that occur in more than one table, specifying the table name as part of the column reference is a must, and thus a table alias will allow for a shorter syntax.
Am I the only person here who really hates them?
Generally, I don't use them unless I have to. I just really hate having to read something like
select a.id, a.region, a.firstname, a.blah, b.yadda, b.huminahumina, c.crap
from table toys as a
inner join prices as b on a.blah = b.yadda
inner join customers as c on c.crap = something else
etc
When I read SQL, I like to know exactly what I'm selecting when I read it; aliases actually confuse me more because I've got to slog through lines of columns before I actually get to the table name, which generally represents information about the data that the alias doesn't. Perhaps it's okay if you made the aliases, but I commonly read questions on StackOverflow with code that seems to use aliases for no good reason. (Additionally, sometimes, someone will create an alias in a statement and just not use it. Why?)
I think that table aliases are used so much because a lot of people are averse to typing. I don't think that's a good excuse, though. That excuse is the reason we end up with terrible variable naming, terrible function acronyms, bad code...I would take the time to type out the full name. I'm a quick typer, though, so maybe that has something to do with it. (Maybe in the future, when I've got carpal tunnel, I'll reconsider my opinion on aliases. :P ) I especially hate running across table aliases in PHP code, where I believe there's absolutely no reason to have to do that - you've only got to type it once!
I always use column qualifiers in my statements, but I'm not averse to typing a lot, so I will gladly type the full name multiple times. (Granted, I do abuse MySQL's tab completion.) Unless it's a situation where I have to use an alias (like some described in other answers), I find the extra layer of abstraction cumbersome and unnecessary.
Edit: (Over a year later) I'm dealing with some stored procedures that use aliases (I did not write them and I'm new to this project), and they're kind of painful. I realize that the reason I don't like aliases is because of how they're defined. You know how it's generally good practice to declare variables at the top of your scope? (And usually at the beginning of a line?) Aliases in SQL don't follow this convention, which makes me grind my teeth. Thus, I have to search the entire code for a single alias to find out where it is (and what's frustrating is, I have to read through the logic before I find the alias declaration). If it weren't for that, I honestly might like the system better.
If I ever write a stored procedure that someone else will have to deal with, I'm putting my alias definitions in a comment block at the beginning of the file, as a reference. I honestly can't understand how you guys don't go crazy without it.
Good
As it has been mentioned multiple times before, it is a good practice to prefix all column names to easily see which column belongs to which table - and aliases are shorter than full table names so the query is easier to read and thus understand. If you use a good aliasing scheme of course.
And if you create or read the code of an application, which uses externally stored or dynamically generated table names, then without aliases it is really hard to tell at the first glance what all those "%s"es or other placeholders stand for. It is not an extreme case, for example many web apps allow to customize the table name prefix at installation time.
Microsoft SQL's query optimiser benefits from using either fully qualified names or aliases.
Personally I prefer aliases, and unless I have a lot of tables they tend to be single letter ones.
--seems pretty readable to me ;-)
select a.Text
from Question q
inner join Answer a
on a.QuestionId = q.QuestionId
There's also a practical limit on how long a Sql string can be executed - aliases make this limit easier to avoid.
If I write a query myself (by typing into the editor and not using a designer) I always use aliases for the table name just so I only have to type the full table name once.I really hate reading queries generated by a designer with the full table name as a prefix to every column name.
I suppose the only thing that really speaks against them is excessive abstraction. If you will have a good idea what the alias refers to (good naming helps; 'a', 'b', 'c' can be quite problematic especially when you're reading the statement months or years later), I see nothing wrong with aliasing.
As others have said, joins require them if you're using the same table (or view) multiple times, but even outside that situation, an alias can serve to clarify a data source's purpose in a particular context. In the alias's name, try to answer why you are accessing particular data, not what the data is.
I LOVE aliases!!!! I have done some tests using them vs. not and have seen some processing gains. My guess is the processing gains would be higher when you're dealing with larger datasets and complex nested queries than without. If I'm able to test this, I'll let you know.
You need them if you're going to join a table to itself, or if you use the column again in a subquery...
Aliases are great if you consider that my organization has table names like:
SchemaName.DataPointName_SubPoint_Sub-SubPoint_Sub-Sub-SubPoint...
My team uses a pretty standard set of abbreviations, so the guesswork is minimized. We'll have say ProgramInformationDataPoint shortened to pidp, and submissions to just sub.
The good thing is that once you get going in this manner and people agree with it, it makes those HAYUGE files just a little smaller and easier to manage. At least for me, fewer characters to convey the same info seems to go a little easier on my brain.
I like long explicit table names (it's not uncommon to be more than 100 characters) because I use many tables and if the names aren't explicit, I might get confused as to what each table stores.
So when I write a query, I tend to use shorter aliases that make sense within the scope of the query and that makes the code much more readable.
I always use aliases in my queries and it is part of the code guidebook in my company. First of all you need aliases or table names when there are columns with identical names in the joining tables. In my opinion the aliases improve readability in complex queries and allow me to see quickly the location of each columns. We even use aliases with single table queries, because experience has shown that single table queries don´t stay single table for long.
IMHO, it doesn't really matter with short table names that make sense, I have on occasion worked on databases where the table name could be something like VWRECOFLY or some other random string (dictated by company policy) that really represents users, so in that case I find aliases really help to make the code FAR more readable. (users.username makes a lot more sence then VWRECOFLY.username)
I always use aliases, since to get proper performance on MSSQL you need to prefix with schema at all times. So you'll see a lot of
Select
Person.Name
From
dbo.Person As Person
I always use aliases when writing queries. Generally I try and abbreviate the table name to 1 or 2 representative letters. So Users becomes u and debtor_transactions becomes dt etc...
It saves on typing and still carries some meaning.
The shorter names makes it more readable to me as well.
If you do not use an alias, it's a bug in your code just waiting to happen.
SELECT Description -- actually in a
FROM
table_a a,
table_b b
WHERE
a.ID = b.ID
What happens when you do a little thing like add a column called Description to Table_B. That's right, you'll get an error. Adding a column doesn't need to break anything. I never see writing good code, bug free code, as a necessary evil.
Aliases are required when joining tables with columns that have identical names.