SQL Server extra text after table name in SELECT - sql

I found a SQL statement to the effect of:
SELECT * FROM Users x
My question is: what is x? I have never seen this before.
Thanks.

x is an Alias for the table Users.
Using Table Aliases
The readability of a SELECT statement can be improved by giving a
table an alias, also known as a correlation name or range variable. A
table alias can be assigned either with or without the AS keyword:
SELECT * FROM Users x
SELECT * FROM Users AS x

It's an alias. The AS keyword is optional and has been left out, but it is the same as:
SELECT * FROM Users AS x
This means you can (in some implementations of SQL, SQL Server being one of them, must) use x in the rest of the query to refer back to the table Users specified here. For example:
SELECT x.MyColumn
FROM Users x
WHERE x.AnotherColumn = 42
There are three general use cases for aliases:
Readability. For long table names or when the name will be used many times, it can improve readability. For example, imagine the following without the alias:
SELECT x.SomeColumn, x.SomeOtherColumn, x.AThirdColumn
FROM [my crAzy Table Name with spaces in it] x
WHERE x.AnotherColumn = 42
Disambiguation. Often used for self-joins, note the use of the same table twice. You must use an alias to differentiate the two instances of the Users table:
SELECT x.SomeColumn, COUNT(y.SomeColumn)
FROM Users x
INNER JOIN Users y ON x.SomeOtherColumn < y.SomeOtherColumn
GROUP BY x.SomeColumn
Sub-queries in a FROM or JOIN clause (also called derived tables) must have a name. This is done by specifying an alias:
SELECT x.SomeColumn
FROM
(
SELECT SomeColumn
FROM Users
) x

It's just an alias for Users, which can be used in the query.
Imagine :
you want to retrieve datas from 2 tables, with an Id column in both
if you wanna retrieve these Ids, you have to prefix the column name to avoid confusion.
With alias:
select t1.Id, t2.Id
from mytableWithAReallyComplicatedName t1
inner join mySecondtableWithAReallyComplicatedName t2 on t1.Id = t2.Id
Without alias
select mytableWithAReallyComplicatedName.Id, mySecondtableWithAReallyComplicatedName.Id
from mytableWithAReallyComplicatedName
inner join mySecondtableWithAReallyComplicatedName on mytableWithAReallyComplicatedName.Id = mySecondtableWithAReallyComplicatedName .Id
It the table names are long, the query might be fast less practical to read, with the second version.

Related

SQL join: keep same column name, then refer to it

I'm regularly running into the following issue.
select
A.command_id as command_id,
sum(B.compile_time) as compile_time,
sum(B.run_time) as run_time,
compile_time + run_time as total_time
from commands as A
inner join subcommands as B on A.command_id = B.command_id
group by A.command_id
This doesn't seem to work because on line 5, the SQL engine seems to think that I'm referring to the columns of table B, and not the columns of the resulting table. Is there a way to fix that? Something like this.compile_time?
Of course I can rename the columns of the resulting table, e.g. total_compile_time and total_run_time. But this situation happens to me enough times that I hate having to be creative about the naming every time. It just makes sense to have the same column names in the result.
You can't use columns name alias in select because the alias name is created after the select execution then is not available in select clause.
For avoid error or problem you must repeat the sum function
select
A.command_id as command_id,
sum(B.compile_time) as compile_time,
sum(B.run_time) as run_time,
sum(B.compile_time) + sum(B.run_time) as total_time
from commands as A
inner join subcommands as B on A.command_id = B.command_id
group by A.command_id
there is a specific sequence for clause evaluation by the db engine in the db engine sequence evalation the alias resulting after the completion of select clause

SQL: Is the newly selected temp table in FROM clause not passed to the sub-query in WHERE clause?

here's a question I'm in trouble with. Basically, there are originally two tables: "a" and "b". I firstly "joined" (without using JOIN clause) them together with some conditions: "a.id=b.id", "b.class="xxx"". Then I name that temp table as A, and want to select the data with the highest income within the people in A.
The error returns "the relation A doesn't exist." And the error arrow turns to the clause "select max(A.income) from A". Therefore, I suspect that the temp table A created in FROM clause will not be passed to the sub-query in WHERE clause?
select * from
(select * from a,b where a.id=b.id and b.class='xxx') as A
where A.income = all
(select max(A.income) from A)
I've encountered this problem while using Postgres, but I think it may also happen in other languages like MYSQL or MSSQL. Are there any possible solutions to solve that? Without using WITH clause? Thanks. (The reason why I say "sub-query" instead of "query" is because I've tried terms like "where A.income>1000" and they all work)
The problem is that your alias a hides the table with the same name. Use a different alias name.
It is unclear whether you want to select from the original table a in the subquery or from the alias. If it is the former, then the above will solve your problem.
If you want to reference the alias in the subquery, you had better use a common table expression:
WITH alias_name AS (/* your FROM subquery */)
SELECT ... /* alias_name can be used in a subquery here */
You can try the below -
select * from a join b on a.id=b.id where b.class='xxx'
and income = all (select max(income) from a join b on a.id=b.id where b.class='xxx')

Using ON clause to JOIN tables with same column name

I wanted to ask about the condition of an ON clause while joining tables:
SELECT c_User.ID
FROM c_User
WHERE EXISTS (
SELECT *
FROM c_Group
JOIN c_Member ON (c_Group.Group_Name LIKE 'mcp%')
WHERE
c_Group.Name = c_Member.Parent_Name
AND c_Member.Child_Name = c_User.Lower_User_Name
)
I know that tables c_Member and c_Group have one column with the same name, Directory_ID. What I expected was c_Member and c_Group to join on that column using something like:
c_Group JOIN c_Member ON (c_Group.Directory_ID = c_Member.Directory_ID)
WHERE c_Group.Group_Name like 'mcp%'
How is this condition able to match the rows?
c_Member ON (c_Group.Group_Name LIKE 'mcp%')
Is this is a shorter way of referring to two tables joining on a column with the same name, while applying the LIKE condition?
If so, then can such a style work for a table with multiple column names that are the same?
This is your correlated subquery:
SELECT *
FROM c_Group
JOIN c_Member ON (c_Group.Group_Name LIKE 'mcp%')
WHERE
c_Group.Name = c_Member.Parent_Name
AND c_Member.Child_Name = c_User.Lower_User_Name
This subquery works, but the way it is spelled makes it quite unclear:
The join condition (c_Group.Group_Name LIKE 'mcp%') is not actually not related to the table being joined (c_Member) ; what it actually does is apply filter on table c_Group that makes it filter on (there is no magic such as shorter way of referring to two tables joining on a column with the same name, while applying the LIKE condition). It would make more sense to move it to the WHERE clause (this would still be functionaly equivalent).
On the other hand, the WHERE clause contains conditions that relate to the tables being joined (for example: c_Group.Name = c_Member.Parent_Name). A more sensible option would be to put them in the ON clause of the JOIN.
Other remarks:
when using NOT EXISTS, you usually would prefer SELECT 1 instead of SELECT *, (most RDBMS will optimize this under the hood for you, but this makes the intent clearer).
table aliases can be used to make the query more readable
I would suggest the following syntax for the query (which is basically syntaxically equivalent to the original, but a lot clearer):
SELECT u.ID
FROM c_User u
WHERE EXISTS (
SELECT 1
FROM c_Group g
JOIN c_Member m ON g.Name = m.Parent_Name AND m.Child_Name = u.Lower_User_Name
WHERE g.Group_Name LIKE 'mcp%'
)

Selecting ambiguous column from subquery with postgres join inside

I have the following query:
select x.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
Since id0 is in both sessions and clicked_products, I get the expected error:
column reference "id0" is ambiguous
However, to fix this problem in the past I simply needed to specify a table. In this situation, I tried:
select sessions.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
However, this results in the following error:
missing FROM-clause entry for table "sessions"
How do I return just the id0 column from the above query?
Note: I realize I can trivially solve the problem by getting rid of the subquery all together:
select sessions.id0
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0;
However, I need to do further aggregations and so do need to keep the subquery syntax.
The only way you can do that is by using aliases for the columns returned from the subquery so that the names are no longer ambiguous.
Qualifying the column with the table name does not work, because sessions is not visible at that point (only x is).
True, this way you cannot use SELECT *, but you shouldn't do that anyway. For a reason why, your query is a wonderful example:
Imagine that you have a query like yours that works, and then somebody adds a new column with the same name as a column in the other table. Then your query suddenly and mysteriously breaks.
Avoid SELECT *. It is ok for ad-hoc queries, but not in code.
select x.id from
(select sessions.id0 as id, clicked_products.* from sessions
inner join
clicked_products on
sessions.id0 = clicked_products.session_id0 ) x;
However, you have to specify other columns from the table sessions since you cannot use SELECT *
I assume:
select x.id from (select sessions.id0 id
from sessions
inner join clicked_products
on sessions.id0 = clicked_products.session_id0 ) x;
should work.
Other option is to use Common Table Expression which are more readable and easier to test.
But still need alias or selecting unique column names.
In general selecting everything with * is not a good idea -- reading all columns is waste of IO.

SQL - table alias scope

I've just learned ( yesterday ) to use "exists" instead of "in".
BAD
select * from table where nameid in (
select nameid from othertable where otherdesc = 'SomeDesc' )
GOOD
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
And I have some questions about this:
1) The explanation as I understood was: "The reason why this is better is because only the matching values will be returned instead of building a massive list of possible results". Does that mean that while the first subquery might return 900 results the second will return only 1 ( yes or no )?
2) In the past I have had the RDBMS complainin: "only the first 1000 rows might be retrieved", this second approach would solve that problem?
3) What is the scope of the alias in the second subquery?... does the alias only lives in the parenthesis?
for example
select * from table t where exists (
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeDesc' )
AND
select nameid from othertable o where t.nameid = o.nameid and otherdesc = 'SomeOtherDesc' )
That is, if I use the same alias ( o for table othertable ) In the second "exist" will it present any problem with the first exists? or are they totally independent?
Is this something Oracle only related or it is valid for most RDBMS?
Thanks a lot
It's specific to each DBMS and depends on the query optimizer. Some optimizers detect IN clause and translate it.
In all DBMSes I tested, alias is only valid inside the ( )
BTW, you can rewrite the query as:
select t.*
from table t
join othertable o on t.nameid = o.nameid
and o.otherdesc in ('SomeDesc','SomeOtherDesc');
And, to answer your questions:
Yes
Yes
Yes
You are treading into complicated territory, known as 'correlated sub-queries'. Since we don't have detailed information about your tables and the key structures, some of the answers can only be 'maybe'.
In your initial IN query, the notation would be valid whether or not OtherTable contains a column NameID (and, indeed, whether OtherDesc exists as a column in Table or OtherTable - which is not clear in any of your examples, but presumably is a column of OtherTable). This behaviour is what makes a correlated sub-query into a correlated sub-query. It is also a routine source of angst for people when they first run into it - invariably by accident. Since the SQL standard mandates the behaviour of interpreting a name in the sub-query as referring to a column in the outer query if there is no column with the relevant name in the tables mentioned in the sub-query but there is a column with the relevant name in the tables mentioned in the outer (main) query, no product that wants to claim conformance to (this bit of) the SQL standard will do anything different.
The answer to your Q1 is "it depends", but given plausible assumptions (NameID exists as a column in both tables; OtherDesc only exists in OtherTable), the results should be the same in terms of the data set returned, but may not be equivalent in terms of performance.
The answer to your Q2 is that in the past, you were using an inferior if not defective DBMS. If it supported EXISTS, then the DBMS might still complain about the cardinality of the result.
The answer to your Q3 as applied to the first EXISTS query is "t is available as an alias throughout the statement, but o is only available as an alias inside the parentheses". As applied to your second example box - with AND connecting two sub-selects (the second of which is missing the open parenthesis when I'm looking at it), then "t is available as an alias throughout the statement and refers to the same table, but there are two different aliases both labelled 'o', one for each sub-query". Note that the query might return no data if OtherDesc is unique for a given NameID value in OtherTable; otherwise, it requires two rows in OtherTable with the same NameID and the two OtherDesc values for each row in Table with that NameID value.
Oracle-specific: When you write a query using the IN clause, you're telling the rule-based optimizer that you want the inner query to drive the outer query. When you write EXISTS in a where clause, you're telling the optimizer that you want the outer query to be run first, using each value to fetch a value from the inner query. See "Difference between IN and EXISTS in subqueries".
Probably.
Alias declared inside subquery lives inside subquery. By the way, I don't think your example with 2 ANDed subqueries is valid SQL. Did you mean UNION instead of AND?
Personally I would use a join, rather than a subquery for this.
SELECT t.*
FROM yourTable t
INNER JOIN otherTable ot
ON (t.nameid = ot.nameid AND ot.otherdesc = 'SomeDesc')
It is difficult to generalize that EXISTS is always better than IN. Logically if that is the case, then SQL community would have replaced IN with EXISTS...
Also, please note that IN and EXISTS are not same, the results may be different when you use the two...
With IN, usually its a Full Table Scan of the inner table once without removing NULLs (so if you have NULLs in your inner table, IN will not remove NULLS by default)... While EXISTS removes NULL and in case of correlated subquery, it runs inner query for every row from outer query.
Assuming there are no NULLS and its a simple query (with no correlation), EXIST might perform better if the row you are finding is not the last row. If it happens to be the last row, EXISTS may need to scan till the end like IN.. so similar performance...
But IN and EXISTS are not interchangeable...