Column ambiguously defined in subquery using rownums - sql

I have to execute a SQL made from some users and show its results. An example SQL could be this:
SELECT t1.*, t2.* FROM table1 t1, table2 t2, where table1.id = table2.id
This SQL works fine as it is, but I need to manually add pagination and show the rownum, so the SQL ends up like this.
SELECT z.*
FROM(
SELECT y.*, ROWNUM rn
FROM (
SELECT t1.*, t2.* FROM table1 t1, table2 t2, where table1.id = table2.id
) y
WHERE ROWNUM <= 50) z
WHERE rn > 0
This throws an exception: "ORA-00918: column ambiguously defined" because both Table1 and Table2 contains a field with the same name ("id").
What could be the best way to avoid this?
Regards.
UPDATE
In the end, we had to go for the ugly way and parse each SQL coming before executing them. Basically, we resolved asterisks to discover what fields we needed to add, and alias every field with an unique id. This introduced a performance penalty but our client understood it was the only option given the requirements.
I will mark Lex answer as it´s the solution we ended up working on.

I think you have to specify aliasses for (at least one of) table1.id and table2.id. And possibly for any other corresponding columnnames as well.
So instead of SELECT t1.*, t2.* FROM table1 t1, table2 use something like:
SELECT t1.id t1id, t2.id t2id [rest of columns] FROM table1 t1, table2 t2
I'm not familiar with Oracle syntax, but I think you'll get the idea.

I was searching for an answer to something similar. I was referencing an aliased sub-query that had a couple of NULL columns. I had to alias the NULL columns because I had more than one;
select a.*, t2.column, t2.column, t2.column
(select t1.column, t1.column, NULL, NULL, t1.column from t1
where t1='VALUE') a
left outer join t2 on t2.column=t1.column;
Once i aliased the NULL columns in the sub-query it worked fine.

If you could modify the query syntactically (or get the users to do so) to use explicit JOIN syntax with the USING clause, this would automatically fix the problem at hand:
SELECT t1.*, t2.*
FROM table1 t1
JOIN table2 t2 USING (id)
The USING clause does the same as ON t1.id = t2.id (or the implicit JOIN you have in the question), except that only one id column remains in the result, thereby eliminating your problem.
You would still run into problems if there are more columns with identical names that are not included in the USING clause. Aliases as described by #Lex are indispensable then.

Use replace null values function to fix this.
SELECT z.*
FROM(
SELECT y.*, ROWNUM rn
FROM (
SELECT t1.*, t2.* FROM table1 t1, table2 t2, where
NVL(table1.id,0) = NVL(table2.id,0)
) y
WHERE ROWNUM <= 50) z
WHERE rn > 0

Related

Is is possible to attach table alias to column names to figure out where columns are coming from?

I have a query that I'm trying to rework that has over 1,000 columns when I select * FROM several tables. I want to know if there is a way in SQL to tag the column alias with the table alias so i can know from which table the columns are from. It looks like the following:
SELECT *
FROM table1 t1
join table2 t2
join table3 t3
join table4 t4
Current column output:
id, id, id, id, name, name, name, name, order, order, order, order
Desired Column output:
t1.id, t1.name, t1.order, t2.id, t2.name, t2.order,t3.id, t3.name, t3.order, t4.id, t4.name, t4.order
this is a very simple example but you can imagine trying to fish out the column you need of a sea of 1,000 columns trying to figure out what table it came from! Any ideas??
I'm not aware of a way to prefix each column with the column alias. However I do know how you could easily break the columns into groups that would allow you to figure out which table each column comes from.
SELECT 'T1' as [Table1]
, t1.*
, 'T2' as [Table2]
, t2.*
, 'T3' as [Table3]
, t3.*
, t4.* as [Table4]
, t4.*
, 'T5' as [Table5]
, t5.*
FROM table1 t1
join table2 t2
join table3 t3
join table4 t4
This would break out the columns into groups by table and it would break a little bookmark before and after each group to help you understand where they're coming.
I know not exactly what you asked for but I believe it would help you a lot in figuring out what's from what tables.
Your other option is as others have said and specifiying the prefix on every column which it sounds like you don't want to do. However it can be a lot quicker to do this if you drag the columns from the Object Explorer - and use ALT-SHIFT to add the prefix to each column.
Here's an article about copying columns from object explorer - https://www.qumio.com/Blog/Lists/Posts/Post.aspx?ID=56
Her's an article about adjusting code using ALT+SHIFT - https://blogs.msdn.microsoft.com/sql_pfe_blog/2017/04/11/quick-tip-shiftalt-for-multiple-line-edits/
The first method would take less than a method, the 2nd method I could see taking less than 10 minutes even for 1,000 columns.
You have to assign non-default column aliases manually:
select t1.id as t1_id, t1.name as t1_name, t1.order as t1_order,
t2.id as t2_id, t2.name as t2_name, t2.order as t2_order,
. . .
You might find that a spreadsheet or query can help, if you have a lot of columns.
Some products may have exceptions, but generally no, you can't do that. You either have to use wildcards (SELECT *) or specify the columns you wish returned by full and complete name.
If you specify columns, you can "alias" them, set the column name to something other than the source name. For example (psuedo-code, leaving out the "ON" clause):
SELECT
T1.Id as T1_Id
,T2.Id as T2_Id
from table1 T1
join table2 T2
Note that you can combine table aliases with wildcards. For example:
SELECT
T2.*
from table1 T1
join table2 T2
join table3 T3
join table4 T5
will return all the columns from table2, and only from table2. This might help in revising your query by getting a list of the available columns in each table.

works fine in one case / (column ambiguously defined)error in another

I have 2 tables with a column named the same. Column is BAN_KEY
when I run this query
with
t1 as
(
select *
from table1
),
t2 as
(
select *
from table2
)
t3 as
(
select *
from t1, t2
where t1.c1 = t2.c2
)
select * from t3
I get error column ambiguously defined, but when I do it this way
with
t1 as
(
select *
from table1
),
t2 as
(
select *
from table2
)
select *
from t1, t2
where t1.c1 = t2.c2
The result looks like this
BAN_KEY | BAN_KEY_1 | other columns
some values...
What's the reason for this?
First, learn to use proper JOIN syntax. Simple rule: Never use commas in the FROM clause. Always use proper, explicit JOINs.
That has nothing to do with your question. The answer is much simpler. For a CTE (or table), Oracle needs to be able to assign column names to the result so they can be access subsequently. It accepts the column names that you provide, assuming that your intention is correct. Duplicate column names are not allowed because the reference would be ambiguous; hence the error.
Why doesn't this happen for a result set? Oracle does not require that the columns in the result set of a query be unique. For convenience, though, it distinguishes between columns with the same name.

WHERE + NOT EXIST + 2 Columns

I have a query, that should return all records in T1 that not linked to records in T2:
SELECT DISTINCT fldID, fldValue FROM T1
WHERE NOT EXISTS
(
SELECT T1.fldID, T1.fldValue
FROM T2
JOIN T1 ON T2.fldID = T1.fldPtr
)
But it returns empty set -- should be one record.
If I use query like this (clause on one field):
SELECT DISTINCT fldID FROM T1
WHERE fldID NOT IN
(
SELECT T1.fldID
FROM T2
JOIN T1 ON T2.fldID = T1.fldPtr
)
It returns correct result.
But the SQL Server do not support syntax
WHERE ( fldID, flrValue ) NOT IN ....
Help me please to figure out how to compose query that will check several columns?
Thanks!
You can also use EXCEPT for this:
SELECT DISTINCT fldID, fldValue FROM T1
EXCEPT
SELECT T1.fldID, T1.fldValue
FROM T2
JOIN T1 ON T2.fldID = T1.fldPtr
A more efficient and elegant query that will work with every database is:
SELECT T1.*
FROM T1
LEFT JOIN T2
ON T2.fldID = T1.fldPtr
AND T2.flrValue = T1.flrValue
WHERE T2.fldID IS NULL
The LEFT JOIN attempts to match using both criteria, then the WHERE clause filters the joins, and only non-joins have NULL values for the LEFT JOINed table.
This approach is IMHO pretty much the industry standard for finding non-matches. It is usually more efficient than a NOT EXIstS(), although several databases optimize a NOT EXISTS() to this query anyway.
Use both those columns if sub-query join:
SELECT DISTINCT fldID, fldValue FROM T1
WHERE NOT EXISTS
(
SELECT *
FROM T2
JOIN T1 ON T2.fldID = T1.fldPtr
AND T1.fldValue = T2.flrValue
)
Something like (I think, as I'm not sure I 100% understand your question):
SELECT DISTINCT fldID FROM T1
WHERE fldID NOT IN
(
SELECT T1.fldID
FROM T2
JOIN T1 ON T2.fldID = T1.fldPtr
WHERE T2.flrValue = T1.flrValue
)
If you have the same structure in both tables you can use the EXCEPT operator http://technet.microsoft.com/en-us/library/ms188055.aspx
In a more general case, you have to to use left join and find null elements in second table.
try the below Query.
select DISTINCT fldID
from Table1
WHERE cast(fldID as varchar(100))+'~'+cast(flrValue as varchar)
NOT IN (select cast(fldID as varchar(100))+'~'+cast(flrValue as varchar) from table2)
This is more easy query. It returns all T1.fldID that not linked to records in T2
SELECT DISTINCT T1.fldID
FROM T1
LEFT JOIN T2 ON T2.fldID = T1.fldPtr
WHERE T2.fldID IS NULL
Using IN to exclude a large number of values is terrible for performance. Try the following:
SELECT T1.*
FROM T1
LEFT JOIN T2 ON T2.fldID = T1.fldPtr AND T1.fldValue = T2.fldvalue
WHERE T2.fldID IS NULL
(from my comment:) you do not have to reference t1 again in the subquery. Doing so would cause a logic of the form select all the records from t1 that don't exist in t1 ..., which is always empty, just like select all blue balls that are not blue, or select all odd numbers that are even ...
The first query should be:
SELECT DISTINCT fldID, fldValue
FROM T1
WHERE NOT EXISTS (
SELECT * FROM T2
WHERE T2.fldID = T1.fldPtr
);
And: in your original query, the subquery is uncorrelated: The t1 in the subquery shadows the t1 in the main query, making the subquery not referring any table or alias from the main query: it returns either True (some row exists) or False, the result being totally uncorrelated to the rows in the main query. (yet another good reason to use aliases instead of real table names in your queries)

How to join all columns from one table

I tried doing this but it failed.
SELECT table2.ID, table1.* FROM table2
LEFT JOIN table1 ON table1.ID = table2.table1ID
How do you select all columns from a table?
EDIT: There is no error in the above query. I don't know what caused the error but the code is now working.
You had field names conflict as both tables have ID field. You must to
SELECT table2.ID as t2_id, table1.* FROM table2
LEFT JOIN table1 ON table1.ID = table2.table1ID
What you have is syntactically correct, exactly what did you mean by it failed? Did you get an error message or just not the results you wanted? (BTW it is a bad practice to select *, only return the columns you need. In this case you do not need all the columns as the id field in table1 will have the exact same data as the file din table 2 it is joined to)
SELECT t2.ID, t1.* FROM table2 t2
LEFT JOIN table1 t1 ON t1.ID = t2.table1ID
this works on sql 2000+
If I am working inside a stored procedure where I have a defined #Table data type, there is no issue with using select table.* especially if I am using it for an output SELECT at the end. So the comment about production servers and network traffic in this case is meaningless as the entire stored procedure executes in memory. A select.* in this case is merely returning all the columns which have been defined ahead of time.

How to convert a SQL subquery to a join

I have two tables with a 1:n relationship: "content" and "versioned-content-data" (for example, an article entity and all the versions created of that article). I would like to create a view that displays the top version of each "content".
Currently I use this query (with a simple subquery):
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1
WHERE (version = (SELECT MAX(version) AS topversion
FROM mytable
WHERE (fk_idothertable = t1.fk_idothertable)))
The subquery is actually a query to the same table that extracts the highest version of a specific item. Notice that the versioned items will have the same fk_idothertable.
In SQL Server I tried to create an indexed view of this query but it seems I'm not able since subqueries are not allowed in indexed views. So... here's my question... Can you think of a way to convert this query to some sort of query with JOINs?
It seems like indexed views cannot contain:
subqueries
common table expressions
derived tables
HAVING clauses
I'm desperate. Any other ideas are welcome :-)
Thanks a lot!
This probably won't help if table is already in production but the right way to model this is to make version = 0 the permanent version and always increment the version of OLDER material. So when you insert a new version you would say:
UPDATE thetable SET version = version + 1 WHERE id = :id
INSERT INTO thetable (id, version, title, ...) VALUES (:id, 0, :title, ...)
Then this query would just be
SELECT id, title, ... FROM thetable WHERE version = 0
No subqueries, no MAX aggregation. You always know what the current version is. You never have to select max(version) in order to insert the new record.
Maybe something like this?
SELECT
t2.id,
t2.title,
t2.contenttext,
t2.fk_idothertable,
t2.version
FROM mytable t1, mytable t2
WHERE t1.fk_idothertable == t2.fk_idothertable
GROUP BY t2.fk_idothertable, t2.version
HAVING t2.version=MAX(t1.version)
Just a wild guess...
You Might be able to make the MAX a table alias that does group by.
It might look something like this:
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1 JOIN
(SELECT fk_idothertable, MAX(version) AS topversion
FROM mytable
GROUP BY fk_idothertable) as t2
ON t1.version = t2.topversion
I think FerranB was close but didn't quite have the grouping right:
with
latest_versions as (
select
max(version) as latest_version,
fk_idothertable
from
mytable
group by
fk_idothertable
)
select
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable,
t1.version
from
mytable as t1
join latest_versions on (t1.version = latest_versions.latest_version
and t1.fk_idothertable = latest_versions.fk_idothertable);
M
If SQL Server accepts LIMIT clause, I think the following should work:
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1 ordery by t1.version DESC LIMIT 1;
(DESC - For descending sort; LIMIT 1 chooses only the first row and
DBMS usually does good optimization on seeing LIMIT).
I don't know how efficient this would be, but:
SELECT t1.*, t2.version
FROM mytable AS t1
JOIN (
SElECT mytable.fk_idothertable, MAX(mytable.version) AS version
FROM mytable
) t2 ON t1.fk_idothertable = t2.fk_idothertable
Like this...I assume that the 'mytable' in the subquery was a different actual table...so I called it mytable2. If it was the same table then this will still work, but then I imagine that fk_idothertable will just be 'id'.
SELECT
t1.id,
t1.title,
t1.contenttext,
t1.fk_idothertable
t1.version
FROM mytable as t1
INNER JOIN (SELECT MAX(Version) AS topversion,fk_idothertable FROM mytable2 GROUP BY fk_idothertable) t2
ON t1.id = t2.fk_idothertable AND t1.version = t2.topversion
Hope this helps