SQL Duplicate column name error - sql

I am trying to find an error in a massive SQL statement (not mine) - I have cut a lot of it out to make it readable - even pared down it still throws the error
SELECT DISTINCT Profiles.ID
FROM
(select * from Profiles RIGHT JOIN FriendList ON (FriendList.Profile = 15237)
order by LastLoggedIn DESC ) as Profiles
This returns an error
Duplicate column name 'ID'
I have tested the the last part (select * from Profiles ... order by LastLoggedIn DESC) and it works fine by itself
I have tried to troubleshoot by changing column names in the DISTINCT section without any luck.
One solution I read was to remove the DISTINCT, but that didn't help.
I just can't see where the duplicate column error can be coming from. Could it be a database integrity problem?
Any help much appreciated.

Your Profile and FriendList tables both have an ID column. Because you say select *, you're getting two columns named ID in the sub-select which is aliased to Profiles, and SQL doesn't know which one Profiles.ID refers to (note that Profiles here is referring to the alias of the sub-query, not the table of the same name).
Since you only need the ID column, you can change it to this:
SELECT DISTINCT Profiles.ID FROM
( select Profiles.ID from Profiles RIGHT JOIN FriendList ON (FriendList.Profile = 15237)
order by LastLoggedIn DESC ) as Profiles

Replace the "select *" with "select col1, col2..." and the error should become apparent (i.e. multiple columns named "ID"). Nothing to do with distinct or database integrity.

you have a table called Profiles and you are "creating" a temp table called Profiles in your From, that would be my guess as to what is causing the problem. call your temp bananas and try SELECT DISTINCT bananas.ID FROM and see if that works

As the error says, each of the tables that you're joining together has a column named ID. You'll have to specify which ID column you want (Profiles.ID or FriendList.ID) or include ID in the join conditions.

Profiles and FriendList both have an ID column. You are asking to call the entire join "Profiles", and then using Profiles.ID, but SQL doesn't know which ID you mean.

Related

Why do I get a duplicate column name error only when I SELECT FROM (SELECT)

I imagine this is a really basic oversight on my part but I have an SQL query which works fine. But I when I SELECT from that result (SELECT FROM (SELECT))
I get a 'duplicate column' error. There are duplicate column names, for sure, in two tables where I compare them but they do not cause a problem in the initial result. For example:
SELECT _dia_tagsrel.tag_id,_dia_tagsrel.article_id, _dia_tags.tag_id, _dia_tags.tag
FROM _dia_tagsrel
JOIN _dia_tags
ON _dia_tagsrel.tag_id = _dia_tags.tag_id
Works fine but when I try to select from it, I get the error:
SELECT DISTINCT tag FROM
(SELECT _dia_tagsrel.tag_id,_dia_tagsrel.article_id, _dia_tags.tag_id, _dia_tags.tag
FROM _dia_tagsrel
JOIN _dia_tags
ON _dia_tagsrel.tag_id = _dia_tags.tag_id) a
Regardless of the DISTINCT. Ok, I can change the column names to be unique but the question really is - why do i get the error when I SELECT FROM (SELECT) and not in the initial query?
Thanks
Solution:
SELECT DISTINCT tag_id, tag FROM (SELECT _dia_tagsrel.tag_id, _dia_tagsrel.article_id, _dia_tags.tag
FROM _dia_tagsrel
JOIN _dia_tags
ON _dia_tagsrel.tag_id = _dia_tags.tag_id) a
I only needed to SELECT one of the duplicate columns, even though I was comparing the both of them. Provided by answer below.
In you are second query i.e., the sub query, you are selecting tag_id twice. Though it is from two different tables, it works out whey you are selecting the data. But when you select the columns with same name twice, it provides you duplicate error. Below is the way you have selected the column which is incorrect
_dia_tagsrel.tag_id,_dia_tagsrel.article_id, _dia_tags.tag_id, _dia_tags.tag
While using sub queries, merge, in or exists clause, avoid using the same column names multiple times.
Simple join works out no need of having subquery,
SELECT _dia_tagsrel.tag_id,_dia_tagsrel.article_id, _dia_tags.tag_id, _dia_tags.tag
FROM _dia_tagsrel
JOIN _dia_tags
ON _dia_tagsrel.tag_id = _dia_tags.tag_id
Your first query returns four columns:
tag_id
article_id
tag_id
tag
Duplicate column names are allowed in a result set, but are not allowed in a table -- or derived table, view, CTE, or most subqueries (an exception are EXISTS subqueries).
I hope you can see the duplicate. There is no need to select tag_id twice, because the JOIN requires that the values are the same. So just select three columns:
SELECT tr.tag_id, tr.article_id, t.tag
FROM _dia_tagsrel tr JOIN
_dia_tags t
ON tr.tag_id = t.tag_id
Your subquery has two tag_ids, so how database engine decide which one you want to use.
So, either use one (join requires tag_ids to be same) or re-name it :
If _dia_tag has unique tags then you can use EXISTS instead of INNER JOIN:
SELECT t.tag
FROM _dia_tags t
WHERE EXISTS (SELECT 1 FROM _dia_tagsrel tr WHERE tr.tag_id = t.tag_id);

Nesting SQL queries

I have the following SQL query. In this query, I am left joining a table tblTestImport with access query named "unique". I am trying to integrate the query "unique" into the code below. I am not having any luck, please help.
DELETE tblTestImport.ID
FROM tblTestImport
WHERE tblTestImport.[ID]
in (SELECT tblTestImport.ID
FROM tblTestImport
LEFT JOIN **[unique]**
ON tblTestImport.ID = **unique.**LastOfID
WHERE (((**unique.**LastOfID) Is Null)));
Code for "unique" query
SELECT Last(tblTestImport.ID) AS LastOfID
FROM tblTestImport
GROUP BY tblTestImport.Url, tblTestImport.Kms, tblTestImport.Price, tblTestImport.Time;
Further info: I am trying to delete duplicates from the Access Table, and leave unique ones only.
tblTestImport has duplicate records. "Unique" query displays unique records. Then, I join tblTestImport table with "unique" query, to determine which unique records do not exist in tblTestImport. This gives me a list duplicates, which I want to delete.
Within the big chunk of code, i have [unique], which I would like replace with small piece of code below.
Your query will return no results, assuming that id is never NULL.
Why? Well unique.lastOfId is always a valid id in tblTestImport. As such, the LEFT JOIN will always match, and lastOfId will never be NULL.
So, no rows are in the subquery.
My suggestion is that you ask another question. Explain what you want to do and provide sample data and desired results.
It's hard to tell exactly what you're after based on the current setup of your question. However, this may be what you're looking for:
DELETE tblTestImport.ID
FROM tblTestImport
WHERE tblTestImport.[ID] IN
(SELECT tblTestImport.ID
FROM tblTestImport
LEFT JOIN [unique] ON tblTestImport.ID = unique.LastOfID
WHERE unique.LastOfID Is Null
AND tblTestImport.ID IN (
SELECT Last(tblTestImport.ID) AS LastOfID
FROM tblTestImport
GROUP BY tblTestImport.Url, tblTestImport.Kms, tblTestImport.Price, tblTestImport.Time)
)
(This is a simple nested query in the previous query, but it would all depend on the relationship of the data itself. This query assumes that they are linked on the same tblTestImport.ID field)
Try the following, untested:
DELETE FROM tblTestImport
WHERE ID <> (SELECT Min(ID) AS MinOfID FROM tblTestImport AS Dupe
WHERE (Dupe.Url= tblTestImport.Url)
AND (Dupe.Kms= tblTestImport.Kms)
AND (Dupe.Price= tblTestImport.Price)
AND (Dupe.Time= tblTestImport.Time));
Solution based on the following http://allenbrowne.com/subquery-01.html

Insert Into where not exists from specific category

I have a table that contains several repair categories, and items that are associated with each repair category. I am trying to insert all the standard items from a specific repair category that don't already exist into a Details table.
TblEstimateDetails is a join table for an Estimate Table and StandardItem Table. And TblCategoryItems is a join table for the Repair Categories and their respective Standard Items.
For example in the attached image, Left side are all the Standard Items in a Repair Category, and Right side are all the Standard Items that are already in the EstimateDetails table.
Standard Items All vs Already Included
I need to be able to insert the 6 missing GUIDS from the left, and into the table on the right, and only for a specific estimate GID.
This is being used in an Access VBA script, which I will translate into the appropriate code once I get the sql syntax correct.
Thank you!
INSERT INTO TblEstimateDetails(estimate_GID, standard_item_GID)
SELECT
'55DEEE29-7B79-4830-909C-E59E831F4297' AS estimate_GID
, standard_item_GID
FROM TblCategoryItems
WHERE repair_category_GID = '32A8AE6D-A512-4868-8E1A-EF0357AB100E'
AND NOT EXISTS
(SELECT standard_item_GID
FROM TblEstimateDetails
WHERE estimate_GID = '55DEEE29-7B79-4830-909C-E59E831F4297');
Some things to try: 1) simplify to a select query to see if it selects the right records, 2) use a NOT IN statement instead of NOT EXISTS. There's no reason NOT EXISTS shouldn't work, I'd just try a different way if it isn't working.
SELECT '55DEEE29-7B79-4830-909C-E59E831F4297' AS estimate_GID,
standard_item_GID
FROM TblCategoryItems
WHERE repair_category_GID = '32A8AE6D-A512-4868-8E1A-EF0357AB100E'
AND standard_item_GID NOT IN
(SELECT standard_item_GID FROM TblEstimateDetails
WHERE estimate_GID = '55DEEE29-7B79-4830-909C-E59E831F4297');
Got it figured out. Access needs the subquery to be correlated to main query to work. So I set the WHERE clause in the subquery to equal the matching column in the main query. And I had to join the Estimates table so that it picked only the items in a specific estimate.
SELECT
'06A2E0A9-9AE5-4073-A856-1CCE6D9C48BB' AS estimate_GID
, CI.standard_item_GID
FROM TblCategoryItems CI
INNER JOIN TblEstimates E ON CI.repair_category_GID=E.repair_category_GID
WHERE E.repair_category_GID = '15238097-305E-4456-B86F-3787C9B8219B'
AND NOT EXISTS
(SELECT ED.standard_item_GID
FROM TblEstimateDetails ED
WHERE E.estimate_GID=ED.estimate_GID
);

Why is selecting specified columns, and all, wrong in Oracle SQL?

Say I have a select statement that goes..
select * from animals
That gives a a query result of all the columns in the table.
Now, if the 42nd column of the table animals is is_parent, and I want to return that in my results, just after gender, so I can see it more easily. But I also want all the other columns.
select is_parent, * from animals
This returns ORA-00936: missing expression.
The same statement will work fine in Sybase, and I know that you need to add a table alias to the animals table to get it to work ( select is_parent, a.* from animals ani), but why must Oracle need a table alias to be able to work out the select?
Actually, it's easy to solve the original problem. You just have to qualify the *.
select is_parent, animals.* from animals;
should work just fine. Aliases for the table names also work.
There is no merit in doing this in production code. We should explicitly name the columns we want rather than using the SELECT * construct.
As for ad hoc querying, get yourself an IDE - SQL Developer, TOAD, PL/SQL Developer, etc - which allows us to manipulate queries and result sets without needing extensions to SQL.
Good question, I've often wondered this myself but have then accepted it as one of those things...
Similar problem is this:
sql>select geometrie.SDO_GTYPE from ngg_basiscomponent
ORA-00904: "GEOMETRIE"."SDO_GTYPE": invalid identifier
where geometrie is a column of type mdsys.sdo_geometry.
Add an alias and the thing works.
sql>select a.geometrie.SDO_GTYPE from ngg_basiscomponent a;
Lots of good answers so far on why select * shouldn't be used and they're all perfectly correct. However, don't think any of them answer the original question on why the particular syntax fails.
Sadly, I think the reason is... "because it doesn't".
I don't think it's anything to do with single-table vs. multi-table queries:
This works fine:
select *
from
person p inner join user u on u.person_id = p.person_id
But this fails:
select p.person_id, *
from
person p inner join user u on u.person_id = p.person_id
While this works:
select p.person_id, p.*, u.*
from
person p inner join user u on u.person_id = p.person_id
It might be some historical compatibility thing with 20-year old legacy code.
Another for the "buy why!!!" bucket, along with why can't you group by an alias?
The use case for the alias.* format is as follows
select parent.*, child.col
from parent join child on parent.parent_id = child.parent_id
That is, selecting all the columns from one table in a join, plus (optionally) one or more columns from other tables.
The fact that you can use it to select the same column twice is just a side-effect. There is no real point to selecting the same column twice and I don't think laziness is a real justification.
Select * in the real world is only dangerous when referring to columns by index number after retrieval rather than by name, the bigger problem is inefficiency when not all columns are required in the resultset (network traffic, cpu and memory load).
Of course if you're adding columns from other tables (as is the case in this example it can be dangerous as these tables may over time have columns with matching names, select *, x in that case would fail if a column x is added to the table that previously didn't have it.
why must Oracle need a table alias to be able to work out the select
Teradata is requiring the same. As both are quite old (maybe better call it mature :-) DBMSes this might be historical reasons.
My usual explanation is: an unqualified * means everything/all columns and the parser/optimizer is simply confused because you request more than everything.

Select rows that are different in SQL

I have a table with way too many columns and a couple million rows that I need to query for differences.
On these rows there will hopefully be only one column that is different and that should be the Auto incremented id field.
What I need to do is check to see if these rows ARE actually the same and if there are any that have any differences in any of the fields.
So for example, if the "Name" column is supposed to be "Peter, Paul and Mary" and the "Order #" column is supposed to be "132" I need to find any rows where those values aren't true, but I need to find it for every column in the table AND I don't actually know what the correct values are (meaning I can't just create a "SELECT...WHERE Name='This'" for each column).
So how can I find the rows that are different? (using straight SQL, no programming)
Would you think this answer is what you are looking for and would help you? here's a Link to find the appropriate sql query.
Let's suppose you coded a email newsletter signup form, but you forgot to double check that the email address was not a duplicate, or already in the database. We can write a query to find all the emails in our table that are duplicates, or occurs in more than one row.
The following SQL query works great for finding duplicate values in a table.
SELECT email,
COUNT(email) AS NumOccurrences
FROM users
GROUP BY email
HAVING ( COUNT(email) > 1 )
By using group by and then having a count greater than one, we find rows with with duplicate email addresses using the above SQL.
Blockquote
If you know the limit of the wrong results (say 10 for example) then you could order them and get only the first 11 results. You see where I am going with this, right?
I have no SQL expertise whatsoever though :)
Do you need to do this programmatically, or can you just run a few queries yourself to check it?
If the latter, I'd just do "select distinct name, order#" to start. This should return a list that includes "Peter Paul and Mary, 132" and possibly some other things.
Then find the other things by doing select ... where name = "this" as you suggest.
You could get even more info out of that first query by doing "select distinct name, order#, count(*) from ... group by name, order#". This would give you both the list of values and the frequency of a given set of values.
if I understand you correctly, (your question is not 100% clear to me), you are tryin g to find the rows that are unnecessary duplicates ? If so, Try these SQL queries:
Select A.Id, B.Id
From Table A
Join Table B
On A.Id <> B.Id
And A.ColA = B.ColA
And A.ColB = B.Col
And A.ColC = B.ColC
...
Or
Select ColA, ColB, etc.
From Table
Group By ColA, ColB, etc.
Having Count(*) > 1
If you have a correlation between two "independent" columns where there is really only one "correct" value for column B whenever column A is a given value, then you have a broken database design, because these correlation should have been factored out as a separate table.
Try this:
SELECT Name, OrderNum
FROM Orders T1
FULL OUTER JOIN (
SELECT Name, OrderNum
FROM Orders
GROUP BY Name, OrderNum
HAVING COUNT(*) > 1) T2
ON T1.Name = T2.Name
AND T1.OrderNum = T2.OrderNum
The nested select is identifying the duplicates, so you will need to target your common fields, the FULL OUTER JOIN excludes the duplicates from your result set. So essentially you are joining the table on itself to identify the duplicates and exclude them from your results. If you want only the duplicates then change the FULL OUTER JOIN to just JOIN.