SQL subquery multiple times error - sql

I am making a subquery but I am getting a strange error
The column 'RealEstateID' was specified multiple times for 'NotSold'.
here is my code
SELECT *
FROM
(SELECT *
FROM RealEstatesInfo AS REI
LEFT JOIN Purchases AS P
ON P.RealEstateID=REI.RealEstateID
WHERE DateBought IS NULL) AS NotSold
INNER JOIN OwnerEstate AS OE
ON OE.RealEstateID=NotSold.RealEstateID
It's on SQL server by the way.

That's because there will be 2 realestiteids in your subquery. You need to change it to explicitly list the columns from both table and only include 1 realestateid. It doesn't matter which as you use it for your join.
If you're very Lazy you can select rei.* and only name the p cols apart from realestateid.
Btw select * is probably never a good idea in sub queries or derived tables or ctes.

Related

Prevent duplicate record when inner join query in SQL

I used the inner join command to get the data from two tables.
But, when I run the SQL query.
I got the same record duplicated 48 times.
The SQL query I created is below
SELECT
ABS_LIMIT.B1_NAME, ABS_LIMIT.B2_NAME, ABS_LIMIT.B3_NAME, ABS_LIMIT.ELEM_NAME
FROM
ABS_LIMIT
INNER JOIN
RTU_SCAN ON RTU+SCAN.B1_NAME = ABS_LIMIT.B1_NAME
WHERE
ABS_LIMIT.B3_NAME LIKE 'AMP%';
Does anyone have any idea how to remove the duplicate from the query result?
You never SELECT any columns from RTU_SCAN so you can use EXISTS rather than an INNER JOIN:
SELECT a.B1_NAME,
a.B2_NAME,
a.B3_NAME,
a.ELEM_NAME
FROM ABS_LIMIT a
WHERE EXISTS (SELECT 1 FROM RTU_SCAN r WHERE r.B1_NAME = a.B1_NAME)
AND a.B3_NAME LIKE 'AMP%';
Then, if there are duplicates in RTU_SCAN they will not propagate duplicate rows in the output.
Alternatively, you could use DISTINCT to remove duplicates:
SELECT DISTINCT
a.B1_NAME,
a.B2_NAME,
a.B3_NAME,
a.ELEM_NAME
FROM ABS_LIMIT a
INNER JOIN RTU_SCAN r
ON r.B1_NAME = a.B1_NAME
AND a.B3_NAME LIKE 'AMP%';
However, it will probably be less efficient to generate duplicates and then filter them out using DISTINCT compared to using EXISTS and not generating the duplicates in the first place.

Selecting ambiguous column from subquery with postgres join inside

I have the following query:
select x.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
Since id0 is in both sessions and clicked_products, I get the expected error:
column reference "id0" is ambiguous
However, to fix this problem in the past I simply needed to specify a table. In this situation, I tried:
select sessions.id0
from (
select *
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0
) x;
However, this results in the following error:
missing FROM-clause entry for table "sessions"
How do I return just the id0 column from the above query?
Note: I realize I can trivially solve the problem by getting rid of the subquery all together:
select sessions.id0
from sessions
inner join clicked_products on sessions.id0 = clicked_products.session_id0;
However, I need to do further aggregations and so do need to keep the subquery syntax.
The only way you can do that is by using aliases for the columns returned from the subquery so that the names are no longer ambiguous.
Qualifying the column with the table name does not work, because sessions is not visible at that point (only x is).
True, this way you cannot use SELECT *, but you shouldn't do that anyway. For a reason why, your query is a wonderful example:
Imagine that you have a query like yours that works, and then somebody adds a new column with the same name as a column in the other table. Then your query suddenly and mysteriously breaks.
Avoid SELECT *. It is ok for ad-hoc queries, but not in code.
select x.id from
(select sessions.id0 as id, clicked_products.* from sessions
inner join
clicked_products on
sessions.id0 = clicked_products.session_id0 ) x;
However, you have to specify other columns from the table sessions since you cannot use SELECT *
I assume:
select x.id from (select sessions.id0 id
from sessions
inner join clicked_products
on sessions.id0 = clicked_products.session_id0 ) x;
should work.
Other option is to use Common Table Expression which are more readable and easier to test.
But still need alias or selecting unique column names.
In general selecting everything with * is not a good idea -- reading all columns is waste of IO.

RedShift SQL subquery with Inner join

I am using AWS Redshift SQL. I want to inner join a sub-query which has group by and inner join inside of it. When I do an outside join; I am getting an error that column does not exist.
Query:
SELECT si.package_weight
FROM "packageproduct" ub "clearpathpin" cp ON ub.cpipr_number = cp.pin_number
INNER JOIN "clearpathpin" cp ON ub.cpipr_number = cp.pin_number
INNER JOIN (
SELECT sf."AWB", SUM(up."weight") AS package_weight
FROM "productweight" up ON up."product_id" = sf."item_id"
GROUP BY sf."AWB"
HAVING sf."AWB" IS NOT NULL
) AS si ON si.item_id = ub.order_item_id
LIMIT 100;
Result:
ERROR: column si.item_id does not exist
It's simply because column si.item_id does not exist
Include item_id in the select statement for the table productweight
and it should work.
There are many things wrong with this query.
For your subquery, you have an ON statement, but it is not joining:
FROM "productweight" up ON up."product_id" = sf."item_id"
When you join the results of this subquery, you are referencing a field that does not exist within the subquery:
SELECT sf."AWB", SUM(up."weight") AS package_weight
...
) AS si ON si.item_id = ub.order_item_id
You should imagine the subquery as creating a new, separate, briefly-existing table. The outer query than joins that temporary table to the rest of the query. So anything not explicitly resulted in the subquery will not be available to the outer query.
I would recommend when developing you write and run the subquery on its own first. Only after it returns the results you expect (no errors, appropriate columns, etc) then you can copy/paste it in as a subquery and start developing the main query.

SQL optimization (inner join or selects)

I have a dilema, i had a teacher that thought me basically that inner joins are hell (he reproved me because I missed the delivery of the final proyect by 3 mins...), now i have another that tells me that using just selects is inefficient, so I don't know what is white nor black... could someone enlighten me with their knowledge?
Joins
SELECT
NombreP AS Nombre,
Nota
FROM Lleva
INNER JOIN Estudiante ON CedEstudiante = Estudiante.Cedula
WHERE
Lleva.SiglaCurso='CI1312';
No Joins
SELECT
NombreP AS Nombre,
Nota
FROM (
SELECT
Nota,
CedEstudiante
FROM Lleva
WHERE
SiglaCurso='CI1312'
) AS Lleva, (
SELECT
NombreP,
Cedula
FROM Estudiante
) AS Estudiante
WHERE
CedEstudiante = Estudiante.Cedula;
So wich one is more efficient?
Lets re-write the code so it's easier to understand:
SELECT E.NombreP AS Nombre
,L.Nota
FROM Lleva L INNER JOIN Estudiante E ON L.CedEstudiante = E.Cedula
WHERE
L.SiglaCurso='CI1312';
A table using a subquery may look like this:
SELECT L.Nota
,(SELECT E.Nombre
FROM Estudiante E
WHERE E.Cedula = L.CedEstudiante
)
FROM Lleva L
WHERE
L.SiglaCurso='CI1312'
What you actually did in your original query was an implicit join. This is similar to inner join, without declaring the exact joining conditions. Implicit joins will attempt to join on similarly named columns between tables. Most programmers do not advise or use implicit join.
As for join versus subquery, they are applied in different situations.
They are not equivalent. Notice what I have put in bold below:
A subquery will guarantee 1 returned value or return NULL; if there are multiple values returned in the subquery you will get an error for returning more than 1 value and have to solve the problem with an aggregation perhaps (max value, top 1 value). Subqueries that return no match will return NULL without affecting the rest of the row.
The JOIN (INNER JOIN) operates differently. It can match rows and get the single value you're looking for just like a subquery. But a join can multiply returned rows if the joining conditions are not distinct (singular/non-repeating). This is why joins are usually done on Primary Keys (PK's). PK's are distinct by definition. The INNER JOIN will also remove rows if a joining condition doesn't occur between tables. This may be what your first professor was trying to explain- an INNER JOIN can work in many cases - similar to a subquery- but can also can return additional rows or remove rows from the output.

Difference visibility in subquery join and where

I had problems with a simple join:
SELECT *
FROM worker wo
WHERE EXISTS (
SELECT wp.id_working_place
FROM working_place wp
JOIN working_place_worker wpw ON ( wp.id_working_place = wpw.id_working_place
AND wpw.id_worker = wo.id_worker)
)
The error I had was ORA-00904: "WO"."ID_WORKER": not valid identifier.
Then I decided to move the union of tables from join clause to the where clause:
SELECT *
FROM worker wo
WHERE EXISTS (
SELECT wp.id_working_place
FROM working_place wp
JOIN working_place_worker wpw ON ( wp.id_working_place = wpw.id_working_place)
WHERE wpw.id_worker = wo.id_worker
)
And this last query works perfect.
Why is not possible to make it in the join? The table should be visible like it is in the where clause. Am I missing something?
In
FROM working_place wp
JOIN working_place_worker wpw ON ...
WHERE ...
the ON clause refers only to the two tables participating in the join, namely wp and wpw. Names from the outer query are not visible to it.
The WHERE clause (and its cousin HAVING is the means by which the outer query is correlated to the subquery. Names from the outer query are visible to it.
To make it easy to remember,
ON is about the JOIN, how two tables relate to form a row (or rows)
WHERE is about the selection criteria, the test the rows must pass
While the SQL parser will admit literals (which aren't column names) in the ON clause, it draws the line at references to columns outside the join. You could regard this as a favor that guards against errors.
In your case, the wo table is not part of the JOIN, and is rejected. It is part of the whole query, and is recognized by WHERE.