SQL columns that were Inner joined showing up separately? - sql

I'm trying to join some tables together, but when I join them the columns I am joining on are showing up separately.
My code looks something like this:
Select * FROM
SimulationOutput as SO
Inner Join SimulationInput as SI
ON SO.Key = SI.Key
AND
SO.Scenario = SI.Scenario
When I do this, I get the result I expected but my result also includes duplicates of The "Key" columns and the "Scenario" columns that I just joined on.
Is this expected behavior?
https://www.w3schools.com/sql/trysql.asp?filename=trysql_select_join_inner
On that website
I ran the following code:
Select * FROM
Orders
Inner Join Customers ON Orders.CustomerID = Customers.CustomerID
ORDER BY Orders.CustomerID
And in that output, the CustomerId column only shows up once, the Customers.CustomerID was dropped. I am trying to understand why this is happening in the latter, but not the former.
The reason I ask is because I intend to use a series of nested tables. And I'm running into duplicate column problems where SQL doesn't know which Scenario column to choose. I understand that I can specify the specific column, but I was planning on specifying my specific columns at the top fo the nested loop, and use a series of nested Select* in between.
So the structure would be something like
Select Column1, Column2, Column3 From
(Select * FROM ......
(Select * FROM.......
(Select * FROM..... ) as T1) as T2) as T3
I understand that this may not be ideal and for this question my end result is moot, I want to understand why the columns I'm joining on aren't being dropped the way they are on the w3schools website.
Is it because I am using an AND statement? Is there a better way to inner join using multiple criteria? Is it because of my version of SQL (TSQL?)
Image From W3 schools output

Related

Specifying SELECT, then joining with another table

I just hit a wall with my SQL query fetching data from my MS SQL Server.
To simplify, say i have one table for sales, and one table for customers. They each have a corresponding userId which i can use to join the tables.
I wish to first SELECT from the sales table where say price is equal to 10, and then join it on the userId, in order to get access to the name and address etc. from the customer table.
In which order should i structure the query? Do i need some sort of subquery or what do i do?
I have tried something like this
SELECT *
FROM Sales
WHERE price = 10
INNER JOIN Customers
ON Sales.userId = Customers.userId;
Needless to say this is very simplified and not my database schema, yet it explains my problem simply.
Any suggestions ? I am at a loss here.
A SELECT has a certain order of its components
In the simple form this is:
What do I select: column list
From where: table name and joined tables
Are there filters: WHERE
How to sort: ORDER BY
So: most likely it was enough to change your statement to
SELECT *
FROM Sales
INNER JOIN Customers ON Sales.userId = Customers.userId
WHERE price = 10;
The WHERE clause must follow the joins:
SELECT * FROM Sales
INNER JOIN Customers
ON Sales.userId = Customers.userId
WHERE price = 10
This is simply the way SQL syntax works. You seem to be trying to put the clauses in the order that you think they should be applied, but SQL is a declarative languages, not a procedural one - you are defining what you want to occur, not how it will be done.
You could also write the same thing like this:
SELECT * FROM (
SELECT * FROM Sales WHERE price = 10
) AS filteredSales
INNER JOIN Customers
ON filteredSales.userId = Customers.userId
This may seem like it indicates a different order for the operations to occur, but it is logically identical to the first query, and in either case, the database engine may determine to do the join and filtering operations in either order, as long as the result is identical.
Sounds fine to me, did you run the query and check?
SELECT s.*, c.*
FROM Sales s
INNER JOIN Customers c
ON s.userId = c.userId;
WHERE s.price = 10

Getting way more results than expected in SQL left join query

My code is such:
SELECT COUNT(*)
FROM earned_dollars a
LEFT JOIN product_reference b ON a.product_code = b.product_code
WHERE a.activity_year = '2015'
I'm trying to match two tables based on their product codes. I would expect the same number of results back from this as total records in table a (with a year of 2015). But for some reason I'm getting close to 3 million.
Table a has about 40,000,000 records and table b has 2000. When I run this statement without the join I get 2,500,000 results, so I would expect this even with the left join, but somehow I'm getting 300,000,000. Any ideas? I even refered to the diagram in this post.
it means either your left join is using only part of foreign key, which causes row multiplication, or there are simply duplicate rows in the joined table.
use COUNT(DISTINCT a.product_code)
What is the question are are trying to answer with the tsql?
instead of select count(*) try select a.product_code, b.product_code. That will show you which records match and which don't.
Should also add a where b.product_code is not null. That should exclude the records that don't match.
b is the parent table and a is the child table? try a right join instead.
Or use the table's unique identifier, i.e.
SELECT COUNT(a.earned_dollars_id)
Not sure what your datamodel looks like and how it is structured, but i'm guessing you only care about earned_dollars?
SELECT COUNT(*)
FROM earned_dollars a
WHERE a.activity_year = '2015'
and exists (select 1 from product_reference b ON a.product_code = b.product_code)

SQL join result errors

I'm trying to run this join and I'm not receiving the correct values.
My first query return like 25,000 record
SELECT count(*) from table1 as DSO,
table2 as EAR
WHERE
(UCASE(TRIM(EAR.value)) = UCASE(TRIM(DSO.value))
AND
UCASE(TRIM(EAR.value1) = UCASE(TRIM(DSO.value1))
my second Query return like 3,000,000
SELECT count(*) from table1 as DSO
left join table2 as EAR,
ON
(UCASE(TRIM(EAR.value)) = UCASE(TRIM(DSO.value))
AND
UCASE(TRIM(EAR.value1) = UCASE(TRIM(DSO.value1))
The total of records of the table 1 are like 45,000, thats what I Should recieve.
First query is an INNER JOIN and second one is a LEFT JOIN. You should expect quite different results. Also, look at the way db2400 treats NULLs with the UCASE and TRIM functions. My guess is that your left join is making some matches that you don't want.
The INNER JOIN in the first query is going to exclude any records from table1 that don't have a match in table2. That pretty quickly explains the lower count.
Either join will happily create more than one row for each record in table1 if it finds multiple matches in table2. The difference is that the LEFT JOIN will ALSO create one row for each record in table1 that doesn't have a match in table2. It sounds like you expect there to be a 1:1 match between the two tables, but that is not what you are getting.

Duplicate columns with inner Join

SELECT
dealing_record.*
,shares.*
,transaction_type.*
FROM
shares
INNER JOIN shares ON shares.share_ID = dealing_record.share_id
INNER JOIN transaction_type ON transaction_type.transaction_type_id = dealing_record.transaction_type_id;
The above SQL code produces the desired output but with a couple of duplicate columns. Also, with incomplete display of the column headers. When I change the
linesize 100
the headers shows but data displayed overlaps
I have checked through similar questions but I don't seem to get how to solve this.
You have duplicate columns, because, you're asking to the SQL engine for columns that they will show you the same data (with SELECT dealing_record.* and so on) , and then duplicates.
For example, the transaction_type.transaction_type_id column and the dealing_record.transaction_type_id column will have matching rows (otherwise you won't see anything with an INNER JOIN) and you will see those duplicates.
If you want to avoid this problem or, at least, to reduce the risk of having duplicates in your results, improve your query, using only the columns you really need, as #ConradFrix already said. An example would be this:
SELECT
dealing_record.Name
,shares.ID
,shares.Name
,transaction_type.Name
,transaction_type.ID
FROM
shares
INNER JOIN shares ON shares.share_ID = dealing_record.share_id
INNER JOIN transaction_type ON transaction_type.transaction_type_id = dealing_record.transaction_type_id;
Try to join shares with dealing_record, not shares again:
select dealing_record.*,
shares.*,
transaction_type.*
FROM shares inner join dealing_record on shares.share_ID = dealing_record.share_id
inner join transaction_type on transaction_type.transaction_type_id=
dealing_record.transaction_type_id;

How to select all attributes in sql Join query

The following sql query below produces the specified result.
select product.product_no,product_type,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
However my intensions are to further produce a more detailed result which will include all the attributes in the PRODUCT table. my approach is to list all the attributes in the first line of the query.
select product.product_no,product.product_date,product.product_colour,product.product_style,
product.product_age product_type,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
Is there another way it can be done instead of listing all the attributes of PRoduct table one by one?
You can use * to select all columns from all tables, or you can use [table/alias].* to select all columns from the specified table. In your case, you can use product.*:
select product.*,salesteam.rep_name,salesteam.SUPERVISOR_NAME
from product
inner join salesteam
on product.product_rep=salesteam.rep_id
ORDER BY product.Product_No;
It is important to note that you should only do this if you are 100% sure you need every single column, and always will. There are performance implications associated with this; if you're selecting 100 columns from a table when you really only need 4 or 5 of them, you're adding a lot of overhead to the query. The DBMS has to work harder, and you're also sending more data across the wire (if your database is not on the same machine as your executing code).
If any columns are later added to the product table, those columns will also be returned by this query in the future.
select
product.*,
salesteam.rep_name,
salesteam.SUPERVISOR_NAME
from product inner join salesteam on
product.product_rep=salesteam.rep_id
ORDER BY
product.Product_No;
This should do.
You can write like this
select P.* --- all Product columns
,S.* --- all salesteam columns
from product P
inner join salesteam S
on P.product_rep=S.rep_id
ORDER BY P.Product_No;