duplicate query result when join table - sql

I face issue about duplicate data when join table, here my sample data table I have
-- Table A
I want to join with
-- Table B
this my query notation for join both table,
select a.trans_id, name
from tableA a
inner join tableB b
on a.ID_Trans = b.trans_id
and this the result, why I get the duplicating data which should show only two lines of data, please help me to solve this case.

Firstly, as you have been told multiple times in the comments, this is working exactly as you have written, and (more importantly) as intended. You have 2 rows in tableA and those 2 rows match 2 rows in your table tableB according to the ON clause. This means that each join operation, for the each of the rows in tableA, results in 2 rows as well; thus 4 rows (2 * 2 = 4).
Considering that your table, TableA only has one column then it seems that you should be cleaning up that data and deleting the duplicates. There are plenty of examples on how to do that already (example).
Perhaps the column you show us in TableA is one many, and thus instead you have a denormalisation issue, and instead there should be another table with the details of Id_trans and a PRIMARY KEY or UNIQUE CONSTRAINT/INDEX on it. Then you would join fron that table to TableB.
Finally, what you might be after is an EXISTS, which would look like this:
SELECT B.trans_id, B.[name]
FROM dbo.TableB B
WHERE EXISTS(SELECT 1
FROM dbo.TableA A
WHERE A.ID_Trans = B.trans_id); --Odd that it's called ID_Trans in one table, and Trans_ID in another

As the comments mentioned your query does exactly what you asked it to do but I think you wanted something like:
select a.trans_id, a.name, b.name
from tableA a
inner join tableB b on a.trans_id = b.trans_id
group by a.trans_id, a.name, b.name

Since there are two rows in both table with same ID join will make them four. You can use distinct to remove duplicates:
select distinct a.trans_id, name
from tableA a
inner join tableB b
on a.id_trans = b.trans_id
But I would suggest to use exists:
select trans_id, name
from tableB b
exists (select 1 from tableA a where a.trans_id=b.trans_id)

Related

join or merge two table based on id merge

I have two tables:
I am looking for the results like mentioned in the last.
I tried union (only similar col can be merged), left join, right join i am getting repeated fields in Null areas what can be other options where i can get null without column repeating
A full join would get all results from both tables.
select
A.ID,
A.ColA,
A.ColB,
B.ColC,
B.ColD
from TableA A
full join Table B on A.ID = B.ID
Here is a good post to understand joins
You can try distinct:
select distinct * from
tableA a,
tableB b
where a.id = b.id;
It will not give any duplicate tuples.

Join of Two Tables where Data Matches in One Column

For some reason I have a hard time grasping joins and this one should be very simple with the knowledge that I have in SQL.
Anyway, I have 2 tables. We will call them TableA and TableB. One of the columns in TableA is "ID". TableB only consists of the column "ID". I want to return all rows in TableA whose ID is present in TableB.
I know this should be very simple to figure out, but my brain doesn't want to work today.
You can do this using an EXISTS:
Select A.*
From TableA A
Where Exists
(
Select *
From TableB B
Where A.Id = B.Id
)
You can also use a JOIN if you wish, but depending on your data, you may want to couple that with a SELECT DISTINCT:
Select Distinct A.*
From TableA A
Join TableB B On A.Id = B.Id
One thing to keep in mind is that the ID of TableA is not necessarily related to the ID of TableB.
this should work
SELECT B.ID
FROM TableA A
JOIN TableB B
ON (A.ID=B.ID)
WHERE A.ID=B.ID
You can also use IN operator like this:
Select *
From TableA
Where ID in
(
Select distinct ID
From TableB
)

Joining on 2 tables but only selecting rows from one of the tables

I have 2 tables with identical names and schema. I would like to join on them, but only select rows from one of the tables. What is a good way to do this? The below query selects the rows from both tables, but I just want table a2 from the other DB.
select a.fkey_id, a2.fkeyid_id, a.otherthing, a2.otherthing from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
I tried using left outer join but since the schemas are identical between the 2 tables this doesn't seem to work.
EDIT: I am only including the "a" table columns in the select to get an idea of what values the rows are returning. I just don't want any rows returned from "a", so I'd like to filter those rows out somehow.
Just take out the references to "a2" columns from the select list.
select a.fkey_id, a.otherthing from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
OR
select a.* from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
Which begs the questions on why you're joining to the other table if you don't want data from it. Is this a filtering method? If so, it would better performance-wise to do an exists.
select a.* from mytable a
WHERE EXISTS (
SELECT 1
FROM otherdb.dbo.mytable a2
WHERE a.fkey_id=a2.fkey_id)
select a.fkey_id
, a.otherthing
from mytable a
WHERE EXISTS (SELECT 1
FROM otherdb.dbo.mytable a2
WHERE a.fkey_id=a2.fkey_id)

Select returns the same row multiple times

I have the following problem:
In DB, I have two tables. The value from one column in the first table can appear in two different columns in the second one.
So, the configuration is as follows:
TABLE_A: Column Print_group
TABLE _B: Columns Print_digital and Print_offset
The value from the different rows and Print_group column of the Table_A can appear in one row of the Table_B but in different column.
I have the following query:
SELECT DISTINCT * FROM Table_A
INNER JOIN B ON (Table_A. Print_digital = Table_B.Print_group OR
Table_A.Print_offset = Table_B.Print_group)
The problem is that this query returns the same row from the Table_A two times.
What I am doing wrong? What is the right query?
Thank you for your help
If I'm understanding your question correctly, you just need to clarify your fields to come from Table_A:
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
EDIT:
Given your comments, looks like you just need SELECT DISTINCT B.*
SELECT DISTINCT B.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group
I've still another question... first,to be clear, the right query version is
SELECT DISTINCT A.*
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group.
If I want it returns also one column from the B table it again returns duplicate rows. My query (the bad one) is the following one:
SELECT DISTINCT A.*, B.Id
FROM Table_A A
INNER JOIN B ON A.Print_digital = B.Print_group
OR A.Print_offset = B.Print_group

How to loop through rows in two tables and create a new set based on the merged results in SQL

Here is my obstacle.
I have two tables. Table A contains more rows than Table B. I have to merge the results and if Table A does not contain a row from Table B then I insert it into the new set. If however, a row from Table A contains a row with the same primary key as Table B, the new set will take the row from Table B.
Would this best be done in a cursor or is there an easier way to do this? I ask because there are 20 million rows and while I am new to sql, i've heard cursors are expensive.
Your phrasing is a little vague. It seems that you want everything from TableB and then rows from TableA that have no matching primary key in B. The following query solves this problem:
select *
from tableB union all
select *
from tableA
where tableA.pk not in (select pk from tableB)
Yep, cursors are expensive.
There's a MERGE command in later versions of SQL that will do this in one shot, but it's sooo cumbersome. Better to do it in two pieces - first:
UPDATE A SET
field1 = B.field1
,field2 = B.field2
, etc
FROM A JOIN B on B.id = A.id
Then:
INSERT A SELECT * FROM B --enumerate fields if different
WHERE B.id not in (select id FROM A)
An OUTER JOIN should do what you need and be more efficient than a cursor.
Try this query
--first get the rows that match between TableA and TableB
INSERT INTO [new set]
SELECT TableB.* --or columns of your choice
FROM TableA LEFT JOIN TableB ON [matching key criteria]
WHERE TableB.[joining column/PK] IS NOT NULL
--then get the rows from TableA that don't have a match
INSERT INTO [new set]
SELECT TableA.* --you didn't say what was inserted if there was no matching row
FROM TableA LEFT JOIN TableB ON [matching key criteria]
WHERE TableB.[joining column/PK] IS NULL