Inner join Without duplicates, is it possible? - sql

Given these two tables
Table A1 has two rows with the same value 'a'
A1
a
a
Table A2 has two rows with primary key value A,B and they are associated with 'a'
A2
PK col2
A a
B a
What I want is a join of A1 and A2 with this result
a A
a B
Obviously inner join doesn't work here. Is there a way to do this in SQL Server 2008?

You can wipe out the duplicates by using DISTINCT
select distinct
A1.col1,
A2.PK
from
A1
inner join A2
on A1.col1 = A2.col2

If distinct is not restricted
SELECT DISTINCT a.*, b.pk
FROM A1 a
INNER JOIN A2 b ON (a.[test] = b.fk)

There are no joining condition in the post, so we need to go for cross join. I have applied cross join and restrict the duplicate values using distinct.
Select distinct A1.Col1, A2.Pk
From A1 ,A2

"and restrict the duplicate values using distinct."
at least in Postgres 9+ DISTINCT eliminates existing duplicates but not preventing or restricting its appearing.

SELECT DISTINCT A.*
FROM aTable AS A
INNER JOIN
bTable AS B USING(columnId)

Related

Query left join without all the right rows from B table

I have 2 tables, A and B.
I need all columns from A + 1 column from B in my select.
Unfortunately, B has multiples rows(all identicals) for 1 row in A
on the join condition.
I tried but I can't isolate one row in A for one row in B with left join for example while keeping my select.
How can I do this query ? Query in ORACLE SQL
Thanks in advance.
This is a good use for outer apply. The structure of the query looks like this:
select a.*, b.col
from a outer apply
(select top 1 b.col
from b
where b.? = a.?
) b;
Normally, you would only use top 1 with order by. In this case, it doesn't seem to make a difference which row you choose.
You can group by on all columns from A, and then use an aggregate (like max or min) to pick any of the identical B values:
select a.*
, b.min_col1
from TableA a
left join
(
select a_id
, min(col1) as min_col1
from TableB
group by
a_id
) b
on b.a_id = a.id

Concatenating two tables distributively

I'm not 100% sure how to phrase the question, but I'm pretty much trying to do this:
say I have two tables:
table a:
a1
a2
and
table b:
b1
b2
I want to combine them and create a table such as:
a1 b1
a1 b2
a2 b1
a2 b2
(for every row in table a, create row number of rows in table b sort of)
I figure I'd be able to do this using a loop of some sort, but I was wondering if there was any way to do this with set logic?
The syntax you're looking for is a cross join:
SELECT a.*, b.*
FROM a
CROSS JOIN b
You don't need any loops.
This is a very simple task in SQL.
You can do:
select a.*, b.*
from a
cross join b
or:
select a.*, b.*
from a
inner join b on (1=1)
No need to loop just simple one line query would work.
SELECT a.*, b.* FROM a,b
Note: By Default it is cross join so no need to define keyword cross join.

SQL join including all rows from one table irrespective of how many are represented in the other

I have two tables:
I want to output the following:
I tried this statement:
SELECT TableA.bu_code
, SUM(TableB.count_invalid_date) AS TotalInvDate
FROM TableA
LEFT JOIN TableB ON TableA.bu_code = TableB.bu_code
GROUP BY TableA.bu_code
But it doesn't show every row represented in TableA, instead it does this:
Is there a single SQL statement that can output what I want?
You could use a left join after performing the group by:
SELECT a.bu_code, COALESCE (TotalInvDate, 0)
FROM TableA a
LEFT JOIN (SELECT bu_code, SUM(count_invalid_date) AS TotalInvDate
FROM TableB
GROUP BY bu_code) b ON a.bu_code = b.bu_code
There may be orphaned TableB rows without a parent row in TableA.
If so, use this GROUP BY syntax to see them.
GROUP BY ROLLUP(TableA.bu_code)
Refer to this Microsoft SQL-Server page on the GROUP BY clause for more details on the ROLLUP option.
SELECT A.bu_code AS [bu_code],
ISNULL(sum( B.count_invalid_date),0) AS [TotalInvDate]
FROM
TableA A
left JOIN
TableB B
ON
A.bu_code=B.bu_code
GROUP BY
A.bu_code

Joining on 2 tables but only selecting rows from one of the tables

I have 2 tables with identical names and schema. I would like to join on them, but only select rows from one of the tables. What is a good way to do this? The below query selects the rows from both tables, but I just want table a2 from the other DB.
select a.fkey_id, a2.fkeyid_id, a.otherthing, a2.otherthing from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
I tried using left outer join but since the schemas are identical between the 2 tables this doesn't seem to work.
EDIT: I am only including the "a" table columns in the select to get an idea of what values the rows are returning. I just don't want any rows returned from "a", so I'd like to filter those rows out somehow.
Just take out the references to "a2" columns from the select list.
select a.fkey_id, a.otherthing from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
OR
select a.* from mytable a
inner join otherdb.dbo.mytable a2 on a.fkey_id=a2.fkey_id
Which begs the questions on why you're joining to the other table if you don't want data from it. Is this a filtering method? If so, it would better performance-wise to do an exists.
select a.* from mytable a
WHERE EXISTS (
SELECT 1
FROM otherdb.dbo.mytable a2
WHERE a.fkey_id=a2.fkey_id)
select a.fkey_id
, a.otherthing
from mytable a
WHERE EXISTS (SELECT 1
FROM otherdb.dbo.mytable a2
WHERE a.fkey_id=a2.fkey_id)

SQL where clause for left outer join

I have a problem with a view I want to create. I have two tables joined in a left outer join, say tableA and tableB, where tableB is left outer joined.
I want to select only those rows from table B where state equals 4, so I add WHERE state = 4 to my query. Now the result set is trimmed quite a bit because all rows without a matching row in tableB are removed from the result (since state isn't 4 for those rows). I also tried WHERE state = 4 OR state IS NULL, doesn't work either (since state technically isn't NULL when there is no state).
So what I need is a WHERE statement which is only evaluated when there actually is a row, does such a thing exist?
If not I see two options: join (SELECT * FROM tableB WHERE state = 4) instead of table B, or create a view with the same WHERE statement and join that instead. What's the best option performance wise?
This is SQL Server 2008 R2 by the way.
You put the conditions in the on clause. Example:
select a.this, b.that
from TableA a
left join TableB b on b.id = a.id and b.State = 4
You can add state = 4 to the join condition.
select *
from T1
left outer join T2
on T1.T1ID = T2.T1ID and
T2.state = 4
Even easier than a subquery is expanding the on clause, like;
select *
from TableA a
left join
TableB b
on a.b_id = b.id
and b.state = 4
All rows from TableA will appear, and only those from TableB with state 4.
SQL Server will probably execute the view, expanded on, and subquery in exactly the same way. So performance wise, there should be little difference.
Alternative approach: (1) inner join to table B where state equals 4, (2) antijoin to table B to find rows that don't exist, (3) union the results:
SELECT A1.ID, A1.colA, B1.ColB
FROM tableA AS A1
INNER JOIN tableB AS B1
ON A1.ID = B1.ID
AND B1.state = 4
UNION
SELECT A1.ID, A1.colA, '{{MISSING}}' AS ColB
FROM tableA AS A1
WHERE NOT EXISTS (
SELECT *
FROM tableB AS B1
WHERE A1.ID = B1.ID
);
Alternatively:
SELECT A1.ID, A1.colA, B1.ColB
FROM tableA AS A1
JOIN tableB AS B1
ON A1.ID = B1.ID
AND B1.state = 4
UNION
SELECT ID, colA, '{{NA}}' AS ColB
FROM tableA
WHERE ID IN (
SELECT ID
FROM tableA
EXCEPT
SELECT ID
FROM tableB
);