SQL join with distinct column on one table - sql

Maybe I'm searching using the wrong words because I can't find the answer elswhere, but I need to join two tables but make sure the ID from one of the tables is distinct. Something like the below:
SELECT B.COLUMN_A, B.COLUMN_B, B.COLUMN_C
FROM TABLE1 A
JOIN TABLE2 B
ON (Distinct) A.COLUMN_A = B.COLUMN_A;
The value A.COLUMN_A from TABLE1 needs to be DISTINCT.
I've tried the below but that didn't work:
SELECT B.COLUMN_A, B.COLUMN_B, B.COLUMN_C
FROM TABLE1 A
JOIN (SELECT DISTINCT COLUMN_A FROM TABLE2) B
ON A.COLUMN_A = B.COLUMN_A;
I keep getting a ORA-00904: invalid identifer error on B.COLUMN_C. If I try to use ) AS B then I get a ORA-00905: missing keyword error.

If you don't care about the other values, use group by
SELECT b.column_a, b.column_b, b.column_c
FROM table1 a
JOIN (
SELECT column_a, max(column_b) as column_b, max(column_c) as column_c
FROM table2
GROUP BY column_a
) b ON a.column_a = b.column_a

Use a ROW_NUMBER to get a single row per COLUMN_A:
SELECT *
FROM table1 A
JOIN
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY COLUMN_A ORDER BY COLUMN_A) AS rn
FROM table2
) B
ON A.column_a = B.column_a
AND B.rn = 1

Maybe you need something like this:
select * from
(
select column_a,column_b,column_c
from
(
select column_a,column_b,column_c, count(1) over (partition by column_a) as num
from tableB
)
where num = 1
)tB
inner join tableA
using (column_a)
The double nesting is not necessary, but I hope it makes the query more readable

If you need col_a, col_b, col_c and want to ensure col_a never repeats and col_b, col_c values are not germane, then :
SELECT col_a, col_b, col_c
FROM table2
WHERE rowid in ( SELECT min(rowid)
FROM table2 A , table1 B )
WHERE B.col_a = A.col_a
GROUP BY A.col_a )
In above you choose one distinct row of Table2 that is also present in Table1. Then using that row's id you select all three columns.

You are not selecting any of the columns from TABLE1, so your join to (distinct) TABLE1 records is really just a semi-join, which is most easily expressed as:
SELECT B.COLUMN_A, B.COLUMN_B, B.COLUMN_C
FROM TABLE2 B
WHERE EXISTS ( SELECT 'at least one row in table1'
FROM TABLE1 A
WHERE A.COLUMN_A = B.COLUMN_A );

Related

Get exclusive users in each table

I have 4 tables as shown below
For each table I want the count of users that are present exclusively in that table (not present in other tables). The result should look something likes this
I have one way of getting desired result as shown below:
First Column:
SELECT COUNT(DISTINCT A.id) table1_only
FROM table1 A
LEFT JOIN (SELECT DISTINCT id
FROM table2
UNION
SELECT DISTINCT id
FROM table3
UNION
SELECT DISTINCT id
FROM table4) B
ON A.id = B.id
WHERE B.id IS NULL
Second Column:
SELECT COUNT(DISTINCT A.id) table2_only
FROM table2 A
LEFT JOIN (SELECT DISTINCT id
FROM table1
UNION
SELECT DISTINCT id
FROM table3
UNION
SELECT DISTINCT id
FROM table4) B
ON A.id = B.id
WHERE B.id IS NULL
Third Column:
SELECT COUNT(DISTINCT A.id) table3_only
FROM table3 A
LEFT JOIN (SELECT DISTINCT id
FROM table1
UNION
SELECT DISTINCT id
FROM table2
UNION
SELECT DISTINCT id
FROM table4) B
ON A.id = B.id
WHERE B.id IS NULL
Fourth Column:
SELECT COUNT(DISTINCT A.id) table4_only
FROM table4 A
LEFT JOIN (SELECT DISTINCT id
FROM table1
UNION
SELECT DISTINCT id
FROM table2
UNION
SELECT DISTINCT id
FROM table3) B
ON A.id = B.id
WHERE B.id IS NULL
But I wanted to know if there is any efficient and scalable way to get same result. Just for 4 tables the amount of code is too much.
Any ways of optimizing this task will be really helpful.
Sample fiddle. (This fiddle is for mysql, I am looking for a generic SQL based approach than any db specific approach)
P.S.:
There is no complusion on the result needs to be in column wise. It can be row wise as well, as shown below:
I would approach this by combining the data from all tables. Then aggregate and filter:
select which, count(*) as num_in_table_only
from (select id, min(which) as which, count(*) as cnt
from ((select id, 1 as which from table1) union all
(select id, 2 as which from table2) union all
(select id, 3 as which from table3) union all
(select id, 4 as which from table4)
) t
group by id
) i
where cnt = 1
group by which
Note: In your sample data, the ids are unique in each table. This solution assumes that is true, but can easily be tweaked to handle duplicates within a table.

select query respecting conditions

i have my table containing 4 Columns (id, val1, val2, val3).
Does anyone knows how to select rows where val3 is the same where val1 is different.
for example
row1: (id1, user1, matheos, cvn)
row2: (id2, user2, matheos, cvn)
row3: (id3, user3, Claudia, bnps)
then i return the row1 and row2.
Your explanation is not entirely clear, but the following query will find matching rows according to the criteria you specified:
select a.*, b.*
from my_table a
join my_table b on b.val3 = a.val3
and b.val2 <> a.val2
and b.id < a.id
In order to produce the rows separately, you can also do:
select *
from my_table a
where exists (
select null from my_table b where b.val3 = a.val3 and b.val2 <> a.val2
)
Based on your explanation, you can try this:
select distinct t1.* from mytable t1
JOIN mytable t2 where t1.val3 = t2.val3
and t1.val1 != t2.val1;
Demo: SQL Fiddle

How do I count three different distinct values and group on an ID in MS-Access?

So I know MS-Access does not allow SELECT COUNT(DISTINCT....) FROM ..., but I am trying to find a more viable alternative to the usual standard of
SELECT COUNT(*) FROM (SELECT DISTINCT Name FROM table1)
My problem is I am trying to do three separate Count functions and group them on ID. If I use the method above, it is giving me the total unique value count for the whole table instead of the total count for only the value of ID. I tried doing
(SELECT COUNT(*) FROM (SELECT DISTINCT Name FROM table1 as T2
WHERE T2.ColumnA = T1.ColumnA)) As MyVal
FROM table1 as T1
but it tells me I need to specify a value for T1.ColumnA.
The SQL query I am trying to accomplish is this:
SELECT ID
COUNT(DISTINCT ColumnA) as CA,
COUNT(DISTINCT ColumnB) as CB,
COUNT(DISTINCT ColumnC) as CC
FROM table1
GROUP BY ID
Any ideas?
You can use subqueries. Assuming you have a table where each id occurs once:
select (select count(*)
from (select columnA
from table1 t1
where t1.id = t.id
group by columnA
) as a
) as num_a,
(select count(*)
from (select columnB
from table1 t1
where t1.id = t.id
group by columnB
) as b
) as num_b,
(select count(*)
from (select columnC
from table1 t1
where t1.id = t.id
group by columnC
) as c
) as num_c
from <table with ids> as t;
I'm not sure if you'll think this is "viable".
EDIT:
This makes it even more complicated . . . it suggests that MS Access doesn't support correlation clauses more than one level deep (might you consider switching to another database?).
In any case, the brute force way:
select a.id, a.numA, b.numB, c.numC
from ((select id, count(*) as numA
from (select id, columnA
from table1 t1
group by id, columnA
) as a
) as a inner join
(select id, count(*) as numB
from (select id, columnB
from table1 t1
group by id, columnB
) as b
) as b
on a.id = b.id
) inner join
(select id, count(*) as numC
from (select id, columnC
from table1 t1
group by id, columnC
) as c
) c
on c.id = a.id;

SQL Server. Delete from Select

I am using SQL Server 2012, and have the following query. Let's call this query A.
SELECT a.col, a.fk
FROM Table1 a
INNER JOIN (
select b.col
from Table1 b
group by b.col
having count(*) > 1)
b on b.col = a.col
I want to delete only the rows returned from query A, specifically rows that match the returned col AND fk
I am thinking of doing the following, but it will only delete rows that match on the col.
delete from Table1
where col in (
SELECT a.col
FROM Table1 a
INNER JOIN (
select b.col
from Table1 b
group by b.col
having count(*) > 1)
b on b.col = a.col)
)
Use delete from Join syntax
delete t1
from table1 t1
INNER JOIN (SELECT a.col, a.fk
FROM Table1 a
INNER JOIN (
select b.col
from Table1 b
group by b.col
having count(*) > 1)
b on b.col = a.col) t2
ON t1.col1=t2.col1 and t1.fk=t2.fk
you can combine col and fk fields to be another unique filed to retrieve wanted rows
delete from Table1
where cast(col as varchar(50))+'//'+cast(fk as varchar(50)) in (
SELECT cast(a.col as varchar(50))+'//'+cast(a.fk as varchar(50))
FROM Table1 a
INNER JOIN (
select b.col
from Table1 b
group by b.col
having count(*) > 1)
b on b.col = a.col)
)
You can express Query A like this:
SELECT col, fk
FROM (
SELECT a.col, a.fk, COUNT(*) OVER (PARTITION BY a.col) AS [count]
FROM Table1 a
) counted
WHERE [count] > 1
Which leads to a nice way to do the DELETE using a CTE:
;WITH ToDelete AS (
SELECT a.col, a.fk, COUNT(*) OVER (PARTITION BY a.col) AS [count]
FROM Table1 a
)
DELETE FROM ToDelete
WHERE [count] > 1
This does give the same result as the DELETE statement in your question though.
If you want to delete all but one row with the duplicate col value you can use something like this:
;WITH ToDelete AS (
SELECT a.col, a.fk
, ROW_NUMBER() OVER (PARTITION BY a.col ORDER BY a.fk) AS [occurance]
FROM Table1 a
)
DELETE FROM ToDelete
WHERE [occurance] > 1
The ORDER BY clause will determine which row is kept.

Joining 2 SQL SELECT result sets into one

I've got 2 select statements, returning data like this:
Select 1
col_a col_b
Select 2
col_a col_c
If I do union, I get something like
col_a col_b
And rows joined. What i need is getting it like this:
col_a col_b col_c
Joined on data in col_a
Use JOIN to join the subqueries and use ON to say where the rows from each subquery must match:
SELECT T1.col_a, T1.col_b, T2.col_c
FROM (SELECT col_a, col_b, ...etc...) AS T1
JOIN (SELECT col_a, col_c, ...etc...) AS T2
ON T1.col_a = T2.col_a
If there are some values of col_a that are in T1 but not in T2, you can use a LEFT OUTER JOIN instead.
Use a FULL OUTER JOIN:
select
a.col_a,
a.col_b,
b.col_c
from
(select col_a,col_bfrom tab1) a
join
(select col_a,col_cfrom tab2) b
on a.col_a= b.col_a
SELECT table1.col_a, table1.col_b, table2.col_c
FROM table1
INNER JOIN table2 ON table1.col_a = table2.col_a