SQL JOIN question (yet another one) - sql

Sounds simple but I'm stuck
Table A Table B
col_a col_b col_a col_c
1 b 1 c
2 c 2 d
3 z 3 a
4 d 4 e
33 a 5 k
6 l
33 b
33 b
I want to JOIN table A with B:
select * from A inner join B on A.col_a = B.col_a
I am expecting to get 5 records as a result.
Expected join result ** Actual result **
col_a col_b col_c col_x[n]... col_a col_b col_c col_y[n]...
1 b c ... 1 b c ...
2 c d ... 2 c d ...
3 z a ... 3 z a ...
4 d e ... 4 d e ...
33 a b ... 33 a b ...
33 a b ...
Why did MySQL match 33 twice? Because they are 2 values with 33 in table B.
What I want though, is just one record with the same value in col_a. How do I do that?
EDIT: I am updating the tables' design to include more columns that contain non-identical data, because them being as they were posed more questions than solved problems. Anyway, the answer to this is to use GROUP BY, but the performance penalty is huge, especially on a table that contains above 50 million records (and growing).
However, the best approach to solve my problem was to use a compound statement (using UNION ALL) for every distinct value in col_a. The performance benefit was x5 ~ x10 faster !!

You have 33 twice in Table B.
Either SELECT DISTINCT or GROUP BY col_a, ...:
SELECT DISTINCT *
FROM A
JOIN B ON ( A.col_a = B.col_a )
;
or
SELECT *
FROM A
JOIN B ON ( A.col_a = B.col_a )
GROUP BY col_a, col_b, col_c
;
You should clean up that table, though.
Depending on how many occurrences of a repeated row, it might be faster to use a subquery:
SELECT *
FROM A
JOIN (select distinct * from B) AS C
ON ( A.col_a = C.col_a )
;

The quick & dirty answer is:
select DISTINCT * from A inner join B on A.col_a = B.col_a
But the real question is, why do you have two identical entries in Table B?
Usually when you have to use DISTINCT, it indicates a problem in your data model.

Related

postgresql Join two table without relation

lets say i have table like this
Table A:
column_a
1
2
3
Table B:
column_b
a
b
c
and i want to have result like this:
column_a column_b
1 a
1 b
1 c
2 a
2 b
2 c
3 a
3 b
3 c
does it possible to join two table without relation like this using query?
Yes, you're looking for cross join
select a.column_a, b.column_b
from a
cross join b
order by a.column_a, b.column_b;

How to identify non-existing keys with reference to a table that has all mandatory keys, SQL?

I have the table 'Table01' which contains the keys that should be mandatory:
id
1
2
3
4
And I also have the table 'Table02' which contains the records to be filtered:
id
customer
weight
1
a
100
2
a
300
3
a
200
4
a
45
1
b
20
2
b
100
3
b
17
1
c
80
4
c
90
2
d
30
3
d
30
4
d
50
So I want to identify which are the mandatory id's that the table 'Table02' does not have, and in turn identify which is the 'customer' of each id's that the table 'Table02' does not have.
The resulting table should look like this:
customer
id
b
4
c
2
c
3
d
1
What I have tried so far is a 'rigth join'.
proc sql;
create table table03 as
select
b.id
from table02 a
right join table01 b
on a.id=b.id
where a.id is null;
run;
But that query is not identifying all the id's that should be mandatory.
I hope someone can help me, thank you very much.
here is one way:
select cl.customerid , a.id
from
Table1 a
cross join
( select customerid
from table2
group by customerid
) cl
where not exists ( select 1 from table2 b
where b.customerid = cl.customerid
and b.id = a.id
)
You can use an EXCEPT between two sub-selects. The first creates a matrix of all possibilities, and the except table is a selection of the extant customers.
Example:
data ids;
do id = 1 to 4; output; end;
run;
data have;
input id customer $ weight;
datalines;
1 a 100
2 a 300
3 a 200
4 a 45
1 b 20
2 b 100
3 b 17
1 c 80
4 c 90
2 d 30
3 d 30
4 d 50
run;
proc sql;
create table want(label='Customers missing some ids') as
select matrix.*
from
(select distinct have.customer, ids.id from have, ids) as matrix
except
(select customer, id from have)
;
quit;
If you are doing it in SQL server. Something like #eshirvana above posted, but also you can use with cte:
;with cte as
(
SELECT t1.id, t2.Customer
FROM Table01 t1
cross join (select distinct customer from Table02)
)
SELECT a.customer, a.id FROM cte a
LEFT JOIN Table02 b
ON a.id=b.id AND a.customer=b.customer
where b.id is null

SQL query - Cumulatively concatenate strings in consecutive rows

I'm a data analyst, so I write SQL queries to retrieve data from a database. I'm not sure what kind of SQL exactly, just assume the most standard (also not things like 'DECLARE #tbl', and no create functions etc.)
Here is my problem.
Given the following table:
name
number
letter
A
1
a
A
2
b
A
3
c
A
4
d
B
1
a
B
2
b
B
3
c
B
4
d
I want the following result: (concatenate letter cumulatively, order by number))
name
number
letter
result
A
1
a
a
A
2
b
a,b
A
3
c
a,b,c
A
4
d
a,b,c,d
B
1
a
a
B
2
b
a,b
B
3
c
a,b,c
B
4
d
a,b,c,d
Any help is highly appreciated. Thanks very much.
This answers the original version of the question which was tagged MySQL.
MySQL doesn't support group_concat() as a window function. So a subquery may be your best alternative:
select t.*,
(select group_concat(t2.letter order by t2.number)
from t t2
where t2.name = t.name and t2.number <= t.number
) as letters
from t;

SQL apply different where condition (filter) for each group in table

i have following SQL table A in my database:
index, group, foo
1 A 2
2 A 2
3 A 0
4 A 1
5 B 2
6 B 1
7 C 1
There are few more groups and I need to write a query based on this filter table B. For each group in table A it's index should be equal or greater than index_egt from table B for the same group.
If the group is not listed in table B, the group won't be filtered.
index_egt, group
3 A
5 B
Expected result:
index, group, foo
3 A 0
4 A 1
5 B 2
6 B 1
7 C 1
Try this, the A.index>=B.index_egt will handle cases where the group is listed in TableB and the B.index_egt IS NULL will handle cases where the group is not listed:
SELECT
A.index,
A.group,
A.foo
FROM TableA AS A
LEFT JOIN TableB AS B ON A.group=B.group
WHERE A.index>=B.index_egt
OR B.index_egt IS NULL
select
a.*
from
A a
left join
B b ON b.group = a.group
where
a.index >= b.index_egt OR b.index_egt IS NULL
I always like this trick with coalesce
SELECT a.*
FROM a_table_with_no_name a
LEFT JOIN b_table_with_no_name b ON b.group = a.group
WHERE a.index >= COALESCE(b.index_egt,a.index)

Conditional Join in Oracle SQL

Consider below 3 tables.
Table a
Col a Col b Col c
1 000 Actual data
1 001 Actual data
2 000 Actual data
3 000 Actual data
3 001 Actual data
3 002 Actual data
Table b
Col a Col b Col d
1 000 Actual data
1 001 Actual data
2 000 Actual data
Table c
Col a Col b Col d
3 000 Actual data
3 001 Actual data
3 002 Actual data
Table a is parent table and table b and c are child table having col a & b common among 3 and needs to be joined.
Now Join should be such if data is not found in table b then only it should be searched in table c
Desired:
cola col b col c col d
1 000 somedata moredata
1 001 somedata moredata
2 000 somedata moredata
3 000 somedata moredata
3 001 somedata moredata
3 002 somedata moredata
Well, currently what i am doing is, left join b to a and c to a, but i think every time for record in a will be searched in b and c both making it Less cost effective. hence want to make it cost effective/fine-tune such that if records NOT exist in b then only search c.
What you really need is a way to "collect" all the rows from table B, and if there are none, then all the rows from table C. Doing the join to A is then standard.
Something like this should work. Make it a subquery and join to your first table.
select col_a, col_b, col_c
from table_b
union all
select col_a, col_b, col_c
from table_c
where (select count(*) from table_b) = 0
If table_b has at least one row, then nothing will be selected from table_c (because the where condition will be false for all rows in table_c). However, if table_b is empty, all the rows from table_c will be selected.
What you need to do is first create a union of two tables B and C with only those records where are in B and C but if they are in B then we should ignore the C ones then do a join with Table A. Thus:
SELECT B.cola, B.colb from B
UNION ALL
SELECT C.cola, C.colb from C
Now using this table, you can join with Table A like:
SELECT A.cola, A.colb, tmp.colc
FROM A
JOIN
( SELECT B.cola, B.colb, B.colc from B
UNION ALL
SELECT C.cola, C.colb from C) AS tmp
ON A.cola = tmp.cola
AND A.colb = tmp.colb
Two left joins:
select a.*, b.*, c.*
from a
left join b
on a.cola=b.cola
and a.colb = b.colb
left join c
on a.cola=c.cola
and a.colb=c.colb