To use SQL join or not - sql

I have a query with 2 tables Table A and Table B.
Table_A : A_ID, B_ID, A3, A4, A5
Table_B : B_ID, B2.
I want value of column A3 from Table_A and B2 from Table_B.
I tried this query:
select
b.B2, a.A3
from
Table_A a
join
Table_B b on (a.A_ID = b.B_ID)
The query returns nothing. What am I doing wrong?

It would seem that the B_ID column in Table A is the foreign key to Table B - in that case, you need to join on that column:
select
b.B2, a.A3
from
Table_A a
inner join
Table_B b on a.B_ID = b.B_ID
Use a.B_ID (not a.A_ID) to establish the join between the two tables.

Related

Sparksql to select certain records against 3 tables

I have 3 tables and need to fetch the records as below
Table_A,
Table_B,
Table_C
Select only Table_A records which are common in Table_B & Table_C and ignore which are not common in both Table_B & Table_C finally results would be no duplicates.
Approach 1 Tried: inner join Table_A with Table_B and again separate inner join Table_A with Table_C finally did union.
Ab = Table_A.join(Table_B,Table_A["id"] == Table_B["id"], "inner").select(common columns)
Ac = Table_A.join(Table_C,Table_A["id"] == Table_C["id"], "inner").select(common columns)
result = Ab.union(Ac) <<Got more duplicates>>
result = result,dropDuplicates(["id"])
But still I got the duplicates.
Approach 2 Tried with SparkSql:
Table_A
left outer
Table_B
on A.id = B.id
left outer Table_C
on A.id = c.id
In this Approach, no duplicates but more records than Table_A also the uncommon records.
Any suggestion and best approach would be apprciated
In Spark SQL, I would recommend exists:
select a.*
from table_a a
where exists (select 1 from table_b b on b.id = a.id)
and exists (select 1 from table_c c on c.id = a.id)
This does the filtering you want, and will not duplicate records of table_a in the resuletset, even if there are multiple matches in table_b or table_c.

Fetch data from 2 tables, when either not present in 2nd table or has a null value

I am trying to fetch data from 2 tables - table_a and table_b.
Columns of table_a - a1,a2,a3
Columns of table_b - b1, b2, b3
Now the foreign key mapping is a1 and b2.
All values in table_a may not be present in table_b. And the values in table_b, may have the value in column b3 as null at times.
I am trying to fetch all the data that is either present in table_a and not in table_b and the values that are present in both the tables, but the value of the columnn b3 is null in table_b
Could you please help me with a SQL query for the same?
I am currently using the below -
select *
from table_a left join
table_b
on (a1 = b2 and (b3 is null or b1 is null));
I am trying to fetch all the data that is either present in table_a and not in table_b and the values that are present in both the tables, but the value of the columnn b3 is null in table_b
Consider:
select *
from table_a a
left join table_b b on b.b2 = a.a1
where b.b3 is null
Rationale: the left join along with were condition b.b3 is null covers the following situations
there is no match in table_b (then b.b3 ends up null)
there is a match in table_b but b3 is null in the matching record
if there is a match

SQL Query Duplicating records

I've got two tables.
Let's call them table_A and table_B.
Table_B contains the ForeignKey of table_A.
Table_A
ID Name
1 A
2 B
3 C
Table_B
ID table_a_fk
1 2
2 3
Now I want to get all the names out of table_a IF table_b does not contain the ID of the record in table_a.
I've tried it with this query:
SELECT a.name
FROM table_a a, table_b b
WHERE a.id != b.table_a_fk
With this Query I'm getting the right result I just get this result like 5times and I don't know why.
Hope someone can explain me that.
Your query creates a cartesian product between your two tables A and B. It is the cartesian product that generates those duplicate values. Instead, you want to use an anti-join, which is most commonly written in SQL using NOT EXISTS
SELECT a.name
FROM table_a a
WHERE NOT EXISTS (
SELECT *
FROM table_b b
WHERE a.id = b.table_a_fk
)
Another way to express an anti-join with NOT IN (only if table_b.table_a_fk is NOT NULL):
SELECT a.name
FROM table_a a
WHERE a.id NOT IN (
SELECT b.table_a_fk
FROM table_b b
)
Another, less common way to express an anti-join:
SELECT a.name
FROM table_a a
LEFT OUTER JOIN table_b b ON a.id = b.table_a_fk
WHERE b.id IS NULL
use distinct
SELECT distinct a.name
FROM table_a a, table_b b
WHERE a.id != b.table_a_fk
or better is...
Select distinct name
from tableA a
Where not exists (Select * from tableB
Where table_a_fk = a.id)

Restrict many - many results in SQL join

The SQL below contains some DDL and a simple query.
The result I am getting is
a1|b1|c1
a1|b2|c3
a3|b3|c2
a3|b3|c3
a3|b3|c4
a3|b3|c5
a3|b5|c6
a3|b5|c7
The result I want is
a1 |b1 |c1
a1 |b2 |c3
a3 |b3 |c2
null |null |c4
null |null |c5
a3 |b5 |c6
null |null |c7
I tried using MAX, MIN, rownums and what not. I am at my wit's end. I am including only the base query I started with and not all the options I tried because they don't work at all. Any help is appreciated!
BEGIN TRANSACTION;
drop table if exists table_A;
drop table if exists table_B;
drop table if exists table_C;
/* Create a table called NAMES */
CREATE TABLE table_A(a_Id text PRIMARY KEY, val_a text);
CREATE TABLE table_B(a_Id text, b_Id text, val_b text);
CREATE TABLE table_C(b_Id text, c_Id text, val_c text);
/* Create few records in this table */
INSERT INTO table_A VALUES('a1','va1');
INSERT INTO table_A VALUES('a2','va2');
INSERT INTO table_A VALUES('a3','va3');
INSERT INTO table_B VALUES('a1', 'b1','vb1');
INSERT INTO table_B VALUES('a1', 'b2','vb2');
INSERT INTO table_B VALUES('a3', 'b3','vb31');
INSERT INTO table_B VALUES('a2', 'b4','vb4');
INSERT INTO table_B VALUES('a3', 'b5','vb31');
INSERT INTO table_C VALUES('b1', 'c1','vc1');
INSERT INTO table_C VALUES('b3', 'c2','vc2');
INSERT INTO table_C VALUES('b3', 'c3','vc3');
INSERT INTO table_C VALUES('b2', 'c3','vc3');
INSERT INTO table_C VALUES('b3', 'c4','vc2');
INSERT INTO table_C VALUES('b3', 'c5','vc3');
INSERT INTO table_C VALUES('b5', 'c6','vc3');
INSERT INTO table_C VALUES('b5', 'c7','vc3');
COMMIT;
select
a.a_Id, b.b_Id, c.c_Id
from
table_A as a
join
table_B as b
on a.a_Id = b.a_Id
join
table_C as c
on b.b_Id = c.b_Id;
something like this should work (I have tested it on PostgreSql, should work on Oracle too)
SELECT
case when row_number = 1 then a_id end as a_id,
case when row_number = 1 then b_id end as b_id,
c_id
FROM (
SELECT
a.a_Id,
b.b_Id,
c.c_Id,
row_number() OVER (partition by a.a_id, b.b_id order by c.c_id) as row_number, --for a_id, b_id
row_number() OVER (partition by c.c_id order by c.c_id) as row_number2 --to avoid c_id duplicates
FROM
table_A a
join
table_B b on a.a_Id = b.a_Id
join table_C c on b.b_Id = c.b_Id
) innerquery
WHERE
row_number2 = 1 --this is to avoid c_id duplicates
SQLFIDDLE
I'd first recommend that this sounds better to handle in your presentation logic. However, it is possible to accomplish with SQL alone.
You can take advantage of Oracle's LAG() function along with CASE to check if the previous row had the same a and b id values.
Here's an example using a common table expression:
with cte as (
select
a.a_Id, b.b_Id, c.c_Id,
lag (a.a_Id,1) over (order by a.a_Id, b.b_Id) prev_a_Id,
lag (b.b_Id,1) over (order by a.a_Id, b.b_Id) prev_b_Id
from table_A a
join table_B b
on a.a_Id = b.a_Id
join table_C c
on b.b_Id = c.b_Id
order by
a.a_id, b.b_id
)
select
case
when prev_a_Id is null or
prev_a_Id <> a_Id or
prev_b_Id <> b_Id
then a_id
end new_a_Id,
case
when prev_a_Id is null or
prev_a_Id = a_Id or
prev_b_Id = b_Id
then b_id
end new_b_Id, c_Id
from cte;
SQL Fiddle Demo
select t1.a_id, t1.b_id, table_c.c_id
from table_c
left join
(
select a_Id, b_Id, c_Id
from
(
select a.a_Id as a_id, b.b_Id as b_id, c.c_Id as c_id,
ROW_NUMBER() OVER (PARTITION BY a.a_ID, b.b_id ORDER BY C_ID) as aNum
from table_A as a
join table_B as b on a.a_Id = b.a_Id
join table_C as c on b.b_Id = c.b_Id
) t2
where aNum = 1
) t1 on table_c.c_id = t1.c_id
order by table_c.c_id
fiddle:
http://sqlfiddle.com/#!3/6049b/1
I don't think i fully understood that you are trying to achieve, but correct me please if i'm wrong.
First, you use inner join to join tables (at least it will be if you use sql server, but should be the same in oracle). That means that for example you will get row from first table only if it has a correspondent row in the second table, And if now rows in first table for corresponding rows in second table, that rows from second table never appear in results.
According to description of result that you want to achieve you need is a outer join . Which one exactly left/right/full outer join depends on that you are trying to achieve (looks like you need left outer join or full outer join). I'm not quite sure that you exact aim because you explain how your data in this concrete example should looks like not general case.
So please have a look at description of different join types and choose sql join types
And also one importation remark: text type is probably last type from the list that i would consider as primary key.

Trouble with select statement on three table join

Im trying to write a query that will return a set of columns from three different tables.
The table that is link between the other two tables is called Table_A, it contains the keys of for the other two tables.
The second table is called the Table_B and the last table is called the Table_C.
Table_A columns.
| a_ID (primary key)| b_ID (foreign key)| c_ID
(foreign key)| ..... |
Table_B columns
|b_ID (primary key)| b1 | b2 | ...... |
Table_C columns
| c_ID (primary key) | c1 | c2 | ...... |
This is my SQL Query below. (Im only concerned with the columns above although there are more in each table.)
SELECT b.b_ID
, b.b1
, b.b2
, a.a_ID
, c.c1
, c.c2
FROM Table_A AS a
JOIN Table_B AS b ON a.b_ID = b.b_ID
JOIN Table_C AS c ON a.c_ID = c.c_ID
Im using open office for my projects and the error I'm getting is
"Table not found in statement [ SELECT b.b_ID
, b.b1
, b.b2
, a.a_ID
, c.c1
, c.c2
FROM Table_A AS a
JOIN Table_B AS b ON a.b_ID = b.b_ID
JOIN Table_C AS c ON a.c_ID = c.c_ID]"
For some reason if I change the select statement just to get all columns (*) it returns the correct results but I need to narrow it down to the columns listed in my query.*
SELECT *
FROM Table_A AS a
JOIN Table_B AS b ON a.b_ID = b.b_ID
JOIN Table_C AS c ON a.c_ID = c.c_ID'
EDIT: I have removed the actual table and column names so that you don't have to understand the story to help with the issue.
Shouldn't your WHERE clause be:
WHERE b.eventStartDate > '2013-10-01'
?
This is a "spot the difference" type of question...