Records difference in three tables - MS Access

Records difference in three tables - MS Access - sql

I have three tables Tab1, Tab2 and Tab3 with almost same structre (in MS Access). But Tab2 and Tab3 have a few more columns than Tab1.
Tab2 and Tab3 are exactly same structure. Following are the joining keys
col1
col2
col3
Basically Tab1 records should tally with Tab2 and Tab3 together.
If I need to get the missing records in Tab2 and Tab3 when compare to Tab1 what could be the efficient way
Appreciate your response

If you only care about the keys, here is a good approach:
select col1, sum(isTab1) as numTab1, sum(isTab2) as numTab2, sum(isTab3) as numTab3
from ((select col1 as col, 1 as isTab1, 0 as isTab2, 0 as isTab3 from tab1
) union all
(select col2, 0 as isTab1, 1 as isTab2, 0 as isTab3 from tab2
) union all
(select col3, 0 as isTab1, 0 as isTab2, 1 as isTab3 from tab3
)
) t
group by col
having sum(isTab1)*sum(isTab2)*sum(isTab3) <> 1
This returns each of the key values and tells you which tables they are in and not in, for keys that are not in all three tables. As a bonus, this will also tell you if any of the tables have duplicate keys.

usually you would SELECT FROM tab1 LEFT JOINing the tab2 and tab3 LEFT JOINed together. That way you will get ALL records from tab1. When there are some missing records in tab2 and tab3 there will be nulls. You can check for nulls in the WHERE clause
So, the query would look similar to this one (please note the brackets - it is a requirement for ms-access):
SELECT * FROM
tab1 LEFT JOIN (tab2 LEFT JOIN tab3 ON tab2.col1 = tab3.col1 AND tab2.col2 = tab3.col2)
ON tab1.col1 = tab2.col1 AND tab1.col2 = tab2.col2
WHERE tab2.col1 Is Null;

Related

Update SQL query set

UPDATE tab1
SET col = 1
FROM tab1
LEFT JOIN tab2 ON tab2.ID = tab1.ID
WHERE tab2.ID IS NULL
Where do I put the ELSE col = 0in this query?

UPDATE tab1
SET col = CASE WHEN tab2.ID IS NULL THEN 1 ELSE 0 END
FROM tab1
LEFT JOIN tab2 ON tab2.ID = tab1.ID
I assume you want col to be 1 when tab2.ID is NULL and 0 when it is not. So you need to do 2 things. Use a case expression. Also remove your where expression so that you are not limiting the results table to only tab1 rows that have no relation to tab2

SQL code performance

I have a SQL query which takes a lot of time to execute.
It goes like this
select
columns
from
tab1
where
tab1.id in (select col from tab2 where conditions) --32000 rows
or
tab1.id in (select col from tab3 where conditions) ---14000 rows
or
tab1.id in (select col from tab4 where conditions) --6000 rows
Is there any way I can increase the performance here?
I've tried using EXISTS() too but that did not help.

Oracle should be pretty good with optimizing queries that have in with a subquery. Your best bet is adding indexes. However, your query is not detailed enough to suggest particular indexes. You need to be explicit about the where clause.

option1:
select
columns
from
tab1
where
tab1.id in (select col from tab2 where conditions --32000 rows
union all
select col from tab3 where conditions ---14000 rows
union all
select col from tab4 where conditions --6000 rows
);
option2:
select
columns
from
tab1
inner join (select distinct col
from (select col from tab2 where conditions --32000 rows
union all
select col from tab3 where conditions ---14000 rows
union all
select col from tab4 where conditions --6000 rows
)
) x
on tab1.id = x.col;
option3:
select
columns
from
tab1
where
exists (select col from tab2 where conditions --??? rows
where col = tab1.id
union all
select col from tab3 where conditions ---??? rows
where col = tab1.id
union all
select col from tab4 where conditions --??? rows
where col = tab1.id
);

Missing data from Left JOIN

I have two tables which I am left joining together.
SELECT Col3, tab1.Col1, tab1.Col2 FROM
(SELECT Col1,Col2
FROM Table1
GROUP BY Col1,Col2) tab1
LEFT JOIN
(SELECT Col3, Col1, Col2
FROM Table2
GROUP BY Col3, Col1, Col2) tab2
ON tab2.Col2 = tab1.Col2 AND tab2.Col1 = tab1.Col1
At the moment for the rows in Table1 which do not exist in Table2 I return a row where Col3 is Null. As I am grouping data based on Col3, it would be good if I could somehow get the value of Col3 instead of Null.....
Is this possible??
So I am trying to return every possible combination of col1 and col2, per value of col3. The problem is when col3 does not contain a particular combination of col1,col2 I am getting nulls for col3...

Assuming Col3 is some kind of category, and is a primary key of a category table, you might do this:
select Category.Col3,
tab1.Col1,
tab1.Col2,
sum (tab2.YourAggregate) SumOfSomething
-- Take all categories
from Category
-- Cartesian product with Tabl1
cross join Table1 tab1
-- Find matching records in Table2, if they exist
left join Table2 tab2
on Category.Col3 = Tab2.Col3
and Tab1.Col1 = Tab2.Col1
and Tab1.Col2 = Tab2.Col2
group by Category.Col3,
tab1.Col1,
tab1.Col2
Cross join produces Cartesian product of tables involved, retrieving Col3 which might not be found in Table2.

A LEFT JOIN will produce nulls in the right table where no match could be found for Col1 and Col2 in the left table. This is the expected behavior.
If you need help writing a different query, you'll have to post your data structures and some sample data to play with.

Just switch your tables (or use right join):
SELECT tab2.Col3, tab2.Col1, tab2.Col2 FROM
(SELECT distinct Col3, Col1, Col2 FROM Table2) tab2
LEFT JOIN
(SELECT distinct Col1,Col2 FROM Table1) tab1
ON tab2.Col2 = tab1.Col2 AND tab2.Col1 = tab1.Col1

Simplifying a query with correlated subquery to simple joins

I need help in simplifying the below query.
I was able to check for '0' count without using Group By/having Count clauses in the below query but with correlated subquery.
Now, I've been asked to simplify the below query as simple joins!.
I tried merging the query into one. But the output differs.
Could you please suggest any other idea of simplifying the query, which is checking for '0' count.
select distinct tab1.col1
from tab1
where tab1.col2 = 'A'
And 0 = (select count(tab2.col1)
from tab2
where tab2.col2 = 'B'
and tab2.col1 = tab1.col1)

This sort of thing would normally be written as a NOT EXISTS
SELECT distinct tab1.col1
FROM tab1
WHERE tab1.col2 = 'A'
AND NOT EXISTS(
SELECT 1
FROM tab2
WHERE tab2.col2 = 'B'
AND tab2.col1 = tab1.col1 )
However you could also write
SELECT tab1.col1, count(tab2.col1)
FROM (SELECT * FROM tab1 WHERE col2 = 'A') tab1,
(SELECT * FROM tab2 WHERE col2 = 'B') tab2
WHERE tab1.col1 = tab2.col2(+)
GROUP BY tab1.col1
HAVING count(tab2.col1) = 0

Try some of these.
If col1 is declared as not null, the first two queries have the same execution plan (anti-joins). The second alternative is my personal advice, since it matches your requirements the best.
-- Non-correlated subquery
select distinct col1
from tab1
where col2 = 'A'
and col1 not in(select col1
from tab2
where col2 = 'B');
-- Correlated subquery
select distinct col1
from tab1
where col2 = 'A'
and not exists(select 'x'
from tab2
where tab2.col2 = 'B'
and tab2.col1 = tab1.col1);
-- Using join
select distinct tab1.col1
from tab1
left join tab2 on(tab2.col2 = 'B' and tab2.col1 = tab1.col1)
where tab1.col2 = 'A'
and tab2.col1 is null;
-- Using aggregation
select tab1.col1
from tab1
left join tab2 on(tab2.col2 = 'B' and tab2.col1 = tab1.col1)
where tab1.col2 = 'A'
group
by tab1.col1
having count(tab2.col2) = 0;

Fetching fetch the first occurrence from the result

I have an oracle sql query
select
distinct
tab1.col1,
tab2.col1
from
table1 tab1
join table2 tab2 on tab1.col1 = tab2.col1
Here i get the as expected in terms of distinct values.
For Example : The result rows are
1 2
3 4
5 6
Now I want to add one more join for table3. so my sql is
select
distinct
tab1.col1,
tab2.col1,
tab3.col1
from
table1 tab1
join table2 tab2 on tab1.col1 = tab2.col1
join table3 tab3 on tab1.col1 = tab3.col1
Here what the problem is is that table 3 is returning more than one value.
which is resulting in duplicate rows based on table3.
For Example : The result rows are
1 2 4
1 2 5
3 4 1
3 4 2
5 6 3
(Here if you notice row 1 & 2 are duplicate and 3 & 4 are duplicate)
What I am trying to do is for the join of table3 i want to fetch the
first occurrence of row.

This shud work for you !
select
distinct
tab1.col1,
tab2.col1,
MIN(tab3.col1)
from
table1 tab1
join table2 tab2 on tab1.col1 = tab2.col1
join table3 tab3 on tab1.col1 = tab3.col1
GROUP BY tab1.col1, tab2.col1
Edit: Thoughts,
I am assuming column 3 to be a integer which ever increasing, in that case this works. You can use the date column to define your aggregate accurately to get the "first occurance of your row".

select
distinct
tab1.col1,
tab2.col1,
t3.col1
from
table1 tab1
join table2 tab2 on tab1.col1 = tab2.col1
join (select distinct col1 from table3) t3 on tab1.col1 = t3.col1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Records difference in three tables - MS Access - sql

Related

Update SQL query set

SQL code performance

Missing data from Left JOIN

Simplifying a query with correlated subquery to simple joins

Fetching fetch the first occurrence from the result

Categories

Resources