Join or Union All To Combine Records? - sql

I have 2 similar tables that contain campaign names. I know I can do an union all to combine the tables, but I was wondering if there was a way to do this using form of Join instead? I want to create a table Z with campaign names for table A plus campaign names from table B (which are not in A). Can I do this with a join or is Union ALL the only way?

UNION is the easier and correct way to do that. Purely for the exercise you can do it with a JOIN but it is a lot more complex, unreadable, and the perf will be way worse...

SELECT * INTO TABLEZ
FROM
(
SELECT Column1, Column2, Column3.... FROM TABLEA
UNION ALL
SELECT Column1, Column2, Column3.... FROM TABLEB
)Q

Here is how you would do this with a full outer join:
select distinct coalesce(a.campaign, b.campaign)
from b left outer join
a
on a.campaign = b.campaign;
The union/union all approach is totally reasonable. I'm just offering this as a join solution that you seem to be alluding to in the question.

Related

sql union and join combine

I checked multiple questions but could not point to similar topic as mine.
Please point to the link if this looks repetitive.
Basically I have 2 tables. I need to join the 2 tables and then use lets say store_nr from table_b if it is not null. if it is null then use id_nr form table_a.
I can achieve this using union and join but it is not efficient as table is quite big.
Looking for any faster solution.
SELECT a.month,
a.id_nr
from table_a a
where a.id_nr not in (select distinct to_char(b.group_id) from table_b)
union
SELECT a.month,
CASE
WHEN b.store_nr is not null
then b.store_nr
ELSE to_number(a.id_nr)
END id_nr
FROM table_a a
join table_b b
on a.id_nr = to_char(b.group_id)
I think you're really after a left outer join of your table_b to your table_a, plus a COALESCE to handle which column to display, something like:
select a.month,
coalesce(b.store_nr, a.id_nr) id_nr
from table_a a
left outer join table_b b on a.id_nr = to_char(b.group_id);

PostgreSQL union two tables and join with a third table

I want to union to tables and join them with a third metadata table and I would like to know which approach is the best/fastest?
The database is a PostgreSQL.
Below is my two suggestions, but other approaches are welcome.
To do the join before the union on both tables:
SELECT a.id, a.feature_type, b.datetime, b.file_path
FROM table1 a, metadata b WHERE a.metadata_id = b.id
UNION ALL
SELECT a.id, a.feature_type, b.datetime, b.file_path
FROM table2 a, metadata b WHERE a.metadata_id = b.id
Or to do the union first and then do the join:
SELECT a.id, a.feature_type, b.datetime, b.file_path
FROM
(
SELECT id, feature_type, metadata_id FROM table1
UNION ALL
SELECT id, feature_type, metadata_id FROM table2
)a, metadata b
WHERE a.metadata_id = b.id
Run an EXPLAIN ANALYZE on both statements then you will see which one is more efficient.
it can be unpredictable due to sql-engine optimizator. it's better to look at the execution plan. finally both approaches can be represented in the same way
In so far as I can remember, running Explain will reveal that PostgreSQL interprets the second as the first provided that there is no group by clause (explicit, or implicit due to union instead of union all) in any of the subqueries.

Use join with a table and SQL Statement

Joins are usually used to fetch data from 2 tables using a common factor from either tables
Is it possible to use a join statement using a table and results of another SQL statement and if it is what is the syntax
Sure, this is called a derived table
such as:
select a.column, b.column
from
table1 a
join (select statement) b
on b.column = a.column
keep in mind that it will run the select for the derived table in entirety, so it can be helpful if you only select things you need.
EDIT: I've found that I rarely need to use this technique unless I am joining on some aggregated queries.... so I would carefully consider your design here.
For example, thus far most demonstrations in this thread have not required the use of a derived table.
It depends on what the other statement is, but one of the techniques you can use is common table expressions - this may not be available on your particular SQL platform.
In the case of SQL Server, if the other statement is a stored procedure, you may have to insert the results into a temporary table and join to that.
It's also possible in SQL Server (and some other platforms) to have table-valued functions which can be joined just like a view or table.
select *
from TableA a
inner join (select x from TableB) b
on a.x = b.x
Select c.CustomerCode, c.CustomerName, sq.AccountBalance
From Customers c
Join (
Select CustomerCode, AccountBalance
From Balances
)sq on c.CustomerCode = sq.CustomerCode
Sure, as an example:
SELECT *
FROM Employees E
INNER JOIN
(
SELECT EmployeeID, COUNT(EmployeeID) as ComplaintCount
FROM Complaints
GROUP BY EmployeeID
) C ON E.EmployeeID = C.EmployeeID
WHERE C.ComplaintCount > 3
It is. But what specifically are you looking to do?
That can be done with either a sub-select, a view or a temp table... More information would help us answer this question better, including which SQL software, and an example of what you'd like to do.
Try this:
SELECT T1.col1, t2.col2 FROM Table1 t1 INNER JOIN
(SELECT col1, col2, col3 FROM Table 2) t2 ON t1.col1 = t2.col1

When or why would you use a right outer join instead of left?

Wikipedia states:
"In practice, explicit right outer joins are rarely used, since they can always be replaced with left outer joins and provide no additional functionality."
Can anyone provide a situation where they have preferred to use the RIGHT notation, and why?
I can't think of a reason to ever use it. To me, it wouldn't ever make things more clear.
Edit:
I'm an Oracle veteran making the New Year's Resolution to wean myself from the (+) syntax. I want to do it right
The only reason I can think of to use RIGHT OUTER JOIN is to try to make your SQL more self-documenting.
You might possibly want to use left joins for queries that have null rows in the dependent (many) side of one-to-many relationships and right joins on those queries that generate null rows in the independent side.
This can also occur in generated code or if a shop's coding requirements specify the order of declaration of tables in the FROM clause.
B RIGHT JOIN A is the same as A LEFT JOIN B
B RIGHT JOIN A reads: B ON RIGHT, THEN JOINS A. means the A is in left side of data set. just the same as A LEFT JOIN B
There are no performance that can be gained if you'll rearrange LEFT JOINs to RIGHT.
The only reasons I can think of why one would use RIGHT JOIN is if you are type of person that like to think from inside side out (select * from detail right join header). It's like others like little-endian, others like big-endian, others like top down design, others like bottom up design.
The other one is if you already have a humongous query where you want to add another table, when it's a pain in the neck to rearrange the query, so just plug the table to existing query using RIGHT JOIN.
I've never used right join before and never thought I could actually need it, and it seems a bit unnatural. But after I thought about it, it could be really useful in the situation, when you need to outer join one table with intersection of many tables, so you have tables like this:
And want to get result like this:
Or, in SQL (MS SQL Server):
declare #temp_a table (id int)
declare #temp_b table (id int)
declare #temp_c table (id int)
declare #temp_d table (id int)
insert into #temp_a
select 1 union all
select 2 union all
select 3 union all
select 4
insert into #temp_b
select 2 union all
select 3 union all
select 5
insert into #temp_c
select 1 union all
select 2 union all
select 4
insert into #temp_d
select id from #temp_a
union
select id from #temp_b
union
select id from #temp_c
select *
from #temp_a as a
inner join #temp_b as b on b.id = a.id
inner join #temp_c as c on c.id = a.id
right outer join #temp_d as d on d.id = a.id
id id id id
----------- ----------- ----------- -----------
NULL NULL NULL 1
2 2 2 2
NULL NULL NULL 3
NULL NULL NULL 4
NULL NULL NULL 5
So if you switch to the left join, results will not be the same.
select *
from #temp_d as d
left outer join #temp_a as a on a.id = d.id
left outer join #temp_b as b on b.id = d.id
left outer join #temp_c as c on c.id = d.id
id id id id
----------- ----------- ----------- -----------
1 1 NULL 1
2 2 2 2
3 3 3 NULL
4 4 NULL 4
5 NULL 5 NULL
The only way to do this without the right join is to use common table expression or subquery
select *
from #temp_d as d
left outer join (
select *
from #temp_a as a
inner join #temp_b as b on b.id = a.id
inner join #temp_c as c on c.id = a.id
) as q on ...
The only time I would think of a right outer join is if I were fixing a full join, and it just so happened that I needed the result to contain all records from the table on the right. Even as lazy as I am, though, I would probably get so annoyed that I would rearrange it to use a left join.
This example from Wikipedia shows what I mean:
SELECT *
FROM employee
FULL OUTER JOIN department
ON employee.DepartmentID = department.DepartmentID
If you just replace the word FULL with RIGHT you have a new query, without having to swap the order of the ON clause.
SELECT * FROM table1 [BLANK] OUTER JOIN table2 ON table1.col = table2.col
Replace [BLANK] with:
LEFT - if you want all records from table1 even if they don't have a col that matches table2's (also included are table2 records with matches)
RIGHT - if you want all records from table2 even if they don't have a col that matches table1's (also included are table1 records with matches)
FULL - if you want all records from table1 and from table2
What is everyone talking about? They're the same? I don't think so.
SELECT * FROM table_a
INNER JOIN table_b ON ....
RIGHT JOIN table_c ON ....
How else could you quickly/easily inner join the first 2 tables and join with table_c while ensuring all rows in table_c are always selected?
I've not really had to think much on the right join but I suppose that I have not in nearly 20 years of writing SQL queries, come across a sound justification for using one. I've certainly seen plenty of them I'd guess arising from where developers have used built-in query builders.
Whenever I've encountered one, I've rewritten the query to eliminate it - I've found they just require too much additional mental energy to learn or re-learn if you haven't visited the query for some time and it hasn't been uncommon for the intent of the query to become lost or return incorrect results - and it's usually this incorrectness that has led to requests for me to review why the queries weren't working.
In thinking about it, once you introduce a right-join, you now have what I'd consider competing branches of logic which need to meet in the middle. If additional requirements/conditions are introduced, both of these branches may be further extended and you now have more complexity you're having to juggle to ensure that one branch isn't giving rise to incorrect results.
Further, once you introduce a right join, other less-experienced developers that work on the query later may simply bolt on additional tables to the right-join portion of the query and in doing so, expanding competing logic flows that still need to meet in the middle; or in some cases I've seen, start nesting views because they don't want to touch the original logic, perhaps in part, this is because they may not understand the query or the business rules that were in place that drove the logic.
SQL statements, in addition to being correct, should be as easy to read and expressively concise as possible (because they represent single atomic actions, and your mind needs to grok them completely to avoid unintended consequences.) Sometimes an expression is more clearly stated with a right outer join.
But one can always be transformed into the other, and the optimizer will do as well with one as the other.
For quite a while, at least one of the major rdbms products only supported LEFT OUTER JOIN. (I believe it was MySQL.)
The only times I've used a right join have been when I want to look at two sets of data and I already have the joins in a specific order for the left or inner join from a previously written query. In this case, say you want to see as one set of data the records not included in table a but in table b and in a another set the records not in table b but in table a. Even then I tend only to do this to save time doing research but would change it if it was code that would be run more than once.
In some SQL databases, there are optimizer hints that tell the optimizer to join the tables in the order in which they appear in the FROM clause - e.g. /*+ORDERED */ in Oracle. In some simple implementations, this might even be the only execution plan available.
In such cases order of tables in the FROM clause matters so RIGHT JOIN could be useful.
I think it's difficult if you don't have right join in this case. ex with oracle.
with a as(
select 1 id, 'a' name from dual union all
select 2 id, 'b' name from dual union all
select 3 id, 'c' name from dual union all
select 4 id, 'd' name from dual union all
select 5 id, 'e' name from dual union all
select 6 id, 'f' name from dual
), bx as(
select 1 id, 'fa' f from dual union all
select 3 id, 'fb' f from dual union all
select 6 id, 'f' f from dual union all
select 6 id, 'fc' f from dual
)
select a.*, b.f, x.f
from a left join bx b on a.id = b.id
right join bx x on a.id = x.id
order by a.id

How do I merge data from two tables in a single database call into the same columns?

If I run the two statements in batch will they return one table to two to my sqlcommand object with the data merged. What I am trying to do is optimize a search by searching twice, the first time on one set of data and then a second on another. They have the same fields and I’d like to have all the records from both tables show and be added to each other. I need this so that I can sort the data between both sets of data but short of writing a stored procedure I can’t think of a way of doing this.
Eg. Table 1 has columns A and B, Table 2 has these same columns but different data source. I then wan to merge them so that if a only exists in one column it is added to the result set and if both exist it eh tables the column B will be summed between the two.
Please note that this is not the same as a full outer join operation as that does not merge the data.
[EDIT]
Here's what the code looks like:
Select * From
(Select ID,COUNT(*) AS Count From [Table1]) as T1
full outer join
(Select ID,COUNT(*) AS Count From [Table2]) as T2
on t1.ID = T2.ID
Perhaps you're looking for UNION?
IE:
SELECT A, B FROM Table1
UNION
SELECT A, B FROM Table2
Possibly:
select table1.a, table1.b
from table1
where table1.a not in (select a from table2)
union all
select table1.a, table1.b+table2.b as b
from table1
inner join table2 on table1.a = table2.a
edit: perhaps you would benefit from unioning the tables before counting. e.g.
select id, count() as count from
(select id from table1
union all
select id from table2)
I'm not sure if I understand completely but you seem to be asking about a UNION
SELECT A,B
FROM tableX
UNION ALL
SELECT A,B
FROM tableY
To do it, you would go:
SELECT * INTO TABLE3 FROM TABLE1
UNION
SELECT * FROM TABLE2
Provided both tables have the same columns
I think what you are looking for is this, but I am not sure I am understanding your language correctly.
select id, sum(count) as count
from (
select id, count() as count
from table1
union all
select id, count() as count
from table2
) a
group by id