I was going through union and union all logic and trying examples. What has puzzled me is why is it necessary to have same number of columns in both the tables to perform a union or union all operation?
Forgive me if my question's silly, but i couldn't get the exact answer anywhere. And conceptually thinking, as long as one common column is present, merging two tables should be easy right?(like a join operation). But this is not the case, and I want to know why?
JOIN operations do NOT require the same number of columns be selected in both tables. UNION operations are different that joins. Think of it as two separate lists of data that can be "pasted" together in one big block. You can't have columns that don't match.
Another way to look at it is a JOIN does this:
TableA.Col1, TableA.Col2 .... TableB.Col1, TableB.Col2
A UNION does this:
TableA.Col1, TableA.Col2
TableB.Col1, TableB.Col2
JOINS add columns to rows, UNIONS adds more rows to existing rows. That's why they must "match".
Join Operation combines columns from two tables.
Where as Union and Union all combines two results sets, so In order to combine two results you need to have same number of columns with compatible data types.
In real world e.g. In order to play a game of cricket you need 11 players in both team, 7 in one team and 11 in opposite team not allowed.
Lets say you have EMPTable with columns and values as below:
id, name, address, salary, DOB
1, 'SAM', '2 Merck Ln', 100000, '08/18/1980'
IF you want to UNION with only the column name (lets say value is 'TOBY'), it means you have to default other values to NULL (a smart software or db can implicitly do it for you), which in essence translates to below (to prevent the integrity of a relational table) ->
1, 'SAM', '2 Merck Ln', 100000, '08/18/1980'
UNION
NULL,'TOBY', NULL, NULL, NULL
A "union" by definition is a merger of (different) values of THE same class or type.
You need to research the difference between UNION and JOIN.
Basically, you use UNION when you want to get records from one source (table, view, group of tables, etc) and combine those results with records from another source. They have to have the same columns to get a common set of results.
You use JOIN when you want to join records from one source with another source. There are several types of JOINs (INNER, LEFT, RIGHT, etc) depending on your needs. When using JOINs, you can specify whichever columns you'd like.
Good luck.
join are basically used when we want to get data from two or more tables, and for this there is no need to fetch same number of columns, consider a situation where by using normalization concepts we have divided the table into two parts,
emp [eid, ename,edob,esal]
emp_info [eid,emob_no]
here i we want to know name and mobile no. of all employees then we will use the concept of join, because here information needed by us can't be provided by same table.
so we will use..
SELECT E.ENAME,EI.EMOB_NO FROM EMP E, EMP_INFO EI WHERE E.EID = EI.EID AND LOWER(E.ENAME)='john' ;
now consider about situation where we want to find employees those are having saving account and loan account in a bank.. here we want to find common tuples from two tables results. for this we will use set operations. [ intersection ]
for set operations 2 conditions MUST BE satisfied.
same no. of attributes must be fetches from each table.
domain of each attribute must be same of compatible to the higher one...
Related
I have two tables of world countries Independence Day and I wanted to combine them into one table with a distinct id, but they are using different Primary keys, any suggestions will be appreciated.
Summary of request: How to combine two tables with different Primary Keys but the other fields in common and removing duplicate fields ideally Hash Match and removing duplicates
Expected Results this will include all the unique countries in both tables, please one table may have more countries and we want to make sure we take all the distinct countries from each table. Ideally, the solution will be likely to be of like Hash Match operator in SQL which implements several different logical operations that all use an in-memory hash table for finding matching data. Many thanks in advance
The image of two tables which needs combining.
You seem to want full join:
select a.*, b.* -- select the columns you want
from a full join
b
on a.country = b.country;
If you want to assign a new unique id use row_number():
select row_number() over (order by coalesce(a.country, b.country)) as new_id,
a.*, b.* -- select the columns you want
from a full join
b
on a.country = b.country;
I'm working in a large access database (Access 2010) and am trying to return records where two locations are different.
In my case, I have a large number of birds that have been observed on multiple dates and potentially on different islands. Each bird has a unique BirdID and also an actual physical identifier (unfortunately that may have changed over time). [I'm going to try addressing the changing physical identifier issue later]. I currently want to query individual birds where one or more of their observations is different than the "IslandAlpha" (the first island where they were observed). Something along the lines of a criteria for BirdID: WHERE IslandID [doesn't equal] IslandAlpha.
I then need a separate query to find where all observations DO equal where they were first observed. So where IslandID = IslandAlpha
I'm new to Access, so let me know if you need more information on how my tables/relationships are set up! Thanks in advance.
Assuming the following tables:
Birds table in which all individual birds have records with a unique BirdID and IslandAlpha.
Sightings table in which individual sightings are recorded, including IslandID.
Your first query would look something like this:
SELECT *
FROM Birds
INNER JOIN Sightings ON Birds.BirdID=Sightings.BirdID
WHERE Sightings.IslandID <> Birds.IslandAlpha
You second query would be the same but with = instead of <> in the WHERE clause.
Please provide us information about the tables and columns you are using.
I will presume you are asking this question because a simple join of tables and filtering where IslandAlpha <> ObsLoc is not possible because IslandAlpha is derived from first observation record for each bird. Pulling first observation record for each bird requires a nested query. Need a unique record identifier in Observations - autonumber should serve. Assuming there is an observation date/time field, consider:
SELECT * FROM Observations WHERE ObsID IN
(SELECT TOP 1 ObsID FROM Observations AS Dupe
WHERE Dupe.ObsBirdID = Observations.ObsBirdID ORDER BY Dupe.ObsDateTime);
Now use that query for subsequent queries.
SELECT * FROM Observations
INNER JOIN Query1 ON Observations.ObsBirdID = Query1.ObsBirdID
WHERE Observations.ObsLocID <> Query1.ObsLocID;
I have two separate tables in my access database, which both use a third table as the reference for one particular field on each table. The data is entered onto the different tables by separate forms. Then I have several queries that then reference those particular fields that count and show unique values. Those queries show the actual values, then I created an sql query that does the same thing, only it shows the reference ID instead of the value in the actual field.
table ODI----------table CDN----------reference table
id RHA---------id CHA----------------id HA
1 blank----------1 radio---------------1 internet
2 internet-------2 tv------------------2 radio
3 referral-------3 radio---------------3 referral
4 tv-------------4 blank---------------4 repeat customer
5 blank----------5 internet------------5 tv
6 internet-------6 referral------------6 employee
7 referral-------7 referral------------7 social media
this is the code I am trying to make work.
SELECT m.[Marketing Results], Count(*) AS [Count]
FROM (SELECT RHA as [Marketing Results] FROM ODI
UNION ALL
SELECT CHA as [Marketing Results] FROM CDN) AS m
GROUP BY m.[Marketing Results]
HAVING (((m.[Marketing Results]) Is Not Null))
ORDER BY Count(*) DESC;
and what my desired result is,
Marketing Results--Counts
referral------------------4
internet------------------3
radio---------------------2
tv------------------------2
Lookup fields with alias don't show what is actually stored in table. ID is stored, not descriptive alias. Lookup alias will carry into regular queries but Union query only pulls actual stored values. At some point need to include reference table in query by joining on key fields in order to retrieve descriptive alias. Options:
in each UNION query SELECT line, join tables
join UNION query to reference table
join aggregate query to reference table
Most experienced developers will not build lookups in table because of confusion they cause. Also, they are not portable to other database platforms. http://access.mvps.org/Access/lookupfields.htm
I have a table which represents data for people that have applied. Each person has one PERSON_ID, but can have multiple APP_IDs. I want to select all of the columns except for APP_ID(because its values aren't distinct) for all of the distinct people in the table.
I can list every field individually in both the select and group by clause
This works:
select PERSON_ID, FIRST,LAST,MIDDLE,BIRTHDATE,SEX,EMAIL,PRIMARY_PHONE from
applications
where first = 'Rob' and last='Robot'
group by PERSON_ID,FIRST,LAST,MIDDLE,BIRTHDATE,SEX,EMAIL,PRIMARY_PHONE
But there are twenty more fields that I may or may not use at any given time
Is there any shorter way to achieve this sort of selection without being so verbose?
select distinct is shorter:
select distinct PERSON_ID, FIRST, LAST, MIDDLE, BIRTHDATE, SEX, EMAIL, PRIMARY_PHONE
from applications
where first = 'Rob' and last = 'Robot';
But you still have to list out the columns once.
Some more modern databases support an except clause that lets you remove columns from the wildcard list. To the best of my knowledge, Oracle has no similar concept.
You could write a query to bring the columns together from the system tables. That could simplify writing the query and help prevent misspellings.
http://www.codinghorror.com/blog/2007/10/a-visual-explanation-of-sql-joins.html
and
https://web.archive.org/web/20120621231245/http://www.khankennels.com/blog/index.php/archives/2007/04/20/getting-joins/
have been very helpful in learning the basics of joins using Venn diagrams. But I am wondering how you would apply that same thinking to a query that has more than one join.
Let's say I have 3 tables:
Employees
EmployeeID
FullName
EmployeeTypeID
EmployeeTypes (full time, part time, etc.)
EmployeeTypeID
TypeName
InsuranceRecords
InsuranceRecordID
EmployeeID
HealthInsuranceNumber
Now, I want my final result set to include data from all three tables, in this format:
EmployeeID | FullName | TypeName | HealthInsuranceNumber
Using what I learned from those two sites, I can use the following joins to get all employees, regardless of whether or not their insurance information exists or not:
SELECT
Employees.EmployeeID, FullName, TypeName, HealthInsuranceNumber
FROM Employees
INNER JOIN EmployeeTypes ON Employees.EmployeeTypeID = EmployeeTypes.EmployeeTypeID
LEFT OUTER JOIN InsuranceRecords ON Employees.EmployeeID = InsuranceRecords.EmployeeID
My question is, using the same kind of Venn diagram pattern, how would the above query be represented visually? Is this picture accurate?
I think it is not quite possible to map your example onto these types of diagrams for the following reason:
The diagrams here are diagrams used to describe intersections and unions in set theory. For the purpose of having an overlap as depicted in the diagrams, all three diagrams need to contain elements of the same type which by definition is not possible if we are dealing with three different tables where each contains a different type of (row-)object.
If all three tables would be joined on the same key then you could identify the values of this key as the elements the sets contain but since this is not the case in your example these kind of pictures are not applicable.
If we do assume that in your example both joins use the same key, then only the green area would be the correct result since the first join restricts you to the intersection of Employees and Employee types and the second join restricts you the all of Employees and since both join conditions must be true you would get the intersection of both of the aforementioned sections which is the green area.
Hope this helps.
That is not an accurate set diagram (either Venn or Euler). There are no entities which are members of both Employees and Employee Types. Even if your table schema represented some kind of table-inheritance, all the entities would still be in a base table.
Jeff's example on the Coding Horror blog only works with like entities i.e. two tables containing the same entities - technically a violation of normalization - or a self-join.
Venn diagrams can accurately depict scenarios like:
-- The intersection lens
SELECT *
FROM table
WHERE condition1
AND condition2
-- One of the lunes
SELECT *
FROM table
WHERE condition1
AND NOT condition2
-- The union
SELECT *
FROM table
WHERE condition1
OR condition2