I have the following:
TableA
ID | DocumentType | DocumentCode | DocumentDate | Warehouse | RefecenceCode
---+--------------+--------------+--------------+-----------+--------------
1 | DeliveryNote | DOC-001 | 2017-04-21 | 1 | NULL
2 | Invoice | DOC-002 | 2017-04-21 | 2 | DOC-001
As you can see, the warehouse is different on each document and DOC-002 is related to DOC-001 through the information in ReferenceCode column (which means that was created starting from DOC-001 as a source document).
It is supposed for the DOC-002 to have the same information but sometimes might be different and in this case, I was tried to create a query (I think self join applies here) in order to check what information is different in the DOC-002 in this case compared to DOC-001, based on the reference code, but I couldn't managed to do it.
If someone could give me a hand, I'll be very grateful.
This is the SQL query:
select *
from TableA tbl
inner join TableA tbla on tbl.id = tbla.id
where tbla.ReferenceCode = tbl.DocumentCode
You indeed want to join the table to itself. But joining on the ID column won't work, because that column doesn't relate records to each other. Instead, you need to join on the DocumentCode and ReferenceCode fields. Then only include the records that have some difference (in this case, I'm only comparing the DocumentDate and Warehouse fields).
select tbla.*
from TableA tbl
join TableA tbla on tbl.DocumentCode = tbla.ReferenceCode
where tbla.DocumentDate != tbl.DocumentDate
or tbla.Warehouse != tbl.Warehouse
I have 2 tables where column X and Y are concatenated to represent a unique identifier. I want to find all rows in tableB that do not exist in tableA and add them into tableC.
-------tableA-------- // tableA is a master refernce table with all names so far
|__X__|__Y__|_name__|
| 3 | 7 | Mary |
| 3 | 2 | Jaime |
-------tableB-------- // tableB is an input file with all daily names (some repeats already exist in tableA)
|__X__|__Y__|_name__|
| 2 | 5 | Smith |
| 3 | 7 | Mary |
-------tableC-------- // tableC is a temporary holding table for new names
|__X__|__Y__|_name__|
| | | |
DESIRED RESULT:
-------tableC-------- // tableB - tableA = tableC
|__X__|__Y__|_name__|
| 2 | 5 | Smith |
I want to match rows based on a concatenated X+Y value. My SQL query so far looks like this:
INSERT INTO tableC
SELECT * FROM tableA
LEFT JOIN tableB
ON tableA.X & table.B = tableB.X & tableB.Y
WHERE tableB.X & tableB.Y IS null
However, this does not give me the intended result. I cannot use EXISTS as my actual data set is very big. Could anyone give me suggestions?
I don't think the slowness is caused by exists. Your query is probably slow because you're trying to use concatenation to match multiple columns. Use and instead and make sure you have a composite index on (x,y):
This will select all unique rows in tableB that don't have the same (x,y) value in tableA. Note that any rows with the same x,y but a different name will show up in the result (i.e. 2,5,Joe would also appear). If you don't want that, then you have to group by x,y and decide which name you want in case of duplicate x,y but different name.
select distinct x,y,name
from tableB b
where not exists (
select 1 from tableA a
where a.x = b.x
and a.y = b.y
)
A pretty basic question. But what is the difference between
SELECT t.col
FROM table t, other_table o
WHERE t.col NOT IN o.col
and
SELECT col
FROM table
WHERE col NOT IN (SELECT col FROM other_table)
Semantically this sounds pretty equal to me, but the first one creates duplicates. What am I understanding wrong?
The first one won't even run in most RDBMS, but in oracle it returns every combination of records except where t.col = o.col, you'd see this if you added o.col to your SELECT
The latter query returns records from table that don't share the col value with any records in other_table.
Best illustrated by example:
Table1
| ANIMAL |
|--------|
| dog |
| cat |
| horse |
Table2
| ANIMAL |
|--------|
| dog |
| fish |
Queries:
SELECT t."animal",o."animal"
FROM Table1 t, Table2 o
WHERE t."animal" NOT IN o."animal"
| ANIMAL | ANIMAL2 |
|--------|---------|
| cat | dog |
| horse | dog |
| dog | fish |
| cat | fish |
| horse | fish |
SELECT t."animal"
FROM Table1 t
WHERE t."animal" NOT IN (SELECT o."animal" FROM Table2 o)
| ANIMAL |
|--------|
| horse |
| cat |
Demo: SQL Fiddle
Basically, you've got a cartesian product in the first query which would return every combination of records from the two tables, but your WHERE criteria filters out one of them. The second query has no JOIN, implicit/explicit, it's just taking records from one table and filtering based on criteria that happens to draw from another table.
As far as I know, the query (slightly modified):
SELECT t.col
FROM table t, other_table o
WHERE t.col <> o.col
makes a cartesian product, then filters it.
The below example might not be the exact process that takes place, but it might give an abstract overview of the situation.
If in table table you would have following rows:
col
----
A
B
and in table other_table there would be:
col
---
B
E
cartesian product (FROM table t, other_table o) of the two tables query would probably be:
table.col other_table.col
---------------------------
A B
A E
B B
B E
Then, applying the WHERE t.col <> o.col clause the above would be filtered, giving the results
table.col other_table.col
---------------------------
A B
A E
B E
Since in the query result set, there is only table.col chosen for the output, the final result contains A value duplicates:
table.col
---------
A
A
B
I hope it could help you some way.
# UPDATE
As for the query:
SELECT col
FROM table
WHERE col NOT IN (SELECT col FROM other_table)
Since there is no join, only the row set from the table table is taken into account when building the result.
As far as I understand well, the condition WHERE col NOT IN (SELECT col FROM other_table) is evaluated against each row from the table.
Column table.col is checked whether it belongs to the result set returned by the subquery taking the data from other_table. If it validates to TRUE, then, it's taken into result set, if not, it's excluded from it.
Summing it up, I think that the first query doubles the table.col values only because of the preparing phase, where the tables are joined (merged) together, thus second query takes to the result set only records from table using other_table only for validation purposes. That is implicated from the query structure - if I'm right of course.
What is the difference between a natural join and an inner join?
One significant difference between INNER JOIN and NATURAL JOIN is the number of columns returned.
Consider:
TableA TableB
+------------+----------+ +--------------------+
|Column1 | Column2 | |Column1 | Column3 |
+-----------------------+ +--------------------+
| 1 | 2 | | 1 | 3 |
+------------+----------+ +---------+----------+
The INNER JOIN of TableA and TableB on Column1 will return
SELECT * FROM TableA AS a INNER JOIN TableB AS b USING (Column1);
SELECT * FROM TableA AS a INNER JOIN TableB AS b ON a.Column1 = b.Column1;
+------------+-----------+---------------------+
| a.Column1 | a.Column2 | b.Column1| b.Column3|
+------------------------+---------------------+
| 1 | 2 | 1 | 3 |
+------------+-----------+----------+----------+
The NATURAL JOIN of TableA and TableB on Column1 will return:
SELECT * FROM TableA NATURAL JOIN TableB
+------------+----------+----------+
|Column1 | Column2 | Column3 |
+-----------------------+----------+
| 1 | 2 | 3 |
+------------+----------+----------+
The repeated column is avoided.
(AFAICT from the standard grammar, you can't specify the joining columns in a natural join; the join is strictly name-based. See also Wikipedia.)
(There's a cheat in the inner join output; the a. and b. parts would not be in the column names; you'd just have column1, column2, column1, column3 as the headings.)
An inner join is one where the matching row in the joined table is required for a row from the first table to be returned
An outer join is one where the matching row in the joined table is not required for a row from the first table to be returned
A natural join is a join (you can have either natural left or natural right) that assumes the join criteria to be where same-named columns in both table match
I would avoid using natural joins like the plague, because natural joins are:
not standard sql [SQL 92] and therefore not portable, not particularly readable (by most SQL coders) and possibly not supported by various tools/libraries
not informative; you can't tell what columns are being joined on without referring to the schema
your join conditions are invisibly vulnerable to schema changes - if there are multiple natural join columns and one such column is removed from a table, the query will still execute, but probably not correctly and this change in behaviour will be silent
hardly worth the effort; you're only saving about 10 seconds of typing
A natural join is just a shortcut to avoid typing, with a presumption that the join is simple and matches fields of the same name.
SELECT
*
FROM
table1
NATURAL JOIN
table2
-- implicitly uses `room_number` to join
Is the same as...
SELECT
*
FROM
table1
INNER JOIN
table2
ON table1.room_number = table2.room_number
What you can't do with the shortcut format, however, is more complex joins...
SELECT
*
FROM
table1
INNER JOIN
table2
ON (table1.room_number = table2.room_number)
OR (table1.room_number IS NULL AND table2.room_number IS NULL)
SQL is not faithful to the relational model in many ways. The result of a SQL query is not a relation because it may have columns with duplicate names, 'anonymous' (unnamed) columns, duplicate rows, nulls, etc. SQL doesn't treat tables as relations because it relies on column ordering etc.
The idea behind NATURAL JOIN in SQL is to make it easier to be more faithful to the relational model. The result of the NATURAL JOIN of two tables will have columns de-duplicated by name, hence no anonymous columns. Similarly, UNION CORRESPONDING and EXCEPT CORRESPONDING are provided to address SQL's dependence on column ordering in the legacy UNION syntax.
However, as with all programming techniques it requires discipline to be useful. One requirement for a successful NATURAL JOIN is consistently named columns, because joins are implied on columns with the same names (it is a shame that the syntax for renaming columns in SQL is verbose but the side effect is to encourage discipline when naming columns in base tables and VIEWs :)
Note a SQL NATURAL JOIN is an equi-join**, however this is no bar to usefulness. Consider that if NATURAL JOIN was the only join type supported in SQL it would still be relationally complete.
While it is indeed true that any NATURAL JOIN may be written using INNER JOIN and projection (SELECT), it is also true that any INNER JOIN may be written using product (CROSS JOIN) and restriction (WHERE); further note that a NATURAL JOIN between tables with no column names in common will give the same result as CROSS JOIN. So if you are only interested in results that are relations (and why ever not?!) then NATURAL JOIN is the only join type you need. Sure, it is true that from a language design perspective shorthands such as INNER JOIN and CROSS JOIN have their value, but also consider that almost any SQL query can be written in 10 syntactically different, but semantically equivalent, ways and this is what makes SQL optimizers so very hard to develop.
Here are some example queries (using the usual parts and suppliers database) that are semantically equivalent:
SELECT *
FROM S NATURAL JOIN SP;
-- Must disambiguate and 'project away' duplicate SNO attribute
SELECT S.SNO, SNAME, STATUS, CITY, PNO, QTY
FROM S INNER JOIN SP
USING (SNO);
-- Alternative projection
SELECT S.*, PNO, QTY
FROM S INNER JOIN SP
ON S.SNO = SP.SNO;
-- Same columns, different order == equivalent?!
SELECT SP.*, S.SNAME, S.STATUS, S.CITY
FROM S INNER JOIN SP
ON S.SNO = SP.SNO;
-- 'Old school'
SELECT S.*, PNO, QTY
FROM S, SP
WHERE S.SNO = SP.SNO;
** Relational natural join is not an equijoin, it is a projection of one. – philipxy
A NATURAL join is just short syntax for a specific INNER join -- or "equi-join" -- and, once the syntax is unwrapped, both represent the same Relational Algebra operation. It's not a "different kind" of join, as with the case of OUTER (LEFT/RIGHT) or CROSS joins.
See the equi-join section on Wikipedia:
A natural join offers a further specialization of equi-joins. The join predicate arises implicitly by comparing all columns in both tables that have the same column-names in the joined tables. The resulting joined table contains only one column for each pair of equally-named columns.
Most experts agree that NATURAL JOINs are dangerous and therefore strongly discourage their use. The danger comes from inadvertently adding a new column, named the same as another column ...
That is, all NATURAL joins may be written as INNER joins (but the converse is not true). To do so, just create the predicate explicitly -- e.g. USING or ON -- and, as Jonathan Leffler pointed out, select the desired result-set columns to avoid "duplicates" if desired.
Happy coding.
(The NATURAL keyword can also be applied to LEFT and RIGHT joins, and the same applies. A NATURAL LEFT/RIGHT join is just a short syntax for a specific LEFT/RIGHT join.)
Natural Join: It is combination or combined result of all the columns in the two tables.
It will return all rows of the first table with respect to the second table.
Inner Join: This join will work unless if any of the column name shall be sxame in two tables
A Natural Join is where 2 tables are joined on the basis of all common columns.
common column : is a column which has same name in both tables + has compatible datatypes in both the tables.
You can use only = operator
A Inner Join is where 2 tables are joined on the basis of common columns mentioned in the ON clause.
common column : is a column which has compatible datatypes in both the tables but need not have the same name.
You can use only any comparision operator like =, <=, >=, <, >, <>
Natural Join : A SQL Join clause combines fields from 2 or more tables in a relational database. A natural join is based on all columns in two tables that have the same name and selected rows from the two tables that have equal values in all matched columns.
--- The names and data types of both columns must be the same.
Using Clause : In a natural join,if the tables have columns with the same names but different data types, the join causes and error.To avoid this situation, the join clause can be modified with a USING clause. The USING clause specifies the columns that should be used for the join.
difference is that int the inner(equi/default)join and natural join that in the natuarl join common column win will be display in single time but inner/equi/default/simple join the common column will be display double time.
Inner join and natural join are almost same but there is a slight difference between them. The difference is in natural join no need to specify condition but in inner join condition is obligatory. If we do specify the condition in inner join , it resultant tables is like a cartesian product.
mysql> SELECT * FROM tb1 ;
+----+------+
| id | num |
+----+------+
| 6 | 60 |
| 7 | 70 |
| 8 | 80 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
6 rows in set (0.00 sec)
mysql> SELECT * FROM tb2 ;
+----+------+
| id | num |
+----+------+
| 4 | 40 |
| 5 | 50 |
| 9 | 90 |
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
6 rows in set (0.00 sec)
INNER JOIN :
mysql> SELECT * FROM tb1 JOIN tb2 ;
+----+------+----+------+
| id | num | id | num |
+----+------+----+------+
| 6 | 60 | 4 | 40 |
| 7 | 70 | 4 | 40 |
| 8 | 80 | 4 | 40 |
| 1 | 1 | 4 | 40 |
| 2 | 2 | 4 | 40 |
| 3 | 3 | 4 | 40 |
| 6 | 60 | 5 | 50 |
| 7 | 70 | 5 | 50 |
| 8 | 80 | 5 | 50 |
.......more......
return 36 rows in set (0.01 sec)
AND NATURAL JOIN :
mysql> SELECT * FROM tb1 NATURAL JOIN tb2 ;
+----+------+
| id | num |
+----+------+
| 1 | 1 |
| 2 | 2 |
| 3 | 3 |
+----+------+
3 rows in set (0.01 sec)
Inner join, join two table where column name is same.
Natural join, join two table where column name and data types are same.
I'm a SQL newbie, so please forgive the ignorance :)
Basically, I'm wondering what would be a good way of 'joining' 2 tables A and B wherein I just want to check if certain cases in A are in B. The thing is, Not all entries in A need to have matches in B, just a few. For example, Table A
merchant_id | tablet_id | address
33232 | 1 | 83 abs
94732 | 2 | 92 bcu
47373 | 3 | dkid
48238 | 3 | kdid
has joins with other tables in a query. In this same query, I want to implement a condition wherein if tablet_id in B matches with that of A, then to ignore those cases.
merchant | tablet_id | incentive?
33232 | 1 | Yes
67382 | 2 | No
Like I said, A and B only have a few cases in common. I tried a query with a JOIN between A & B and got nothing returned since a join might not be possible if there are no intersecting values between A & B. I'm just looking to implement an IF condition kind of thing.
Hopefully I was articulate. Any help would be appreciated!
SELECT * FROM `A` WHERE `tablet_id` NOT IN (SELECT `tablet_id` FROM `B`)
SELECT
*
FROM
A LEFT JOIN B
ON A.tablet_id = B.tablet_id
WHERE
B.tablet_id is null
You may be looking for the OUTER JOIN.
SELECT *
FROM TableA
LEFT OUTER JOIN TableB ON TableA.tablet_id = TableB.tabletID
This will return all rows from table A, and join rows from Table B where they meet the criteria in the ON clause. If no row exists in Table B for a row in Table A, the Table B column values in the results will be NULL.