Joining two tables on colums with dissimilar (but connected) values - sql

How can I connect two tables on columns with certain linked values but not having the same values?
For instance I need to join tbl1 to tbl2 where tbl1.col=100 and tbl2.col=200. The only connection that have is to me/my company.
Is there a way to link the rows without an explicit shared value? I need all rows with col value '100' to be on the same row as all tbl2 columns have col value 200.

You can put some logic in your join predicate, as in:
select *
from tbl1 as a
join tbl2 as b on a.col + 100 = b.col

Is there a way to link the rows without an explicit shared value?
Yes. You can write a custom JOIN to relate data yourself.
You didn't specify your specific DBMS, so the following examples contain generic SQL.
SELECT * FROM tbl1, tbl2 WHERE tbl1.col = 100 AND tbl12.col = 200
Or, more dynamically:
SELECT * FROM tbl1, tbl2 WHERE tbl1.col + 100 = tbl12.col;
-- with JOIN
SELECT * FROM tbl1 JOIN tbl2 ON (tbl1.col + 100) = tbl12.col;

select
*
from
tbl1
inner join
tbl2
on tbl1.col = 100 and tbl2.col = 200
weird, but it will work

If I understand your problem correctly, you have two tables that logically relate to each other but the current keys in the tables don't (but you have business rules that put them together). I think you need to create a cross-reference table that maps that relationship. The cross-reference table would map the primary keys of each other tables together to show the logical relationship between the data.
I think all of the others posters have made the assumption that the relationship is one you can calculate, but I don't think that is what you are asking. Correct me if I'm wrong.

Related

How to merge data from two tables with some common fields

So I have two tables that have similar data. There are columns in table A that match columns in table B, but with different naming conventions, there are also columns from each that have no equivalent in the other table, no rows should be merged, I need a view (I think) with all the rows from both tables, but with some columns merged so that data from table A.columnB and data from table B.columnF both end up in view C.columnD. There would be columns in the view that only had sources in one of the tables and would be null in rows from the other table. I can't change any of the existing table structure as the database is shared across multiple apps. I think I need to use a bunch of FULL OUTER JOIN statements in the view but I'm having trouble wrapping my mind around how to really go about it. If anyone can provide a generic example of how this should look I should be able to take it from there.
Here's an example of what doesn't work (there are a lot more columns on each side of the JOIN in the actual db, truncated for readability):
SELECT
schedule_block.id as vid,
schedule_block.reason as vreason,
schedule_block.when_ts as vwhen,
schedule_block.duration as vduration,
schedule_block.note as vnote,
schedule_block.deleted_ts as vdeleted_when,
schedule_block.deleted_user_id as vdeleted_user_id,
schedule_block.lastmodified_ts as vlastmodified_ts,
schedule_block.lastmodified_user_id as vlastmodified_user_id
FROM schedule_block
FULL OUTER JOIN appointment.appt_when as vwhen ON 1 = 1
FULL OUTER JOIN appointment.patient_id as vpatient ON 1 = 1
FULL OUTER JOIN appointment.duration as vduration on 1 = 1
FULL OUTER JOIN appointment.deleted_when as vdeleted_when ON 1 = 1
Correct me if I'm wrong but I think I can't use a UNION because there are different numbers of columns on each side
You could do something like:
SELECT ColA, CONVERT(DATE, NULL) AS ColB
FROM T1
UNION ALL
SELECT CONVERT(VARCHAR(10), NULL) AS ColA, ColB
FROM T2
Just make sure to match the datatypes.

How does JOIN work exactly in SQL

I know that joins work by combining two or more tables by their attributes, so if you have two tables that both have three columns and both have column INDEX, if you use table1 JOIN table2 you will get a new table with 5 columns, but what if you do not have a column that is shared by both table1 and table2? Can you still use JOIN or do you have to use TIMES?
Join is not a method for combining tables. It is a method to select records (and selected fields) from 2 or more tables where every table in the query must carry a field that can be matched to a field in another table in the query. The matched fields need not have the same name, but must carry the same type of data. Lacking this would be like trying to create meaning from joining a list of license plates of cars in NYC, with height data from lumberjacks in Washington state -- not meaningful.
Ex:)
Select h.name, h.home_address, h.home_phone, w.work_address,
w.department
from home h, work w
where h.employee_id = w.emp_id
As long as both columns: employee_id and emp_id carry the same information this query will work
In Microsoft Access, to get five rows from a three column table joined to a two column table, you'd use:
SELECT Table1.*, Table2.* FROM Table1 INNER JOIN Table2 ON Table1.Field1 = Table2.Field1;
You can query whatever you want, and join whatever you want, though.
If your one table is a list of people, and your other is a list of cars, and you want to see what people have names that are also models of cars, you can do:
SELECT Table1.Name, Table1.Age, Table2.Make, Table2.Year
FROM Table1 INNER JOIN Table2 ON Table1.Name = Table2.Model;
Only when Name is the same as Model will it show a record.
This is the same idea for joining tables in any relational DBMS I've used.
You are right you can join two tables even if they do not have shared column.
Join uses primary to prevent mistakes on inserting or deleting when user trying to insert record that does not has a parent one or some thing like this.
join methods has many types you can view them here:
http://dev.mysql.com/doc/refman/5.7/en/join.html
LEFT JOIN: select all records from first table, then selecting all records from second table that fulfilling the condition after ON clause.
you can't join the tables if they do not share a common column. If you can find a 3rd table that has common columns with table1 and table2 you can get them to join that way. so join table2 and tabl3 on a common column and than join table3 back to table1 on a common column.

Number of Records don't match when Joining three tables

Despite going through every material I could possibly find on the internet, I haven't been able to solve this issue myself. I am new to MS Access and would really appreciate any pointers.
Here's my problem - I have three tables
Source1084 with columns - Department, Sub-Dept, Entity, Account, +few more
R12CAOmappingTable with columns - Account, R12_Account
Table4 with columns - R12_Account, Department, Sub-Dept, Entity, New Dept, LOB +few more
I have a total of 1084 records in Source and the result table must also contain 1084 records. I need to draw a table with all the columns from Source + R12_account from R12CAOmappingTable + all columns from Table4.
Here is the query I wrote. This yields the right columns but gives me more or less number of records with interchanging different join options.
SELECT rmt.r12_account,
srb.version,
srb.fy,
srb.joblevel,
srb.scenario,
srb.department,
srb.[sub-department],
srb.[job function],
srb.entity,
srb.employee,
table4.lob,
table4.product,
table4.newacct,
table4.newdept,
srb.[beg balance],
srb.jan,
srb.feb,
srb.mar,
srb.apr,
srb.may,
srb.jun,
srb.jul,
srb.aug,
srb.sep,
srb.oct,
srb.nov,
srb.dec,
rmt.r12_account
FROM (source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account)
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity )
WHERE ( ( ( srb.[sub-department] ) = table4.subdept )
AND ( ( srb.entity ) = table4.entity )
AND ( ( rmt.r12_account ) = table4.r12_account ) );
In this simple example, Table1 contains 3 rows with unique fld1 values. Table2 contains one row, and the fld1 value in that row matches one of those in Table1. Therefore this query returns 3 rows.
SELECT *
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2
ON t1.fld1 = t2.fld1;
However if I add the WHERE clause as below, that version of the query returns only one row --- the row where the fld1 values match.
SELECT *
FROM
Table1 AS t1
LEFT JOIN Table2 AS t2
ON t1.fld1 = t2.fld1
WHERE t1.fld1 = t2.fld1;
In other words, that WHERE clause counteracts the LEFT JOIN because it excludes rows where t2.fld1 is Null. If that makes sense, notice that second query is functionally equivalent to this ...
SELECT *
FROM
Table1 AS t1
INNER JOIN Table2 AS t2
ON t1.fld1 = t2.fld1;
Your situation is similar. I suggest you first eliminate the WHERE clause and confirm this query returns at least your expected 1084 rows.
SELECT Count(*) AS CountOfRows
FROM (source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account)
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity );
After you get the query returning the correct number of rows, you can alter the SELECT list to return the columns you want. But the columns aren't really the issue until you can get the correct rows.
Without knowing your tables values it is hard to give a complete answer to your question. The issue that is causing you a problem based on how you described it. Is more then likely based on the type of joins you are using.
The best way I found to understand what type of joins you should be using would referencing a Venn diagram explaining the different type of joins that you can use.
Jeff Atwood also has a really good explanation of SQL joins on his site using the above method as well.
Best to just use the query builder. Drop in your main table. Choose the columns you want. Now for any of the other lookup values then simply drop in the other tables, draw the join line(s), double click and use a left join. You can do this for 2 or 30 columns that need to "grab" or lookup other values from other tables. The number of ORIGINAL rows in the base table returned should ALWAYS remain the same.
So just use the query builder and follow the above.
The problem with your posted SQL is you NESTED the joins inside (). Don't do that. (or let the query builder do this for you – they tend to be quite messy but will also work).
Just use this:
FROM source1084 AS srb
LEFT JOIN r12caomappingtable AS rmt
ON srb.account = rmt.account
LEFT JOIN table4
ON ( srb.department = table4.dept )
AND ( srb.[sub-department] = table4.subdept )
AND ( srb.entity = table4.entity )
As noted, I don't see why you are "repeating" the conditions again in the where clause.

SQL Join for cell content, not column name

I read up on SQL Join but as far as I understand it, you can only join tables which have a column name in common.
I have information in two different tables, but the column name is different in each. I need to pull information on something which is only in one of the tables, but also need information from the other. So was looking to join/merge them.
Here is what I mean..
TABLE1:
http://postimg.org/image/hnd63c2f5/
The cell content 18599 in column from_pin_id also pertains to content in another table:
TABLE2:
http://postimg.org/image/apmu26l5z/
My question is how do I merge the two table details so that it recognizes 18599 is referring to the same thing, so that I can pull content on it from other columns in TABLE2?
I've looked through the codes on W3 but cannot find anything to what I need, as mentioned above, it seems to be just for joining tables with a common column:
SELECT column_name(s)
FROM table1
JOIN table2
ON table1.column_name=table2.column_name;
You can write as :
select * from table1
where from_pin_id in
(
select from_pin_id
from table1
intersect
select id
from table2
)
Intersect operator selects all elements that belong to both of the sets.
Change the table names and the columns that you select as needed.
SELECT table1.id, table1.owner_user_id, table1.from_pin_id, table2.board_id
FROM table1
JOIN table2 ON table1.from_pin_id = table2.id
GROUP BY id, owner_user_id, from_pin_id, board_id

Find difference between two big tables in PostgreSQL

I have two similar tables in Postgres with just one 32-byte latin field (simple md5 hash).
Both tables have ~30,000,000 rows. Tables have little difference (10-1000 rows are different)
Is it possible with Postgres to find a difference between these tables, the result should be 10-1000 rows I described above.
This is not a real task, I just want to know about how PostgreSQL deals with JOIN-like logic.
EXISTS seems like the best option.
tbl1 is the table with surplus rows in this example:
SELECT *
FROM tbl1
WHERE NOT EXISTS (SELECT FROM tbl2 WHERE tbl2.col = tbl1.col);
If you don't know which table has surplus rows or both have, you can either repeat the above query after switching table names, or:
SELECT *
FROM tbl1
FULL OUTER JOIN tbl2 USING (col)
WHERE tbl2 col IS NULL OR
tbl1.col IS NULL;
Overview over basic techniques in a later post:
Select rows which are not present in other table
Aside: The data type uuid is efficient for md5 hashes:
Convert hex in text representation to decimal number
Would index lookup be noticeably faster with char vs varchar when all values are 36 chars
To augment existing answers I use the row() function for the join condition. This allows you to compare entire rows. E.g. my typical query to see the symmetric difference looks like this
select *
from tbl1
full outer join tbl2
on row(tbl1) = row(tbl2)
where tbl1.col is null
or tbl2.col is null
If you want to find the difference without knowing which table has more rows than other, you can try this option that get all rows present in either tables:
SELECT * FROM A
WHERE NOT EXISTS (SELECT * FROM B)
UNION
SELECT * FROM B
WHERE NOT EXISTS (SELECT * FROM A)
In my experience, NOT IN with a subquery takes a very long time. I'd do it with an inclusive join:
DELETE FROM table1 where ID IN (
SELECT id FROM table1
LEFT OUTER JOIN table2 on table1.hashfield = table2.hashfield
WHERE table2.hashfield IS NULL)
And then do the same the other way around for the other table.