SQL: finding unmatched observations using multiple criteria - sql

As I am not very practical with SQL I am dealing with what is very likely a simple problem.
I have two tables t1 and t2 and I need to find observations in t2 that are not already in t1. The two tables contain lists of people by name, last name, email address...
The problem is that I can match some people via email address while some others with name + last name and so on...
For those in t1 that figure in t2 I need to set a value to 1 and then I need a list of the others in t2 that are unmatched.
So I've done:
UPDATE t1, t2 SET t1.value = "1"
WHERE (t1.Mail In t2.Mail);
which seems to have worked, but then when I try using name + last name it does not work:
UPDATE t1, t2 SET t1.value = "1"
WHERE (t1.Name AND t1.Surname In t2.Name AND t2.Surname);
Then for a list of unmatched observation I can find people that have no match using email and people that have no match using name + last name but I would like to find the people that have no match for name+last name of those who have no match for email address.
What I've done:
SELECT t2.*
FROM t2 LEFT JOIN t1 ON t2.Mail = t1.Mail
WHERE (t1.Mail Is Null);
and
SELECT t2.*
FROM t2
LEFT JOIN t1 ON t2.Surname = t1.Surname AND t2.Name = t1.Name
WHERE (t1.ID is Null);

Your current update queries are referencing the tables using a cross join or cartesian product, that is: for each record in the first table, the query is iterating over every record in the second table, with all matching occurring through the use of the where clause criteria.
This is generally considered bad practice when left/right/inner joins can be used, per your second set of examples using the left join.
As such, assuming I have correctly understood your data, I might suggest the following:
update t1 inner join t2 on t1.mail = t2.mail set t1.value = "1"
update t1 inner join t2 on t1.name = t2.name and t1.surname = t2.surname set t1.value = "1"
Regarding your second question:
...I would like to find the people that have no match for name+last name of those who have no match for email address.
You could use something along the lines of the following:
select q.*
from
(
select t2.* from t2 left join t1 on t2.mail = t1.mail
where t1.mail is null
) q
left join t1 on q.name = t1.name and q.surname = t1.surname
where
t1.name is null
This uses a subquery to obtain an initial set of records for which the email address does not match, and then left joins the results of this subquery with table t1 to obtain the records for which there is no matching first & last name.

Related

Join tables based on condition in Hive

I have two tables and have mentioned the layout of tables in the below screenshot.
Using Id column I need to join two tables. Whenever there is a match with Id (example Id is a2), I need to take the corresponding values from Table2. In some scenario, Id from Table1 will not be there in Table2 (example Id is a1). In that case, I have to take values from Table2 where the value of Id in Table2 is all.
For better clarification, I have mentioned the Expected output in the below screenshot.
I tried the below code, but don't seem to be working.
select distinct t1.Id,t1.Name,t2.Phone1,t2.Phone2
from Table1 t1
left join Table1 t2 on t1.Id = case when t2.Id is null then t1.Id else t2.Id end
Any help is greatly appreciated.
You can use two left joins, where the second brings in the default values:
select t1.*, coalesce(t2.phone1, t2d.phone1) as phone1, coalesce(t2.phone2, t2d.phone2) as phone2
from table1 t1 left join
table2 t2
on t1.id = t2.id left join
table2 t2d
on t2d.id = 'All' and t2.id is null;

Join two tables on two dimensions without cross-joining

I have a table with IDs and domains(T1). Another with Names and domains(T2). A third with names and IDs (T3).
In its simplified form, my query goes as follows :
SELECT *
FROM T2
LEFT JOIN T1
ON T2.domain = T1.domain
)
LEFT JOIN T3
ON T1.name = T3.name
The output I'm looking for is a list with columns : "ID", "Name" and "Domain" where either domains or Names match in order to get the IDs. The challenge I face is that one domain can match with two names, and this creates a set of false positives (because the name matches, the wrong ID is also attributed).
Any best practices I should follow when doing these kind of joins would be most helpful.
Thanks
S
I think you want:
SELECT t2.name, t2.domain, coalesce(t1.id, t2.id)
FROM T2 LEFT JOIN
T1
ON T2.domain = T1.domain LEFT JOIN
T3
ON T2.name = T3.name AND
t1.domain IS NULL; -- no match on T1
This matches on domain first. Then if there is no domain-match, it uses name.

One update SQL query from three tables

I have three tables and I have to write one query to update table 1 row from table 3 and the only matching columns I have is in table 2.
Table 1 which has incorrect data:
Table 3 has the correct data:
I did try to write a query and execute it but it gives me an error saying there are too many rows too select which is true I do have many rows to correct but it still wouldn't correct. What do you think I should do. This is my query so far.
UPDATE Table1
SET Table1.Number = (SELECT Table3.Number
FROM Table2
FULL OUTER JOIN Table1 ON Table1.ID = Table2.ID
FULL OUTER JOIN Table3 ON Table3.Signin = Table2.Signin
WHERE (Table2.ID = Table1.ID)
AND (Table1.Number = 'xxx'))
WHERE (Tale1.Number = 'xxx')
In Where clause of JOIN query need to modify as multiple records are generating by inappropriate condition.Try to use Table3 components instead of using Table1 in joining query where clause.
UPDATE Table1
SET Table1.NUMBER = (SELECT table3.NUMBER FROM Table1 FULL OUTER JOIN Table2
ON Table1.ID = Table2.ID
FULL OUTER JOIN Table3
ON Table2.SIGNIN = Table3.SIGNIN
WHERE Table3.SIGNIN = 100) // This is the point where you need to modify your code
WHERE Table1.ID = 1;
ONLINE DEMO HERE
It actually worked after I removed this line from my query.
FULL OUTER JOIN Table1 ON table1.ID = Table2.ID
Thanks for the help.
You are fairly close. When doing the update though unless you are wanting to clear value for t1.number when a record is not matched in t3, you will want to use INNER JOIN. FULL OUTER JOIN would mean you are trying to update rows in t1 that don't exist but a LEFT JOIN you would update t1.number to NULL if a record in t3 doesn't exist.
UPDATE t1
SET t1.Number = t3.Number
FROM
Table1 t1
INNER JOIN Table2 t2
ON t1.Id = t2.Id
INNER JOIN Table3 t3
ON t2.Signin = t.3.Signin
WHERE
t1.number <> t3.number
--Or if you have nulls something like
--ISNULL(t1.number,'xxx') <> ISNULL(t3.number,'xxx')
-- if you only want to update when t1.number = 'xxx' then
--t1.number = 'xxx'
t1,t2,t3 are table aliases that I created by adding the alias after table name. By using join syntax rather than a sub select you simplify your were conditions. In sql-sever if more than 1 record in t2 & t3 match it will select one row randomly in the case of a one to many relationship. If you want a specific record when not one to one relation you can use window functions and common table expressions (cte) to limit t3 to the exact record you want to use.

Join where does not equal

I have two queries I have created in Access and I'm a step away from arriving at the data I need. I need to subset in some way.
Table1 and Table2 (Actually query1 and query2). They both have 3 fields: Email, Matcher and List.
I need to get all the results from Table2 where Email does not exist in Table1.
I found some posts about using an outer join and where null clause. I could not get it to work though. Didn't post what I tried here in case I was off course.
select t2.*
from table2 t2
left join table1 t1 on t2.email = t1.email
where t1.email is null
SELECT t2.*
FROM table2 t2
WHERE NOT EXISTS (
SELECT *
FROM table1 t1
WHERE t1.email = t2.email
)

MySQL Join syntax for one to many relationship

I have a situation where I have one table of titles (t1) and another table with multiple links that reference these titles (t2) in a one to many relationship.
What I want is the full list of titles returned with a flag that indicates if there is a specific link associated with it.
Left Join and Group By:
SELECT
t1.id
, t1.title
, t2.link_id AS refId
FROM
t1
LEFT JOIN t2
ON (t1.id = t2.title_id)
GROUP BY t1.id;
This is close as it gives me either the first link_id or NULL in the refId column.
Now, how do I constrain the results if I have a specific link_id rather than allowing t2 run through the whole data set?
If I add a WHERE clause, for example:
WHERE t2.link_id = 123
I only get the few records where the link_id matches but I still need the full set of titles returned with NULL in the refId column unless link_id = 123.
Hope someone can help
Instead of in the WHERE clause, put your criteria in the LEFT JOIN clause:
SELECT
t1.id
, t1.title
, t2.link_id AS refId
FROM
t1
LEFT JOIN t2
ON t1.id = t2.title_id AND t2.link_id = 123
GROUP BY t1.id;
Put it in the join condition for the second table
SELECT t1.id, t1.title, t2.link_id as refId
FROM t1
LEFT JOIN t2 ON t1 = t2.title_id AND t2.link_id = 123
GROUP BY t1.id;