find duplicate values in between two tables

find duplicate values in between two tables - sql

I have two tables: Customer1 and Customer2. Both tables' fields are the same, but employee names are different. Some of the employee are repeated in both tables.
I want to find the employees that are duplicated across both tables.
Sample data:
Customer 1
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
1 User5 Developer 5000
1 User1 Developer 5000
Customer 2
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
1 User3 Developer 5000
1 User1 Developer 5000
Result
ID Name Designation salary
1 User1 Developer 5000
1 User2 Developer 5000
User1 and User2 are in both tables many times, but I want to count them only once. I really appreciate any help on this.

You can join the two tables and the print out the distinct records from Customer1 that are left:
SELECT distinct customer1.*
FROM customer1
INNER JOIN customer2 ON
customer1.id = customer2.id
AND customer1.name = customer2.name

You can use UNION ALL & do aggergation :
select id, name, Designation, salary
from (select c1.id, c1.name, c1.Designation, c1.salary
from Customer1 c1
union all
select c2.id, c2.name, c2.Designation, c2.salary
from Customer2 c2
) c
group by id, name, Designation, salary
having count(*) > 1;

Related

count different column values after grouping by

Consider this table:
id name department email
1 Alex IT blah#gmail.com
1 Alex IT blah#gmail.com
2 Jay HR jay#gmail.com
2 Jay Marketing zou#gmail.com
If I group byid,name and count I get:
id name count(*)
1 Alex 2
2 Jay 2
With this query:
select id,name,count(*) from tb group by id,name;
However I would like to count only records that diverge from department,email, so as to have:
id name count(*)
1 Alex 0
2 Jay 1
This time the count for the first group 1,Alex is 0 because department,email have the same values (duplicated) , on the other hand 2,Jay is one because department,email has one different value.

If you meant "two different values" for "Jay", you can use distinct:
select id,name,count(*) from (SELECT distinct * FROM tb) group by id,name;
You can use count(*) - 1 to get similar results in your question.

How to select rows with related values?

I have a table "company" with data like follows containing employees and their companies:
EmployeeID EmployeeName CompanyName CompanyID ParentCoID
------------------------------------------------------------------------------------------------
100 John A 500 NULL
100 John A.1 600 500
250 Paul B 800 NULL
250 Paul ABC 2000 NULL
350 Joe D 1000 5000
600 Tom E 700 NULL
600 Tom E.2 1500 700
I am trying to display rows where if the Parent Co ID value is in the Company ID, then show both rows for that employee. So basically select employees that are in both the parent company and the sub company and show both of those rows.
I've tried
SELECT *
FROM company
WHERE CompanyID IN (SELECT ParentCoID FROM company)
but this only returns the row where the CompanyID value matches and I'm not sure how to also include that second row also.
My desired output for the sample above would be:
EmployeeID EmployeeName CompanyName CompanyID ParentCoID
------------------------------------------------------------------------------------------------
100 John A 500 NULL
100 John A.1 600 500
600 Tom E 700 NULL
600 Tom E.2 1500 700
As from the result above, Company A.1 is a sub company of A, and same with company E and E.2. I am trying to select employees that are in both the main company and sub company and therefore need to refer to the ParentCoID and the CompanyID columns.

SELECT * FROM company
WHERE EmployeeID IN (SELECT EmployeeID FROM company WHERE CompanyID IN (SELECT ParentCoID FROM company)

Please try this:
Updated the query.
SELECT * FROM company c
WHERE CompanyID IN (SELECT ParentCoID FROM company c2 WHERE c2.EmployeeID = c.EmployeeID) OR ParentCoID IN (SELECT CompanyID FROM company c3 WHERE c3.EmployeeID = c.EmployeeID)

Using EXISTS
SELECT a.*
FROM company a
WHERE EXISTS (SELECT b.parentcoid
FROM company b
WHERE a.companyid = b.parentcoid
AND a.employeeid = b.employeeid)
UNION ALL
SELECT c.*
FROM company c
WHERE EXISTS (SELECT d.companyid
FROM company d
WHERE d.companyid = c.parentcoid
AND d.employeeid = c.employeeid)

A recursive CTE is useful in querying hierarchical data.
with cte As (
select * from company c
Where ParentCoID is null and CompanyID = Any(Select ParentCoID From company Where EmployeeID=c.EmployeeID)
union all
select c.*
from cte p inner join company c On (p.CompanyID=c.ParentCoID And p.EmployeeID=c.EmployeeID)
)
select * From cte order by EmployeeID, CompanyID, ParentCoID

Here's a slightly different approach using a simple window function and a self-join and will likely be performant given a clustered index on (EmployeeId,ParentCoId).
with e as (
select EmployeeID,ParentCoID CoId,
Sum(case when ParentCoID is null then 1 end ) over(partition by EmployeeID) IsSub
from Company
)
select c.*
from e
join Company c on c.EmployeeID=e.EmployeeID
where (e.CoId=c.CompanyID or e.CoId=c.ParentCoID)
and e.CoId is not null and e.IsSub=1

unable to perform conditional insert int hive

I have two tables . Customer1 and Customer2
Customer1
id name
1 jack
2 john
3 jones
Customer 2
id name
The Customer 2 table is empty . Now i have to check if a particular name say 'jack' is present or not in customer 2 and to insert if a name 'jack' is not present in customer 2 .

The below query should server the purpose. I am assuming that ID is the key to link between the tables, if not you can use the name in the join condition.
`insert into customer2
select customer1.*
from customer1
left join customer2
on (customer1.id=customer2.id)
where customer1.name='jack' and isnull(customer2.id);`

Selecting Records Matching Two or More Related Tables

I have a 'persons' table:
person_id name
100 jack
125 jill
201 jane
And many sub-tables, that the person_id could be in:
'rowing'
id person_id
1 100
2 201
'swimming'
id person_id
1 125
2 201
'running'
id person_id
1 201
'throwing'
id person_id
1 125
2 201
I would like to be able to select all people who are involved in two activities, regardless of which two.

As the great #TimSchmelter (great first name) mentioned, you should really be having a single PersonActivities table with an id corresponding to the particular activity.
That being said, if you must work with your current schema, one option would be to UNION together the activity tables, and then count which persons have two or more records, meaning that they participated in two or more activities.
SELECT t1.person_id, t1.name
FROM persons
INNER JOIN
(
SELECT t.person_id, COUNT(t.person_id) AS activityCount
FROM
(
SELECT person_id FROM rowing
UNION ALL
SELECT person_id FROM swimming
UNION ALL
SELECT person_id FROM running
UNION ALL
SELECT person_id FROM throwing
) AS t
GROUP BY t.person_id
HAVING COUNT(t.person_id) > 1
) t2
ON t1.person_id = t2.person_id

show result from one table

good day, i have these 3 tables...i.e.;
customer table
cust_id cust_name sales_employee
1 abc 1
2 cde 1
3 efg 2
transaction table
order_num cust_id sales_employee
1001 1 1
1002 2 2
sales_employee table
sales_employee employee name
1 john doe
2 jane doe
how can i show the employee name on both customer table and transaction table?
notice how the sales_employee can change per transaction, it does not necessarily have to be the same per customer.
please help.

To select customers with sales person name
select
C.*, E.employee_name
from
Customers as C
inner join Sales_Employees as E on E.sales_employee = C.sales_employee
To select transactions with customer name and salesperson name (at the point in time of the transaction)
select
T.*,
E.employee_name as Trans_employee,
C.cust_name,
EC.employee_name as Cust_employee
from
Transactions as T
inner join Sales_Employees as E on E.sales_employee = T.sales_employee
inner join Customers as C on C.cust_id= T.cust_id
inner join Sales_Employees as EC on EC.sales_employee = C.sales_employee
This code is meant to guide you, you will need to adjust it to match your table and field names.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

find duplicate values in between two tables - sql

You can join the two tables and the print out the distinct records from Customer1 that are left: SELECT distinct customer1.* FROM customer1 INNER JOIN customer2 ON customer1.id = customer2.id AND customer1.name = customer2.name

You can use UNION ALL & do aggergation : select id, name, Designation, salary from (select c1.id, c1.name, c1.Designation, c1.salary from Customer1 c1 union all select c2.id, c2.name, c2.Designation, c2.salary from Customer2 c2 ) c group by id, name, Designation, salary having count(*) > 1;

Related

count different column values after grouping by

How to select rows with related values?

unable to perform conditional insert int hive

Selecting Records Matching Two or More Related Tables

show result from one table

Categories

Resources