Join table on Count - sql

I have two tables in Access, one containing IDs (not unique) and some Name and one containing IDs (not unique) and Location. I would like to return a third table that contains only the IDs of the elements that appear more than 1 time in either Names or Location.
Table 1
ID Name
1 Max
1 Bob
2 Jack
Table 2
ID Location
1 A
2 B
Basically in this setup it should return only ID 1 because 1 appears twice in Table 1 :
ID
1
I have tried to do a JOIN on the tables and then apply a COUNT but nothing came out.
Thanks in advance!

Here is one method that I think will work in MS Access:
(select id
from table1
group by id
having count(*) > 1
) union -- note: NOT union all
(select id
from table2
group by id
having count(*) > 1
);
MS Access does not allow union/union all in the from clause. Nor does it support full outer join. Note that the union will remove duplicates.

Simple Group By and Having clause should help you
select ID
From Table1
Group by ID
having count(1)>1
union
select ID
From Table2
Group by ID
having count(1)>1

Based on your description, you do not need to join tables to find duplicate records, if your table is what you gave above, simply use:
With A
as
(
select ID,count(*) as Times From table group by ID
)
select * From A where A.Times>1

Not sure I understand what query you already tried, but this should work:
select table1.ID
from table1 inner join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1
Or if you have ID's in one table but not the other
select table1.ID
from table1 full outer join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1

Related

SQL comparing two sets of complex data

Sorry, probably the title is not the best one but I hope you will understand what problem I have.
I need to compare and analyse two sets of data and I'm using MS-Access for that. My data is organized in two tables. Following is not the real data I'm working with but will serve ok as example:
TABLE 1
ID Name
1 Zoie
2 Rohan
2 Simon
3 Jerome
4 Jakob
4 Mathew
4 Cora
6 Keely
7 Aiyana
7 Jake
8 Reid
9 Emerson
TABLE 2
ID Name
1 Michael
2 Rohan
2 Simon
3 Jill
4 Jakob
4 Cora
5 Charlie
7 John
8 Reid
9 Yadiel
9 Emerson
9 Paris
So, I need to select only those IDs which fully corresponds (all names under specific IDs are the same) in both tables and those are: 2 and 8
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 1 which also appears in table 2 (all from table 1 plus possible some extra in table 2 under the same ID). So that would be: 2, 8, 9
I would also like to have separate select statement which will result with IDs 2 and 8 but also IDs with names from table 2 which also appears in table 1 (all from table 2 plus possible some extra in table 1 under the same ID). So that would be: 2, 4, 8
I would also like to have separate select statement which would be a combination of last two.
So result would be: 2, 4, 8, 9
I would appreciate any suggestions.
Thanks in advance!
Best regards,
Mario
Q#1:
select id
from table1
group by id
having count(*) =
(
select count(*)
from table2
group by table2.id
having table2.id = table1.id
)
and count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Explanation of this query:
It is grouping table1 by ID
For each group (for each ID), it is counting the number of rows in table1 that have this ID.
For each group, it is counting the number of rows in table2 that have this ID.
For each group, it is counting the number of rows where the name appears in both tables for this ID (it does that by inner joining table1 and table2 on the ID and Name which means only rows where both ID and Name match in both tables will be counted, for each ID).
It then returns IDs (from table1) where each of the above counts are equal. This is what results in returning IDs where all names are in both tables (no more, no less).
Q#2 - In this case you don't care that table2 has the same number of names per ID. So remove the first sub-query (that counts matching rows in table2).
select id
from table1
group by id
having count(*) =
(
select count(*)
from table1 table1_1
inner join table2 on table1_1.id = table2.id and table1_1.name = table2.name
group by table1_1.id
having table1_1.id = table1.id
)
Although the above is easy enough to understand following the same logic as Q#1, it is probably more efficient to do the following, and more straightforward. It only matters if you find it running too slow for your data (which is subjective and context dependent).
select table1.id
from table1
left join table2 on table1.id = table2.id and table1.name = table2.name
group by table1.id
having count(table1.id) = count(table2.id)
Here, the two tables are LEFT (outer) joined which means all records from table1 are gathered and records in table2 that match by ID and Name are also included alongside. Then, we group them by ID and we compare the count of each group in table1 with those that had matching names in table2.
Q#3 - This case is the same as Q#2 except table1 and table2 are swapped.
Q#4 - In this case you only care about IDs that have at least one name that appears in both tables. So join the tables and return the distinct IDs:
select distinct id
from table1
inner join table2 on table1.id = table2.id and table1.name = table2.name
Here is a SQLFiddle to play with containing the four queries: http://www.sqlfiddle.com/#!18/3fc71/22

SQL Subquery Join and Sum

I have Table 1 & 2 with common Column name ID in both.
Table 1 has duplicate entries of rows which I was able to trim using:
SELECT DISTINCT
Table 2 has duplicate numeric entries(dollarspent) for ID's which I needed and was able to sum up:
Table 1 Table 2
------------ ------------------
ID spec ID Dol1 Dol2
54 A 54 1 0
54 A 54 2 1
55 B 55 0 2
56 C 55 3 0
-I need to join these two queries into one so I get a resultant JOIN of Table 1 & Table 2 ON column ID, (a) without duplicates in Table 1 & (b) Summed $ values from Table 2
For eg:
NewTable
----------------------------------------
ID Spec Dol1 Dol2
54 A 3 1
55 B 3 2
Notes : No. of rows in Table 1 and 2 are not the same.
Thanks
Use a derived table to get the distinct values from table1 and simply join to table 2 and use aggregation.
The issue you have is you have a M:M relationship between table1 and table2. You need it to be a 1:M for the summations to be accurate. Thus we derive t1 from table1 by using a select distinct to give us the unique records in the 1:M relationship (assuming specs are same for each ID)
SELECT T1.ID, T1.Spec, Sum(T2.Dol1) as Dol1, sum(T2.Dol2) as Dol2
FROM (SELECT distinct ID, spec
FROM table1) T1
INNER JOIN table2 T2
on t2.ID = T1.ID
GROUP BY T1.ID, T1.Spec
This does assume you only want records that exist in both. Otherwise we may need to use an (LEFT, RIGHT, or FULL) outer join; depending on desired results.
I can't really see your data, but you might want to try:
SELECT DISTINCT ID
FROM TblOne
UNION ALL
SELECT DISTINCT ID, SUM(Dol)
FROM TblTwo
GROUP BY ID
Pre-aggregate table 2 and then join:
select t1.id, t1.spec, t2.dol1, t2.dol2
from (select t2.id, sum(dol1) as dol1, sum(dol2) as dol2
from table2 t2
group by t2.id
) t2 join
(select distinct t1.id, t1.spec
from table1 t1
) t1
on t1.id = t2.id;
For your data examples, you don't need to pre-aggregate table 2. This gives the correct sums -- albeit in multiple rows -- if table1 has multiple specs for a given id.

Remove Duplicate Data Based on Three Fields and Two tables

I have two tables that have more than three fields each. There is a group of records that are on both files, the below is a mock example:
Table 1:
ID Name Town State
1 Dave Chicago IL
2 Mark Tea MD
Table 2:
ID Name State Job Married
1 Dave IL Manager Yes
2 Mark MD Driver No
For my purpose duplicates exist if ID, Name, and State are the same. So the above data are duplicates. How do I delete them from one table (I have over 900 duplicates so deleting one by one is not possible)?
delete table1
where ID in(select distinct ID from table1 where ID in (Select ID from table2))
i dont understand which table has duplicates, if you want to delete duplicate data from one table1 then you can use this query
This query will produce a de-duplicated result set:
SELECT Table1.ID,
Table1.NAME,
Table1.Town,
Table1.STATE,
NULL AS Job,
NULL AS Married
FROM Table1
WHERE Table1.ID NOT IN (
SELECT Table1.ID
FROM Table1
INNER JOIN Table2 ON (Table2.STATE = Table1.STATE)
AND (Table2.NAME = Table1.NAME)
AND (Table1.ID = Table2.ID)
)
UNION
SELECT Table2.ID,
Table2.NAME,
NULL AS Town,
Table2.STATE,
Table2.Job,
Table2.Married
FROM Table2
This is the most straightforward way to do it, assuming that you want to delete from Table1. I'm a little rusty with Access SQL syntax, but I believe this works:
DELETE FROM [Table1]
WHERE EXISTS (
SELECT 1
FROM [Table2]
WHERE [Table2].[ID] = [Table1].[ID]
AND [Table2].[Name] = [Table1].[Name]
AND [Table2].[State] = [Table1].[State]
)

DISTINCT based on ID between Two table and SUM of the other exiting column in SQL

SELECT
SUM(amount), (DISTINCT table1.id)
FROM
table1
INNER JOIN
table2 ON table1.id = table2.id IS NULL;
Table1 id amount Table2 id product
1 40 5 10
2 364.25 2 20
3 704.5 8 30
4 404.5 3 40
5 580.5 2 20
The id is not unique or primary ------------------first i need to ignore all double entry ID from table2 then match id from table2 to table1 after that those ids will be match i need total of amount figure amount is a column name i will not calculate single the data will be more than 20000. please help me if you can
First compare match table2 id with table1 id if found any id match then those id amount need to be SUM i mean total. here match id is 2 and 3 according to table2 and then we will add this 2 and 3 id amount so result will be 364.25+704.5=1068.75 i am looking the result how can i do it using mysql.
I am trying to DISTINCT based on ID between two tables and SUM of the other existing column we have in table1. Can somebody help me how to do it?
Based on your latest comment try this
SELECT
SUM(amount)
FROM
table1
INNER JOIN
(Select Distinct ID From table2) T2 ON table1.id = T2.id
Lets leave a matter of performance beyond the topic:)
So according to this:
First compare match table2 id with table1 id if found any id match
then those id amount need to be SUM i mean total
it is possible to use correlated subquery:
select sum(t1.amount)
from table1 t1
where exists (select 1 from table2 t2 where t2.id = t1.id)
From what I understood, you need common IDs from both table and corresponding sum of amount.
SELECT
SUM(amount)
FROM
table1
WHERE
id IN (SELECT id
FROM table2);

mysql - union tables by unique field

I have two tables with the same structure:
id name
1 Merry
2 Mike
and
id name
1 Mike
2 Alis
I need to union second table to first with keeping unique names, so that result is:
id name
1 Merry
2 Mike
3 Alis
Is it possible to do this with MySQL query, without using php script?
This is not a join (set multiplication), this is a union (set addition).
SELECT #r := #r + 1 AS id, name
FROM (
SELECT #r := 0
) vars,
(
SELECT name
FROM table1
UNION
SELECT name
FROM table2
) q
This will select all names from table1 and combine those with all the names from table2 which are not in table1.
(
select *
from table1
)
union
(
select *
from table2 t2
left join table1 t1 on t2.name = t1.name
where t1.id is null
)
Use:
SELECT a.id,
a.name
FROM TABLE_A a
UNION
SELECT b.id,
b.name
FROM TABLE_B b
UNION will remove duplicates.
As commented, it all depends on what your 'id' means, cause in the example, it means nothing.
SELECT DISTINCT(name) FROM t1 JOIN t2 ON something
if you only want the names
SELECT SUM(something), name FROM t1 JOIN t2 ON something GROUP BY name
if you want to do some group by
SELECT DISTINCT(name) FROM t1 JOIN t2 ON t1.id = t2.id
if the id's are the same
SELECT DISTINCT COALESCE(t1.name,t2.name) FROM
mytable t1 LEFT JOIN mytable t2 ON (t1.name=t2.name);
will get you a list of unique names from the 2 tables. If you want them to get new ids (like Alis does in your desired results), that's something else and requires the answers to a couple of questions:
do any of the names need to maintain their previous id. And if they do, which table's id should be preferred?
why do you have 2 tables with the same structure? ie what are you trying to accomplish when you generate the unique name list?