mysql - union tables by unique field - sql

I have two tables with the same structure:
id name
1 Merry
2 Mike
and
id name
1 Mike
2 Alis
I need to union second table to first with keeping unique names, so that result is:
id name
1 Merry
2 Mike
3 Alis
Is it possible to do this with MySQL query, without using php script?

This is not a join (set multiplication), this is a union (set addition).
SELECT #r := #r + 1 AS id, name
FROM (
SELECT #r := 0
) vars,
(
SELECT name
FROM table1
UNION
SELECT name
FROM table2
) q

This will select all names from table1 and combine those with all the names from table2 which are not in table1.
(
select *
from table1
)
union
(
select *
from table2 t2
left join table1 t1 on t2.name = t1.name
where t1.id is null
)

Use:
SELECT a.id,
a.name
FROM TABLE_A a
UNION
SELECT b.id,
b.name
FROM TABLE_B b
UNION will remove duplicates.

As commented, it all depends on what your 'id' means, cause in the example, it means nothing.
SELECT DISTINCT(name) FROM t1 JOIN t2 ON something
if you only want the names
SELECT SUM(something), name FROM t1 JOIN t2 ON something GROUP BY name
if you want to do some group by
SELECT DISTINCT(name) FROM t1 JOIN t2 ON t1.id = t2.id
if the id's are the same

SELECT DISTINCT COALESCE(t1.name,t2.name) FROM
mytable t1 LEFT JOIN mytable t2 ON (t1.name=t2.name);
will get you a list of unique names from the 2 tables. If you want them to get new ids (like Alis does in your desired results), that's something else and requires the answers to a couple of questions:
do any of the names need to maintain their previous id. And if they do, which table's id should be preferred?
why do you have 2 tables with the same structure? ie what are you trying to accomplish when you generate the unique name list?

Related

How join 2 table

I have trouble. How to join 2 table query. if data table:
Table 1:
CustomerID : 1,2,3,4,5
Customercode : cus1,cus9,cus4,null,null
Customername: roya,almudena,jack,jane, Francisco
Table 2 :
CustomerID : 1,2,3,4
Customercode : cus1,cus2,cus9,null
Customername: roya,jose,almudena,jane
Q: what is query to show all name from 2 table (no duplicate name).
Thank you for your answer.
You don't need a JOIN for this, you need a UNION statement
select distinct name from table1
union
select distinct name from table2
if you use union all it will create duplicates, but union on its own will not.
you could wrap it in select distinct name from () as well if you want to be extra safe.
If you have no duplicate names within each table (as in your sample data), I strongly suggest:
select t1.customername
from table1 t1
union all
select t2.customername
from table2 t2
where not exists (select 1 from table1 t1 where t1.customername = t2.customername);
This should have better performance.
Use union statement :
select distinct Customername from table1
union
select distinct Customername from table2

Join table on Count

I have two tables in Access, one containing IDs (not unique) and some Name and one containing IDs (not unique) and Location. I would like to return a third table that contains only the IDs of the elements that appear more than 1 time in either Names or Location.
Table 1
ID Name
1 Max
1 Bob
2 Jack
Table 2
ID Location
1 A
2 B
Basically in this setup it should return only ID 1 because 1 appears twice in Table 1 :
ID
1
I have tried to do a JOIN on the tables and then apply a COUNT but nothing came out.
Thanks in advance!
Here is one method that I think will work in MS Access:
(select id
from table1
group by id
having count(*) > 1
) union -- note: NOT union all
(select id
from table2
group by id
having count(*) > 1
);
MS Access does not allow union/union all in the from clause. Nor does it support full outer join. Note that the union will remove duplicates.
Simple Group By and Having clause should help you
select ID
From Table1
Group by ID
having count(1)>1
union
select ID
From Table2
Group by ID
having count(1)>1
Based on your description, you do not need to join tables to find duplicate records, if your table is what you gave above, simply use:
With A
as
(
select ID,count(*) as Times From table group by ID
)
select * From A where A.Times>1
Not sure I understand what query you already tried, but this should work:
select table1.ID
from table1 inner join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1
Or if you have ID's in one table but not the other
select table1.ID
from table1 full outer join table2 on table1.id = table2.id
group by table1.ID
having count(*) > 1

SQL Server: Cross Referencing multiple columns in 1 table with multiple columns in another

I have 2 tables each containing 10 integer fields. I need to retrieve records from table2 where if one of the field values exists in any one of the columns in table1. For example;
So if I have a variable containing 1 (the id of record in table1) I need sql which will retrieve records 2 & 3 from table 2 because either 3400 or 3500 exists in any of the cat fields in table 2. Hope that makes sense ;-)
This might be the most efficient way, although a bit cumbersome to write:
select t2.*
from table2 t2
where exists (select 1 from table1 t1 where t2.cat1 in (t1.cat1, t1.cat2, t1.cat3, t1.cat4, t1.cat5) or
exists (select 1 from table1 t1 where t2.cat2 in (t1.cat1, t1.cat2, t1.cat3, t1.cat4, t1.cat5) or
exists (select 1 from table1 t1 where t2.cat3 in (t1.cat1, t1.cat2, t1.cat3, t1.cat4, t1.cat5) or
exists (select 1 from table1 t1 where t2.cat5 in (t1.cat1, t1.cat2, t1.cat3, t1.cat4, t1.cat5) or
exists (select 1 from table1 t1 where t2.cat6 in (t1.cat1, t1.cat2, t1.cat3, t1.cat4, t1.cat5);
More importantly, though, your data structure is bad. Trying to use multiple columns as an array is usually a bad idea in SQL. You want a junction table for each of these, where you have one row per id and category. If you had such a data structure, the query would be much simpler to write and probably have much better performance.
WITH Table1Ids AS (
SELECT ID, cat
FROM Table1
CROSS APPLY (
VALUES (cat0), (cat1), (cat2), (cat3), (cat4)
) AS CA1(cat)
)
,Table2Ids AS (
SELECT ID, cat
FROM Table2
CROSS APPLY (
VALUES (cat0), (cat1), (cat2), (cat3), (cat4)
) AS CA1(cat)
)
,MatchingRecords AS (
SELECT DISTINCT Table1Ids.ID AS Table1ID
,Table2Ids.ID AS Table2ID
FROM Table1Ids
INNER JOIN Table2Ids
ON Table2Ids.cat = Table1Ids.cat
)
SELECT Table2.*
FROM MatchingRecords
INNER JOIN Table2
ON Table2.ID = MatchingRecords.Table2ID
WHERE MatchingRecords.Table1ID = #TheIDYouAreLookingFor

Compare two tables and give the output record which does not exist in 1st table

I want an SQL code which should perform the task of data scrubbing.
I have two tables both contain some names I want to compare them and list out only those name which are in table 2 but not in table 1.
Example:
Table 1 = A ,B,C
Table 2 = C,D,E
The result must have D and E?
SELECT t2.name
FROM 2 t2
LEFT JOIN
1 t1 ON t1.name=t2.name
WHERE t1.name IS NULL
select T2.Name
from Table2 as T2
where not exists (select * from Table1 as T1 where T1.Name = T2.Name)
See this article about performance of different implementations of anti-join (for SQL Server).
select t2.name
from t2,t1
where t2.name<>t1.name -- ( or t2.name!=t1.name)
If the DBMS supports it:
select name from table2
minus
select name from table1
A more portable solution could also be:
select name from table2
where name not in (select name from table1)

select first N distinct rows without inner select in oracle

I have something like the following structure: Table1 -> Table2 relationship is 1:m
I need to perform queries similar to the next one:
select Table1.id from Table1 left outer join Table2 on (Table1.id1 = Table2.id2) where Table2.name like '%a%' and rownum < 11
i.e. I want first 10 ids from Table 1 which fulfils conditions in Table2. The problem is that I've to use distinct, but the distinct clause applies after 'rownum < 11', so the result could be e.g. 5 records even if their number is more than 10.
The apparent solution is to use the following:
select id from ( select Table1.id from Table1 left outer join Table2 on (Table1.id1 = Table2.id2) where Table2.name like '%a%' ) where rownum < 11
But I'm afraid of performance of such a query. If Table1 contains about 300k records, and Table2 contains about 700k records, wouldn't such a query be really slow?
Is there another query, but without inner select? Unluckily, I want to avoid using inner selects.
Unluckily, I want to avoid using inner
selects
With having the WHERE clause on TABLE2, you are filtering the select to an INNER JOIN (ie. since Table2.name IS null <> Table2.name like '%a%' you will only get results where the join is INNER to one another. Also, the %a% without a function based index will result in a full table scan on each iteration.
but #lweller is completely correct, to do the query correctly you will need to use a subquery. keep in mind, without an ORDER BY you have no guarantee of the order of your top X records (it may always 'appear' that the values conform to the primary key or whatnot, but there is no guarantee.
WITH TABLE1 AS(SELECT 1 ID FROM DUAL
UNION ALL
SELECT 2 ID FROM DUAL
UNION ALL
SELECT 3 ID FROM DUAL
UNION ALL
SELECT 4 ID FROM DUAL
UNION ALL
SELECT 5 ID FROM DUAL) ,
TABLE2 AS(SELECT 1 ID, 'AAA' NAME FROM DUAL
UNION ALL
SELECT 2 ID, 'ABB' NAME FROM DUAL
UNION ALL
SELECT 3 ID, 'ACC' NAME FROM DUAL
UNION ALL
SELECT 4 ID, 'ADD' NAME FROM DUAL
UNION ALL
SELECT 1 ID, 'BBB' NAME FROM DUAL
) ,
sortable as( --here is the subquery
SELECT
Table1.ID ,
ROW_NUMBER( ) OVER (ORDER BY Table2.NAME NULLS LAST) ROWOverName , --this wil handle the sort
table2.name
from
Table1
LEFT OUTER JOIN --this left join it moot, pull the WHERE table2.name into the join to have it LEFT join as expected
Table2
on
(
Table1.id = Table2.id
)
WHERE
Table2.NAME LIKE '%A%')
SELECT *
FROM sortable
WHERE ROWOverName <= 2;
-- you can drop the ROW_NUMBER( ) analytic function and replace the final query as such (as you initially indicated)
SELECT *
FROM sortable
WHERE
ROWNUM <= 2
ORDER BY sortable.NAME --make sure to put in an order by!
;
You don't need DISTINCT here at all, and there is nothing bad in subqueries as such.
SELECT id
FROM Table1
WHERE id IN
(
SELECT id
FROM Table2
WHERE name LIKE '%a%'
)
AND rownum < 11
Note that the order is not guaranteed. To guarantee order, you have to use a nested query:
SELECT id
FROM (
SELECT id
FROM Table1
WHERE id IN
(
SELECT id
FROM Table2
WHERE name LIKE '%a%'
)
ORDER BY
id -- or whatever else
)
WHERE rownum < 11
There is no way to do it without nested queries (or the CTE).
For me there is no reason to be afraid of performance. I think the sub select ist the best way to solve your problem. And if you want don't trust me, take a look at explain plan of your query and you will see that it behave not so bad as you might think.