So I have a table that holds a link relationship. Field1 is and ID and Filed2 is an ID. So far I've eliminated duplicate Field1\Field2 combinations. However I still have cases where the inverse occurs. Meaning Field1 occurs as Field2 and Field2 occurs in Field1 for the same record. I tried a subquery within in inner join but it's returning way to many rows. Thanks in advance!
select a.field1, a.field2 from linktable
inner join (select field1, field2 from linktable) b on a.field1=b.field2 and a.field2=b.field1;
Sample data:
insert into linktable(field1, field2) values ('ABC', '123');
insert into linktable(field1, field2) values ('123', 'ABC');
I want to identify and remove cases where the above sample data occurs.
Something like this should get you the pairs:
SELECT a.field1, a.field2
FROM linktable a
JOIN linktable b ON (b.field1 = a.field2 AND b.field2 = a.field1)
Edit: An example of finding one of the pairs:
create table linktable(id number, field1 varchar2(32), field2 varchar2(32));
insert into linktable(id, field1, field2) values (1, 'ABC', '123');
insert into linktable(id, field1, field2) values (2, '123', 'ABC');
SELECT a.field1, a.field2, LEAST(a.id, b.id) AS id_to_delete
FROM linktable a
JOIN linktable b ON (b.field1 = a.field2 AND b.field2 = a.field1);
Result:
FIELD1 FIELD2 ID_TO_DELETE
123 ABC 1
ABC 123 1
You should post sample data and expected result.
I think this query may work for you, this prevent the inverse occurs as field1, field2 = (x, y) and (y, x)
SELECT field1, field2
FROM linktable
WHERE field1 >= field2;
Is this what you want?
select field1, field2
from (select field1, field2,
row_number() over (partition by least(field1, field2), greatest(field1, field2) order by field1) as seqnum
from t
) t
where seqnum = 1;
This returns one instance of field1/field2 regardless of ordering.
Related
I have a query like this:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... where ...=field1) as field3
FROM
...
And it works fine - and I see 3 columns in results
The I need to add one more column for internal query:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) as my_count, sum(*) as my _sum from ...where ...=field1 ) as field3
FROM
...
this syntax doesn't work.
How can I achieve it ?
This partial query makes it unsure what you really want, but I would expect that the subquery actually correlates to the outer query (otherwise, you could just cross join). If so, a typical solution is a lateral join.
In Postgres:
select
field1 as field1,
field2 as field2,
x.*
from ...
left join lateral (
select count(*) as my_count, sum(*) as my _sum from ...
) x
Oracle supports lateral joins starting version 12. You just need to replace left join lateral with outer apply.
The following would seem to do what you want, and it should work fine in Oracle 9i:
SELECT t.field1,
t.field2,
x.my_count,
x.my_sum
FROM SOME_TABLE t
LEFT OUTER JOIN (select FIELD1,
count(*) as my_count,
sum(SOME_FIELD) as my_sum
from SOME_OTHER_TABLE
GROUP BY FIELD1) x
ON x.FIELD1 = t.FIELD1
You can use a CTE (Common Table Expression) to precompute the values:
WITH
q as (select count(*) as my_count, sum(*) as my _sum from ... )
SELECT
field1 as field1 ,
field2 as field2 ,
q.my_count as field3,
q.my_sum as field4
FROM
...
CROSS JOIN q
Or... you can always use the less performant, simpler way:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... ) as field3,
(select sum(*) from ... ) as field4
FROM
...
With your limited (& a bit confusing - 2 databases, sum(*) ...) info,
here is the logic:
SELECT
field1 as field1 ,
field2 as field2 ,
(select count(*) from ... ) as my_count,
(Select sum(<my field>) from ...) as my _sum
FROM
...
Given this setup:
CREATE TABLE table1 (column1 text, column2 text);
CREATE TABLE table2 (column1 text, column2 text);
INSERT INTO table1 VALUES
('A', 'A')
, ('B', 'N')
, ('C', 'C')
, ('B', 'A');
INSERT INTO table2 VALUES
('A', 'A')
, ('B', 'N')
, ('C', 'X')
, ('B', 'Y');
How can I find missing combinations of (column1, column2) between these two tables? Rows not matched in the other table.
The desired result for the given example would be:
C | C
B | A
C | X
B | Y
There can be duplicate entries so we'd want to omit those.
One method is union all:
select t1.col1, t1.col2
from t1
where (t1.col1, t1.col2) not in (select t2.col1, t2.col2 from t2)
union all
select t2.col1, t2.col2
from t2
where (t2.col1, t2.col2) not in (select t1.col1, t1.col2 from t1);
If there are duplicates within a table, you can remove them by using select distinct. There is no danger of duplicates between the tables.
Seems to be a perfect task for set operations:
( --all rows from table 1 missing in table 2
select *
from table1
except
select *
from table2
)
union all -- both select return distinct rows
( -- all rows in table 2 missing in table 1
select *
from table2
except
select *
from table1
)
You can try to use not exists with a subquery, then use UNION ALL
select Column1,Column2 from table1 t1
where NOT exists
(
select 1
FROM table2 t2
where t1.Column1 = t2.Column1 or t1.Column2 = t2.Column2
)
UNION ALL
select Column1,Column2 from table2 t1
where NOT exists
(
select 1
FROM table1 t2
where t1.Column1 = t2.Column1 or t1.Column2 = t2.Column2
)
You can try set operations. EXCEPT to find the rows in table but not in the other and UNION to put the partial results into one.
(SELECT column1,
column2
FROM table1
EXCEPT
SELECT column1,
column2
FROM table2)
UNION
(SELECT column1,
column2
FROM table2
EXCEPT
SELECT column1,
column2
FROM table1);
If you don't need duplicate elimination you can try to use the ALL variants (EXCEPT ALL and UNION ALL). They are generally faster, as the DBMS doesn't have to look for and eliminate duplicates.
The devil is in the details with this seemingly simple task.
Short and among the fastest:
SELECT col1, col2
FROM (SELECT col1, col2, TRUE AS x1 FROM t1) t1
FULL JOIN (SELECT col1, col2, TRUE AS x2 FROM t2) t2 USING (col1, col2)
WHERE (x1 AND x2) IS NULL;
The FULL [OUTER] JOIN includes all rows from both sides, but fills in NULL values for columns of missing rows. The WHERE conditions (x1 AND x2) IS NULL identifies these unmatched rows. Equivalent: WHERE x1 IS NULL OR x2 IS NULL.
To eliminate duplicate pairs, add DISTINCT (or GROUP BY) at the end - cheaper for few dupes:
SELECT DISTINCT col1, col2
FROM ...
If you have many dupes on either side, it's cheaper to fold before the join:
SELECT col1, col2
FROM (SELECT DISTINCT col1, col2, TRUE AS x1 FROM t1) t1
FULL JOIN (SELECT DISTINCT col1, col2, TRUE AS x2 FROM t2) t2 USING (col1, col2)
WHERE (x1 AND x2) IS NULL;
It's more complicated if there can be NULL values. DISTINCT / DISTINCT ON or GROUP BY treat them as equal (so dupes with NULL values are folded in the subqueries above). But JOIN or WHERE conditions must evaluate to TRUE for rows to pass. NULL values are not considered equal in this, the FULL [OUTER] JOIN never finds a match for pairs containing NULL. This may or may not be desirable. You just have to be aware of the difference and define your requirements.
Consider the added demo in the SQL Fiddle
If there are no NULL values, no duplicates, but an additional column defined NOT NULL in each table, like the primary key, let's name each id, then it can be as simple as:
SELECT col1, col2
FROM t1
FULL JOIN t2 USING (col1, col2)
WHERE t1.id IS NULL OR t2.id IS NULL;
Related:
Select rows which are not present in other table
PostgreSQL - Create table as select with distinct on specific columns
I have this table named table1
id uniquefield field1 field2
1 11 test test2
2 12 test2 test3
and I have this value in my temp table #temp1
id uniquefield field1 field2
1 11 test test2
2 12 test2 test3
3 13 test4 test5
4 14 test5 test6
Now, what I want to happen is that I want to transfer all data from #temp1 table. It would insert if data does not exist in table1 table and would update if it exist.
Does anybody know how to do this using SQL Server or dynamic SQL?
Hope to find some response from you.
Temp tables are no different in such cases like you mentioned. The difference is they are only available to the current connection for the user; and they are automatically deleted when the user disconnects from instances. So you can handle these tables like any other SQL table and use a MERGE query to achieve this data manupulation.
Assuming the uniquefield column can be treated as link between these tables.
MERGE table1 t
USING #temp1 t1
ON t.uniquefield = t1.uniquefield
WHEN MATCHED THEN
UPDATE
SET t.id = t1.id,
t.field1 = t1.field1,
t.field2 = t1.field2
WHEN NOT MATCHED BY TARGET THEN
INSERT (id, uniquefield, field1, field2)
VALUES (t1.id, t1.uniquefield, t1.field1, t1.field2 );
You can DROP #temp1 after this and do a SELECT * FROM table1 to check the updated/ inserted data.
Assume 2 table row is identical by id, insert with not exists()
-- Append missing row to table1
INSERT table1
SELECT * FROM #temp1 t WHERE NOT EXISTS(SELECT * FROM table1 WHERE id = t.id)
Temp tables are no different in such cases like you mentioned. The difference is they are only available to the current connection for the user; and they are automatically deleted when the user disconnects from instances. So you can handle these tables like any other SQL table.
Assuming the uniquefield column can be treated as link between these tables.
Update statemant:
update table1
set
t.id = t1.id,
t.field1 = t1.field1,
t.field2 = t1.field2
from table1 t
join #temp1 t1
on t.uniquefield = t1.uniquefield
Insert statement:
insert into table1(id, uniquefield, field1, field2)
select t1.id, t1.uniquefield, t1.field1, t1.field2
from table1 t
join #temp1 t1
on t.uniquefield != t1.uniquefield
This is assuming that id is the table primary key. Otherwise replace appropriately.
UPDATE Table1
SET id= T.id,
Uniquefield = T.Uniquefield,
Field1 = T.field1,
Field2 = T.field2
FROM Table1
INNER JOIN #temp1 T ON T.id = Table1.id;
INSERT INTO Table1 (id, uniquefield, field1, field2)
SELECT id, uniquefield, field1, field2
FROM #temp1
WHERE id NOT IN (SELECT id FROM Table1)
I think the elegant way is to use MERGE here:
SET IDENTITY_INSERT table1 ON;
MERGE INTO table1 AS T
USING #temp1 AS S
ON S.id = T.id
WHEN MATCHED THEN
UPDATE SET
T.uniquefield = S.uniquefield,
T.field1 = S.field1,
T.field2 = S.field2
WHEN NOT MATCHED BY TARGET THEN
INSERT (id, uniquefield, field1, field2)
VALUES (S.id, S.uniquefield, S.field1, S.field2);
SET IDENTITY_INSERT table1 OFF;
I have added the IDENTITY_INSERT there just in case ID column in your table1 is an IDENTITY and you might want to keep the one from #temp1 table. If you dont need / have IDENTITY, just remove those lines.
you can simply use this query:-
insert into table1
select * from (
select #temp.workersID,#temp.W_name,#temp.salary,#temp.joining_year,#temp.city,#temp.id
from #temp
full join workers
on #temp.WorkersID = workers.WorkersID
where workers.WorkersID is null
) ds
I would like to expand on this simple sub select:
Select * from table1 where pkid in (select fkid from table2 where clause...)
The logic above is fairly simple - get me all rows in table1 where the pkid is contained in the subset returned from the sub select query that has a where clause. It works well because there is only 1 field being returned.
Now I want to expand on this.
In table 1 I want to return results where field1 and field2 and field3 in select (field1, field2, field3 from table2 where clause...)
How is this possible?
Thanks in advance.
Example.
TABLE1
FIELD1 FIELD2 FIELD3
1 2 3
2 3 4
4 5 6
TABLE 2
2 3 4
4 5 6
I want to return 2 results.
If I understand what you need you can try:
SELECT t1.field1, t1.field2, t1.field3 FROM table1 t1
INNER JOIN table2 t2
ON t1.field1 = t2.field1
AND t1.field2 = t2.field2
AND t1.field3 = t2.field3
AND t2.... // Use this as WHERE condition
Like Marco pointed out, what you want to do is an INNER JOIN.
But (that's just FYI, you should definitely use Marco's solution) it's also possible to simply use braces.
Select *
from table1
where (field1, field2, field3) in (select field1, field2, field3 from table2 where clause...)
At least in MySQL (wasn't this question tagged with MySQL?)
you can use a temporary table
select field1, field2, field3 into #tempTable from table2 where clause...
select * from table 1
where filed1 in (select field1 from #tempTable)
and filed2 in (select field2 from #tempTable)
and filed3 in (select field3 from #tempTable)
Avoid using IN for most cases like this. It's very limitting.
I prefer to use a JOIN in most cases.
SELECT
*
FROM
yourTable
INNER JOIN
(SELECT c1, c2, c3 FROM anotherQuery) AS filter
ON yourTable.c1 = filter.c1
AND yourTable.c2 = filter.c2
AND yourTable.c3 = filter.c3
(Ensure the filter returns unique combinations of c1, c2, c3 using DISTINCT or GROUP BY if necessary)
you didn't mentioned engine, so I'll assume SQL Server.
This query will show you what's on both tables
select FIELD1, FIELD2 from table1
intersect
select FIELD1, FIELD2 from table2
I want to insert only Distinct Records from Table "A" to Table "B". Assume both the tables has same structure.
If by DISTINCT you mean unique records that are on TableB that aren't already in TableA, then do the following:
INSERT INTO TableB(Col1, Col2, Col3, ... , Coln)
SELECT DISTINCT A.Col1, A.Col2, A.Col3, ... , A.Coln
FROM TableA A
LEFT JOIN TableB B
ON A.KeyOfTableA = B.KeyOfTableB
WHERE B.KeyOfTableB IS NULL
INSERT INTO B SELECT DISTINCT * FROM A
You might not want the id column of the table to be part of the distinct check, so use this solution if that's the case: https://stackoverflow.com/a/5171345/453673
INSERT INTO TableB
(Col1, Col2, ...)
SELECT DISTINCT Col1, Col2, ...
FROM TableA
INSERT INTO TableB
SELECT *
FROM TableA AS A
WHERE NOT EXISTS(SELECT * FROM TableB AS B WHERE B.Field1 = A.Field1)
-- If need: B.Field2 = A.Field2 and B.Field3 = A.Field3