Create table with duplicate records? - sql

I have identified duplicate records in my staging environment
SQL> SELECT COUNT(*)
2 FROM MASTER_CHILD_MERGE A
3 WHERE A.CAM_ID IN (SELECT B.CAM_ID FROM CAM_DIM B);
SQL> 703022
For the analyst in our team, I want to create a table that has A and B's columns. I tried doing that
SQL> CREATE TABLE DUPES AS
2 SELECT * FROM NDS_MASTER_CHILD_MERGE A
3 WHERE A.CAM_ID IN (SELECT B.CAM_ID FROM CAM_DIM B);
but I realized that it will only gives me A's columns. How do I add B's columns as well? I am pretty sure it is an obvious solution but I am not seeing it...
I am on Oracle 10g.

Just join those two tables
create table dupes as
select *
from nds_master_child_merge a
join cam_dim b
on a.can_id = b.cam_id

SELECT A.*, B.* INTO DUPES
FROM NDS_MASTER_CHILD_MERGE AS A INNER JOIN CAM_DIM AS B ON A.CAM_ID = B.CAM_ID

Related

How to prove that ALL records from Query A are present in Table B for millions of records?

I have query written (in Microsoft SQL Server Management Studio V18) which does multiple inner joins to give result set with 3 columns: ZipCode, ID, Income.
This result set contains 118 Million records
I have Table B with 2 columns: ZipCode, ID
Table B contains 123 Million records
These 118M records are present in Table B and I want to prove that.
How do I do this? I don't want another resultset that will display all these 118M records on the output console.
I can add first result set in a temp table but I am stuck after that.
Ideally I would like to see something printed on the console that will say that "All the records from temp table are present in target table<Table_Name>"
If not, what could be an ideal way to prove that all these records are present in target table?
so after say you have pushed your query results into temp table say #a, we can proceed like below
IF NOT EXISTS
(
SELECT TOP 1 1
FROM #a A
WHERE NOT EXISTS
(
SELECT 1 FROM TableB B
ON A.ZipCode=B.ZipCode
AND A.Id= B.ID
)
)
BEGIN
PRINT 'All the records from temp table are present in target table<Table_Name>'
END
WITH
TA AS (SELECT ...), --> first query
TB AS (SELECT ...) --> second query
SELECT * FROM TA
EXCEPT
SELECT * FROM TB
UNION ALL
SELECT * FROM TB
EXCEPT
SELECT * FROM TA;
Then replace by your queries TA and TB.

duplicate query result when join table

I face issue about duplicate data when join table, here my sample data table I have
-- Table A
I want to join with
-- Table B
this my query notation for join both table,
select a.trans_id, name
from tableA a
inner join tableB b
on a.ID_Trans = b.trans_id
and this the result, why I get the duplicating data which should show only two lines of data, please help me to solve this case.
Firstly, as you have been told multiple times in the comments, this is working exactly as you have written, and (more importantly) as intended. You have 2 rows in tableA and those 2 rows match 2 rows in your table tableB according to the ON clause. This means that each join operation, for the each of the rows in tableA, results in 2 rows as well; thus 4 rows (2 * 2 = 4).
Considering that your table, TableA only has one column then it seems that you should be cleaning up that data and deleting the duplicates. There are plenty of examples on how to do that already (example).
Perhaps the column you show us in TableA is one many, and thus instead you have a denormalisation issue, and instead there should be another table with the details of Id_trans and a PRIMARY KEY or UNIQUE CONSTRAINT/INDEX on it. Then you would join fron that table to TableB.
Finally, what you might be after is an EXISTS, which would look like this:
SELECT B.trans_id, B.[name]
FROM dbo.TableB B
WHERE EXISTS(SELECT 1
FROM dbo.TableA A
WHERE A.ID_Trans = B.trans_id); --Odd that it's called ID_Trans in one table, and Trans_ID in another
As the comments mentioned your query does exactly what you asked it to do but I think you wanted something like:
select a.trans_id, a.name, b.name
from tableA a
inner join tableB b on a.trans_id = b.trans_id
group by a.trans_id, a.name, b.name
Since there are two rows in both table with same ID join will make them four. You can use distinct to remove duplicates:
select distinct a.trans_id, name
from tableA a
inner join tableB b
on a.id_trans = b.trans_id
But I would suggest to use exists:
select trans_id, name
from tableB b
exists (select 1 from tableA a where a.trans_id=b.trans_id)

Same Data In Multiple Column Or Table

I have 3 column In Table 1 in SQL A,B,C and User_Column in Table 2.
I want a user who is present in all three column of Table 1??? .
Experts please guide me I'm a beginner.
It seems you want an intersection of A, B, and C users:
select a as usr from table1
intersect
select b as usr from table1
intersect
select c as usr from table1;

Oracle SQL to subtract 2 values from different table joins

I am trying to subtract sequences MN_SEQ from Table C generated based on join with other tables.
Here is the problem.
Query 1 -
Select M_Seq from Table C, Table A, Table B where C.date_sk=A.MTH_END_DT
and B.Loan_seq=A.Loan_seq
Query 2 -
Select M_Seq from Table C, Table B where C.date_sk=B.ORIG_DT
I have to get difference between 2 M_SEQ generated from the result set of query 1 and Query 2.
Below is what i tried, but I am getting error.
select mn_seq -mn_seq from
((select mn_seq from Table C, Table A, Table B where B.MTH_END_DT=C.DATE_SK and B.LOAN_SEQ=A.LOAN_SEQ)a,
(select mn_seq from Table C , Table B where B.ORIG_DT=C.DATE_SK
)b)
T
Kindly provide inputs . I am not sure if this is the right way to do it. I tried just using "-" between queries but didnt work. Thanks!
Try this..
SELECT (SELECT mn_seq
FROM TABLE c, TABLE a, TABLE b
WHERE b.mth_end_dt = c.date_sk
AND b.loan_seq = a.loan_seq) -
(SELECT mn_seq FROM TABLE c, TABLE b WHERE b.orig_dt = c.date_sk)
FROM dual
I assume both the mn_seq are NUMBER and also your WHERE clause returns only one record in each of the inner queries.

MySQL how to remove records in a table that are in another table

I have table A with close to 15000 entries. I have a second table B with 7900 entries with a common field with table A.
I need to extract into a third temporary tableC all entries from table A except the ones that also appear in table B. Simple as it may sound, i havent found a way to do it. The closest i got was this:
INSERT INTO tableC
SELECT *
FROM tableA
INNER JOIN tableB
ON tableA.field IS NOT tableB.field
This SQL just selects everything in tableA, even entries that are in tableB.
Any ideas where i'm going wrong?
What if you try this?
INSERT INTO tableC
SELECT *
FROM tableA
WHERE tableA.field NOT IN (SELECT tableB.field FROM tableB)
Or you can try the alternate EXISTS syntax
INSERT INTO tableC
SELECT *
FROM tableA
WHERE NOT EXISTS (SELECT * FROM tableB WHERE tableB.field = tableA.field)