Comparing two columns in two different tables - sql

I have two tables A and B where table A has a column X, table B has column Y. These columns contain account number information. I want to check whether the account number information in column A.X is present in B.Y.
Note: Both column X and Y can have duplicates because these are like composite primary keys.
How can I do solve this?

You can use an INNER JOIN, like this:
SELECT *
FROM table1 a
INNER JOIN table2 b
ON a.X = b.Y
OR
you can go for IF EXISTS,like this:
SELECT *
FROM table1 a
WHERE EXISTS(
SELECT 1
FROM table2 b
WHERE a.x=b.Y )

This will give you account information in table A that is also present in table B. The check is done by a correlated subquery that checks for each row in A whether the account information is present in B.
SELECT DISTINCT
X
FROM
A
WHERE
EXISTS (
SELECT
*
FROM
B
WHERE
B.Y=A.X
);

select distinct(X)
from A,B
WHERE A.X=B.Y

This will give you a list of account numbers in A that are not in B:
SELECT X FROM (
SELECT DISTINCT X FROM A
) A
LEFT JOIN B ON Y = X
WHERE Y IS NULL

Related

SQL query to append values not contained in second table

I have table A and table B with different number of columns but both containing a column with IDs. Table A contains more complete list of IDs and table B contains some of the IDs from the table A.
I would like to return resulting table B with original information plus appended IDs that are missing in B but contained in A. For these appended rows, other columns should be blank while column with IDs in B should just contain missing ID values.
Simple solution UNION ALL, with NOT EXISTS:
select b.id, b.c1, ..., b.cn
from b
UNION ALL
select distinct a.id, null, ..., null -- should be same number of columns as in the above select
from a
where not exists (select 1 from b where b.id = a.id)
I think you described left join:
select *
from b left join
a
using (id)

Does wrapping my Coalesce in a subquery make my query more efficient or does it do nothing?

Lets say I have a query where one field can appear in either Table A or Table B but not both. So to retrieve it I use Coalesce.
Something like
Select
...
Coalesce(A.Number,B.Number) Number
...
From Table A
Left Join Table B on A.C= B.C
Now lets say I want to join another table to that Number field
should I just do
Join Table Z on Z.Z = Coalesce(A.Number,B.Number)
Or is it better to wrap my original table in a query and join on the definite result. So something like
Select * from (
Select
...
Coalesce(A.Number,B.Number) Number
...
From Table A
Left Join Table B on A.C= B.C
) T
left join Table Z on Z.Number= T.Number
Does this make a difference?
if i were joining another table to the result of the first query instead of a sub query i would place the first part in a CTE whenever possible, i believe the performance would be the same as a subquery but CTEs are more readable in my opinion.
with cte1 as
(
Select
...
Coalesce(A.Number,B.Number) Number
...
From Table A
Left Join Table B
on A.C= B.C
)
select *
from cte1 a
Join Table Z
on Z.Z = a.number

Combining four tables in SQL Server

I have four tables Table A, Table B, Table C and Table D. The schema of all four tables are identical. I need to union these four tables in the following way:
If a record is present in Table A then that is considered in the output table.
If a record is present in Table B then it is considered in the output table ONLY if it is not present in Table A.
If a record is present in Table C then it is considered ONLY if it is not present in Table A and Table B.
If a record is present in Table D then it is considered ONLY if it is not present in Table A, Table B, and Table C.
Note -
Every table has a column which identifies the table itself for every record (I don't know if this is of any importance)
Records are identified based on a particular column - Column X which is not unique even within each table
You could do something like (only two cases shown but you should see how to extend this)
WITH CTE1 AS
(
SELECT 't1' as Source, X, Y
FROM t1
UNION ALL
SELECT 't2' as Source, X, Y
FROM t2
), CTE2 AS
(
SELECT *,
RANK() OVER (PARTITION BY X
ORDER BY CASE Source
WHEN 't1' THEN 1
WHEN 't2' THEN 2
END) As RN
FROM CTE1
)
SELECT X,Y
FROM CTE2
WHERE RN=1
I would be inclined to do this using not exists:
select a.*
from a
union all
select b.*
from b
where not exists (select 1 from a where a.x = b.x)
union all
select c.*
from c
where not exists (select 1 from a where a.x = c.x) and
not exists (select 1 from b where b.x = c.x)
union all
select d.*
from d
where not exists (select 1 from a where a.x = d.x) and
not exists (select 1 from b where b.x = d.x) and
not exists (select 1 from c where c.x = d.x);
If you have an index on the x column in each table, then this should be the fastest method.
This will work as long as there are no NULL columns, or if columns for a record that exists in table with higher precedence are NULL you can assume the same column will NULL in tables with lower precedence.
SELECT coalesce(a.column1, b.column1, c.column1, d.column1) column1
,coalesce(a.column2, b.column2, c.column2, d.column2) column2
,coalesce(a.column3, b.column3, c.column3, d.column3) column3
--...
,coalesce(a.columnN, b.columnN, c.columnN, d.columnN) columnN
FROM TableA a
FULL JOIN TableB b on b.ColumnX = a.ColumnX
FULL JOIN TableC c on c.ColumnX = a.ColumnX or c.ColumnX = b.ColumnX
FULL JOIN TableD d on d.ColumnX = a.ColumnX or d.ColumnX = b.ColumnX or d.ColumnX = c.ColumnX
If the NULL values matter, you can switch to a more-complicated (and likely slower) CASE version:
CASE WHEN a.columnX IS NOT NULL THEN a.column1
WHEN b.columnX IS NOT NULL THEN b.column1
WHEN c.columnX IS NOT NULL THEN c.column1
WHEN d.columnX IS NOT NULL THEN d.column1 END column1
Of course, you can mix and match, so columns that are not nullable can use the former syntax, and columns where NULL values matter use the latter.
Hopefully the purpose of this is to fix the broken schema and put this data all in the same table, where it belongs.
This might seem stupid, but if, by any chance, you can leave out the table-identifying column and you also want to eliminate duplicate records (from within one table) too then the most straightforward answer would be
select <all columns without table identifier> from tableA
union
select <all columns without table identifier> from tableB
union
select <all columns without table identifier> from tableC
...
This is exactly, what union was designed to do: add rows only if they do not already exist before.

Oracle SQL to subtract 2 values from different table joins

I am trying to subtract sequences MN_SEQ from Table C generated based on join with other tables.
Here is the problem.
Query 1 -
Select M_Seq from Table C, Table A, Table B where C.date_sk=A.MTH_END_DT
and B.Loan_seq=A.Loan_seq
Query 2 -
Select M_Seq from Table C, Table B where C.date_sk=B.ORIG_DT
I have to get difference between 2 M_SEQ generated from the result set of query 1 and Query 2.
Below is what i tried, but I am getting error.
select mn_seq -mn_seq from
((select mn_seq from Table C, Table A, Table B where B.MTH_END_DT=C.DATE_SK and B.LOAN_SEQ=A.LOAN_SEQ)a,
(select mn_seq from Table C , Table B where B.ORIG_DT=C.DATE_SK
)b)
T
Kindly provide inputs . I am not sure if this is the right way to do it. I tried just using "-" between queries but didnt work. Thanks!
Try this..
SELECT (SELECT mn_seq
FROM TABLE c, TABLE a, TABLE b
WHERE b.mth_end_dt = c.date_sk
AND b.loan_seq = a.loan_seq) -
(SELECT mn_seq FROM TABLE c, TABLE b WHERE b.orig_dt = c.date_sk)
FROM dual
I assume both the mn_seq are NUMBER and also your WHERE clause returns only one record in each of the inner queries.

SQL Select Top and Random Fill

I would like to select and insert the top 4650 fields from table a column g into table b column e. How can I randomly fill this column with data from table a column g? How do I replace the data that already exist in column e? Would this be easier to do in multiple parts?
if this is just for a single column then this should work for you.
insert into tableB (columnE)
select top(4650) columnG from tableA
if there is a relationship between the tables then you could do something like this
Update x
set columnE = y.columnG
from tableB as x
inner join (select top(4650) ID from tableA) as y
on x.ID = y.ID
you could also utilize CTE
;with as
(
select top (4650) id,ColumnG
from TableA
) Y
update X
set columnE = Y.ColumnG
from TableB as X
inner join Y on x.ID = y.ID
We will need the structure of the tables to fully answer your question
This answer assumes 2 things: A) your Products table has an unbroken auto-incrementing Id column (by unbroken I mean no deleted rows that would cause the int to jump) and b) your Orders has an int PK Id column
DECLARE #MaxProductId int
SELECT #MaxProductId = MAX(Id) FROM Product
SELECT p.Description, o.Id
FROM Product p
JOIN Order o
ON p.Id = o.Id % #MaxProductId
This will haphazardly join a product to an order. If you like what you see, you can change the p's column to whatever your g column is. You can then use UPDATE and this select to join on your Order's Id.
Well I actually went about this completely different. I made a temp table with 2 fields, Temp1, Temp2. I pulled all the Products and put them in Temp1 then used Two different updates. One to add random digits to Temp2 and one to change my TableToUpdate to those temp digits. This way I can do this to other tables and the data looks different but keeps the sales forecasting structure.
UPDATE TempTable
SET [Temp2] = abs(checksum(NewId())) % 5000 + 10000
UPDATE TableToUpdate
SET Product = ( SELECT Temp2
FROM TempTable
WHERE TableToUpdate.Product = TempTable.Temp1)
Thank you everyone for your input.