SQL Server query for finding duplicate rows from identical tables

SQL Server query for finding duplicate rows from identical tables - sql

I have a question about finding identical rows from one table to another. I have a table for users to ask for information. So with that the query will be ran against another table. Both tables are identical except for the ID columns.
The ID columns are not involved in the query except for the
SELECT TOP 1 *
FROM searchTable
ORDER BY searchid DESC
part.
My query looks like this
SELECT TOP 1 *
FROM searchTable
ORDER BY searchid DESC(SELECT A.column1, A.column2,..............
FROM dbo.searchTable A
WHERE EXISTS (SELECT * FROM realTable B
WHERE A.Column1 = B.Column1
AND A.Column2 = B.Column2,
.......
AND A.lastColumn = B.lastColumn))
What I get when running the query is the last entered query from the query table, which is correct, but I get all the rows listed from the realTable as if everything after WHERE EXISTS is pointless. What I need is the single row query from the queryTable to list all the rows that are identical to it from the realTable. Not all the rows the realTable has.

You can use inner join instead of exists.
select B.* from searchTable A
inner join realTable B
on A.Column1 = B.Column1
and A.Column2 = B.Column2
.
.
.
It will return all the records in your realTable which have identical columns with your searchTable.

I was able to get it to work the way I needed by understanding the logic in the last suggestion.
It looks like this.
DECLARE #searchID int = (SELECT MAX(searchID) FROM searchTable)
SELECT Column1, Column2.............LastColumn FROM realTable B
WHERE EXISTS(SELECT * FROM searchTable A WHERE searchID = #searchID AND A.Column1=B.Column1 AND A.Column2=B.Column2................A.LastColumn=B.LastColumn)
Now the last search in the searchTable will give me all the rows in the realTable that match that search.

Related

Can you use a Select statement in an joins on clause?

I am trying to use a sub query to pick an column to join on to, is this even possible:
if let's say table b has a value that = the column name of table a?
Please note the below example does specify the table_b.Column_A however this is to make my question clearer and less cluttered. The where condition will always return an single value/record.
EDIT: I am trying to basically create an dynamic on clause if that makes any sense.
Further more the only relationship the tables have is that Table_b contains Table_a's columns as values.
SELECT *
FROM table_a a
INNER JOIN table_b b
ON a.(select column1 FROM table_b WHERE Column1 ='Column_A') = b.Column_A

Is this what you want?
SELECT *
FROM table_a a INNER JOIN
table_b b
ON (b.Column1 = 'Column_A' AND a.column1 = b.column_A) OR
(b.Column1 = 'Column_B' AND a.column1 = b.column_B) OR
(b.Column1 = 'Column_C' AND a.column1 = b.column_C)
You would have to list out all the columns directly. Also, JOINs with ORs generally have very poor performance.
You can express this more concisely using APPLY:
SELECT *
FROM table_b b CROSS APPLY
(SELECT 'Column_A' as colname, b.Column_A as colval FROM DUAL UNION ALL
SELECT 'Column_B', b.Column_B FROM DUAL UNION ALL
SELECT 'Column_C', b.Column_C FROM DUAL
) v JOIN
table_a a
ON a.column1 = v.colval
WHERE v.colname = b.Column1
Note that this version works in Oracle 12C+.

You can't use the select statement in way you are trying but looking to you code you could use and condition eg:
SELECT *
FROM table_a a
INNER JOIN table_b b ON a.column1 = b.Column_A
and b.Column1 ='Column_A'
Otherwise if you want build dynamically the query code you should build server side a string then using this string as a command

Delete distinct multiple columns in sql

I have Query in Access that I'm building in SQL Server.
Access:
DELETE DISTINCT * from [TableA] INNER JOIN TableB
ON [TableA].[Column1]=[TableB].[column1]
AND [TableA].[Column2]=[TableB].[column2]
I know I could use
Delete from tableA where ID in (
Select * from from [TableA] INNER JOIN TableB
ON [TableA].[Column1]=[TableB].[column1]
AND [TableA].[Column2]=[TableB].[column2])
But I get an error saying "Only one expression can be specified in the select list when the subquery is not introduced with EXISTS"
My Goal is to delete the Distinct records from the Access query mentioned at the top.

You want to delete the rows in TableA that are in TableB, according to the column matches. How about doing this:
delete from tableA
where exists (select 1
from tableB
where tableA.Column1 = tableB.Column1 and tableA.column2 = tableB.column2
);
This seems to be the intent of what you are trying to do.

In the sub-query u have to select the ID column from the respective table that is the only column u need
DELETE a
FROM tableA a
JOIN (SELECT DISTINCT Column1 ,column2
FROM tableA
WHERE EXISTS (SELECT 1
FROM tableB
WHERE tableA.Column1 = tableB.Column1
AND tableA.column2 = tableB.column2)) b
ON A.Column1 = B.Column1
AND A.column2 = B.column2

Is there a way to do a multi table query and get result just from specific tables?

I am trying to do a multi query but I don't want to use sub queries i.e:
SELECT column1
FROM table1
WHERE
EXISTS (SELECT column1 FROM table2 WHERE table1.column1 = table2.column1);)
I thought of using a JOIN but so far my best result was this:
SELECT *
FROM table1
JOIN table2 ON table1.t1id = table2.t2id
WHERE table1.id = 5;
This would be good except of the fact that I get a duplicate column (the id in table 1 and 2 are foreign keys).
How do I remove the duplicate column if possible?
UPDATE:
Table1:
tableA_ID, TABLEB_ID
1, 1
1, 4
3, 2
4, 3
TableA: ID, COL1, COL2
1, A, B
2, A, B
3, A, B
4, A, B
TableB: ID, Col3, COL4
1, C, D
2, C, D
3, C, D
4, C, D
I want to get all or some of the columns from TableA according to a condition
Sample: Lets say the condition is that tableA_ID = 1 which will result in the 2 first rows in the table then I want to get all or some of the columns in TableA that respond to the ID that I got from Table1.
Sample: The result from before was [{1,1}{1,4}] which means I want from TableA the results:
TableA.ID, TableA.COL1, TableA.COL2
1,A,B
4,A,B
The actual results I get is:
Table1.tableA_ID, Table1.TABLEB_ID, TableA.ID, TableA.COL1, TableA.COL2
1,1,1,A,B
1,4,4,A,B

Is this what you're looking for?
select a.id, a.column1, b.column2
from table1 a
left join table2 b on a.id = b.otherid;

You can't change the column list of a query based on the values it returns. It just isn't the way that SQL is designed to operate. At best, you can return all of the columns from the second table and ignore the ones that aren't relevant based on other values in that row.
I'm not even sure how a variable column list would work. In your scenario, you're looking for two discrete values separately. But that's not the only scenario: what if the condition is tableA_ID in (1,2). Would you want different numbers of columns in different rows as part of a single result set?

Getting just the columns you want (just from specific tables, as you say) is the easy part (btw -- don't use '*' if you can help it -- topic for another discussion):
SELECT
A.ID,
A.COL1,
A.COL2
FROM
TABLE1 Bridge
LEFT JOIN TABLEA A
ON Bridge.TABLEA_ID = A.ID
LEFT JOIN TABLEB B
ON Bridge.TABLEB_ID = B.ID
Getting the rows you want will be the harder part (influenced by your choice of joins, among several other things).

I think you'll need to select only the fields of table A and use a distinct clause. Rest of your query will remain as it is. i.e.
SELECT distinct table1.*
FROM table1
JOIN table2 ON table1.t1id = table2.t2id
WHERE table1.id = 5;

Fastest SQL & HQL Query for two tables

Table1: Columns A, B, C
Table2: Columns A, B, C
Table 2 is a copy of Table 1 with different data. Assume all columns to be varchar
Looking for a single efficient query which can fetch:
Columns A, B, C from Table1
Additional Rows from Table2 where values of Table2.A are not present in Table1.A
Any differences between the Oracle SQL & HQL for the same query will be appreciated.
I'm fiddling with Joins, Unions & Minus but not able to get the correct combination.

SQL:
SELECT *
FROM Table1
UNION ALL
SELECT *
FROM Table2 T2
WHERE NOT EXISTS(
SELECT 'X' FROM Table1 T1
WHERE T1.A = T2.A
)
HQL:
You must execute two different query an discard the element by Table2 result in a Java loop because in HQL doesn't exist UNION command.
Alternatatively you can write the first query for Table1 and the second query must have a not in clause to discard Table1 A field.
Solution 1:
Query 1:
SELECT * FROM Table1
Query 2:
SELECT * FROM Table2
and then you apply a discard loop in Java code
Solution 2:
Query 1:
SELECT * FROM Table1
Query 2:
SELECT * FROM Table2 WHERE Table2.A not in (SELECT Table1.A from Table1)

This query returns all rows in table1, plus all rows in table2 which does not exist in table1, given that column a is the common key.
select a,b,c
from table1
union
all
select a,b,c
from table2
where a not in(select a from table1);
There may be different options available depending on the relative sizes of table1 and table2 and the expected overlap.

SQL Case statement to check for NULLS and Non-existent records

I am doing a join between two tables and want to select the columns based on whether they have a record or not. I'm trying to avoid having multiple of the same field and am trying to condense them into single columns. Something like:
Select
id = (CASE WHEN a.id IS NULL THEN b.id ELSE a.id END),
name = (CASE WHEN a.name IS NULL THEN b.name ELSE a.name END)
From Table1 a
Left Join Table2 b
On a.id = b.id
Where a.id = #id
I'd like id to populate from Table1 if a record exists, but if not pull from Table2. The previous code returns no records because there are no NULL values in Table1 so my question is how do I run a check to see if any records even exist? Also if anyone knows of a better way to accomplish what I am trying to do I appreciate guidance and constructive criticism.
EDIT
It looks like COALESCE will work for what I'm trying to accomplish. I'd like to give a little more info on exactly what I am working with and get some advice on whether I am using the best method.
I have a bloated table Table2 and it is in production. I'm working on building new web applications for this system but can't justify a complete database redesign so I am trying to do one "on the fly". I've created a new table Table1 and I am writing stored procedures for the following methods Get(Select), Set(Update), Add(Insert), Remove(Delete). This way, to my code, it will seem that I am working with a single table that is not bloated. My code will simply call one of the SP methods and then the stored procedure will handle the data between the old table and the new. I am currently working on the Get method and I need to check the old table Table2 for a record if it doesn't exist in Table1.
Thanks to the suggestions here my query currently looks like this:
Select
id = coalesce(a.id, b.student_number),
first_name = coalesce(a.first_name, b.first_name),
last_name = coalesce(a.last_name, b.last_name),
//etc
From Table1 a
Full Outer Join Table2 b
On a.id = b.student_number
Where (a.id = #id Or b.student_number = #id)
This works for what I'm trying to accomplish, I'd like to throw it out there to the experienced crowd for any tips or suggestions if there are better or more correct ways to go about this.
Thanks

I suspect your problem may come from doing a left join. Try again using a full outer join, like this:
Select
id = coalesce(a.id, b.id),
name = coalesce(a.name, b.name)
From Table1 a
full outer Join Table2 b
On a.id = b.id
Where a.id = #id

Select id = coalesce(a.id, b.id),
name = coalesce(a.name, b.name)
From Table2 b
Left Join Table1 a On a.id = b.id
Where b.id = #id
You may need to use ISNULL or CASE instead of COALESCE depending on your database platform.

First, you don't need a case statement for that:
Select ISNULL(a.id,b.id) AS id, ISNULL(a.name,b.name) AS name,
From Table1 a
Left Join Table2 b
On a.id = b.id
Where a.id = #id
Second, if I get it right, the id field can contain nulls, and in that case you are screwed. I mean, the ID is a unique value that identify a row, if it can be null, you can't identify that row.
But if what you want is getting records from Table1 and Table2 and avoid duplicates, a simple UNION will work fine, since it discards duplicates:
select id, name
from Table1
where id = #id
union
select id, name
from Table2
where id = #id

You could do something like:
select id, name from Table1 a where a.id not in (select id from Table2)
UNION
select id, name from Table2 b
This would give you all the records from table1 that didn't have a corresponding match in table2 plus all of table2's records. The union would then combine the results.

In your first CASE statement, a.id and b.id will always be same value, except for instances in which a.id has a value and b.id generates a NULL value because of the LEFT JOIN. There will never be a row in the result set with a NULL a.id value and a non-NULL b.id value. You could just use a.id for this column.
For the second CASE statement, you may find the name column in either or both tables with a value (and, of course, the values may be different). You said you want to "condense" the these column values; the SQL function for that is COALESCE:
COALESCE(a.id, b.id)
which returns the first non-NULL value (a.id if it isn't NULL, otherwise b.id). It won't tip you off to different names in the two tables.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas