Delete Duplicate record in sql server if 2 colums matching - sql

Col1
Col2
Col3
A
B
1
A
B
1
A
B
2
A
B
2
A
c
1
When col1 and Col2 values are same and Col3 values are different I dont want that values in result set.
I want result as below. I tried with row_number, group by , so manythings but did not worked. Please help me here
Col1
Col2
Col3
A
c
1

You can use exists:
delete from t
where exists (select 1
from t t2
where t2.col1 = t.col1 and t2.col2 = t.col1 and
t2.col3 <> t.col3
);
You can also use window functions:
with todelete as (
select t.*,
min(col3) over (partition by col1, col2) as min_col3,
max(col3) over (partition by col1, col2) as min_col4
from t
)
delete from todelete
where min_col3 <> max_col3;

Best way is to make these column a unique composite key. But here is a query to delete all records other than your desired result.
delete from Table_1
where
Col1=(SELECT Col1
FROM table_1
GROUP BY Col1, Col2
HAVING Count(*) > 1)
And
Col2 =(SELECT Col2
FROM table_1
GROUP BY Col1, Col2
HAVING Count(*) > 1)
this might not be the most optimized and efficient query but it works. if you don't want to delete duplicated records and just retrieve unique ones:
SELECT Col1,Col2
FROM table_1
GROUP BY Col1, Col2
HAVING Count(*) = 1
To get duplicating records:
SELECT Col2,Col1
FROM table_1
GROUP BY Col1, Col2
HAVING Count(*) > 1

Related

How to use and "in" clause in "having" in HIVE?

I have my data in sometable like this:
col1 col2 col3
A B 3
A B 1
A B 2
C B 1
And I want to get all of the unique groups of col1 and col2 that contain certain rows of col3. Like, all groups of col1 and col2 that contain a "2".
I wanted to do something like this:
select col1, col2 from sometable
group by col1, col2
having col3=1 and col3=2
But I want it to only return groups that have an instance of both 1 and 2 in col3. so, the result after the query should return this:
col1 col2
A B
How do I express this in HIVE? THANK YOU.
I don't know why others deleted answers that where correct and then almost correct but I will put their's back up.
SELECT col1, col2, COUNT(DISTINCT col3)
FROM
sometable
WHERE
col3 IN (1,2)
GROUP BY col1, col2
HAVING
COUNT(DISTINCT col3) > 1
If you actually want to return all of the records that meet your criteria you need to do a sub select and join back to the main table to get them.
SELECT s.*
FROM
sometable s
INNER JOIN (
SELECT col1, col2, COUNT(DISTINCT col3)
FROM
sometable
WHERE
col3 IN (1,2)
GROUP BY col1, col2
HAVING
COUNT(DISTINCT col3) > 1
) t
ON s.Col1 = t.Col1
AND s.Col2 = t.Col2
AND s.col3 IN (1,2)
The gist of this is narrow/filter your rowset to the rows that you want to test col3 IN (1,2) then count the DISTINCT values of col3 to make sure both 1 and 2 exist and not just 1 & 1 or 2 & 2.
I think below mentioned query will be useful for your question.
select col1,col2
from Abc
group by col1,col2
having count(col1) >1 AND COUNT(COL2)>2

SQL - Delete repeated rows in table

I need to delete the repeated row-
I have this table-
source
The result that I need-
result
*keep only one combination of 2 column (the order is not important)
Thanks! (:
Here is one method that should be efficient:
select col1, col2
from t
where col1 <= col2
union all
select col1, col2
from t
where col1 > col2 and
not exists (select 1 from t t2 where t2.col1 = t.col2 and t2.col2 = t.col1);
Note: This is a SQL select statement, so it does not delete rows in the table. You seem to want the results from a query, not to modify the underlying table.
My interpretation of the spec "keep only one combination of 2 column, [column] order not important":
SELECT col1, col2
FROM t
WHERE col1 <= col2
UNION
SELECT col2, col1
FROM t
WHERE col1 > col2;

Dynamic Group By in a Query

Is there a way to apply or not a group by into a query? for example, I have this:
Col1 Col2 Col3
A 10 X
A 10 NULL
B 12 NULL
B 12 NULL
I have to group by Col1 and Col2 only when I have a value in Col3, if Col3 is null, I don't need to group it. The result should be:
Col1 Col2
A 20
B 12
B 12
Maybe is not an elegant example, but this is the idea.
Thank you.
Here's a SQL Fiddle that does what you want:
http://sqlfiddle.com/#!3/b7f07/2
Here's the SQL itself:
SELECT col1, sum(col2) as col2 FROM dataTable WHERE
col1 in (SELECT col1 from dataTable WHERE col3 IS NOT NULL)
GROUP BY col1
UNION ALL
SELECT col1, col2 FROM dataTable WHERE
(col1 not in
(SELECT col1 from dataTable WHERE col3 IS NOT NULL and col1 is not null))
It sounds like you want all unique values of col1 when col3 is not null. Otherwise, you want all values of col1.
Assuming you have a SQL engine that supports window functions, you can do this as:
select col1, sum(col2)
from (select t.*,
count(col3) over (partition by col1) as NumCol3Values,
row_number() over (partition by col1 order by col1) as seqnum
from t
) t
group by col1,
(case when NumCol3Values > 1 then NULL else seqnum end)
The logic is pretty much as you state it. If there is any non-NULL value, then the second clause of the group by always evaluates to NULL -- everything goes in the same group. If things are all NULL, then the clause evaluates to a sequence number, which puts each values on a separate row.
This is a bit more difficult without window functions. If I assume that the minimum value of column 3 (when not NULL) is unique, then the following would work:
select t.col1,
(case when minCol3 is null then tsum.col2 else t.col2 end) as col2
from t left outer join
(select col1, sum(col2) as col2,
min(col3) as minCol3
from t
) tsum
on t.col1 = tsum.col1
where minCol3 is NULL or t.col3 = MinCol3
re: Is there a way to apply or not a group by into a query?
Not directly, but you can break it down by groupings and then UNION the results together.
Does this work?
Select col1, sum(col2)
from table
group by col1, col2
having max(col3) is not null
union all
select col1, col2
from table t left outer join
(Select col1, col2
from table
group by col1, col2
having max(col3) is not null) g
where g.col1 is null

select all columns with one column has different value

In my table,some records have all column values are the same, except one. I need write a query to get those records. what's the best way to do it? the table is like this:
colA colB colC
a b c
a b d
a b e
What's the best way to get all records with all the columns? Thanks for everyone's help.
Assuming you know that column3 will always be different, to get the rows that have more than one value:
SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
If you need all the values in the three columns, then you can join this back to the original table:
SELECT t.*
FROM table t join
(SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
) cols
on t.col1 = cols.col1 and t.col2 = cols.col2
Just select those rows that have the different values:
SELECT col1, col2
FROM myTable
WHERE colWanted != knownValue
If this is not what you are looking for, please post examples of the data in the table and the wanted output.
How about something like
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) = 1
This will give you Col1, Col2 that have unique data.
Assuming col3 has the difs
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) > 1
OR TO SHOW ALL 3 COLS
SELECT Col1, Col2, Col3
FROM Table1
GROUP BY Col1, Col2, Col3
HAVING COUNT(Col3) > 1

SQL query to simulate distinct

SELECT DISTINCT col1, col2 FROM table t ORDER BY col1;
This gives me distinct combination of col1 & col2. Is there an alternative way of writing the Oracle SQL query to get the unique combination of col1 & col2 records with out using the keyword distinct?
Use the UNIQUE keyword which is a synonym for DISTINCT:
SELECT UNIQUE col1, col2 FROM table t ORDER BY col1;
I don't see why you would want to but you could do
SELECT col1, col2 FROM table_t GROUP BY col1, col2 ORDER BY col1
Another - yet overly complex and somewhat useless - solution:
select *
from (
select col1,
col2,
row_number() over (partition by col1, col2 order by col1, col2) as rn
from the_table
)
where rn = 1
order by col1
select col1, col2
from table
group by col1, col2
order by col1
or a less elegant way:
select col1,col2 from table
UNION
select col1,col2 from table
order by col1;
or a even less elegant way:
select a.col1, a.col2
from (select col1, col2 from table
UNION
select NULL, NULL) a
where a.col1 is not null
order by a.col1
Yet another ...
select
col1,
col2
from
table t1
where
not exists (select *
from table t2
where t2.col1 = t1.col1 and
t2.col2 = t1.col2 and
t2.rowid > t1.rowid)
order by
col1;
Variations on the UNION solution by #aF. :
INTERSECT
SELECT col1, col2 FROM tableX
INTERSECT
SELECT col1, col2 FROM tableX
ORDER BY col1;
MINUS
SELECT col1, col2 FROM tableX
MINUS
SELECT col1, col2 FROM tableX WHERE 0 = 1
ORDER BY col1;
MINUS (2nd version, it will return one row less than the other versions, if there is (NULL, NULL) group)
SELECT col1, col2 FROM tableX
MINUS
SELECT NULL, NULL FROM dual
ORDER BY col1;
Another ...
select col1,
col2
from (
select col1,
col2,
rowid,
min(rowid) over (partition by col1, col2) min_rowid
from table)
where rowid = min_rowid
order by col1;