Sql find duplicates in filed two IF field one is unique

Sql find duplicates in filed two IF field one is unique - sql

Trying to wrap my head around this but its just spinning in circles...
I have a sql right now and get to a point where I have values as such:
select T1.col1, T2.col2
from T2, T1
where T2.recNo = T1.recNo
AND T2.id=3
AND T1.recNo IN(
select recNo from T1 where col1 IN (
select col1 from T1 group by col1 having COUNT(*) >2))
col1|col2
111|123
111|123
222|456
222|456
222|456
333|789
333|700
etc...
This is a pretty large output and what I am trying to find is if there are any values in col2 that are NOT the same for each grouping of values in col1 (i know for certain col1 is unique) I dumped the output to a file and will try to figure it out in perl next.
The output i am trying to get is:
col1|col2
333|789
333|700

You can do this with aggregation:
select col1
from (<your query here>) s
group by col1
having min(col2) <> max(col2);
This will return all col1 values that have more than one col2 value.

Related

SQL - Delete repeated rows in table

I need to delete the repeated row-
I have this table-
source
The result that I need-
result
*keep only one combination of 2 column (the order is not important)
Thanks! (:

Here is one method that should be efficient:
select col1, col2
from t
where col1 <= col2
union all
select col1, col2
from t
where col1 > col2 and
not exists (select 1 from t t2 where t2.col1 = t.col2 and t2.col2 = t.col1);
Note: This is a SQL select statement, so it does not delete rows in the table. You seem to want the results from a query, not to modify the underlying table.

My interpretation of the spec "keep only one combination of 2 column, [column] order not important":
SELECT col1, col2
FROM t
WHERE col1 <= col2
UNION
SELECT col2, col1
FROM t
WHERE col1 > col2;

db2 select distinct rows, but select all columns

Experts, I have a single table with multiple columns. col1, col2, col3, col4, col5, col6
I need to select distinct (col4), but I need all other columns also on my output.
If I run, this ( select distinct(col4 ) from table1 ), then I get only col4 on my output.
May I know, how to do it on db2?.
Thank you

You simply do this...
Select * From Table1 Where col4 In (Select Distinct(col4) From Table1)

I'm not sure if you will be able to do this.
You might try to run group by on this column. You will be able to run some aggregate functions on other columns.
select count(col1), col4 from table1 group by (col4);

none of the answers worked for me so here is one that i got working. use group by on col4 while taking max values of other columns
select max(col1) as col1,max(col2) as col2,max(col3) as col3
, col4
from
table1
group by col4

At least in DB2, you can execute
SELECT
DISTINCT *
FROM
<YOUR TABLE>
Which will give you every distinct combination of your (in this case) 6 columns.
Otherwise, you'll have to specify what columns you want to include. If you do that, you can either use select distinct or group by.

Get row where column2 is X and column1 is max of column1

I have a SQLite table like this:
Col1 Col2 Col3
1 ABC Bill
2 CDE Fred
3 FGH Jack
4 CDE June
I would like to find the row containing a Col2 value of CDE which has the max Col1 value i.e. in this case June. Or, put another way, the most recently added row with a col2 value of CDE, as Col1 is an auto increment column. What is an SQL query string to achieve this? I need this to be efficient as the query will run many iterations in a loop.
Thanks.

SELECT * FROM table WHERE col2='CDE' ORDER BY col1 DESC LIMIT 1
in case if col1 wasn't an increment it would go somewhat like
SELECT *,MAX(col1) AS max_col1 FROM table WHERE col2='CDE' GROUP BY col2 LIMIT 1

Try this:
SELECT t1.*
FROM table1 t1
INNER JOIN
(
SELECT MAX(col1) MAXID, col2
FROM table1
GROUP BY col2
) t2 ON t1.col1 = t2.maxID AND t1.col2 = t2.col2
WHERE t1.col2 = 'CDE';
SQL Fiddle Demo1
1: This demo is mysql, but it should work fine with the same syntax in sqlite.

Use a subquery such as:
SELECT Col1, Col2, Col3
FROM table
WHERE Col1 = (SELECT MAX(Col1) FROM table WHERE Col2='CDE')
Add indexes as appropriate, e.g. clustered index on Col1 and another nonclustered index on Col2 to speed up the subquery.

In SQLite 3.7.11 and later, the simplest query would be:
SELECT *, max(Col1) FROM MyTable WHERE Col2 = 'CDE'
As shown by EXPLAIN QUERY PLAN, both this and passingby's query are most efficient, if there is an index on Col2.
If you'd want to see the correspondig values for all Col2 values, use a query like this instead:
SELECT *, max(Col1) FROM MyTable GROUP BY Col2

select all columns with one column has different value

In my table,some records have all column values are the same, except one. I need write a query to get those records. what's the best way to do it? the table is like this:
colA colB colC
a b c
a b d
a b e
What's the best way to get all records with all the columns? Thanks for everyone's help.

Assuming you know that column3 will always be different, to get the rows that have more than one value:
SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
If you need all the values in the three columns, then you can join this back to the original table:
SELECT t.*
FROM table t join
(SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
) cols
on t.col1 = cols.col1 and t.col2 = cols.col2

Just select those rows that have the different values:
SELECT col1, col2
FROM myTable
WHERE colWanted != knownValue
If this is not what you are looking for, please post examples of the data in the table and the wanted output.

How about something like
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) = 1
This will give you Col1, Col2 that have unique data.

Assuming col3 has the difs
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) > 1
OR TO SHOW ALL 3 COLS
SELECT Col1, Col2, Col3
FROM Table1
GROUP BY Col1, Col2, Col3
HAVING COUNT(Col3) > 1

delete all but minimal values, based on two columns in SQL Server table

how to write a statement to accomplish the folowing?
lets say a table has 2 columns (both are nvarchar) with the following data
col1 10000_10000_10001_10002_10002_10002
col2 10____20____10____30____40_____50
I'd like to keep only the following data:
col1 10000_10001_10002
col2 10____10____30
thus removing the duplicates based on the second column values (neither of the columns are primary keys), keeping only those records with the minimal value in the second column.
how to accomplish this?

This should work for you:
;
WITH NotMin AS
(
SELECT Col1, Col2, MIN(Col2) OVER(Partition BY Col1) AS TheMin
FROM Table1
)
DELETE Table1
--SELECT *
FROM Table1
INNER JOIN NotMin
ON Table1.Col1 = NotMin.Col1 AND Table1.Col2 = NotMin.Col2
AND Table1.Col2 != TheMin
This uses a CTE (like a derived table, but cleaner) and the over clause as a shortcut for less code. I also added a commented select so you can see the matching rows (verify before deleting). This will work in SQL 2005/2008.
Thanks,
Eric

Ideally, you'd like to be able to say:
DELETE
FROM tbl
WHERE (col1, col2) NOT IN (SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1)
Unfortunately, that's not allowed in T-SQL, but there is a proprietary extension with a double FROM (using EXCEPT for clarity):
DELETE
FROM tbl
FROM tbl
EXCEPT
SELECT col1, MIN(col2) AS col2 FROM tbl GROUP BY col1
In general:
DELETE
FROM tbl
WHERE col1 + '|' + col2 NOT IN (SELECT col1 + '|' + MIN(col2) FROM tbl GROUP BY col1)
Or other workarounds.

Sorry, I misunderstood the question.
SELECT col1, MIN(col2) as col2
FROM table
GROUP BY col1
Of course returns the rows in question, but assuming you can't alter the table to add a unique identifier, you would need to do something like:
DELETE FROM test
WHERE col1 + '|' + col2 NOT IN
(SELECT col1 + '|' + MIN(col2)
FROM test
GROUP BY col1)
Which should work assuming that the pipe character never appears in your set.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Sql find duplicates in filed two IF field one is unique - sql

You can do this with aggregation: select col1 from (<your query here>) s group by col1 having min(col2) <> max(col2); This will return all col1 values that have more than one col2 value.

Related

SQL - Delete repeated rows in table

db2 select distinct rows, but select all columns

Get row where column2 is X and column1 is max of column1

select all columns with one column has different value

delete all but minimal values, based on two columns in SQL Server table

Categories

Resources