SQL filtering out less specific rows - sql

My Table data looks like
Col1 | Col2 | Col3
1 | 2 | NULL
1 | 2 | 3
1 | NULL | NULL
1 | 5 | NULL
2 | NULL | NULL
I want to write a query, so that I get only the most specific entries. ie. in the above example row1 is more specific row3 as Value of "Col1" is same in both but Value in "Col2" is more specific( not null) in row1, similarly row2 is more specific than row1.
For the above dataset the result should look like:
Col1 | Col2 | Col3
1 | 2 | 3
1 | 5 | NULL
2 | NULL | NULL
NOTE: Datatype of column can be anything.

I am assuming that the columns are "ordered" as they are in your query, so you don't have a case where col2 is null and col3 is not null:
select col1, col2, col3
from table t
where (col3 is not null) or
(col3 is null and col2 is not null and
not exists (select 1
from table t2
where t2.col1 = t.col1 and t2.col2 = t.col2 and t2.col3 is not null
)
) or
(col2 is null and col1 is not null and
not exists (select 1
from table t2
where t2.col1 = t.col1 and t2.col2 is not null
)
);
The logic behind this is:
Take all rows where col3 is not null.
Take all rows where col2 is not null and there are no similar rows with a value in col3.
Take all rows where col1 is not null and there are no similar rows with a value in col2.
EDIT:
In Oracle, you can do this more simply:
select col1, col2, col3
from (select t.*,
max(col3) over (partition by col1, col2) as maxcol3,
max(col2) over (partition by col1) as maxcol2
from table t
) t
where (col3 is not null) or
(col2 is not null and maxcol3 is null) or
(col1 is not null and maxcol2 is null);
EDIT II:
(With a clarified definition of "more specific".)
I think this is the extrapolation of the logic. It requires looking at all combinations:
select col1, col2, col3
from (select t.*,
max(col3) over (partition by col1, col2) as maxcol3_12,
max(col2) over (partition by col1, col3) as maxcol2_13,
max(col1) over (partition by col2, col3) as maxcol1_23,
max(col1) over (partition by col1) as maxcol1_2,
max(col1) over (partition by col2) as maxcol1_3,
max(col2) over (partition by col1) as maxcol2_1,
max(col2) over (partition by col3) as maxcol2_3,
max(col3) over (partition by col2) as maxcol3_1,
max(col3) over (partition by col2) as maxcol3_2,
from table t
) t
where (col1 is not null and col2 is not null and col3 is not null) or
(col1 is not null and col2 is not null and maxcol3 is null) or
(col1 is not null and col3 is not null and maxcol2 is null) or
(col2 is not null and col1 is not null and maxcol3 is null) or
(col2 is not null and col3 is not null and maxcol1 is null) or
(col3 is not null and col1 is not null and maxcol2 is null) or
(col3 is not null and col2 is not null and maxcol1 is null) or
(col1 is not null and maxcol2 is null and maxcol3 is null) or
(col2 is not null and maxcol1 is null and maxcol3 is null) or
(col3 is not null and maxcol1 is null and maxcol2 is null);
The first combination says "keep this row if all values are not null". The second says: "keep this row if col1 and col2 are not null and col3 never has a value". And so on to the last one that says: "keep this row is col3 is not null and col1 and col2 never have values".
This might simplify to:
where not ((col1 is null and maxcol1 is not null) or
(col2 is null and maxcol2 is not null) or
(col3 is null and maxcol3 is not null)
);

Divide n Conquer kind of Approach!
Demo : SQL Fiddle
SELECT col1,col2,MAX(col3)
FROM test
WHERE col1 is NOT NULL AND col2 is NOT NULL
GROUP BY col1,col2
UNION
SELECT col1,MAX(col2),col3
FROM test
WHERE col1 is NOT NULL AND col3 is NOT NULL
GROUP BY col1,col3
UNION
SELECT MAX(col1),col2,col3
FROM test
WHERE col2 is NOT NULL AND col3 is NOT NULL
GROUP BY col2,col3
UNION
SELECT col1,NULL,NULL
FROM test
GROUP BY COL1
HAVING COUNT(COL2) = 0 AND COUNT(COL3) = 0

Related

How to get Distinct value for a column on the basis of other column in Oracle

I want to get the distinct values from COL1 and it's COL3 value also but the condition is if COL1 = COl2 then it should pick the matching COL3 value otherwise pick the COL1 value if they are not same. I'm stuck in the logic, any help will be appreciated!
Please see the below image for more detail:
select DISTINCT COL1,
CASE WHEN COL1 = COL2 THEN COL3 END COL3 from TABLE1
WHERE COL1 IS NOT NULL;
Do a GROUP BY to get distinct COL1 values.
Use COALESCE() to return the COL3 value if there exists a COL1 = COL2 row, otherwise return the max COL3 value for the COL1. (Could use MIN() too, if that's better.)
select COL1,
COALESCE( MAX(CASE WHEN COL1 = COL2 THEN COL3 END), MAX(COL3) )
FROM table1
WHERE COL1 IS NOT NULL
GROUP BY COL1
use correlated subquery
select col1,col3
from TABLE1 a
where col2 in (select min(col2) from table1 b where a.col1=b.col1)
select distinct COL1, if(COL1 = COL2, COL3, COL1) as result
from table1
I think that you can join the table with itself and then use a join conditio to filter that out, then decide in select wether there was COL2 = COL1 and choose appropriate COL3:
SELECT DISTINCT a.COL1, CASE WHEN b.COL1 IS NULL THEN a.COL3 ELSE b.COL3 END as COL3
FROM TABLE1 a
LEFT JOIN TBALE2 b
on a.COL1 = b.COL2
and a.COL1 = b.COL1
This way you have on table a all the data, and on table b data if and only if COL1 matches with COL2. Then you select whichever COL3 is not null, prefarably the one from table b. There is Oracle function coalesce that does just that.
With a self join:
select distinct
t.col1,
case
when tt.col1 is null then t.col3
else tt.col3
end col3
from tablename t left join tablename tt
on tt.col1 = t.col1 and tt.col2 = t.col1
See the demo.
Results:
> COL1 | COL3
> ---: | :---
> 11 | ABC
> 12 | ABC
> 13 | BDG
> 14 | DEF
> 15 | CEG

Replace Value in Column with Value in Different Column

I have a dataset that looks like this in SQL.
Col1 Col2 Col3
A 4 1
B 5 NULL
C 6 1
D 7 NULL
E 8 NULL
How do I add a new column with the values in Col2 with the values in Col3 if Col3 = 1, or else keep the existing values in Col2.
Final Expected Output:
Col1 Col2 Col3 Col4
A 4 1 1
B 5 NULL 5
C 6 1 1
D 7 NULL 7
E 8 NULL 8
I tried the coalesce function but I don't think that worked:
SELECT
Col1,
Col2,
Col3,
coalesce(Col3, Col2) AS Col4
FROM table1
Your description suggests a case expression :
select . . .
(case when col3 = 1 then col3 else col2 end) as col4
You could also express the above as
select . . .
(case when col3 = 1 then 1 else col2 end) as col4
For the data you provided, coalesce() should also work.
Try this:
SELECT
Col1,
Col2,
Col3,
CASE WHEN Col3 IS NOT NULL THEN Col3 ELSE Col2 END AS Col4
FROM table1
-- this should do what you want
SELECT
Col1,
Col2,
Col3,
CASE WHEN Col3 = 1 THEN Col3 ELSE Col2 END AS NewCOL
FROM table1
Insert into table2
select
Col1,
Col2,
Col3,
(case when col3 = 1 then 1 else col2 end) as col4
FROM table1

How can I get data in single row when multiple columns data have null in some columns?

How can I get data in single row when multiple columns data have null in some columns?
Following is the scenario
col1 col2 col3 col4
----- ------ ---------------
1 NULL NULL NULL
NULL 2 NULL NULL
NULL NULL 3 NULL
NULL NULL NULL 4
I want output like this
col1 col2 col3 col4
----- ------ ---------------
1 2 3 4
You can use aggregate functions as below:
select min(col1) as col1,min(col2) as col2,min(col3) as col3,min(col4) as col4 from t
select max(col1) as col1,max(col2) as col2,max(col3) as col3,max(col4) as col4 from t
select sum(col1) as col1,sum(col2) as col2,sum(col3) as col3,sum(col4) as col4 from t
select avg(col1) as col1,avg(col2) as col2,avg(col3) as col3,avg(col4) as col4 from t
However Min or Max or more meaningful than the Avg and Sum in this scenario.
select max(col1) as col1,
max(col2) as col2,
max(col3) as col3,
max(col4) as col4
from your_table
Try this way.
SELECT DISTINCT
(SELECT TOP 1 Col1 FROM TestTable WHERE Col1 IS NOT NULL) AS 'Column1',
(SELECT TOP 1 Col2 FROM TestTable WHERE Col2 IS NOT NULL) AS 'Column2',
(SELECT TOP 1 Col3 FROM TestTable WHERE Col3 IS NOT NULL) AS 'Column3',
(SELECT TOP 1 Col4 FROM TestTable WHERE Col4 IS NOT NULL) AS 'Column4'
From TestTable
Example 01 
Col1 Col2 Col3 Col4
----- ------ ---------------
1 NULL NULL NULL
NULL 2 NULL NULL
NULL NULL 3 NULL
NULL NULL NULL 4
Result
Column1 Column2 Column3 Column4
-------------------------------
1 2 3 4
Example 02
Col1 Col2 Col3 Col4
----- ------ ---------------
1 NULL NULL NULL
NULL 2 NULL 2
5 NULL 3 NULL
NULL NULL NULL 4
Result
Column1 Column2 Column3 Column4
-------------------------------
1 2 3 2

How to get the minimum value in a set of columns while excluding a certain value?

I have been trying a query to select the minimum value in a row but also exclude a certain value (-998).
The table looks like this:
col1 col2 col3
----------------------------------
1 1 -998
2 -998 2
3 2 1
-998 1 3
So in the first row, the minimum value would be 1; in the second row, it would be 2; and in the third row, it would be 1 again.
I tried using a case statement and excluding -998 in each condition, but it keeps grabbing -998 for some reason.
SELECT
CASE
WHERE (col1 <= col2 and col1 <= col3) and col1 != -998 THEN col1
WHERE (col2 <= col1 and col2 <= col3) and col2 != -998 THEN col2
WHERE (col3 <= col1 and col3 <= col2) and col3 != -998 THEN col3
END AS [MIN_VAL]
FROM myTable
If anyone can point me in the right direction that would ge awesome.
Use the table value constructor to unpivot your column values and exclude values from there.
SQL Fiddle
MS SQL Server 2012 Schema Setup:
create table YourTable
(
col1 int,
col2 int,
col3 int
);
insert into YourTable values
(1 , 1 , -998),
(2 , -998 , 2 ),
(3 , 2 , 1 ),
(-998 , 1 , 3 );
Query 1:
select (
select min(R.Value)
from (values(T.col1),
(T.col2),
(T.col3)) as R(Value)
where R.Value <> -998
) as min_val
from YourTable as T;
Results:
| MIN_VAL |
|---------|
| 1 |
| 2 |
| 1 |
| 1 |
How about this:
use tempdb
create table myTable(
col1 int,
col2 int,
col3 int
)
insert into myTable values
(1, 1, -998),
(2, -998, 2),
(3, 2, 1),
(-998, 1, 3)
;with cte as(
select
rn = row_number() over(order by (select null)),
col = col1
from myTable
union all
select
rn = row_number() over(order by (select null)),
col = col2
from myTable
union all
select
rn = row_number() over(order by (select null)),
col = col3
from myTable
)
select
minimum = min(col)
from cte
where col <> - 998
group by rn
drop table mytable
SELECT
CASE
WHEN (col1 <= col2 or col2 = -998)
and (col1 <= col3 or col3 = -998)
and col1 != -998
THEN col1
WHEN (col2 <= col1 or col1 = -998)
and (col2 <= col3 or col3 = -998)
and col2 != -998
THEN col2
WHEN (col3 <= col1 or col1 = -998)
and (col3 <= col2 or col2 = -998)
and col3 != -998
THEN col3
END AS [MIN_VAL]
FROM myTable;

Sql query to group by and merge rows

I am working on sql query with table structure like below
col1 col2 col3
1 nik NULL
1 nik1 NULL
1 NULL mah
1 NULL mah1
Now i want output like
col1 col2 col3
1 nik mah
1 nik1 mah1
So i want to merge null values if there is value in col2 or col3
How can i achieve this ??
EDIT :Main structure is if col2 has values then col3 will be null and if col3 has value then col2 will be null
So i want to reduce the total no of rows by filling up null values
Try this:
SELECT T1.Col1,T1.Col2,T2.Col3
FROM
(SELECT Col1,Col2,ROW_NUMBER()OVER(ORDER BY Col1) as RN
FROM TableName
WHERE Col2 IS NOT NULL) T1 FULL OUTER JOIN
(SELECT Col1,Col3,ROW_NUMBER()OVER(ORDER BY Col1) as RN
FROM TableName
WHERE Col3 IS NOT NULL) T2 ON T1.Col1=T2.Col1 AND T1.RN=T2.RN
See result in SQL Fiddle.