SQL Server: remove multiple row null value - sql

I have a table like this:
Result Col1 Col2 Col3
-----------------------------
Row1 null 1 null
Row1 2 null null
Row1 null null 3
Row1 1 null null
Row1 null 2 null
Row1 null null 3
and I would like to get the result like
Result Col1 Col2 Col3
-----------------------------
Row1 2 1 3
Row1 1 2 3
How to get this done in the SQL Server table? I know that if I use the MAX of Col1, Col2, Col3 I will get only one row. But I need to get the two rows.
How can I do this?

This is tricky. You can assign a sequential value using row_number() to each value and then aggregate.
Your data lacks ordering -- SQL tables represent unordered sets. Assuming you have an ordering column and you have only one non-NULL value per row:
select t.result, max(col1) as col1, max(col2) as col2, max(col3) as col3
from (select t.*,
row_number() over (partition by case when col1 is not null then 1
when col2 is not null then 2
when col3 is not null then 3
order by ? -- the ordering column
) as seqnum
from t
) t
group by t.result, seqnum;
If you can have multiple non-NULL values per row, then the question is ill-defined. Ask another question and provide sample data and desired results.

Related

Oracle query - Selecting unique row number based on order of another column

I'm trying to find the best way to make a query with two columns, one is a number and the order a date:
Doing a select and ordering by the date column.
Table1:
col1 (NUMBER)
col2 (DATE)
1
02/2019
2
02/2019
3
02/2019
4
03/2019
2
04/2019
3
05/2019
I'm doing a query like this:
select col1, col2
from table1
order by col2 asc, col1 asc
fetch next 10;
The result I'm getting is also getting the next day's values, and repeating the value on col1 result like this:
col1 (NUMBER)
col2 (DATE)
1
02/2019
2
02/2019
3
02/2019
4
03/2019
2
04/2019
3
05/2019
But I would like a filter to limit to only a sequential col1 value like this:
col1 (NUMBER)
col2 (DATE)
1
02/2019
2
02/2019
3
02/2019
4
03/2019
ignoring values that would come in a "next batch" and not going through the risk of repeating col1 values, or getting col1 values that have a bigger col2 value than a previous result.
Any ideas on the best way to do this?
If I understand correctly, you can use a cumulative max():
select col1, col2
from (select t1.*,
max(col1) over (order by col2, col1 rows between unbounded preceding and 1 preceding) as running_max
from table1 t1
) t1
where running_max is null or col1 > running_max;
This returns rows whose value is greater than the values on the preceding rows.
EDIT:
If you want to return rows only up to the first time there is a decline, then:
select t1.*
from (select t1.*,
sum(case when prev_col1 > col1 then 1 else 0 end) over (order by col2, col1) as num_decreases
from (select t1.*,
lag(col1) over (order by col2, col1) as prev_col1
from table1 t1
) t1
where num_decreases = 0;

Print value in SQL depending on its presence in another column

I have a table of the form
Col1 | Col2
-------------
A | C
B | A
C | X
D | A
E | NULL
If any element of Col1 is present in Col2, then It should be printed as
Element, YES.
If it is not present in Col2, then it needs to be printed as element, NO and if corresponding col2 value is NULL then it needs to be printed as element, NULL
So final output should look like
A YES
B NO
C YES
D NO
E NULL
I was able to write three individual queries for the same but am struggling with the moment on how to put them inside Case statements in SQL.
SELECT Col1 FROM table WHERE col1 IN (SELECT col2 FROM table)
Select col1 FROM table where Col2 is NULL
SELECT Col1 FROM table WHERE col1 NOT IN (SELECT col2 FROM table)
I tried putting them inside case statements
Select col1, Case
when (SELECT Col1 FROM table WHERE col1 IN (SELECT col2 FROM table))
then "YES"
when (Select col1 FROM table where Col2 is NULL)
then "NULL"
else
"NO"
But I was getting an error. How should I fix this?
I would expect the query to look like this:
select col1,
(case when col2 is null then NULL
when col1 in (select t2.col2 from t t2)
then 'YES'
else 'NO'
end)
from t;

select query to fetch rows corresponding to all values in a column

Consider this example table "Table1".
Col1 Col2
A 1
B 1
A 4
A 5
A 3
A 2
D 1
B 2
C 3
B 4
I am trying to fetch those values from Col1 which corresponds to all values (in this case, 1,2,3,4,5). Here the result of the query should return 'A' as none of the others have all values 1,2,3,4,5 in Col2.
Note that the values in Col2 are decided by other parameters in the query and they will always return some numeric values. Out of those values the query needs to fetch values from Col1 corresponding to all in Col2. The values in Col2 could be 11,12,1,2,3,4 for instance (meaning not necessarily in sequence).
I have tried the following select query:
select distinct Col1 from Table1 where Col1 in (1,2,3,4,5);
select distinct Col1 from Table1 where Col1 exists (select distinct Col2 from Table1);
and its different variations. But the problem is that I need to apply an 'and' for Col2 not an 'or'.
like Return a value from Col1 where Col2 'contains' all values between 1 and 5.
Appreciate any suggestion.
You could use analytic ROW_NUMBER() function.
SQL FIddle for a setup and working demonstration.
SELECT col1
FROM
(SELECT col1,
col2,
row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
FROM your_table
WHERE col2 IN (1,2,3,4,5)
)
WHERE rn =5;
UPDATE As requested by OP, some explanation about how the query works.
The inner sub-query gives you the following resultset:
SQL> SELECT col1,
2 col2,
3 row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
4 FROM t
5 WHERE col2 IN (1,2,3,4,5);
C COL2 RN
- ---------- ----------
A 1 1
A 2 2
A 3 3
A 4 4
A 5 5
B 1 1
B 2 2
B 4 3
C 3 1
D 1 1
10 rows selected.
PARTITION BY clause will group each sets of col1, and ORDER BY will sort col2 in each group set of col1. Thus the sub-query gives you the row_number for each row in an ordered way. now you know that you only need those rows where row_number is at least 5. So, in the outer query all you need ot do is WHERE rn =5 to filter the rows.
You can use listagg function, like
SELECT Col1
FROM
(select Col1,listagg(Col2,',') within group (order by Col2) Col2List from Table1
group by Col1)
WHERE Col2List = '1,2,3,4,5'
You can also use below
SELECT COL1
FROM TABLE_NAME
GROUP BY COL1
HAVING
COUNT(COL1)=5
AND
SUM(
(CASE WHEN COL2=1 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=2 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=3 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=4 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=5 THEN 1 ELSE 0
END))=5

Removing rows in SQL that have a duplicate column value

I have looked high and low on SO for an answer over the last couple of hours (subqueries, CTE's, left-joins with derived tables) to this question but none of the solutions are really meeting my criteria..
I have a table with data like this :
COL1 COL2 COL3
1 A 0
2 A 1
3 A 1
4 B 0
5 B 0
6 B 0
7 B 0
8 B 1
Where column1 1 is the primary key and is an int. Column 2 is nvarchar(max) and column 3 is an int. I have determined that by using this query:
select name, COUNT(name) as 'count'
FROM [dbo].[AppConfig]
group by Name
having COUNT(name) > 3
I can return the total counts of "A, B and C" only if they have an occurrence of column C more than 3 times. I am now trying to remove all the rows that occur after the initial value of column 3. The sample table I provided would look like this now:
COL1 COL2 COL3
1 A 0
2 A 1
4 B 0
8 B 1
Could anyone assist me with this?
If all you want is the first row with a ColB-ColC combination, the following will do it:
select min(id) as id, colB, colC
from tbl
group by colB, colC
order by id
SQL Fiddle
This should work:
;WITH numbered_rows as (
SELECT
Col1,
Col2,
Col3,
ROW_NUMBER() OVER(PARTITION BY Col2, Col3 ORDER BY Col3) as row
FROM AppConfig)
SELECT
Col1,
Col2,
Col3
FROM numbered_rows
WHERE row = 1
SELECT DISTINCT MIN(COL1) AS COL1,COL2,COL3
FROM TABLE
GROUP BY COL2,COL3
ORDER BY COL1

T-SQL Eliminating duplicate rows while ignoring certain columns

I'm struggling to find the proper statements to select non-duplicate entries that are duplicates only for particular columns. As an example, in the following table I only care about rows that have unique values in col1, col2, and col3 and the values in col4 and col5 do not matter. This means I would consider row 1 and row 2 to be duplicates and row 4 and row 5 to be duplicates:
col1 col2 col3 col4 col5
A 2 p 0 2
A 2 p 1 8
A 3 r 4 12
B 0 f 3 1
B 0 f 6 5
And I would want to select only the following:
col1 col2 col3 col4 col5
A 2 p 0 2
A 3 r 4 12
B 0 f 3 1
Is there a way to combine multiple DISTINCT statements to achieve this or specify certain columns to ignore when comparing rows for duplicates?
You have to choose which lines you want to keep, you can use the ROW_NUMBER() function for this:
SELECT col1, col2, col3, col4, col5
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY col1, col2, col3 ORDER BY col4 DESC) 'RowRank'
FROM table
)sub
WHERE RowRank = 1
You can change the ORDER BY section to change which row you keep and which you toss. The ROW_NUMBER() function just assigns a number to each row, in this example, you want to preserve each combination of col1, col2, col3, so you PARTITION BY them, meaning that numbering will start at 1 for each combination of them. You can run just the inside query to get the idea.
Alternatively, you could use GROUP BY and aggregate functions, ie:
SELECT col1, col2, col3, MAX(col4), MAX(col5)
FROM table
GROUP BY col1, col2, col3
The downside here is that the MAX() of col4 and col5 might come from different rows, so you're not necessarily returning one single row from your original table, but if you don't care which row you return then it doesn't matter.