Summarise null rows in Oracle - sql

I have a dataset like this:
+---------------+-------+
| SAMPLE_NUMBER | SCORE |
+---------------+-------+
| 1 | 100 |
| 2 | 97 |
| 3 | 124 |
| 4 | 762 |
| 5 | 999 |
| 6 | 1200 |
| 7 | NULL |
| 8 | NULL |
| 9 | NULL |
| 10 | NULL |
+---------------+-------+
I want to be able to summarise the NULL rows instead of displaying them all. So ideally, I would want the above to look like this:
+---------------+-------+
| SAMPLE_NUMBER | SCORE |
+---------------+-------+
| 1 | 100 |
| 2 | 97 |
| 3 | 124 |
| 4 | 762 |
| 5 | 999 |
| 6 | 1200 |
| 7-10 | NULL |
+---------------+-------+
Is there any way in Oracle to do this? Or is it something I will have to do post-query?

Yes. For your sample data:
select (case when score is null then min(sample_number) || '-' || max(sample_number)
else min(sample_number)
end) as sample_number,
score
from table t
group by score
order by min(id)
In other words, group by score and then fiddle with the sample number. Note: this assumes that you do not have duplicate scores. If you do, you can do so with a more complicated version:
select (case when score is null then min(sample_number) || '-' || max(sample_number)
else min(sample_number)
end) as sample_number,
score
from (select t.*,
row_number() over (partition by score order by sample_number) as seqnum
from table t
) t
group by score, (case when score is not null then seqnum end);

My guess is that this should be part of your presentation layer since you will have to cast sample_number to a string (assuming it is a numeric type. An alternative to your requirements is to return the min and max consecutive sample_number:
with t (SAMPLE_NUMBER, SCORE) as (
values (1, 100)
, (2, 97)
, (3, 124)
, (4, 762)
, (5, 999)
, (6, 1200)
, (7, NULL)
, (8, NULL)
, (9, NULL)
, (10, NULL)
)
select min(sample_number), max(sample_number), grp, score
from (
select SAMPLE_NUMBER, SCORE
, row_number() over (order by SAMPLE_NUMBER)
- row_number() over (partition by SCORE
order by SAMPLE_NUMBER) as grp
from t
) group by grp, score
order by grp;
1 2 GRP SCORE
----------- ----------- -------------------- -----------
1 1 0 100
2 2 1 97
3 3 2 124
4 4 3 762
5 5 4 999
6 6 5 1200
7 10 6 -
Tried against db2, so you may have to adjust it slightly.
Edit: treat rows as individuals when score is not null
with t (SAMPLE_NUMBER, SCORE) as (
values (1, 100)
, (2, 97)
, (3, 97)
, (4, 762)
, (5, 999)
, (6, 1200)
, (7, NULL)
, (8, NULL)
, (9, NULL)
, (10, NULL)
)
select min(sample_number), max(sample_number), grp, score
from (
select SAMPLE_NUMBER, SCORE
, row_number() over (order by SAMPLE_NUMBER)
- row_number() over (partition by SCORE
order by SAMPLE_NUMBER) as grp
from t
) group by grp, score
, case when score is not null then sample_number end
order by grp;
1 2 GRP SCORE
----------- ----------- -------------------- -----------
1 1 0 100
2 2 1 97
3 3 1 97
4 4 3 762
5 5 4 999
6 6 5 1200
7 10 6 -
You may want to map max to null in case it is the same as min:
[...]
select min(sample_number)
, nullif(max(sample_number), min(sample_number))
, grp
, score
from ...
1 2 GRP SCORE
----------- ----------- -------------------- -----------
1 - 0 100
2 - 1 97
3 - 1 97
4 - 3 762
5 - 4 999
6 - 5 1200
7 10 6 -

SELECT DISTINCT
DECODE(SCORE
,NULL
,(SELECT COUNT()+1
FROM TAB_NAME
WHERE SCORE IS NOT NULL)
|| '-'
|| (SELECT COUNT()
FROM TAB_NAME)
,SAMPLE_NUMBER) NUM
, NVL(TO_CHAR(SCORE),'NULL') SCRE
FROM TAB_NAME
ORDER BY 1 ASC;

Related

Condense or merge rows with null values not using group by

Let's say I have a select which returns the following Data:
select nr, name, val_1, val_2, val_3
from table
Nr. | Name | Value 1 | Value 2 | Value 3
-----+------------+---------+---------+---------
1 | Max | 123 | NULL | NULL
1 | Max | NULL | 456 | NULL
1 | Max | NULL | NULL | 789
9 | Lisa | 1 | NULL | NULL
9 | Lisa | 3 | NULL | NULL
9 | Lisa | NULL | NULL | Hello
9 | Lisa | 9 | NULL | NULL
I'd like to condense the rows down to the bare minimum with.
I want the following result:
Nr. | Name | Value 1 | Value 2 | Value 3
-----+------------+---------+---------+---------
1 | Max | 123 | 456 | 789
9 | Lisa | 1 | NULL | Hello
9 | Lisa | 3 | NULL | NULL
9 | Lisa | 9 | NULL | NULL
For condensing the rows with Max (Nr. 1) a group by of the max values would help.
select nr, name, max(val_1), max(val_2), max(val_3)
from table
group by nr, name
But I am unsure how to get the desired results for Lisa (Nr. 9). The row for Lisa contains a value in the Value 3 column, in this example it's condensed with the first row that matches Nr and Name and has a Null value in Value 3.
I'm thankful for every input!
Basic principle is same as Vladimir's solution. This uses UNPIVOT and PIVOT
with cte as
(
select nr, name, col, val,
rn = row_number() over(partition by nr, name, col order by val)
from [table]
unpivot
(
val
for col in (val_1, val_2, val_3)
) u
)
select *
from (
select nr, name, rn, col, val
from cte
) d
pivot
(
max (val)
for col in ([val_1], [val_2], [val_3])
) p
Here is one way to do it. Assign a unique row number for each column by sorting them in such a way that NULLs come last and then join them back together using these row numbers and remove rows with all NULLs.
Run just the CTE first and examine the intermediate result to understand how it works.
Sample data
DECLARE #T TABLE (Nr varchar(10), Name varchar(10), V1 varchar(10), V2 varchar(10), V3 varchar(10));
INSERT INTO #T VALUES
('1', 'Max ', '123' , NULL , NULL ),
('1', 'Max ', NULL , '456', NULL ),
('1', 'Max ', NULL , NULL , '789'),
('9', 'Lisa', '1' , NULL , NULL ),
('9', 'Lisa', '3' , NULL , NULL ),
('9', 'Lisa', NULL , NULL , 'Hello'),
('9', 'Lisa', '9' , NULL , NULL );
Query
WITH CTE
AS
(
SELECT
Nr
,Name
,V1
,V2
,V3
-- here we use CASE WHEN V1 IS NULL THEN 1 ELSE 0 END to put NULLs last
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V1 IS NULL THEN 1 ELSE 0 END, V1) AS rn1
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V2 IS NULL THEN 1 ELSE 0 END, V2) AS rn2
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V3 IS NULL THEN 1 ELSE 0 END, V3) AS rn3
FROM #T AS T
)
SELECT
T1.Nr
,T1.Name
,T1.V1
,T2.V2
,T3.V3
FROM
CTE AS T1
INNER JOIN CTE AS T2 ON T2.Nr = T1.Nr AND T2.rn2 = T1.rn1
INNER JOIN CTE AS T3 ON T3.Nr = T1.Nr AND T3.rn3 = T1.rn1
WHERE
T1.V1 IS NOT NULL
OR T2.V2 IS NOT NULL
OR T3.V3 IS NOT NULL
ORDER BY
T1.Nr, T1.rn1
;
Result
+----+------+-----+------+-------+
| Nr | Name | V1 | V2 | V3 |
+----+------+-----+------+-------+
| 1 | Max | 123 | 456 | 789 |
| 9 | Lisa | 1 | NULL | Hello |
| 9 | Lisa | 3 | NULL | NULL |
| 9 | Lisa | 9 | NULL | NULL |
+----+------+-----+------+-------+

SQL first order, then partition in over clause

I have a problem, that I want to partition over a sorted table. Is there a way I can do that?
I am using SQL Server 2016.
Input Table:
|---------|-----------------|-----------|------------|
| prod | sortcolumn | type | value |
|---------|-----------------|-----------|------------|
| X | 1 | P | 12 |
| X | 2 | P | 23 |
| X | 3 | E | 34 |
| X | 4 | P | 45 |
| X | 5 | E | 56 |
| X | 6 | E | 67 |
| Y | 1 | P | 78 |
|---------|-----------------|-----------|------------|
Desired Output
|---------|-----------------|-----------|------------|------------|
| prod | sortcolumn | type | value | rowNr |
|---------|-----------------|-----------|------------|------------|
| X | 1 | P | 12 | 1 |
| X | 2 | P | 23 | 2 |
| X | 3 | E | 34 | 1 |
| X | 4 | P | 45 | 1 |
| X | 5 | E | 56 | 1 |
| X | 6 | E | 67 | 2 |
| Y | 1 | P | 78 | 1 |
|---------|-----------------|-----------|------------|------------|
I am this far:
SELECT
table.*,
ROW_NUMBER() OVER(PARTITION BY table.prod, table.type ORDER BY table.sortColumn) rowNr
FROM table
But this does not restart the row number on the 4th row, since it is the same prod and type.
How could I restart on every prod and also on every type change based on the sort criteria, even if the type changes back to something it already was previously? Is this even possible with a ROW_NUMBER function or do I have to work with LEAD and LAG and CASES (which would probably make it very slow, right?)
Thanks!
This is a gaps and islands problem. You can use the following query:
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn)
-
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) AS grp
FROM mytable t
to get:
prod sortcolumn type value grp
----------------------------------------
X 1 P 12 0
X 2 P 23 0
X 3 E 34 2
X 4 P 45 1
X 5 E 56 3
X 6 E 67 3
Y 1 P 78 0
Now, field grp can be used for partitioning:
;WITH IslandsCTE AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn)
-
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) AS grp
FROM mytable t
)
SELECT prod, sortcolumn, type, value,
ROW_NUMBER() OVER (PARTITION BY prod, type, grp ORDER BY sortcolumn) AS rowNr
FROM IslandsCTE
ORDER BY prod, sortcolumn
Demo here
This is a classic 'islands' problem, in that you need to find the 'islands' of records related by prod and type, but without grouping together all records matching on prod and type.
Here's one way this is typically solved. Set up:
DECLARE #t TABLE (
prod varchar(1),
sortcolumn int,
type varchar(1),
value int
);
INSERT #t VALUES
('X', 1, 'P', 12),
('X', 2, 'P', 23),
('X', 3, 'E', 34),
('X', 4, 'P', 45),
('X', 5, 'E', 56),
('X', 6, 'E', 67),
('Y', 1, 'P', 78)
;
Get some row numbers in place:
;WITH numbered AS (
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY prod, type ORDER BY sortcolumn) as rnX,
ROW_NUMBER() OVER (PARTITION BY prod ORDER BY sortcolumn) as rn
FROM
#t
)
numbered now looks like this:
prod sortcolumn type value rnX rn
---- ----------- ---- ----------- -------------------- --------------------
X 1 P 12 1 1
X 2 P 23 2 2
X 3 E 34 1 3
X 4 P 45 3 4
X 5 E 56 2 5
X 6 E 67 3 6
Y 1 P 78 1 1
Why is this useful? Well, look at the difference between rnX and rn:
prod sortcolumn type value rnX rn rn - rnX
---- ----------- ---- ----------- -------------------- -------------------- --------------------
X 1 P 12 1 1 0
X 2 P 23 2 2 0
X 3 E 34 1 3 2
X 4 P 45 3 4 1
X 5 E 56 2 5 3
X 6 E 67 3 6 3
Y 1 P 78 1 1 0
As you can see, each 'group' shares a rn - rnX value, and this changes from one group to the next.
So now if we partition by prod, type, and group number, then number within that:
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY prod, type, rn - rnX ORDER BY sortcolumn) rowNr
FROM
numbered
ORDER BY
prod, sortcolumn
we're done:
prod sortcolumn type value rnX rn rowNr
---- ----------- ---- ----------- -------------------- -------------------- --------------------
X 1 P 12 1 1 1
X 2 P 23 2 2 2
X 3 E 34 1 3 1
X 4 P 45 3 4 1
X 5 E 56 2 5 1
X 6 E 67 3 6 2
Y 1 P 78 1 1 1
Related reading: Things SQL needs: SERIES()
Try this
select prod, sortcolumn, type, value, row_number() over (partition by prod, sortcolumn, type order by value) rowNr
from table_name

Hide the same cells in SQL Server result

I have a result from tables similar to this:
id | name | type
---+------+------
1 | John | 1
1 | John | 34
2 | Jane | 2
1 | John | 12
2 | Jane | 168
I need to hide repeated values and let only unique values. I need to get something like this
id | name | type
---+------+------
1 | John | 1
| | 34
| | 12
2 | Jane | 2
| | 168
How can I do that in SQL Server 2012?
This is something that your presentation layer ought to handle generally but...
WITH T AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY Name, Type) AS RN1,
ROW_NUMBER() OVER (PARTITION BY ID, Name ORDER BY Type) AS RN2
FROM YourTable
)
SELECT
CASE WHEN RN1 = 1 THEN ID END AS ID,
CASE WHEN RN2 = 1 THEN Name END AS Name,
Type
FROM T
ORDER BY ID, Name, Type
If you need nullify marked rows, you can use following query:
WITH Src AS
(
SELECT * FROM (VALUES
(1, 'John', 1 ),
(1, 'John', 34 ),
(2, 'Jane', 2 ),
(1, 'John', 12 ),
(2, 'Jane', 168)
)T(id, name, [type])
)
SELECT
CASE WHEN RN=1 THEN id END id,
CASE WHEN RN=1 THEN name END name,
[Type]
FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY id, name ORDER BY [Type]) RN FROM Src) T
It returns:
id name Type
----------- ---- -----------
1 John 1
NULL NULL 12
NULL NULL 34
2 Jane 2
NULL NULL 168

Select N Rows With Mixed Values

I have a table with columns like
insertTimeStamp, port, data
1 , 20 , 'aaa'
2 , 20 , 'aba'
3 , 20 , '3aa'
4 , 20 , 'aab'
2 , 21 , 'aza'
5 , 21 , 'aha'
8 , 21 , 'aaa'
15 , 22 , '2aa'
Now I need N Rows (Say 4) from that table, ordered asc by insertTimeStamp.
But if possible, I want to get them from different ports.
So the result should be:
1 , 20 , 'aaa'
2 , 20 , 'aba'
2 , 21 , 'aza'
15 , 22 , '2aa'
If there are not enough different values in port I would like select the remaining ones with the lowest insertTimeStamp.
SQL Fiddle Demo
As you can see I create a group_id so group_id = 1 will be the smaller TimeStamp for each port
The second field is time_id so in the ORDER BY after I select all the 1 bring all the 2,3,4 for any port.
SELECT *
FROM (
SELECT *,
row_number() over (partition by "port" order by "insertTimeStamp") group_id,
row_number() over (order by "insertTimeStamp") time_id
FROM Table1 T
) as T
ORDER BY CASE
WHEN group_id = 1 THEN group_id
ELSE time_id
END
LIMIT 4
OUTPUT
| insertTimeStamp | port | data | group_id | time_id |
|-----------------|------|------|----------|---------|
| 1 | 20 | aaa | 1 | 1 |
| 2 | 21 | aza | 1 | 3 |
| 15 | 22 | 2aa | 1 | 8 |
| 2 | 20 | aba | 2 | 2 |
Use row_number():
select *
from (
select insertTimeStamp, port, data
from (
select *, row_number() over (partition by port order by insertTimeStamp) rn
from a_table
) alias
order by rn, insertTimeStamp
limit 4
) alias
order by 1, 2;
inserttimestamp | port | data
-----------------+------+------
1 | 20 | aaa
2 | 20 | aba
2 | 21 | aza
15 | 22 | 2aa
(4 rows)
SqlFiddle

SQL Query to find winners within a contest

I have a query which returns to me results as follows:
Race | Candidate | Total Votes | MaxNoOfWinners
---------------------------------------------------
1 | 1 | 5000 | 3
1 | 2 | 6700 | 3
2 | 1 | 100 | 3
2 | 2 | 200 | 3
2 | 3 | 300 | 3
2 | 4 | 400 | 3
...
I was wondering if there was a query that could be written to return only the winners (based on the MaxNoOfWinners and TotalVotes) for a certain race. So for the above i would only get back
Race | Candidate | Total Votes | MaxNoOfWinners
---------------------------------------------------
1 | 1 | 5000 | 3
1 | 2 | 6700 | 3
2 | 2 | 200 | 3
2 | 3 | 300 | 3
2 | 4 | 400 | 3
...
Here is a solution... I did not test so there may be typos. The idea is is use the RANK() function of SQL Server to give a ranking by Race based on votes and not include those that don't meet the criteria. Note, using RANK() and not ROW_NUMBER() will include ties in the result.
WITH RankedResult AS
(
SELECT Race, Candidate, [Total Votes], MaxNoOfWinners, RANK ( ) OVER (PARTITION BY Race ORDER BY [Total Votes] DESC) AS aRank
FROM Results
)
SELECT Race, Candidate, [Total Votes], MaxNoOfWinners
FROM RankedResult
WHERE aRANK <= MaxNumberOfWinners
Here's a complete working sample that assumes two tables race and candiate
Create Table #Race(Race_id int , MaxNoOfwinners int )
INSERT INTO #Race (Race_id , MaxNoOfwinners)
VALUES (1,3),
(2,3),
(3,1)
CREATE TABLE #Candidate (CandidateID int , Race_ID int , Total_Votes int )
INSERT INTO #Candidate (CandidateID , Race_ID , Total_Votes )
VALUES (1,1,5000),
(2,1,6700),
(1,2,100),
(2,2,200),
(3,2,300),
(4,2,400),
(1,3,42),
(2,3,22)
;WITH CTE as (
SELECT
RANK() OVER(PARTITION BY race_id ORDER BY race_id, total_votes DESC ) num,
CandidateID , Race_ID , Total_Votes
From
#Candidate)
SELECT * FROM cte inner join #Race r
on CTE.Race_ID = r.Race_id
and num <= r.MaxNoOfwinners
DROP TABLE #Race
DROP TABLE #Candidate
With the following results
num CandidateID Race_ID Total_Votes Race_id MaxNoOfwinners
-------------------- ----------- ----------- ----------- ----------- --------------
1 2 1 6700 1 3
2 1 1 5000 1 3
1 4 2 400 2 3
2 3 2 300 2 3
3 2 2 200 2 3
1 1 3 42 3 1
WITH q0 AS (SELECT qry.*, rank() AS r
FROM qry OVER (PARTITION BY race ORDER BY total_votes DESC))
SELECT q0.race, q0.candidate, q0.total_votes FROM q0 WHERE r<=q0.max_winners;