Select distinct one field other first non empty or null

Select distinct one field other first non empty or null - sql

I have table
| Id | val |
| --- | ---- |
| 1 | null |
| 1 | qwe1 |
| 1 | qwe2 |
| 2 | null |
| 2 | qwe4 |
| 3 | qwe5 |
| 4 | qew6 |
| 4 | qwe7 |
| 5 | null |
| 5 | null |
is there any easy way to select distinct 'id' values with first non null 'val' values. if not exist then null. for example
result should be
| Id | val |
| --- | ---- |
| 1 | qwe1 |
| 2 | qwe4 |
| 3 | qwe5 |
| 4 | qew6 |
| 5 | null |

In your case a simple GROUP BY should be the solution:
SELECT Id
,MIN(val)
FROM dbo.mytable
GROUP BY Id
Whenever using a GROUP BY, you have to use an aggregate function on all columns, which are not listed in the GROUP BY.
If an Id has a value (val) other than NULL, this value will be returned.
If there are just NULLs for the Id, NULL will be returned.
As far as i unterstood (regarding your comment), this is exactly what you're going to approach.
If you always want to have "the first" value <> NULL, you'll need another sort criteria (like a timestamp column) and might be able to solve it with a WINDOW-function.

If you want the first non-NULL value (where "first" is based on id), then MIN() doesn't quite do it. Window functions do:
select t.*
from (select t.*,
row_number() over (partition by id
order by (case when val is not null then 1 else 2 end),
id
) as seqnum
from t
) t
where seqnum = 1;

SQL Fiddle:
Create Table from SQL Fiddle:
CREATE TABLE tab1(pid integer, id integer, val varchar(25))
Insert dummy records :
insert into tab1
values (1, 1 , null),
(2, 1 , 'qwe1' ),
(3, 1 , 'qwe2'),
(4, 2 , null ),
(5, 2 , 'qwe4' ),
(6, 3 , 'qwe5' ),
(7, 4 , 'qew6' ),
(8, 4 , 'qwe7' ),
(9, 5 , null ),
(10, 5 , null );
fire below query:
SELECT Id ,MIN(val) as val FROM tab1 GROUP BY Id;

Related

Condense or merge rows with null values not using group by

Let's say I have a select which returns the following Data:
select nr, name, val_1, val_2, val_3
from table
Nr. | Name | Value 1 | Value 2 | Value 3
-----+------------+---------+---------+---------
1 | Max | 123 | NULL | NULL
1 | Max | NULL | 456 | NULL
1 | Max | NULL | NULL | 789
9 | Lisa | 1 | NULL | NULL
9 | Lisa | 3 | NULL | NULL
9 | Lisa | NULL | NULL | Hello
9 | Lisa | 9 | NULL | NULL
I'd like to condense the rows down to the bare minimum with.
I want the following result:
Nr. | Name | Value 1 | Value 2 | Value 3
-----+------------+---------+---------+---------
1 | Max | 123 | 456 | 789
9 | Lisa | 1 | NULL | Hello
9 | Lisa | 3 | NULL | NULL
9 | Lisa | 9 | NULL | NULL
For condensing the rows with Max (Nr. 1) a group by of the max values would help.
select nr, name, max(val_1), max(val_2), max(val_3)
from table
group by nr, name
But I am unsure how to get the desired results for Lisa (Nr. 9). The row for Lisa contains a value in the Value 3 column, in this example it's condensed with the first row that matches Nr and Name and has a Null value in Value 3.
I'm thankful for every input!

Basic principle is same as Vladimir's solution. This uses UNPIVOT and PIVOT
with cte as
(
select nr, name, col, val,
rn = row_number() over(partition by nr, name, col order by val)
from [table]
unpivot
(
val
for col in (val_1, val_2, val_3)
) u
)
select *
from (
select nr, name, rn, col, val
from cte
) d
pivot
(
max (val)
for col in ([val_1], [val_2], [val_3])
) p

Here is one way to do it. Assign a unique row number for each column by sorting them in such a way that NULLs come last and then join them back together using these row numbers and remove rows with all NULLs.
Run just the CTE first and examine the intermediate result to understand how it works.
Sample data
DECLARE #T TABLE (Nr varchar(10), Name varchar(10), V1 varchar(10), V2 varchar(10), V3 varchar(10));
INSERT INTO #T VALUES
('1', 'Max ', '123' , NULL , NULL ),
('1', 'Max ', NULL , '456', NULL ),
('1', 'Max ', NULL , NULL , '789'),
('9', 'Lisa', '1' , NULL , NULL ),
('9', 'Lisa', '3' , NULL , NULL ),
('9', 'Lisa', NULL , NULL , 'Hello'),
('9', 'Lisa', '9' , NULL , NULL );
Query
WITH CTE
AS
(
SELECT
Nr
,Name
,V1
,V2
,V3
-- here we use CASE WHEN V1 IS NULL THEN 1 ELSE 0 END to put NULLs last
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V1 IS NULL THEN 1 ELSE 0 END, V1) AS rn1
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V2 IS NULL THEN 1 ELSE 0 END, V2) AS rn2
,ROW_NUMBER() OVER (PARTITION BY Nr ORDER BY CASE WHEN V3 IS NULL THEN 1 ELSE 0 END, V3) AS rn3
FROM #T AS T
)
SELECT
T1.Nr
,T1.Name
,T1.V1
,T2.V2
,T3.V3
FROM
CTE AS T1
INNER JOIN CTE AS T2 ON T2.Nr = T1.Nr AND T2.rn2 = T1.rn1
INNER JOIN CTE AS T3 ON T3.Nr = T1.Nr AND T3.rn3 = T1.rn1
WHERE
T1.V1 IS NOT NULL
OR T2.V2 IS NOT NULL
OR T3.V3 IS NOT NULL
ORDER BY
T1.Nr, T1.rn1
;
Result
+----+------+-----+------+-------+
| Nr | Name | V1 | V2 | V3 |
+----+------+-----+------+-------+
| 1 | Max | 123 | 456 | 789 |
| 9 | Lisa | 1 | NULL | Hello |
| 9 | Lisa | 3 | NULL | NULL |
| 9 | Lisa | 9 | NULL | NULL |
+----+------+-----+------+-------+

Replace value in column based on another column

I have the following table:
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
+----+--------+------------+----------------------+
How can I create the column Replaced ? I thought of creating 10 maximum columns (I know there are no more than 10 nested levels) and fetch the ID from every substring split by '-', and then concatenating them if not null into Replaced, but I think there is a simpler solution.

While what you ask for is technically feasible (probably using a recursive query or a tally), I will take a different stance and suggest that you fix your data model instead.
You should not be storing multiple values as a delimited list in a single database column. This defeats the purpose of a relational database, and makes simple things both unnecessarily complicated and inefficient.
Instead, you should have a separate table to store that data, which each replacement id on a separate row, and possibly a column that indicates the sequence of each element in the list.
For your sample data, this would look like:
id replace_id seq
1 1 1
2 1 1
2 2 2
3 1 1
3 3 2
4 1 1
4 3 2
4 4 3
5 1 1
5 2 2
5 5 3
6 1 1
6 2 2
6 6 3
Now you can efficiently generate the expected result with either a join, a subquery, or a lateral join. Assuming that your table is called mytable and that the mapping table is mymapping, the lateral join solution would be:
select t.*, r.*
from mytable t
outer apply (
select string_agg(t1.name) within group(order by m.seq) replaced
from mymapping m
inner join mytable t1 on t1.id = m.replace_id
where m.id = t.id
) x

You can try something like this:
DECLARE #Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO #Data ( ID, [Name], To_Replace ) VALUES
( 1, 'Fruits', '1' ),
( 2, 'Apple', '1-2' ),
( 3, 'Citrus', '1-3' ),
( 4, 'Orange', '1-3-4' ),
( 5, 'Empire', '1-2-5' ),
( 6, 'Fuji', '1-2-6' );
SELECT
*
FROM #Data AS d
OUTER APPLY (
SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM #Data WHERE ID IN (
SELECT CAST ( [value] AS INT ) FROM STRING_SPLIT ( d.To_Replace, '-' )
)
) List
ORDER BY ID;
Returns
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
+----+--------+------------+----------------------+
UPDATE
Ensure the id list order is maintained when aggregating names.
DECLARE #Data TABLE ( ID INT, [Name] VARCHAR(10), To_Replace VARCHAR(10) );
INSERT INTO #Data ( ID, [Name], To_Replace ) VALUES
( 1, 'Fruits', '1' ),
( 2, 'Apple', '1-2' ),
( 3, 'Citrus', '1-3' ),
( 4, 'Orange', '1-3-4' ),
( 5, 'Empire', '1-2-5' ),
( 6, 'Fuji', '1-2-6' ),
( 7, 'Test', '6-2-7' );
SELECT
*
FROM #Data AS d
OUTER APPLY (
SELECT STRING_AGG ( [Name], '-' ) AS Replaced FROM (
SELECT TOP 100 PERCENT
Names.[Name]
FROM ( SELECT CAST ( '<ids><id>' + REPLACE ( d.To_Replace, '-', '</id><id>' ) + '</id></ids>' AS XML ) AS id_list ) AS xIds
CROSS APPLY (
SELECT
x.f.value('.', 'INT' ) AS name_id,
ROW_NUMBER() OVER ( ORDER BY ( SELECT NULL ) ) AS row_id
FROM xIds.id_list.nodes('//ids/id') x(f)
) AS ids
INNER JOIN #Data AS Names ON Names.ID = ids.name_id
ORDER BY row_id
) AS x
) List
ORDER BY ID;
Returns
+----+--------+------------+----------------------+
| ID | Name | To_Replace | Replaced |
+----+--------+------------+----------------------+
| 1 | Fruits | 1 | Fruits |
| 2 | Apple | 1-2 | Fruits-Apple |
| 3 | Citrus | 1-3 | Fruits-Citrus |
| 4 | Orange | 1-3-4 | Fruits-Citrus-Orange |
| 5 | Empire | 1-2-5 | Fruits-Apple-Empire |
| 6 | Fuji | 1-2-6 | Fruits-Apple-Fuji |
| 7 | Test | 6-2-7 | Fuji-Apple-Test |
+----+--------+------------+----------------------+
I'm sure there's optimization that can be done here, but this solution seems to guarantee the list order is kept.

SQL Combining multiple rows into one

I want to merge multiple rows into one, and only keep the values where the value is not NULL
Here is what i want to achieve:
I want from this
+----+-----------------+-----------------+-----------------+--------------------+
| ID | 1stNofification | 2ndNotification | 3rdNotification | NotificationNumber |
+----+-----------------+-----------------+-----------------+--------------------+
| 1 | 01.01.2019 | NULL | NULL | 1 |
+----+-----------------+-----------------+-----------------+--------------------+
| 1 | NULL | 02.02.2019 | NULL | 2 |
+----+-----------------+-----------------+-----------------+--------------------+
| 1 | NULL | NULL | 03.03.2019 | 3 |
+----+-----------------+-----------------+-----------------+--------------------+
| 2 | 06.01.2019 | NULL | NULL | 1 |
+----+-----------------+-----------------+-----------------+--------------------+
| 2 | NULL | 09.02.2019 | NULL | 2 |
+----+-----------------+-----------------+-----------------+--------------------+
| 2 | NULL | NULL | 11.03.2019 | 3 |
+----+-----------------+-----------------+-----------------+--------------------+
to this:
+----+-----------------+-----------------+-----------------+
| ID | 1stNofification | 2ndNotification | 3rdNotification |
+----+-----------------+-----------------+-----------------+
| 1 | 01.01.2019 | 02.02.2019 | 03.03.2019 |
+----+-----------------+-----------------+-----------------+
| 2 | 06.01.2019 | 09.02.2019 | 11.03.2019 |
+----+-----------------+-----------------+-----------------+
I tried something like:
SELECT
ID,
MAX(CASE WHEN a.NotificationNumber = 1 THEN 1stNotification END)1stNotification,
MAX(CASE WHEN a.NotificationNumber = 2 THEN 2ndNotification END)2ndNotification,
MAX(CASE WHEN a.NotificationNumber = 3 THEN 3rdNotification END)3rdNotification
FROM Notifications
GROUP BY ID
But that did not give me my expected results unfortunately.
Would really appreciate if someone could help me out :)

You just need to use max without any case
SELECT
ID,
MAX(1stNotification) AS 1stNotification,
MAX(2ndNotification) AS 2ndNotification,
MAX(3rdNotification) AS 3rdNotification
FROM Notifications
GROUP BY ID

I think you need something like this...
; with cte as (
select 1 as id, 'dd' as not1, null as not2, null as not3 , 1 as notifications
union all
select 1, null, 'df', null , 2
union all
select 1, null, null, 'vc', 3
union all
select 2, 'ws', null, null, 1
union all
select 2, null, 'xs', null, 2
union all
select 2, null, null, 'nm', 3
)
, ct as (
select id, coalesce(not1, not2, not3) as ol, notifications ,
'notification' + cast(notifications as varchar(5)) as Col
from cte
)
select * from (
select id, ol, col from ct )
as d
pivot (
max(ol) for col in ( [notification1], [notification2], [notification3] )
) as P
Here as per my understanding your notification columns in result are actually notification number mention in rows.

Postgresql : Mark the first row of a group

I have a table t like this :
id | group_id | name
------------------------
1 | 1 | richard
2 | 1 | ray
3 | 2 | enzo
4 | 2 | shiela
5 | 2 | anne
I have no problem selecting each group, however I want to mark the first occurrence for each group by group_id. Then add it as column to mark that the row is the first occurrence of that group.
E.g, Richard for group 1, or Enzo for group 2 and so on.
I should be able to use:
select
t.*
case
when (condition)
...(boolean result here)
end as is_first_row
from t
and result to :
id | group_id | name |is_first_row
-------------------------------
1 | 1 | richard | t
2 | 1 | ray | f
3 | 2 | enzo | t
4 | 2 | shiela | f
5 | 2 | anne | f
How do I formulate the condition statement for the select query?

Use row_number():
with my_table(id, group_id, name) as (
values
(1, 1, 'richard'),
(2, 1, 'ray'),
(3, 2, 'enzo'),
(4, 2, 'shiela'),
(5, 2, 'anne')
)
select *, row_number() over w = 1 as is_first_row
from my_table
window w as (partition by group_id order by id);
id | group_id | name | is_first_row
----+----------+---------+--------------
1 | 1 | richard | t
2 | 1 | ray | f
3 | 2 | enzo | t
4 | 2 | shiela | f
5 | 2 | anne | f
(5 rows)
Select row_number() to see how it works. Row numbers are calculated in partitions by group_id i.e. for every group_id separately, in order by id:
with my_table(id, group_id, name) as (
values
(1, 1, 'richard'),
(2, 1, 'ray'),
(3, 2, 'enzo'),
(4, 2, 'shiela'),
(5, 2, 'anne')
)
select *, row_number() over w
from my_table
window w as (partition by group_id order by id);
id | group_id | name | row_number
----+----------+---------+------------
1 | 1 | richard | 1
2 | 1 | ray | 2
3 | 2 | enzo | 1
4 | 2 | shiela | 2
5 | 2 | anne | 3
(5 rows)

please check my answer and let me know in case of any error in the logic
Create Table #Temp(id int,group_id int,name nvarchar(max))
Insert into #Temp values
(1,1,'richard')
,(2,1,'ray')
,(3,2,'enzo')
,(4,2,'shiela')
,(5,2,'anne')
Select t2.id,t2.group_id,t2.name,t1.group_id_c, case
when t1.group_id_c=1 then 't'
else 'f'
end AS is_firstrow from #temp t2 join
(Select t.*, row_number() over (partition by group_id order by id) as group_id_c from #Temp t ) t1
on t1.id=t2.id

Querying data groups with total row before starting next group?

I need to query some data in the below format in SQL Server:
Id Group Price
1 A 10
2 A 20
Sum 30
1 B 6
2 B 4
Sum 10
1 C 100
2 C 200
Sum 300
I was thinking to do it in the follwoing steps:
Query one group
In other query do sum
Use Union operator to combine this result set
Do step 1-3 for all groups and finally return all sub sets of data using union.
Is there a better way to do this ? May be using some out of box feature ? Please advise.
Edit:
As per suggestions and code sample I tried this code:
Select
Case
when id is null then 'SUM'
else CAST(id as Varchar(10)) end as ID,
Case when [group] is null then 'ALL' else CAST([group] as Varchar(50)) end as [group]
,Price from
(
SELECT Id, [Group],BGAApplicationID,Section, SUM(PrimaryTotalArea) AS price
FROM vwFacilityDetails
where bgaapplicationid=1102
GROUP BY Id, [Group],BGAApplicationID,Section WITH ROLLUP
) a
And Even this code as well:
Select Id, [Group],BGAApplicationID,Section, SUM(PrimaryTotalArea) AS price
From vwFacilityDetails
Where Not ([group] Is Null And id Is Null And BGAApplicationId is null and section is null) and BGAApplicationId=1102
Group By Id, [Group],BGAApplicationID,Section
With Rollup
In results it groups up the data but for every record it shows it 3 times (in both above codes) like:
2879 Existing Facilities Whole School 25.00
2879 Existing Facilities Whole School 25.00
2879 Existing Facilities Whole School 25.00
2879 ALL 25.00
I guess there is some issue in my query, please guide me here as well.
Thanks

SQL Server introduced GROUPING SETS which is what you should be looking to use.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
Create Table vwFacilityDetails (
id int not null,
[group] char(1) not null,
PrimaryTotalArea int not null,
Section int,
bgaapplicationid int
);
Insert Into vwFacilityDetails (id, [group], Section,bgaapplicationid,PrimaryTotalArea) values
(1, 'A', 1,1102,2),
(1, 'A', 1,1102,1),
(1, 'A', 1,1102,7),
(2, 'A', 1,1102,20),
(1, 'B', 1,1102,6),
(2, 'B', 1,1102,4),
(1, 'C', 1,1102,100),
(2, 'C', 1,1102,200);
Query 1:
SELECT CASE WHEN Id is null then 'SUM'
ELSE Right(Id,10) end Id,
[Group],BGAApplicationID,Section,
SUM(PrimaryTotalArea) price
FROM vwFacilityDetails
where bgaapplicationid=1102
GROUP BY GROUPING SETS (
(Id,[Group],BGAApplicationID,Section),
([Group])
)
ORDER BY [GROUP],
ID;
Results:
| ID | GROUP | BGAAPPLICATIONID | SECTION | PRICE |
----------------------------------------------------
| 1 | A | 1102 | 1 | 10 |
| 2 | A | 1102 | 1 | 20 |
| SUM | A | (null) | (null) | 30 |
| 1 | B | 1102 | 1 | 6 |
| 2 | B | 1102 | 1 | 4 |
| SUM | B | (null) | (null) | 10 |
| 1 | C | 1102 | 1 | 100 |
| 2 | C | 1102 | 1 | 200 |
| SUM | C | (null) | (null) | 300 |

Select
id,
[Group],
SUM(price) AS price
From
Test
Group By
[group],
id
With
Rollup
http://sqlfiddle.com/#!3/080cd/8

Select Case when id is null then 'SUM' else CAST(id as Varchar(10)) end as ID
,Case when [group] is null then 'ALL' else CAST([group] as Varchar(10)) end as [group]
,Price from
(
SELECT id, [group], SUM(price) AS Price
FROM IG
GROUP BY [GROUP],ID WITH ROLLUP
) a

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select distinct one field other first non empty or null - sql

If you want the first non-NULL value (where "first" is based on id), then MIN() doesn't quite do it. Window functions do: select t.* from (select t.*, row_number() over (partition by id order by (case when val is not null then 1 else 2 end), id ) as seqnum from t ) t where seqnum = 1;

Related

Condense or merge rows with null values not using group by

Replace value in column based on another column

SQL Combining multiple rows into one

Postgresql : Mark the first row of a group

Querying data groups with total row before starting next group?

Categories

Resources