Fill NULL value with progressive row_number over partition function - sql

What I have
From the following #MyTable I just have Name and Number columns.
My goal is to fill the valus where Number = NULL with a progressive number and get the values I have wrote into the Desidered_col column.
+------+--------+---------------+
| Name | Number | Desidered_col |
+------+--------+---------------+
| John | 1 | 1 |
| John | 2 | 2 |
| John | 3 | 3 |
| John | NULL | 4 |
| John | NULL | 5 |
| John | 6 | 6 |
| Mike | 1 | 1 |
| Mike | 2 | 2 |
| Mike | NULL | 3 |
| Mike | 4 | 4 |
| Mike | 5 | 5 |
| Mike | 6 | 6 |
+------+--------+---------------+
What I have tried
I have tried with the following query
SELECT Name, Number, row_number() OVER(PARTITION BY [Name] ORDER BY Number ASC) AS rn
FROM #MyTable
but it put all the NULL values first and then count the rows.
How can I fill the empty values?
Why I don't think is a duplicate question
I have read this question and this question but I don't think it is duplicate because they don't consider the PARTITION BY construct.
This is the script to create and populate the table
SELECT *
INTO #MyTable
FROM (
SELECT 'John' AS [Name], 1 AS [Number], 1 AS [Desidered_col] UNION ALL
SELECT 'John' AS [Name], 2 AS [Number], 2 AS [Desidered_col] UNION ALL
SELECT 'John' AS [Name], 3 AS [Number], 3 AS [Desidered_col] UNION ALL
SELECT 'John' AS [Name], NULL AS [Number], 4 AS [Desidered_col] UNION ALL
SELECT 'John' AS [Name], NULL AS [Number], 5 AS [Desidered_col] UNION ALL
SELECT 'John' AS [Name], 6 AS [Number], 6 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], 1 AS [Number], 1 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], 2 AS [Number], 2 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], NULL AS [Number], 3 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], 4 AS [Number], 4 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], 5 AS [Number], 5 AS [Desidered_col] UNION ALL
SELECT 'Mike' AS [Name], 6 AS [Number], 6 AS [Desidered_col]
) A

This query is a bit complicated but seems to return your expected result. The only case it may be wrong is when someone does not have Number = 1.
The idea is that you must find gaps between numbers and count how many null values can be used to fill them.
Sample data
create table #myTable (
[Name] varchar(20)
, [Number] int
)
insert into #myTable
insert into #myTable
SELECT 'John' AS [Name], 1 AS [Number] UNION ALL
SELECT 'John' AS [Name], 2 AS [Number]UNION ALL
SELECT 'John' AS [Name], 3 AS [Number] UNION ALL
SELECT 'John' AS [Name], NULL AS [Number] UNION ALL
SELECT 'John' AS [Name], NULL AS [Number] UNION ALL
SELECT 'John' AS [Name], 6 AS [Number] UNION ALL
SELECT 'Mike' AS [Name], 1 AS [Number] UNION ALL
SELECT 'Mike' AS [Name], 2 AS [Number] UNION ALL
SELECT 'Mike' AS [Name], NULL AS [Number] UNION ALL
SELECT 'Mike' AS [Name], 4 AS [Number] UNION ALL
SELECT 'Mike' AS [Name], 5 AS [Number] UNION ALL
SELECT 'Mike' AS [Name], 6 AS [Number]
Query
;with gaps_between_numbers as (
select
t.Name, cnt = t.nextNum - t.Number - 1, dr = dense_rank() over (partition by t.Name order by t.Number)
, rn = row_number() over (partition by t.Name order by t.Number)
from (
select
Name, Number, nextNum = isnull(lead(Number) over (partition by Name order by number), Number + 1)
from
#myTable
where
Number is not null
) t
join master.dbo.spt_values v on t.nextNum - t.Number - 1 > v.number
where
t.nextNum - t.Number > 1
and v.type = 'P'
)
, ordering_nulls as (
select
t.Name, dr = isnull(q.dr, 2147483647)
from (
select
Name, rn = row_number() over (partition by Name order by (select 1))
from
#myTable
where
Number is null
) t
left join gaps_between_numbers q on t.Name = q.Name and t.rn = q.rn
)
, ordering_not_null_numbers as (
select
Name, Number, rn = dense_rank() over (partition by Name order by gr)
from (
select
Name, Number, gr = sum(lg) over (partition by Name order by Number)
from (
select
Name, Number, lg = iif(Number - lag(Number) over (partition by Name order by Number) = 1, 0, 1)
from
#myTable
where
Number is not null
) t
) t
)
select
Name, Number
, Desidered_col = row_number() over (partition by Name order by rn, isnull(Number, 2147483647))
from (
select * from ordering_not_null_numbers
union all
select Name, null, dr from ordering_nulls
) t
CTE gaps_between_numbers is seeking for numbers that are not consecutive. Number difference between current and next row shows how many NULL values can be used to fill the gaps. Then master.dbo.spt_values is used to multiply each row by that amount. In gaps_between_numbers dr column is gap number and cnt is amount of NULL values that need to used.
ordering_nulls orders only NULL values and is joined with CTE gaps_between_numbersto know in which position each row should appear.
ordering_not_null_numbers orders values that are not NULL. Consecutive Numbers will have same row number
And last step is to union CTE's ordering_not_null_numbers and ordering_nulls and make desired ordering
Rextester DEMO

In order to do this, you need a column that specifies the order of the rows in the table. You can do this using the identity() function:
SELECT identity(int, 1, 1) as MyTableId, a.*
INTO #MyTable
. . .
I'm pretty sure SQL Server will follow the ordering of a values() statement and in practice will follow the ordering of a union all. You can explicitly put this column in each row, if you prefer.
Then you can use this to assign your value:
select t.*,
row_number() over (partition by name order by mytableid) as desired_col
from #MyTable

You could also assign the new ranking based on Desidered_col using row_number() function with ORDER BY clause (select 1 or select null)
select *,
row_number() over (partition by Name order by (select 1)) New_Desidered_col
from #MyTable

Related

delete records if 2 counts match

I have the following table with data
+---------+--------+--------+
| Trans | Status | Step |
+---------+--------+--------+
| ABCDE | Comple | 1
+---------+--------+--------+
| ABCDE | Error | 2
+---------+--------+--------+
| FGHIJ | Comple | 1
+---------+--------+--------+
| FGHIJ | Comple | 2
+---------+--------+--------+
| KLMNO | Comple | 1
+---------+--------+--------+
| KLMNO | Comple | 2
+---------+--------+--------+
I only want to delete the records where the count of trans and status = 'comple' is the same as count of trans
I could probably do a cursor and loop but that is something I don't want to do.
Thinking along the lines of a having count but probably miles off.
Thanks
I just want to delete FGHIJ and KLMNO as I know all the steps completed.
I want to keep abcde as not all steps completed
Is this what you want?
--delete
select *
from yourtable t
where not exists
(
select 1
from yourtable t2
where t1.Trans=t2.Trans
and status<>'Comple'
)
Use it as it is first to make sure this is what you want to remove, then comment out the SELECT and uncomment the delete
DELETE FROM tbl
WHERE Trans IN ( SELECT trans
FROM tbl
GROUP BY Trans
HAVING COUNT(*) = SUM(CASE WHEN status = 'Comple' THEN 1 ELSE 0 END) )
It deletes rows (by filtering on Trans), where the number of rows in a group of trans is equal to the number of rows in this group with the Comple status
http://sqlfiddle.com/#!18/1768b/10
Try this
DECLARE #Data AS TABLE (Trans Varchar(10) , Status Varchar(10) , Step INT )
INSERT INTO #Data
SELECT 'ABCDE', 'Comple' , 1 UNION ALL
SELECT 'ABCDE', 'Error' , 2 UNION ALL
SELECT 'FGHIJ', 'Comple' , 1 UNION ALL
SELECT 'FGHIJ', 'Comple' , 2 UNION ALL
SELECT 'KLMNO', 'Comple' , 1 UNION ALL
SELECT 'KLMNO', 'Comple' , 2
SELECT * FROM #Data
;WITH CTe
AS
(
SELECT * , ROW_NUMBER()OVER(PArtition by Trans , [Status] ORDER BY Trans) AS TwocountsMatch
FROM #Data
)
DELETE FROM CTe WHERE TwocountsMatch=2 AND [Status]='Comple'
SELECT * FROM #Data
Demo Result :http://rextester.com/TYK92230
If I understand you correctly, you want to delete all of a [Trans] rows, if all of those rows [status] are 'Comple'?
DECLARE #Data AS TABLE (Trans Varchar(10) , Status Varchar(10) , Step INT )
INSERT INTO #Data
SELECT 'ABCDE', 'Comple' , 1 UNION ALL
SELECT 'ABCDE', 'Error' , 2 UNION ALL
SELECT 'FGHIJ', 'Comple' , 1 UNION ALL
SELECT 'FGHIJ', 'Comple' , 2 UNION ALL
SELECT 'KLMNO', 'Comple' , 1 UNION ALL
SELECT 'KLMNO', 'Comple' , 2
SELECT * FROM #Data;
WITH
summarised
AS
(
SELECT
* ,
COUNT(*) OVER (PARTITION BY [trans] ) AS count_trans,
COUNT(*) OVER (PARTITION BY [trans], [status]) AS count_trans_status
FROM
#Data
)
DELETE
summarised
WHERE
[status] = 'Comple'
AND count_trans = count_trans_status
;
SELECT * FROM #Data

Order rows by values

I trying order table by rank, but rows which have position value - have to have position according to value in position field. It is possible do it without additional tables, views etc?
I have table like this:
rank | position | name
999 | 10 | txt1
200 | 4 | txt2
32 | 1 | txt3
1200 | 2 | txt4
123 | null | txt5
234 | null | txt6
567 | null | txt7
234 | null | txt8
432 | null | txt9
877 | null | txt10
Desired output have to look like this:
rank | position | name
32 | 1 | txt3
1200 | 2 | txt4
877 | null | txt10
200 | 4 | txt2
567 | null | txt7
432 | null | txt9
345 | null | txt8
234 | null | txt6
123 | null | txt5
999 | 10 | txt1
Here is an idea. Assign the proper ordering to each row. Then, if the position is available use that instead. When there are ties, put the position value first:
select t.*
from (select t.*, row_number() over (order by rank desc) as seqnum
from t
) t
order by (case when position is not null then position else seqnum end),
(case when position is not null then 1 else 2 end);
SQL Fiddle doesn't seem to be working these days, but this query demonstrates the results:
with t(rank, position, t) as (
select 999, 10, 'txt1' union all
select 200, 4, 'txt2' union all
select 32 , 1, 'txt3' union all
select 1200, 2, 'txt4' union all
select 123, null, 'txt5' union all
select 234, null, 'txt6' union all
select 567, null, 'txt7' union all
select 234, null, 'txt8' union all
select 432, null, 'txt9' union all
select 877, null , 'txt10'
)
select t.*
from (select t.*, row_number() over (order by rank desc) as seqnum
from t
) t
order by (case when position is not null then position else seqnum end),
(case when position is not null then 1 else 2 end);
EDIT;
When I wrote the above, I had a nagging suspicion of a problem. Here is a solution that should work. It is more complicated but it does produce the right numbers:
with t(rank, position, t) as (
select 999, 10, 'txt1' union all
select 200, 4, 'txt2' union all
select 32 , 1, 'txt3' union all
select 1200, 2, 'txt4' union all
select 123, null, 'txt5' union all
select 234, null, 'txt6' union all
select 567, null, 'txt7' union all
select 234, null, 'txt8' union all
select 432, null, 'txt9' union all
select 877, null , 'txt10'
)
select *
from (select t.*, g.*,
row_number() over (partition by t.position order by t.rank) gnum
from generate_series(1, 10) g(n) left join
t
on t.position = g.n
) tg left join
(select t.*,
row_number() over (partition by t.position order by t.rank) as tnum
from t
) t
on tg.gnum = t.tnum and t.position is null
order by n;
This is a weird sort of interleaving problem. The idea is to create slots (using generate series) for the positions. Then, assign the known positions to the slots. Finally, enumerate the remaining slots and assign the values there.
Note: I hard-coded 10, but it is easy enough to put in count(*) from the table there.
suppose you stored data in table1.
Then you should update column "position" as follows:
update a
set position = x.pos_null
from table1
a
inner join
(
select
a.name,
COUNT(a.rank) as pos_null
from
(
select
*
from table1
where position is null
)
a
left join
(
select
*
from table1
)
b
on a.rank <= b.rank
group by
a.name
)
x
on a.name = x.name
select * from table1 order by position
Bye,
Angelo.

select based on specific values

I have this table:
ID NO.
111 6
222 7
333 9
111 8
333 4
222 3
111 7
222 5
333 2
I want to select only 2 ID numbers from table where NO. column equal specific values.
For example i tried this query but i didn't get the expected result:
SELECT top 2 * FROM mytable where NO. in
(select NO. from mytable )
Expected result:
111 6
111 8
222 7
222 3
333 9
333 3
You seem to want to select two rows in the table for each id, based on a condition on the No column. For this, one method uses row_number():
select t.*
from (select t.*, row_number() over (partition by id order by id) as seqnum
from mytable t
where <condition goes here>
) t
where seqnum <= 2;
I'm guessing (333,3) is a mistake and you expect (333,2). If not I have no idea.
SELECT
ua.ID
, ua.[NO.]
FROM (
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY t.[NO.] ASC) AS RowNum
, t.ID
, t.[NO.]
FROM dbo.t1 AS t
UNION ALL
SELECT
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY t.[NO.] DESC)
, ID
, t.[NO.]
FROM dbo.t1 AS t
) ua
WHERE ua.RowNum = 1
ORDER BY ID, ua.[NO.] DESC
If you're just trying to get top 2 values for each group, you need something to define the order, ie. a third column. Then you don't need UNION ALL, just use WHERE ua.RowNum < 3.
/*Select 2 random rows per id where the number of rows per id can vary between 1 and infinity
A good article for this:-*/
--https://www.mssqltips.com/sqlservertip/3157/different-ways-to-get-random-data-for-sql-server-data-sampling/
DECLARE #TABLE TABLE(ID INT,NO INT)
INSERT INTO #TABLE
VALUES
(111, 6),
(222, 7),
(333 , 9),
(111 , 8),
(333 , 4),
(222 , 3),
(111 , 7),
(222 , 5),
(333 , 2)
select t.* from
(
Select s.* ,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY randomnumber) ROWNUMBER
from
(
SELECT ID,NO,
(ABS(CHECKSUM(NEWID())) % 100001) + ((ABS(CHECKSUM(NEWID())) % 100001) * 0.00001) [randomnumber]
FROM #TABLE
) s
) t
where t.rownumber < 3

How to select distinct records from individual column in sql table

I want to retrieve distinct rows from each column in this sql table
My table
1 Apple
2 Banana
3 Apple
2 Apple
1 Orange
I want the result like this:
1 Apple
2 Banana
3 Orange
Please help me with this
You can get the distinct names by doing:
select distinct name
from table t;
You can add the first column by doing:
select row_number() over (order by name) as id, name
from (select distinct name
from table t
) t;
Most databases support the ANSI standard row number. You haven't tagged this with the database, so that is the most general solution.
EDIT:
Oh, you want two columns each with values. I would approach this as a full outer join:
select nu.num, na.name
from (select num, row_number() over (order by num) as seqnum
from table
group by num
) nu full outer join
(select name, row_number() over (order by name) as seqnum
from table t
group by name
) na
on nu.seqnum = na.seqnum;
Each subquery enumerates the values in each column. The full outer join makes sure that you have values even when they are missing on one side or the other.
Please try it
select *,row=rank() over(order by name) from (SELECT distinct name FROM abc) as cte
or
with cte as
(
SELECT distinct name FROM abc
)
select *,row=rank() over(order by name) from cte
Output
| row | Name |
|-----------|----------|
| 1 | Apple |
| 2 | Banana |
| 3 | Orange |
Proof of concept.
Tested on Oracle 11.2
WITH
MyTable (firstName, lastName) AS (
SELECT '1', 'Apple' FROM DUAL UNION ALL
SELECT '2', 'Banana' FROM DUAL UNION ALL
SELECT '3', 'Apple' FROM DUAL UNION ALL
SELECT '2', 'Apple' FROM DUAL UNION ALL
SELECT '4', 'Apple' FROM DUAL UNION ALL
SELECT '1', 'Orange' FROM DUAL),
Ranked AS (
SELECT DISTINCT
firstName
, DENSE_RANK() OVER (ORDER BY firstName) AS fnRnk
, lastName
, DENSE_RANK() OVER (ORDER BY lastName) AS lnRnk
FROM MyTable
)
SELECT
DISTINCT R1.firstName, R2.lastName FROM Ranked R1 FULL OUTER JOIN Ranked R2 ON R1.fnRnk = R2.lnRnk ORDER BY lastName NULLS LAST, firstName
;
Returns
| FIRSTNAME | LASTNAME |
|-----------|----------|
| 1 | Apple |
| 2 | Banana |
| 3 | Orange |
| 4 | (null) |
Added a row to the original data to demonstrate columns of different length.
SQL Fiddle

tSQL UNPIVOT of comma concatenated column into multiple rows

I have a table that has a value column. The value could be one value or it could be multiple values separated with a comma:
id | assess_id | question_key | item_value
---+-----------+--------------+-----------
1 | 859 | Cust_A_1 | 1,5
2 | 859 | Cust_B_1 | 2
I need to unpivot the data based on the item_value to look like this:
id | assess_id | question_key | item_value
---+-----------+--------------+-----------
1 | 859 | Cust_A_1 | 1
1 | 859 | Cust_A_1 | 5
2 | 859 | Cust_B_1 | 2
How does one do that in tSQL on SQL Server 2012?
We have a user defined function that we use for stuff like this that we called "split_delimiter":
CREATE FUNCTION [dbo].[split_delimiter](#delimited_string VARCHAR(8000), #delimiter_type CHAR(1))
RETURNS TABLE AS
RETURN
WITH cte10(num) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
)
,cte100(num) AS
(
SELECT 1
FROM cte10 t1, cte10 t2
)
,cte10000(num) AS
(
SELECT 1
FROM cte100 t1, cte100 t2
)
,cte1(num) AS
(
SELECT TOP (ISNULL(DATALENGTH(#delimited_string),0)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
FROM cte10000
)
,cte2(num) AS
(
SELECT 1
UNION ALL
SELECT t.num+1
FROM cte1 t
WHERE SUBSTRING(#delimited_string,t.num,1) = #delimiter_type
)
,cte3(num,[len]) AS
(
SELECT t.num
,ISNULL(NULLIF(CHARINDEX(#delimiter_type,#delimited_string,t.num),0)-t.num,8000)
FROM cte2 t
)
SELECT delimited_item_num = ROW_NUMBER() OVER(ORDER BY t.num)
,delimited_value = SUBSTRING(#delimited_string, t.num, t.[len])
FROM cte3 t;
GO
It will take a varchar value up to 8000 characters and will return a table with the delimited elements broken into rows. In your example, you'll want to use an outer apply to turn those delimited values into separate rows:
SELECT my_table.id, my_table.assess_id, question_key, my_table.delimited_items.item_value
FROM my_table
OUTER APPLY(
SELECT delimited_value AS item_value
FROM my_database.dbo.split_delimiter(my_table.item_value, ',')
) AS delimited_items