Sequencing and re-setting in SQL Server 2008 - sql

I am actually new to SQL server 2008, and I am trying to sequence and re-set a number in a table. The source is something like:
Row Refrec FLAG
1 5 NULL
2 4 X
3 3 NULL
4 2 NULL
5 1 Y
6 5 A
7 4 B
8 3 NULL
9 2 NULL
10 1 NULL
The result should look like:
Row Refrec FLAG SEQUENCE
1 5 NULL NULL
2 4 X 0
3 3 NULL 1
4 2 NULL 2
5 1 Y 0
6 5 A 0
7 4 B 0
8 3 NULL 1
9 2 NULL 2
10 1 NULL 3
Thanks!

It looks like you want to enumerate the sequence values for NULL values, setting all the other values to 0. I'm not sure why the first value is NULL, but that is easily fixed.
The following may do what you want:
select t.*,
(case when flag is not null then 0
else row_number() over (partition by seqnum - row order by row)
end) as Sequence
from (select t.*, row_number() over (partition by flag order by row) as seqnum
from table t
);
If you really care about the first value:
select t.*,
(case when row = 1 then NULL
when flag is not null then 0
else row_number() over (partition by seqnum - row order by row)
end) as Sequence
from (select t.*, row_number() over (partition by flag order by row) as seqnum
from table t
);

Related

row_number() but only increment value after a specific value in a column

Query: SELECT (row_number() OVER ()) as grp, * from tbl
Edit: the rows below are returned by a pgrouting shortest path function and it does have a sequence.
seq grp id
1 1 8
2 2 3
3 3 2
4 4 null
5 5 324
6 6 82
7 7 89
8 8 null
9 9 1
10 10 2
11 11 90
12 12 null
How do I make it so that the grp column is only incremented after a null value on id - and also keep the same order of rows
seq grp id
1 1 8
2 1 3
3 1 2
4 1 null
5 2 324
6 2 82
7 2 89
8 2 null
9 3 1
10 3 2
11 3 90
12 3 null
demo:db<>fiddle
Using a cumulative SUM aggregation is a possible approach:
SELECT
SUM( -- 2
CASE WHEN id IS NULL THEN 1 ELSE 0 END -- 1
) OVER (ORDER BY seq) as grp,
id
FROM mytable
If the current (ordered!) value is NULL, then make it 1, else 0. Now you got a bunch of zeros, delimited by a 1 at each NULL record. If you'd summerize these values cumulatively, at each NULL record, the sum increased.
Execution of the cumulative SUM() using window functions
This yields:
0 8
0 3
0 2
1 null
1 324
1 82
1 89
2 null
2 1
2 2
2 90
3 null
As you can see, the groups start with the NULL records, but you are expecting to end it.
This can be achieved by adding another window function: LAG(), which moves the records to the next row:
SELECT
SUM(
CASE WHEN next_id IS NULL THEN 1 ELSE 0 END
) OVER (ORDER BY seq) as grp,
id
FROM (
SELECT
LAG(id) OVER (ORDER BY seq) as next_id,
seq,
id
FROM mytable
) s
The result is your expected one:
1 8
1 3
1 2
1 null
2 324
2 82
2 89
2 null
3 1
3 2
3 90
3 null

Select top rows until value in specific column has appeared twice

I have the following query where I am trying to select all records, ordered by date, until the second time EmailApproved = 1 is found. The second record where EmailApproved = 1 should not be selected.
declare #Test table (id int, EmailApproved bit, Created datetime);
insert into #Test (id, EmailApproved, Created)
values
(1,0,'2011-03-07 03:58:58.423')
, (2,0,'2011-02-21 04:55:52.103')
, (3,0,'2011-01-29 13:24:02.103')
, (4,1,'2010-10-12 14:41:54.217')
, (5,0,'2010-10-12 14:34:15.903')
, (6,0,'2010-10-12 10:10:19.123')
, (7,1,'2010-08-27 12:07:16.073')
, (8,1,'2010-08-25 12:15:49.413')
, (9,0,'2010-08-25 12:14:51.970')
, (10,1,'2010-04-12 16:43:44.777');
select *
, case when Row1 = Row2 then 1 else 0 end Row1EqualRow2
from (
select id, EmailApproved, Created
, row_number() over (partition by EmailApproved order by Created desc) Row1
, row_number() over (order by Created desc) Row2
from #Test
) X
--where Row1 = Row2
order by Created desc;
Which produces the following results:
id EmailApproved Created Row1 Row2 Row1EqualsRow2
1 0 2011-03-07 03:58:58.423 1 1 1
2 0 2011-02-21 04:55:52.103 2 2 1
3 0 2011-01-29 13:24:02.103 3 3 1
4 1 2010-10-12 14:41:54.217 1 4 0
5 0 2010-10-12 14:34:15.903 4 5 0
6 0 2010-10-12 10:10:19.123 5 6 0
7 1 2010-08-27 12:07:16.073 2 7 0
8 1 2010-08-25 12:15:49.413 3 8 0
9 0 2010-08-25 12:14:51.970 6 9 0
10 1 2010-04-12 16:43:44.777 4 10 0
What I actually want is:
id EmailApproved Created Row1 Row2 Row1EqualsRow2
1 0 2011-03-07 03:58:58.423 1 1 1
2 0 2011-02-21 04:55:52.103 2 2 1
3 0 2011-01-29 13:24:02.103 3 3 1
4 1 2010-10-12 14:41:54.217 1 4 0
5 0 2010-10-12 14:34:15.903 4 5 0
6 0 2010-10-12 10:10:19.123 5 6 0
Note: Row, Row2 & Row1EqualsRow2 are just working columns to show my calculations.
Steps:
Create a row number, rn, over all rows in case id is not in sequence.
Create a row number, approv_rn, partitioned by EmailApproved so we know when EmailApproved = 1 for the second time
Use a outer apply to find the row number of the second instance of EmailApproved = 1
In the where clause filter out all rows where the row number is >= the value found in step 3.
If there is 1 or 0 EmailApproved records available then the outer apply will return null, in which case return all available rows.
with test as
(
select *,
rn = row_number() over (order by Created desc),
approv_rn = row_number() over (partition by EmailApproved
order by Created desc)
from #Test
)
select *
from test t
outer apply
(
select x.rn
from test x
where x.EmailApproved = 1
and x.approv_rn = 2
) x
where t.rn < x.rn or x.rn is null
order by t.Created desc;

Can I start a new group when value changes from 0 to 1?

Can I somehow assign a new group to a row when a value in a column changes in T-SQL?
I would be grateful if you can provide solution that will work on unlimited repeating numbers without CTE and functions. I made a solution that work in sutuation with 100 consecutive identical numbers(with
coalesce(lag()over(), lag() over(), lag() over() ) - it is too bulky
but can not make a solution for a case with unlimited number of consecutive identical numbers.
Data
id somevalue
1 0
2 1
3 1
4 0
5 0
6 1
7 1
8 1
9 0
10 0
11 1
12 0
13 1
14 1
15 0
16 0
Expected
id somevalue group
1 0 1
2 1 2
3 1 2
4 0 3
5 0 3
6 1 4
7 1 4
8 1 4
9 0 5
10 0 5
11 1 6
12 0 7
13 1 8
14 1 8
15 0 9
16 0 9
If you just want a group identifier, you can use:
select t.*,
min(id) over (partition by some_value, seqnum - seqnum_1) as grp
from (select t.*,
row_number() over (order by id) as seqnum,
row_number() over (partition by somevalue order by id) as sequm_1
from t
) t;
If you want them enumerated . . . well, you can enumerate the id above using dense_rank(). Or you can use lag() and a cumulative sum:
select t.*,
sum(case when some_value = prev_sv then 0 else 1 end) over (order by id) as grp
from (select t.*,
lag(somevalue) over (order by id) as prev_sv
from t
) t;
Here's a different approach:
First I created a view to provide the group increment on each row:
create view increments as
select
n2.id,n2.somevalue,
case when n1.somevalue=n2.somevalue then 0 else 1 end as increment
from
(select 0 as id,1 as somevalue union all select * from mytable) n1
join mytable n2
on n2.id = n1.id+1
Then I used this view to produce the group values as cumulative sums of the increments:
select id, somevalue,
(select sum(increment) from increments i1 where i1.id <= i2.id)
from increments i2

Update table records with accumulated result

Lets say I have a table Tbl (Represents simple timelogs for work made on different customers)
Five columns
Id: int
TimeUse: float
IdCustomer: int
Created: DateTime
TimeCalc: float
I have a number of records in this table, (TimeCalc is initialized to value = 0)
What I want my SQL to do is:
when TimeUse for all foregoing records on a specific customer accumulates to a value < 10 then the value in TimeCalc should be 0
when TimeUse for all foregoing records on a specific customer accumulates to a value >= 10 then the value in TimeCalc should be = TimeUse for the record...
I have messed around with Case routines with subqueries, but can't get it working.
BEFORE
Id TimeUse IdCustomer Created TimeCalc
1 2 1 14/09/09 0
2 5 2 14/09/10 0
3 2 1 14/09/11 0
4 5 2 14/09/12 0
5 4 1 14/09/13 0
6 2 2 14/09/14 0
7 4 1 14/09/15 0
8 1 1 14/09/16 0
9 3 2 14/09/17 0
10 2 1 14/09/18 0
11 4 2 14/09/19 0
AFTER
Id TimeUse IdCustomer Created TimeCalc
1 2 1 14/09/09 0
2 5 2 14/09/10 0
3 2 1 14/09/11 0
4 5 2 14/09/12 0
5 4 1 14/09/13 0
6 2 2 14/09/14 2
7 4 1 14/09/15 0
8 1 1 14/09/16 1
9 3 2 14/09/17 3
10 2 1 14/09/18 2
11 4 2 14/09/19 4
Can this be solved in an SQL update?
In SQL Server 2012+, you can do this with a cumulative sum:
select Id, TimeUse, IdCustomer, Created,
(case when sum(timeuse) over (partition by idcustomer order by id) < 10 then 0
else timeuse
end) as timecalc
from table t;
You can do the same thing in earlier versions using outer apply or a subquery.
If you want an update, just use a CTE:
with toupdate as (
select t.*,
(case when sum(timeuse) over (partition by idcustomer order by id) < 10 then 0
else timeuse
end) as new_timecalc
from table t
)
update toupdate
set timecalc = new_timecalc;
EDIT:
The following will work in any version of SQL Server:
with toupdate as (
select t.*,
(case when (select sum(t2.timeuse)
from table t2
where t2.idcustomer = t.idcustomer and
t2.id <= t.id
) < 10 then 0
else timeuse
end) as new_timecalc
from table t
)
update toupdate
set timecalc = new_timecalc;

SQL Query to filter record for particular record count

I have a table which have Identity, RecordId, Type, Reading And IsDeleted columns. Identity is primary key that is auto increment, RecordId is integer that can have duplicate values, Type is a type of reading that can be either 'one' or 'average', Reading is integer that contains any integer value, and IsDeleted is bit that can be 0 or 1 i.e. false or true.
Now, I want the query that contains all the records of table in such a manner that if COUNT(Id) for each RecordId is greater than 2 then display all the records of that RecordId.
If COUNT(Id) == 2 for that specific RecordId and Reading value of both i.e. 'one' or 'average' type of the records are same then display only average record.
If COUNT(Id) ==1 then display only that record.
For example :
Id RecordId Type Reading IsDeleted
1 1 one 4 0
2 1 one 5 0
3 1 one 6 0
4 1 average 5 0
5 2 one 1 0
6 2 one 3 0
7 2 average 2 0
8 3 one 2 0
9 3 average 2 0
10 4 one 5 0
11 4 average 6 0
12 5 one 7 0
Ans result can be
Id RecordId Type Reading IsDeleted
1 1 one 4 0
2 1 one 5 0
3 1 one 6 0
4 1 average 5 0
5 2 one 1 0
6 2 one 3 0
7 2 average 2 0
9 3 average 2 0
10 4 one 5 0
11 4 average 6 0
12 5 one 7 0
In short I want to skip the 'one' type reading which have an average reading with same value and its count for 'one' type reading not more than one.
Check out this program
DECLARE #t TABLE(ID INT IDENTITY,RecordId INT,[Type] VARCHAR(10),Reading INT,IsDeleted BIT)
INSERT INTO #t VALUES
(1,'one',4,0),(1,'one',5,0),(1,'one',6,0),(1,'average',5,0),(2,'one',1,0),(2,'one',3,0),
(2,'average',2,0),(3,'one',2,0),(3,'average',2,0),(4,'one',5,0),(4,'average',6,0),(5,'one',7,0),
(6,'average',6,0),(6,'average',6,0),(7,'one',6,0),(7,'one',6,0)
--SELECT * FROM #t
;WITH GetAllRecordsCount AS
(
SELECT *,Cnt = COUNT(RecordId) OVER(PARTITION BY RecordId ORDER BY RecordId)
FROM #t
)
-- Condition 1 : When COUNT(RecordId) for each RecordId is greater than 2
-- then display all the records of that RecordId.
, GetRecordsWithCountMoreThan2 AS
(
SELECT * FROM GetAllRecordsCount WHERE Cnt > 2
)
-- Get all records where count = 2
, GetRecordsWithCountEquals2 AS
(
SELECT * FROM GetAllRecordsCount WHERE Cnt = 2
)
-- Condition 3 : When COUNT(RecordId) == 1 then display only that record.
, GetRecordsWithCountEquals1 AS
(
SELECT * FROM GetAllRecordsCount WHERE Cnt = 1
)
-- Condition 1: When COUNT(RecordId) > 2
SELECT * FROM GetRecordsWithCountMoreThan2 UNION ALL
-- Condition 2 : When COUNT(RecordId) == 2 for that specific RecordId and Reading value of
-- both i.e. 'one' or 'average' type of the records are same then display only
-- average record.
SELECT t1.* FROM GetRecordsWithCountEquals2 t1
JOIN (Select RecordId From GetRecordsWithCountEquals2 Where [Type] = ('one') )X
ON t1.RecordId = X.RecordId
AND t1.Type = 'average' UNION ALL
-- Condition 2: When COUNT(RecordId) = 1
SELECT * FROM GetRecordsWithCountEquals1
Result
ID RecordId Type Reading IsDeleted Cnt
1 1 one 4 0 4
2 1 one 5 0 4
3 1 one 6 0 4
4 1 average5 0 4
5 2 one 1 0 3
6 2 one 3 0 3
7 2 average2 0 3
9 3 average2 0 2
11 4 average6 0 2
12 5 one 7 0 1
;with a as
(
select Id,RecordId,Type,Reading,IsDeleted, count(*) over (partition by RecordId, Reading) cnt,
row_number() over (partition by RecordId, Reading order by Type, RecordId) rn
from table
)
select Id,RecordId,Type,Reading,IsDeleted
from a where cnt <> 2 or rn = 1
Assuming your table is named the_table, let's do this:
select main.*
from the_table as main
inner join (
select recordId, count(Id) as num, count(distinct Reading) as reading_num
from the_table
group by recordId
) as counter on counter.recordId=main.recordId
where num=1 or num>2 or reading_num=2 or main.type='average';
Untested, but it should be some variant of that.
EDIT TEST HERE ON FIDDLE
The short summary is that we want to join the table with an aggregated version of o=itself, then filter it based in the count criteria you mentioned (num=1, then show it; num=2, show just average record if reading numbers are the same otherwise show both; num>2, show all records).