Missing gaps in recurring series within a group

Missing gaps in recurring series within a group - sql

We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?

Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.

Related

need to pull a specific record

There is 1 record having duplicate values except in 1 column having x and y
record status
XXXXXXXXXX A
XXXXXXXXXX B
Need to pull A only and remove the other duplicate B
Select record
case
when status in ("'a', 'b'") then ('a')
from xyz

Let suppose you have data as below where Status is repeating for First column
but you are interesting in the status which is of having lower value as given below:
In this case following SQL may help. Here, we are partitioning on key field and ordering the Status so that we can apply filter on rank to get desired result.
WITH sampleData AS
 (SELECT '1234' as Field1,  'A' as STATUS UNION ALL 
  SELECT '1234',  'C' UNION ALL
  SELECT '5678', 'A' UNION ALL 
  SELECT '5678',  'B' )
 select * except(rank) from (
 select *, rank() over (partition by Field1 order by STATUS ASC) rank from sampleData)
 where rank = 1
 order by Field1

Consider below approach
select * from sampledata
qualify 1 = row_number() over win
window win as (partition by field1 order by if(status='A',1,2) )
if applied to sample data in your question - output is

SQL Server query order by sequence serie

I am writing a query and I want it to do a order by a series. The first seven records should be ordered by 1,2,3,4,5,6 and 7. And then it should start all over.
I have tried over partition, last_value but I cant figure it out.
This is the SQL code:
set language swedish;
select
tblridgruppevent.id,
datepart(dw,date) as daynumber,
tblRidgrupper.name
from
tblRidgruppEvent
join
tblRidgrupper on tblRidgrupper.id = tblRidgruppEvent.ridgruppid
where
ridgruppid in (select id from tblRidgrupper
where corporationID = 309 and Removeddate is null)
and tblridgruppevent.terminID = (select id from tblTermin
where corporationID = 309 and removedDate is null and isActive = 1)
and tblridgrupper.removeddate is null
order by
datepart(dw, date)
and this is a example the result:
5887 1 J2
5916 1 J5
6555 2 Junior nybörjare
6004 2 Morgonridning
5911 3 J2
6467 3 J5
and this is what I would expect:
5887 1 J2
6555 2 Junior nybörjare
5911 3 J2
5916 1 J5
6004 2 Morgonridning
6467 3 J5

You might get some value by zooming out a little further and consider what you're trying to do and how else you might do it. SQL tends to perform very poorly with row by row processing as well as operations where a row borrows details from the row before it. You also could run into problems if you need to change what range you repeat at (switching from 7 to 10 or 4 etc).
If you need a number there somewhat arbitrarily still, you could add ROW_NUMBER combined with a modulo to get a repeating increment, then add it to your select/where criteria. It would look something like this:
((ROW_NUMBER() OVER(ORDER BY column ASC) -1) % 7) + 1 AS Number
The outer +1 is to display the results as 1-7 instead of 0-6, and the inner -1 deals with the off by one issue (the column starting at 2 instead of 1). I feel like there's a better way to deal with that, but it's not coming to me at the moment.
edit: Looking over your post again, it looks like you're dealing with days of the week. You can order by Date even if it's not shown in the select statement, that might be all you need to get this working.

The first seven records should be ordererd by 1,2,3,4,5,6 and 7. And then it should start all over.
You can use row_number():
order by row_number() over (partition by DATEPART(dw, date) order by tblridgruppevent.id),
datepart(dw, date)
The second key keeps the order within a group.
You don't specify how the rows should be chosen for each group. It is not clear from the question.

where clause with = sign matches multiple records while expected just one record

I have a simple inline view that contains 2 columns.
-----------------
rn | val
-----------------
0 | A
... | ...
25 | Z
I am trying to select a val by matching the rn randomly by using the dbms_random.value() method as in
with d (rn, val) as
(
select level-1, chr(64+level) from dual connect by level <= 26
)
select * from d
where rn = floor(dbms_random.value()*25)
;
My expectation is it should return one row only without failing.
But now and then I get multiple rows returned or no rows at all.
on the other hand,
>>select floor(dbms_random.value()*25) from dual connect by level <1000
returns a whole number for each row and I failed to see any abnormality.
What am I missing here?

The problem is that the random value is recalculated for each row. So, you might get two random values that match the value -- or go through all the values and never get a hit.
One way to get around this is:
select d.*
from (select d.*
from d
order by dbms_random.value()
) d
where rownum = 1;
There are more efficient ways to calculate a random number, but this is intended to be a simple modification to your existing query.
You also might want to ask another question. This question starts with a description of a table that is not used, and then the question is about a query that doesn't use the table. Ask another question, describing the table and the real problem you are having -- along with sample data and desired results.

Add counter column to table for every n rows

I am looking to add a column like CusID that would be essentially a counter that can be chosen according to variable #nrows. In this case #nrows is 3 and just simply goes down the table date added and for each item in the row it adds the counter.
CustID --- DateAdded ---
1 2012-02-09
1 2012-02-09
1 2012-02-08
2 2012-02-07
2 2012-02-07
2 2012-02-07
3 2012-02-06
3 2012-02-06
If someone could tell me how to do that in MSSQL, it would be greatly appreciated.

This can be done in Excel with two formulas the first one counts rows and compares to #nrows
Location A3 in screen shot
=IF(B3=B2,(A2+1),1)
Second places the ID, location B4 in the screen shot
=IF(A3=$B$1,B3+1,B3)
The value in B1 is the variable "#nrows"
The value in B3 is the starter ID, so you can start at any value you want.

What about
=MAX(1,ROUNDUP(ROW()/#NROWS,0))
which I believe produces the result you want.
One reason it might not work is the "#NROWS" variable, which OP indicated he wanted to use. I confess that in my testing I used
=MAX(1,ROUNDUP(ROW()/3,0))

Don't know how to do it in excel, but you can first load data into SQL server, then the following syntax will help you
select NTILE(#NRows) over (order by DateAdded desc), DateAdded from tablename

Apply the ROW_NUMBER() function to the row set. It will produce sequential numbers starting from 1. Modify those by adding #nrows - 1 to them and dividing the results by #nrows:
SELECT
CustID = (ROW_NUMBER() OVER (ORDER BY DateAdded) + #nrows - 1) / #nrows,
DateAdded
FROM atable
;
See a demo at SQL Fiddle.

SQL Server 2005 - SUM'ing one field, but only for the first occurence of a second field

Platform: SQL Server 2005 Express
Disclaimer: I’m quite a novice to SQL and so if you are happy to help with what may be a very simple question, then I won’t be offended if you talk slowly and use small words :-)
I have a table where I want to SUM the contents of multiple rows. However, I want to SUM one column only for the first occurrence of text in a different column.
Table schema for table 'tblMain'
fldOne {varchar(100)} Example contents: “Dandelion“
fldTwo {varchar(8)} Example contents: “01:00:00” (represents hh:mm:ss)
fldThree {numeric(10,0)} Example contents: “65”
Contents of table:
Row number fldOne fldTwo fldThree
------------------------------------------------
1 Dandelion 01:00:00 99
2 Daisy 02:15:00 88
3 Dandelion 00:45:00 77
4 Dandelion 00:30:00 10
5 Dandelion 00:15:00 200
6 Rose 01:30:00 55
7 Daisy 01:00:00 22
etc. ad nausium
If I use:
Select * from tblMain where fldTwo < ’05:00:00’ order by fldOne, fldTwo desc
Then all rows are correctly returned, ordered by fldOne and then fldTwo in descending order (although in the example data I've shown, all the data is already in the correct order!)
What I’d like to do is get the SUM of each fldThree, but only from the first occurrence of each fldOne.
So, SUM the first Dandelion, Daisy and Rose that I come across. E.g.
99+88+55
At the moment, I’m doing this programmatically; return a RecordSet from the Select statement above, and MoveNext through each returned row, only adding fldThree to my ‘total’ if I’ve never seen the text from fldOne before. It works, but most of the Select queries return over 100k rows and so it’s quite slow (slow being a relative term – it takes about 50 seconds on my setup).
The actual select statement (selecting about 100k rows from 1.5m total rows) completes in under a second which is fine. The current programatic loop is quite small and tight, it's just the number of loops through the RecordSet that takes time. I'm using adOpenForwardOnly and adLockReadOnly when I open the record set.
This is a routine that basically runs continuously as more data is added, and also the fldTwo 'times' vary, so I can't be more specific with the Select statement.
Everything that I’ve so far managed to do natively with SQL seems to run quickly and I’m hoping I can take the logic (and work) away from my program and get SQL to take the strain.
Thanks in advance

The best way to approach this is with window functions. These let you enumerate the rows within a group. However, you need some way to identify the first row. SQL tables are inherently unordered, so you need a column to specify the ordering. Here are some ideas.
If you have an id column, which is defined as an identity so it is autoincremented:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by id) as seqnum
from tblMain m
) m
where seqnum = 1
To get an arbitrary row, you could use:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by (select NULL as noorder)) as seqnum
from tblMain m
) m
where seqnum = 1
Or, if FldTwo has the values in reverse order:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by FldTwo desc) as seqnum
from tblMain m
) m
where seqnum = 1

Maybe this?
SELECT SUM(fldThree) as ExpectedSum
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY fldOne ORDER BY fldTwo DSEC) Rn
FROM tblMain) as A
WHERE Rn = 1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas