Add counter column to table for every n rows - sql

I am looking to add a column like CusID that would be essentially a counter that can be chosen according to variable #nrows. In this case #nrows is 3 and just simply goes down the table date added and for each item in the row it adds the counter.
CustID --- DateAdded ---
1 2012-02-09
1 2012-02-09
1 2012-02-08
2 2012-02-07
2 2012-02-07
2 2012-02-07
3 2012-02-06
3 2012-02-06
If someone could tell me how to do that in MSSQL, it would be greatly appreciated.

This can be done in Excel with two formulas the first one counts rows and compares to #nrows
Location A3 in screen shot
=IF(B3=B2,(A2+1),1)
Second places the ID, location B4 in the screen shot
=IF(A3=$B$1,B3+1,B3)
The value in B1 is the variable "#nrows"
The value in B3 is the starter ID, so you can start at any value you want.

What about
=MAX(1,ROUNDUP(ROW()/#NROWS,0))
which I believe produces the result you want.
One reason it might not work is the "#NROWS" variable, which OP indicated he wanted to use. I confess that in my testing I used
=MAX(1,ROUNDUP(ROW()/3,0))

Don't know how to do it in excel, but you can first load data into SQL server, then the following syntax will help you
select NTILE(#NRows) over (order by DateAdded desc), DateAdded from tablename

Apply the ROW_NUMBER() function to the row set. It will produce sequential numbers starting from 1. Modify those by adding #nrows - 1 to them and dividing the results by #nrows:
SELECT
CustID = (ROW_NUMBER() OVER (ORDER BY DateAdded) + #nrows - 1) / #nrows,
DateAdded
FROM atable
;
See a demo at SQL Fiddle.

Related

SQL Server query order by sequence serie

I am writing a query and I want it to do a order by a series. The first seven records should be ordered by 1,2,3,4,5,6 and 7. And then it should start all over.
I have tried over partition, last_value but I cant figure it out.
This is the SQL code:
set language swedish;
select
tblridgruppevent.id,
datepart(dw,date) as daynumber,
tblRidgrupper.name
from
tblRidgruppEvent
join
tblRidgrupper on tblRidgrupper.id = tblRidgruppEvent.ridgruppid
where
ridgruppid in (select id from tblRidgrupper
where corporationID = 309 and Removeddate is null)
and tblridgruppevent.terminID = (select id from tblTermin
where corporationID = 309 and removedDate is null and isActive = 1)
and tblridgrupper.removeddate is null
order by
datepart(dw, date)
and this is a example the result:
5887 1 J2
5916 1 J5
6555 2 Junior nybörjare
6004 2 Morgonridning
5911 3 J2
6467 3 J5
and this is what I would expect:
5887 1 J2
6555 2 Junior nybörjare
5911 3 J2
5916 1 J5
6004 2 Morgonridning
6467 3 J5
You might get some value by zooming out a little further and consider what you're trying to do and how else you might do it. SQL tends to perform very poorly with row by row processing as well as operations where a row borrows details from the row before it. You also could run into problems if you need to change what range you repeat at (switching from 7 to 10 or 4 etc).
If you need a number there somewhat arbitrarily still, you could add ROW_NUMBER combined with a modulo to get a repeating increment, then add it to your select/where criteria. It would look something like this:
((ROW_NUMBER() OVER(ORDER BY column ASC) -1) % 7) + 1 AS Number
The outer +1 is to display the results as 1-7 instead of 0-6, and the inner -1 deals with the off by one issue (the column starting at 2 instead of 1). I feel like there's a better way to deal with that, but it's not coming to me at the moment.
edit: Looking over your post again, it looks like you're dealing with days of the week. You can order by Date even if it's not shown in the select statement, that might be all you need to get this working.
The first seven records should be ordererd by 1,2,3,4,5,6 and 7. And then it should start all over.
You can use row_number():
order by row_number() over (partition by DATEPART(dw, date) order by tblridgruppevent.id),
datepart(dw, date)
The second key keeps the order within a group.
You don't specify how the rows should be chosen for each group. It is not clear from the question.

SQL Server 2005 - SUM'ing one field, but only for the first occurence of a second field

Platform: SQL Server 2005 Express
Disclaimer: I’m quite a novice to SQL and so if you are happy to help with what may be a very simple question, then I won’t be offended if you talk slowly and use small words :-)
I have a table where I want to SUM the contents of multiple rows. However, I want to SUM one column only for the first occurrence of text in a different column.
Table schema for table 'tblMain'
fldOne {varchar(100)} Example contents: “Dandelion“
fldTwo {varchar(8)} Example contents: “01:00:00” (represents hh:mm:ss)
fldThree {numeric(10,0)} Example contents: “65”
Contents of table:
Row number fldOne fldTwo fldThree
------------------------------------------------
1 Dandelion 01:00:00 99
2 Daisy 02:15:00 88
3 Dandelion 00:45:00 77
4 Dandelion 00:30:00 10
5 Dandelion 00:15:00 200
6 Rose 01:30:00 55
7 Daisy 01:00:00 22
etc. ad nausium
If I use:
Select * from tblMain where fldTwo < ’05:00:00’ order by fldOne, fldTwo desc
Then all rows are correctly returned, ordered by fldOne and then fldTwo in descending order (although in the example data I've shown, all the data is already in the correct order!)
What I’d like to do is get the SUM of each fldThree, but only from the first occurrence of each fldOne.
So, SUM the first Dandelion, Daisy and Rose that I come across. E.g.
99+88+55
At the moment, I’m doing this programmatically; return a RecordSet from the Select statement above, and MoveNext through each returned row, only adding fldThree to my ‘total’ if I’ve never seen the text from fldOne before. It works, but most of the Select queries return over 100k rows and so it’s quite slow (slow being a relative term – it takes about 50 seconds on my setup).
The actual select statement (selecting about 100k rows from 1.5m total rows) completes in under a second which is fine. The current programatic loop is quite small and tight, it's just the number of loops through the RecordSet that takes time. I'm using adOpenForwardOnly and adLockReadOnly when I open the record set.
This is a routine that basically runs continuously as more data is added, and also the fldTwo 'times' vary, so I can't be more specific with the Select statement.
Everything that I’ve so far managed to do natively with SQL seems to run quickly and I’m hoping I can take the logic (and work) away from my program and get SQL to take the strain.
Thanks in advance
The best way to approach this is with window functions. These let you enumerate the rows within a group. However, you need some way to identify the first row. SQL tables are inherently unordered, so you need a column to specify the ordering. Here are some ideas.
If you have an id column, which is defined as an identity so it is autoincremented:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by id) as seqnum
from tblMain m
) m
where seqnum = 1
To get an arbitrary row, you could use:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by (select NULL as noorder)) as seqnum
from tblMain m
) m
where seqnum = 1
Or, if FldTwo has the values in reverse order:
select sum(fldThree)
from (select m.*,
row_number() over (partition by fldOne order by FldTwo desc) as seqnum
from tblMain m
) m
where seqnum = 1
Maybe this?
SELECT SUM(fldThree) as ExpectedSum
FROM
(SELECT *, ROW_NUMBER() OVER (PARTITION BY fldOne ORDER BY fldTwo DSEC) Rn
FROM tblMain) as A
WHERE Rn = 1

Missing gaps in recurring series within a group

We have a table with following data
Id,ItemId,SeqNumber;DateTimeTrx
1,100,254,2011-12-01 09:00:00
2,100,1,2011-12-01 09:10:00
3,200,7,2011-12-02 11:00:00
4,200,5,2011-12-02 10:00:00
5,100,255,2011-12-01 09:05:00
6,200,3,2011-12-02 09:00:00
7,300,0,2011-12-03 10:00:00
8,300,255,2011-12-03 11:00:00
9,300,1,2011-12-03 10:30:00
Id is an identity column.
The sequence for an ItemId starts from 0 and goes till 255 and then resets to 0. All this information is stored in a table called Item. The order of sequence number is determined by the DateTimeTrx but such data can enter any time into the system. The expected output is as shown below-
ItemId,PrevorNext,SeqNumber,DateTimeTrx,MissingNumber
100,Previous,255,2011-12-01 09:05:00,0
100,Next,1,2011-12-01 09:10:00,0
200,Previous,3,2011-12-02 09:00:00,4
200,Next,5,2011-12-02 10:00:00,4
200,Previous,5,2011-12-02 10:00:00,6
200,Next,7,2011-12-02 11:00:00,6
300,Previous,1,2011-12-03 10:30:00,2
300,Next,255,2011-12-03 16:30:00,2
We need to get those rows one before and one after the missing sequence. In the above example for ItemId 300 - the record with sequence 1 has entered first (2011-12-03 10:30:00) and then 255(2011-12-03 16:30:00), hence the missing number here is 2. So 1 is previous and 255 is next and 2 is the first missing number. Coming to ItemId 100, the record with sequence 255 has entered first (2011-12-02 09:05:00) and then 1 (2011-12-02 09:10:00), hence 255 is previous and then 1, hence 0 is the first missing number.
In the above expected result, MissingNumber column is the first occuring missing number just to illustrate the example.
We will not have a case where we would have a complete series reset at one time i.e. it can be either a series rundown from 255 to 0 as in for itemid 100 or 0 to 255 as in ItemId 300. Hence we need to identify sequence missing when in ascending order (0,1,...255) or either in descending order (254,254,0,2) etc.
How can we accomplish this in a t-sql?
Could work like this:
;WITH b AS (
SELECT *
,row_number() OVER (ORDER BY ItemId, DateTimeTrx, SeqNumber) AS rn
FROM tbl
), x AS (
SELECT
b.Id
,b.ItemId AS prev_Itm
,b.SeqNumber AS prev_Seq
,c.ItemId AS next_Itm
,c.SeqNumber AS next_Seq
FROM b
JOIN b c ON c.rn = b.rn + 1 -- next row
WHERE c.ItemId = b.ItemId -- only with same ItemId
AND c.SeqNumber <> (b.SeqNumber + 1)%256 -- Seq cycles modulo 256
)
SELECT Id, prev_Itm, 'Previous' AS PrevNext, prev_Seq
FROM x
UNION ALL
SELECT Id, next_Itm ,'Next', next_Seq
FROM x
ORDER BY Id, PrevNext DESC
Produces exactly the requested result.
See a complete working demo on data.SE.
This solution takes gaps in the Id column into consideration, as there is no mention of a gapless sequence of Ids in the question.
Edit2: Answer to updated question:
I updated the CTE in the query above to match your latest verstion - or so I think.
Use those columns that define the sequence of rows. Add as many columns to your ORDER BY clause as necessary to break ties.
The explanation to your latest update is not entirely clear to me, but I think you only need to squeeze in DateTimeTrx to achieve what you want. I have SeqNumber in the ORDER BY additionally to break ties left by identical DateTimeTrx. I edited the query above.

SQL Update each record with its position in an ordered select

I'm using Access via OleDb. I have a table with columns ID, GroupID, Time and Place. An application inserts new records into the table, unfortunately the Place isn't calculated correctly.
I want to update each record in a group with its correct place according to its time ascending.
So assume the following data:
ID GroupId Time Place
Chuck 1 10:01 2
Alice 1 09:01 3
Bob 1 09:31 1
should result in:
ID GroupId Time Place
Chuck 1 10:01 3
Alice 1 09:01 1
Bob 1 09:31 2
I could come up with a solution using a cursor but that's AFAIK not possible in Access.
I just did a search on performing "ranking in Access" and I got this support.microsoft result.
It seems you create a query with a field that has the following expression:
Place: (Select Count(*) from table1 Where [Time] < [table1alias].[Time]) + 1
I can't test this, so I hope it works.
Using this you may be able to do (where queryAbove is the above query):
UPDATE table1
SET [Place] = queryAbove.[Place]
FROM queryAbove
WHERE table1.ID = queryAbove.ID
It's a long shot but please give it a go.
I don't think time is a number or time formatted column, time is unfortunately a text string containing the numbers and dilimetrs of the time format. This is why sorting after the time column is illegal. Removing the dilimiters ":" and "," casting to integer and then sorting numirically could do the job

How to get a value from previous result row of a SELECT statement?

If we have a table called FollowUp and has rows [ ID(int) , Value(Money) ]
and we have some rows in it, for example
ID --Value
1------70
2------100
3------150
8------200
20-----250
45-----280
and we want to make one SQL Query that get each row ID,Value and the previous Row Value in which data appear as follow
ID --- Value ---Prev_Value
1 ----- 70 ---------- 0
2 ----- 100 -------- 70
3 ----- 150 -------- 100
8 ----- 200 -------- 150
20 ---- 250 -------- 200
45 ---- 280 -------- 250
i make the following query but i think it's so bad in performance in huge amount of data
SELECT FollowUp.ID, FollowUp.Value,
(
SELECT F1.Value
FROM FollowUp as F1 where
F1.ID =
(
SELECT Max(F2.ID)
FROM FollowUp as F2 where F2.ID < FollowUp.ID
)
) AS Prev_Value
FROM FollowUp
So can anyone help me to get the best solution for such a problem ?
This sql should perform better then the one you have above, although these type of queries tend to be a little performance intensive... so anything you can put in them to limit the size of the dataset you are looking at will help tremendously. For example if you are looking at a specific date range, put that in.
SELECT followup.value,
( SELECT TOP 1 f1.VALUE
FROM followup as f1
WHERE f1.id<followup.id
ORDER BY f1.id DESC
) AS Prev_Value
FROM followup
HTH
You can use the OVER statement to generate nicely increasing row numbers.
select
rownr = row_number() over (order by id)
, value
from your_table
With the numbers, you can easily look up the previous row:
with numbered_rows
as (
select
rownr = row_number() over (order by id)
, value
from your_table
)
select
cur.value
, IsNull(prev.value,0)
from numbered_rows cur
left join numbered_rows prev on cur.rownr = prev.rownr + 1
Hope this is useful.
This is not an answer to your actual question.
Instead, I feel that you are approaching the problem from a wrong direction:
In properly normalized relational databases the tuples ("rows") of each table should contain references to other db items instead of the actual values. Maintaining these relations between tuples belongs to the data insertion part of the codebase.
That is, if containing the value of a tuple with closest, smaller id number really belongs into your data model.
If the requirement to know the previous value comes from the view part of the application - that is, a single view into the data that needs to format it in certain way - you should pull the contents out, sorted by id, and handle the requirement in view specific code.
In your case, I would assume that knowing the previous tuples' value really would belong in the view code instead of the database.
EDIT: You did mention that you store them separately and just want to make a query for it. Even still, application code would probably be the more logical place to do this combining.
What about pulling the lines into your application and computing the previous value there?
Create a stored procedure and use a cursor to iterate and produce rows.
You could use the function 'LAG'.
SELECT ID,
Value,
LAG(value) OVER(ORDER BY ID) AS Prev_Value
FROM FOLLOWUP;