Add order within group and mark whether a row is the last in its group - sql

I have a table in an SQL Server database on the following form, sorted according to id.
id group
1 10
17 10
24 10
2 20
16 20
72 20
104 20
8 30
9 30
I would like to select every row grouped according to the row group and add the following information to this table: the order (as sorted) within the group and whether the row is the last row in the group. In other words, something similar to this:
id group order last
1 10 1 0
17 10 2 0
24 10 3 1
2 20 1 0
16 20 2 0
72 20 3 0
104 20 4 1
8 30 1 0
9 30 2 1
I've tried fiddling around with ROW_NUMBER, but I'm not all that experienced with SQL Server and I can't get it to work. Does anyone have a suggestion?

Use ROW_NUMBER window function
select id,[group],
row_number()over(partition by [group] order by id) as [order],
case when row_number()over(partition by [group] order by id desc) = 1 then 1 else 0 end as Last
From yourtable

Related

If value is 0 (zero) then increment it with max number +1

I need to UPDATE all new inserted values of 0 with the highest value from the same column + 1. Any value with zero should be updated by the highest value +1. If the highest value is 30 below in the "Preference" column, then the next value should be 31 for Id 11 and 32 for Id 12. New values are inserted every 30 seconds, could be multiple, from the source table that I have no access to into the table below (table 1).
The UPDATE statement is executed when a user drags and drops a row in the web app.
UPDATE [DB].[dbo].[tbl1] SET
Preference = #Preference
WHERE Id = Id
I need to somehow add that logic to this statement described above. This is where I am lost.
Any ideas? Thank you for the help!!
For example:
ID
Preference
Account
3
7
22
6
8
33
7
9
44
9
0
55
11
0
66
Required results:
ID
Preference
Account
3
7
22
6
8
33
7
9
44
9
10
55
11
11
66
Gather the current maximum preference using a cross apply (or you could use a cross join) and together with row_number() ordered by ID you will increment preference as described:
with CTE as (
select id, preference, cp.maxpref, row_number() over(order by id) rn
from mytable
cross apply (select max(preference) maxpref
from mytable p
) cp
where preference = 0
)
update cte
set preference = maxpref + rn
where preference = 0
see db<>fiddle here
select *
from mytable
order by id
ID
Preference
Account
3
7
22
6
8
33
7
9
44
9
10
55
11
11
66

calculate avg(value) for last 10 records postgresql

i have a tricky task,
lets assume we have table "Racings", and there we have columns TRACK, CAR, CIRCLE_TIME
here is an example how data could be look like:
id
track
car
circle_time
10
1
10
15
9
1
10
14
8
1
10
16
7
1
10
15
6
1
10
13
5
2
10
7
4
2
10
4
3
2
10
5
2
3
10
8
1
3
10
10
what i need, i to add one more coumn like avg3_circle_time which will show me an average time from last 3 circle_time from each track, example:
id
track
car
circle_time
avg3_circle_time
10
1
10
15
15
9
1
10
14
15
8
1
10
16
14.6
7
1
10
15
null
6
1
10
13
null
5
2
10
7
5.3
4
2
10
4
null
3
2
10
5
null
2
3
10
8
null
1
3
10
10
null
I know how it could works in oracle, you could use something like rowid, but in case of postgresql i don't know, i have a draft like .....avg(circle_time) OVER(PARTITION BY track,car.....) as avg3_circle_time..... help me to solve that task please
You can use window functions to calculate moving averages:
SELECT track, id, car, circle_time, AVG(circle_time) OVER (
PARTITION BY track
ORDER BY id
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
FROM t
ORDER BY track, id
Depending on your definition of previous three, the window could be ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING.
If you want only values when at least 3 circles available
select *
, case when lag(id, 2) over(partition by TRACK, CAR order by id) is not null then
avg(CIRCLE_TIME) over(partition by TRACK, CAR order by id rows between 2 preceding and current row) end a
from Racing
order by id desc;
db<>fiddle
Output
id track car circle_time a
10 1 10 15 15.0000000000000000
9 1 10 14 15.0000000000000000
8 1 10 16 14.6666666666666667
7 1 10 15 null
6 1 10 13 null
5 2 10 7 5.3333333333333333
4 2 10 4 null
3 2 10 5 null
2 3 10 8 null
1 3 10 10 null
Use LAED() then checking one of the next 2 rows is NULL or not. THEN sum of three values for calculating average.
-- PostgreSQL
SELECT *
, CASE WHEN next_circle_time IS NULL OR next_next_circle_time IS NULL
THEN NULL
ELSE ((t.circle_time + COALESCE(next_circle_time, 0) + COALESCE(next_next_circle_time, 0)) / 3 :: DECIMAL) :: DECIMAL(10, 1)
END avg_circle_time
FROM (SELECT *
, LEAD(circle_time, 1) OVER (PARTITION BY track ORDER BY id DESC) next_circle_time
, LEAD(circle_time, 2) OVER (PARTITION BY track ORDER BY id DESC) next_next_circle_time
FROM Racings) t
Another way Use AVG()
SELECT *
, CASE WHEN LEAD(circle_time, 2) OVER (PARTITION BY track ORDER BY id DESC) IS NULL
OR LEAD(circle_time, 1) OVER (PARTITION BY track ORDER BY id DESC) IS NULL
THEN NULL
ELSE AVG(circle_time) OVER (PARTITION BY track ORDER BY id DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
END :: DECIMAL(10, 2) avg_circle_time
FROM Racings
Please check from url where both query exists https://dbfiddle.uk/?rdbms=postgres_11&fiddle=f0cd868623725a1b92bf988cfb2deba3
Several of the posted answers end up repeating the window definition. You can avoid this with the window clause:
select *,
case when row_number() over(track_window) > 2
then trunc(avg(CIRCLE_TIME) over(track_window rows 2 preceding), 1)
end a
from Racing
window track_window as (partition by track order by id)
order by id desc
Note how, in this sample, track_window is defined once, then reused for both row_number and avg. In the latter case, the window clause is embellished with a frame as well (rows 2 preceding).

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.
SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

sql best strategy to partition same values based on temporal sequence

I have data that looks like this, where there are multiple values for each ID that correspond to an ascending date variable:
ID LEVEL DATE
1 10 10/1/2000
1 10 11/20/2001
1 10 12/01/2001
1 30 02/15/2002
1 30 02/15/2002
1 20 05/17/2002
1 20 01/04/2003
1 30 07/20/2003
1 30 03/16/2004
1 30 04/15/2004
I want to acquire a count per each ID/LEVEL/DATE block that looks like this:
ID LEVEL COUNT
1 10 3
1 30 2
1 20 2
1 30 3
The problem is that if I use the count windows function and partition by level, it groups 30 together regardless of the temporal sequence. I want the count for level 30 both before and after 20 to be distinct. Does anyone know how to do that?
A standard gaps and islands solution using ROW_NUMBER(), if it's available on your particular DBMS...
WITH
ordered AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date) AS set_ordinal,
ROW_NUMBER() OVER (PARTITION BY id, level ORDER BY date) AS grp_ordinal
FROM
yourData
)
SELECT
id,
level,
set_ordinal - grp_ordinal,
MIN(date),
COUNT(*)
FROM
ordered
GROUP BY
id,
level,
set_ordinal - grp_ordinal
ORDER BY
id,
MIN(date)
Visualising the effect of the two row numbers...
ID LEVEL DATE set_ordinal grp_ordinal set-grp GROUP
-- ----- ---------- ----------- ----------- ------- --------
1 10 10/01/2000 1 1 0 1,10,0
1 10 11/20/2001 2 2 0 1,10,0
1 10 12/01/2001 3 3 0 1,10,0
1 30 02/15/2002 4 1 3 1,30,3
1 30 02/15/2002 5 2 3 1,30,3
1 20 05/17/2002 6 1 5 1,20,5
1 20 01/04/2003 7 2 5 1,20,5
1 30 07/20/2003 8 3 5 1,30,5
1 30 03/16/2004 9 4 5 1,30,5
1 30 04/15/2004 10 5 5 1,30,5

How to get a correct row_number reset using row_number()?

Here's my Fiddle.
Requirement: Every time Entry Form changes, then reset numbering on new_form_line_no
The last column New_form_line_no correctly resets as expected on Line_number=7.
But, also want it to reset on line #3 because Entry_form changes from PR to OM.
Where should I make my correction to get the following results?
20 1 1 R OM 1
20 2 1 N PR 1
20 3 2 R OM 1 --This should reset to 1
20 4 3 A OM 2
20 5 4 2 OM 3
20 6 5 P OM 4
20 7 47 S OL 1
20 8 48 A OL 2
20 9 49 T OL 3
20 10 50 2 OL 4
20 11 51 T OL 5
20 12 52 L OL 6
20 13 53 S OL 7
20 14 54 O OL 8
The problem here is that the first two records can appear in either order since they both have the same values in columns used in ORDER BY clause. If you don't mind that, or you have another way to determine which should be first (and you can alter the ORDER BY in analytic functions below), you can try following solution:
SELECT
data.*,
row_number() OVER (PARTITION BY entry_id, entry_form, same_as_prev_or_next
ORDER BY entry_seq) AS new_form_line_no_2
FROM (
SELECT
entry_id,
row_number() OVER (ORDER BY entry_id, entry_seq) AS line_number,
entry_seq,
entry_text,
entry_form,
CASE
WHEN lag(entry_form, 1, 0) OVER (PARTITION BY entry_id ORDER BY entry_seq) = entry_form
OR lead(entry_form, 1, 0) OVER (PARTITION BY entry_id ORDER BY entry_seq) = entry_form THEN 1
ELSE 0
END AS same_as_prev_or_next
FROM entrants
) data
ORDER BY entry_seq
;
It does not return what you expect, but it is due to the fact I mentioned - the order of the first two rows is inconclusive.
SQLFiddle: http://sqlfiddle.com/#!4/abb58/18