get intervals of nonchanging value from a sequence of numbers - sql

I need to sumarize a sequence of values into intervals of nonchanging values - begin, end and value for each such interval. I can easily do it in plsql but would like a pure sql solution for both performance and educational reasons. I have been trying for some time to solve it with analytical functions, but can't figure how to properly define windowing clause. The problem I am having is with a repeated value.
Simplified example -
given input:
id value
1 1
2 1
3 2
4 2
5 1
I'd like to get output
from to val
1 2 1
3 4 2
5 5 1

You want to identify groups of adjacent values. One method is to use lag() to find the beginning of the sequence, then a cumulative sum to identify the groups.
Another method is the difference of row number:
select value, min(id) as from_id, max(id) as to_id
from (select t.*,
(row_number() over (order by id) -
row_number() over (partition by val order by id
) as grp
from table t
) t
group by grp, value;

Using a CTE to collect all the rows and identifying them into changing values, then finally grouping together for the changing values.
CREATE TABLE #temp (
ID INT NOT NULL IDENTITY(1,1),
[Value] INT NOT NULL
)
GO
INSERT INTO #temp ([Value])
SELECT 1 UNION ALL
SELECT 1 UNION ALL
SELECT 2 UNION ALL
SELECT 2 UNION ALL
SELECT 1;
WITH Marked AS (
SELECT
*,
grp = ROW_NUMBER() OVER (ORDER BY ID)
- ROW_NUMBER() OVER (PARTITION BY Value ORDER BY ID)
FROM #temp
)
SELECT MIN(ID) AS [From], MAX(ID) AS [To], [VALUE]
FROM Marked
GROUP BY grp, Value
ORDER BY MIN(ID)
DROP TABLE #temp;

Related

Group sequential integers postgres

I have a query that returns:
1
2
5
7
8
I'd like to group the rows like:
{1,2}
{5}
{7,8}
In other words I need to group the sequential values.
Any idea?
This is a Gaps and Islands problem.
You can try to make row number then do some Calculate make the group by number.
CREATE TABLE T(
ID INT
);
INSERT INTO T VALUES (1);
INSERT INTO T VALUES (2);
INSERT INTO T VALUES (5);
INSERT INTO T VALUES (7);
INSERT INTO T VALUES (8);
Query 1:
WITH CTE AS (
SELECT Min(id) minid,MAX(ID) maxid
FROM (
SELECT ID,ID - ROW_NUMBER() OVER(ORDER BY ID) rn
FROM T
)t1
group by rn
)
SELECT (CASE WHEN minid = maxid
THEN CAST(maxid AS VARCHAR(50))
ELSE CONCAT(minid,',',maxid)
END) ID
FROM CTE
ORDER BY minid
Results:
| id |
|-----|
| 1,2 |
| 5 |
| 7,8 |
Based on your response to the reason for the need, that you are trying to locate broken sequences in a table. I would identify gaps like this. It doesn't give you explicitly what you requested, but what you requested is quite a bit more complex.
select col1
from mytable S
where not exists
(select col1
from mytable T
where T.col1 = S.col1 + 1)
This would return the last number that was sequential in each set prior to the gap.
To get to exactly what you wanted you could use the results of this table and some between statements to ultimately get to your explicit request or something more complex. You may be over-thinking it though.
This looks like an array would be suitable, so:
select array_agg(col order by col)
from (select t.*, row_number() over (order by col) as seqnum
from t
) t
group by (col - seqnum);
EDIT:
If you just want the start and end, use max() and min():
select min(col), max(col)
from (select t.*, row_number() over (order by col) as seqnum
from t
) t
group by (col - seqnum);

SQL Server Row_number in ORDER BY CASE

EDITED THE WHOLE TOPIC.
I need to create a view that sort article per type.
If I only have the type : *VALUE -> I need to show this line only.
If I have the type : *VALUE & 2 -> Still showing row accordingly to *VALUE type only.
If I only have the type : 2 -> Showing this one.
I already did somethink like this :
VALUE* is a value that should come from an another table with a Join.
SELECT Id_item ,Name_item , Type_item , Id_type_item FROM ITEM
WHERE Name_item = 'Gillette' AND (Id_Type_item = VALUE* OR Id_Type_item ='10')
ORDER BY CASE
WHEN row_number() OVER(ORDER BY Id_item DESC , Id_Type_Item DESC) <= 1 THEN 0
ELSE 1
END;
But it does that in the case where we've got both row for the types(*VALUE & 10):
Id_item / Name_item / Type_item / Id_Type_Item
1 Gillette 45 30 (*VALUE)
1 Gillette 2 10
So I think that the order by on the Over() could be useful to always sort by *VALUE (which are in reality another column from another table)
I always want to select 1 row of data only ! :)
I'm guessing, that what you want is the "first" row returned from each SELECT? There's no need to use a separate SELECT statement for each variable on the same table, you can use a window function to do so. I believe this is what you might be after.
WITH CTE AS(
--The following assumes table A and B have the same DDL (which begs the question, why are they different tables?)
SELECT *,
ROW_NUMBER() OVER (PARTITION BY var
ORDER BY (SELECT NULL)) AS RN --Replace SELECT(NULL) with your actual ordering criteria
FROM A
WHERE var IN (1,2)
UNION --ALL(?)
SELECT *
ROW_NUMBER() OVER (PARTITION BY var
ORDER BY (SELECT NULL)) AS RN --Replace SELECT(NULL) with your actual ordering criteria
FROM B
WHERE var IN (3))
SELECT *
FROM CTE
WHERE RN = 1;
Here is a possible solution. In this case ROW_NUMBER, RANK and DENSE_RANK would all work. However, ROW_COUNT is not a valid window function in sql server.
DECLARE #A TABLE(ID INT, Value INT)
DECLARE #B TABLE(ID INT,Value INT)
INSERT INTO #A VALUES (1,1),(2,1),(3,2),(4,3),(5,2),(6,1),(7,3)
INSERT INTO #B VALUES (1,1),(2,1),(3,1),(4,2),(5,3),(6,2),(7,1),(8,3)
;WITH D AS
(
SELECT ID,Value FROM #A WHERE Value IN(1,2)
UNION ALL
SELECT ID,Value FROM #B WHERE Value IN (3)
)
SELECT * FROM
(
SELECT
ID, Value,
ValueRankInSet = DENSE_RANK() OVER(PARTITION BY VALUE ORDER BY ID) -- <-- If you do not have an ID field you can subst ID with NEWID() as order is not important
FROM D
)AS X
WHERE ValueRankInSet = 1
Assign the priority within your Select(s) and then order by it in the row_number:
with cte as
(
SELECT *,
row_number()
over (-- partition by ???
order by prio) as Position
FROM
(
SELECT 1 as prio, * FROM A WHERE var = 1
UNION -- probably a more efficient UNION ALL
SELECT 2 as prio, * FROM A WHERE var = 2
UNION -- probably a more efficient UNION ALL
SELECT 3 as prio, * FROM B WHERE var = 3
)
)
select *
from cte
WHERE Position = 1

How to create "subsets" as a result from a column in SQL

Let's suppose that I've got as a result from one query the next set of values of one column:
Value
1 A
2 B
3 C
4 D
5 E
6 F
7 G
8 H
9 I
10 J
Now, I would like to see this information with another order, establishing a limit to the number of values of every single subset. Now suppose that I choose 3 as a limit,the information will be given like this (one column for all the subsets):
Values
1 A, B, C
2 D, E, F
3 G, H, I
4 J,
Obviously, the last row will contain the remaining values when their number is smaller than the limit established.
Is it possible to perform a query like this in SQL?
What about if the limit is dynamic?. It can be chosen randomly.
create table dee_t (id int identity(1,1),name varchar(10))
insert into dee_t values ('A'),('B'),('c'),('D'),('E'),('F'),('g'),('H'),('I'),('J')
;with cte as
(
select (id-1)/3 +1 rno ,* from dee_t
) select rno ,
(select name+',' from cte where rno = c.rno for xml path('') )
from cte c group by rno
You can do this by using few calculations with row_number, like this:
select
GRP,
max(case when RN = 1 then Value end),
max(case when RN = 2 then Value end),
max(case when RN = 0 then Value end)
from (
select
row_number() over (order by Value) % 3 as RN,
(row_number() over (order by Value)+2) / 3 as GRP,
Value
from Table1
) X
group by GRP
The first row_number creates numbers for the columns (1,2,0,1,2,0...) and the second one creates numbers for the rows (1,1,1,2...). Those are then used to group the values into correct place using case, but you can also use pivot instead of it if you like it more.
If you want them into same column, of course just concatenate the cases instead of selecting them on different columns, but beware of nulls.
Example in SQL Fiddle
Thanks a lot for all your reply. Finally I've got a Solution with the help of Rajen Singh
This is the code than can be used:
WITH CTE_0 AS
(
SELECT DISTINCT column_A_VALUE AS id
FROM Table
WHERE column_A_VALUE IS NOT NULL
), CTE_1 AS
(
SELECT ROW_NUMBER() OVER (ORDER BY id) RN, id
FROM CTE_0
), CTE_2 AS
(
SELECT RN%30 GROUP, ID
FROM CTE_1
)SELECT STUFF(( SELECT ','''+CAST(ID AS NVARCHAR(20))+''''
FROM CTE_2
WHERE GROUP = A.GROUP
FOR XML PATH('')),1,1,'') IDS
FROM CTE_2 A
GROUP BY GROUP

t-SQL Use row number, but on duplicate rows, use the same number

I have some data and want to be able to number each row sequentially, but rows with the same type consecutively, number the same number, and when it's a different type continue numbering.
There will only be types 5 and 6, ID is actually more complex than abc123. I've tried rank but I seem to get two different row counts - in the example instead of 1 2 2 3 4 it would be 1 1 2 2
original image
dense rank result
MS SQL 2008 R2
As far as I understand, you want to number your continous groups
declare #Temp table (id1 bigint identity(1, 1), ID nvarchar(128), Date date, Type int)
insert into #Temp
select 'abc123', '20130101', 5 union all
select 'abc124', '20130102', 6 union all
select 'abc125', '20130103', 6 union all
select 'abc126', '20130104', 5 union all
select 'abc127', '20130105', 6 union all
select 'abc128', '20130106', 6 union all
select 'abc129', '20130107', 6 union all
select 'abc130', '20130108', 6 union all
select 'abc131', '20130109', 5
;with cte1 as (
select
*,
row_number() over (order by T.Date) - row_number() over (order by T.Type, T.Date) as grp
from #Temp as T
), cte2 as (
select *, min(Date) over (partition by grp) as grp2
from cte1
)
select
T.ID, T.Date, T.Type,
dense_rank() over (order by grp2)
from cte2 as T
order by id1

Oracle SQL -- Analytic functions OVER a group?

My table:
ID NUM VAL
1 1 Hello
1 2 Goodbye
2 2 Hey
2 4 What's up?
3 5 See you
If I want to return the max number for each ID, it's really nice and clean:
SELECT MAX(NUM) FROM table GROUP BY (ID)
But what if I want to grab the value associated with the max of each number for each ID?
Why can't I do:
SELECT MAX(NUM) OVER (ORDER BY NUM) FROM table GROUP BY (ID)
Why is that an error? I'd like to have this select grouped by ID, rather than partitioning separately for each window...
EDIT: The error is "not a GROUP BY expression".
You could probably use the MAX() KEEP(DENSE_RANK LAST...) function:
with sample_data as (
select 1 id, 1 num, 'Hello' val from dual union all
select 1 id, 2 num, 'Goodbye' val from dual union all
select 2 id, 2 num, 'Hey' val from dual union all
select 2 id, 4 num, 'What''s up?' val from dual union all
select 3 id, 5 num, 'See you' val from dual)
select id, max(num), max(val) keep (dense_rank last order by num)
from sample_data
group by id;
When you use windowing function, you don't need to use GROUP BY anymore, this would suffice:
select id,
max(num) over(partition by id)
from x
Actually you can get the result without using windowing function:
select *
from x
where (id,num) in
(
select id, max(num)
from x
group by id
)
Output:
ID NUM VAL
1 2 Goodbye
2 4 What's up
3 5 SEE YOU
http://www.sqlfiddle.com/#!4/a9a07/7
If you want to use windowing function, you might do this:
select id, val,
case when num = max(num) over(partition by id) then
1
else
0
end as to_select
from x
where to_select = 1
Or this:
select id, val
from x
where num = max(num) over(partition by id)
But since it's not allowed to do those, you have to do this:
with list as
(
select id, val,
case when num = max(num) over(partition by id) then
1
else
0
end as to_select
from x
)
select *
from list
where to_select = 1
http://www.sqlfiddle.com/#!4/a9a07/19
If you're looking to get the rows which contain the values from MAX(num) GROUP BY id, this tends to be a common pattern...
WITH
sequenced_data
AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY id ORDER BY num DESC) AS sequence_id,
*
FROM
yourTable
)
SELECT
*
FROM
sequenced_data
WHERE
sequence_id = 1
EDIT
I don't know if TeraData will allow this, but the logic seems to make sense...
SELECT
*
FROM
yourTable
WHERE
num = MAX(num) OVER (PARTITION BY id)
Or maybe...
SELECT
*
FROM
(
SELECT
*,
MAX(num) OVER (PARTITION BY id) AS max_num_by_id
FROM
yourTable
)
AS sub_query
WHERE
num = max_num_by_id
This is slightly different from my previous answer; if multiple records are tied with the same MAX(num), this will return all of them, the other answer will only ever return one.
EDIT
In your proposed SQL the error relates to the fact that the OVER() clause contains a field not in your GROUP BY. It's like trying to do this...
SELECT id, num FROM yourTable GROUP BY id
num is invalid, because there can be multiple values in that field for each row returned (with the rows returned being defined by GROUP BY id).
In the same way, you can't put num inside the OVER() clause.
SELECT
id,
MAX(num), <-- Valid as it is an aggregate
MAX(num) <-- still valid
OVER(PARTITION BY id), <-- Also valid, as id is in the GROUP BY
MAX(num) <-- still valid
OVER(PARTITION BY num) <-- Not valid, as num is not in the GROUP BY
FROM
yourTable
GROUP BY
id
See this question for when you can't specify something in the OVER() clause, and an answer showing when (I think) you can: over-partition-by-question