In SQL, how to combine two sequential rows into one when no common ID? - sql-server-2016

Have values split between multiple records without a key to join them. The order is sequential.
Current SQL table contents:
RowNum Value
1 10343
2 20784
3 34523
4 22415
5 31245
6 11345
7 24588
8 32946
I want to return rows combining the two consecutive records but only those with Values starting with 2 and 3. There is no common column value to group on.
Desired result:
RowNums Values
2-3 20784, 34523
4-5 22415, 31245
7-8 24588, 32946

You can use lead():
select concat(rownum, '-', next_rownum) as rownums,
concat(value, ', ', next_value) as values
from (select t.*,
lead(rownum) over (order by rownum) as next_rownum,
lead(value) over (order by value) as next_value
from t
) t
where value like '2%' and next_value like '3%';
This uses Standard SQL syntax. There might be variations depending on your database.

Related

Snowflake: Repeating rows based on column value

How to repeat rows based on column value in snowflake using sql.
I tried a few methods but not working such as dual and connect by.
I have two columns: Id and Quantity.
For each ID, there are different values of Quantity.
So if you have a count, you can use a generator:
with ten_rows as (
select row_number() over (order by null) as rn
from table(generator(ROWCOUNT=>10))
), data(id, count) as (
select * from values
(1,2),
(2,4)
)
SELECT
d.*
,r.rn
from data as d
join ten_rows as r
on d.count >= r.rn
order by 1,3;
ID
COUNT
RN
1
2
1
1
2
2
2
4
1
2
4
2
2
4
3
2
4
4
Ok let's start by generating some data. We will create 10 rows, with a QTY. The QTY will be randomly chosen as 1 or 2.
Next we want to duplicate the rows with a QTY of 2 and leave the QTY =1 as they are.
Obviously you can change all parameters above to suit your needs - this solution works super fast and in my opinion way better than table generation.
Simply stack SPLIT_TO_TABLE(), REPEAT() with a LATERAL() join and voila.
WITH TEN_ROWS AS (SELECT ROW_NUMBER()OVER(ORDER BY NULL)SOME_ID,UNIFORM(1,2,RANDOM())QTY FROM TABLE(GENERATOR(ROWCOUNT=>10)))
SELECT
TEN_ROWS.*
FROM
TEN_ROWS,LATERAL SPLIT_TO_TABLE(REPEAT('hire me $10/hour',QTY-1),'hire me $10/hour')ALTERNATIVE_APPROACH;

SQL group table by "leading rows" without pl/sql

I have this table (short example) with two columns
1 a
2 a
3 a3
4 a
5 a
6 a6
7 a
8 a8
9 a
and I would like to group/partition them into groups separated by those leading "a", ideally to add another column like this, so I can address those groups easily.
1 a 0
2 a 0
3 a3 3
4 a 3
5 a 3
6 a6 6
7 a 6
8 a8 8
9 a 8
problem is that setup of the table is dynamic so I can't use staticly lag or lead functions, any ideas how to do this without pl/sql in postgres version 9.5
Assuming the leading part is a single character. Hence the expression right(data, -1) works to extract the group name. Adapt to your actual prefix.
The solution uses two window functions, which can't be nested. So we need a subquery or a CTE.
SELECT id, data
, COALESCE(first_value(grp) OVER (PARTITION BY grp_nr ORDER BY id), '0') AS grp
FROM (
SELECT *, NULLIF(right(data, -1), '') AS grp
, count(NULLIF(right(data, -1), '')) OVER (ORDER BY id) AS grp_nr
FROM tbl
) sub;
Produces your desired result exactly.
NULLIF(right(data, -1), '') to get the effective group name or NULL if none.
count() only counts non-null values, so we get a higher count for every new group in the subquery.
In the outer query, we take the first grp value per grp_nr as group name and default to '0' with COALESCE for the first group without name (which has a NULL as group name so far).
We could use min() or max() as outer window function as well, since there is only one non-null value per partition anyway. first_value() is probably cheapest since the rows are sorted already.
Note the group name grp is data type text. You may want to cast to integer, if those are clean (and reliably) integer numbers.
This can be achieved by setting rows containing a to a specific value and all the other rows to a different value. Then use a cumulative sum to get the desired number for the rows. The group number is set to the next number when a new value in the val column is encountered and all the proceeding rows with a will have the same group number as the one before and this continues.
I assume that you would need a distinct number for each group and the number doesn't matter.
select id, val, sum(ex) over(order by id) cm_sum
from (select t.*
,case when val = 'a' then 0 else 1 end ex
from t) x
The result for the query above with the data in question, would be
id val cm_sum
--------------
1 a 0
2 a 0
3 a3 1
4 a 1
5 a 1
6 a6 2
7 a 2
8 a8 3
9 a 3
With the given data, you can use a cumulative max:
select . . .,
coalesce(max(substr(col2, 2)) over (order by col1), 0)
If you don't strictly want the maximum, then it gets a bit more difficult. The ANSI solution is to use the IGNORE NULLs option on LAG(). However, Postgres does not (yet) support that. An alternative is:
select . . ., coalesce(substr(reft.col2, 2), 0)
from (select . . .,
max(case when col2 like 'a_%' then col1 end) over (order by col1) as ref_col1
from t
) tt join
t reft
on tt.ref_col1 = reft.col1
You can also try this :
with mytable as (select split_part(t,' ',1)::integer id,split_part(t,' ',2) myvalue
from (select unnest(string_to_array($$1 a;2 a;3 a3;4 a;5 a;6 a6;7 a;8 a8;9 a$$,
';'))t) a)
select id,myvalue,myresult from mytable join (
select COALESCE(NULLIF(substr(myvalue,2),''),'0') myresult,idmin id_down
,COALESCE(lead(idmin) over (order by myvalue),999999999999) id_up
from (
select myvalue,min(id) idmin from mytable group by 1
) a) b
on id between id_down and id_up-1

columns to rows change in oracle sql

I have columns a,b in table x.And i want to change this columns data into rows.
it is possible to have duplicate vales in table but in columns to row change only distinct values should come.
E.G:
a b
1 2
1 11
3 4
5 6
7 8
9 10
......etc
the result 1 (query 1) should be 1-2,1-11,3-4,5-6,7-8,9-10.....etc
The result 2 (query 2) should b 1,3,5,7,9....etc(only one 1 must come as we have duplicate data for column a)
how can i achieve this in oracle SQL.
Please help.
For Oracle 11 use function listagg() and in first query concatenate columns, in second - select distinct values at first.
Query 1:
select listagg(a||'-'||b, ',') within group (order by a, b) result from t
RESULT
------------------------------
1-2,1-11,3-4,5-6,7-8,9-10
Query 2:
select listagg(a, ',') within group (order by a) result
from (select distinct a from t)
RESULT
------------------------------
1,3,5,7,9
For older versions you can use wmsys.wm_concat.

Joining next Sequential Row

I am planing an SQL Statement right now and would need someone to look over my thougts.
This is my Table:
id stat period
--- ------- --------
1 10 1/1/2008
2 25 2/1/2008
3 5 3/1/2008
4 15 4/1/2008
5 30 5/1/2008
6 9 6/1/2008
7 22 7/1/2008
8 29 8/1/2008
Create Table
CREATE TABLE tbstats
(
id INT IDENTITY(1, 1) PRIMARY KEY,
stat INT NOT NULL,
period DATETIME NOT NULL
)
go
INSERT INTO tbstats
(stat,period)
SELECT 10,CONVERT(DATETIME, '20080101')
UNION ALL
SELECT 25,CONVERT(DATETIME, '20080102')
UNION ALL
SELECT 5,CONVERT(DATETIME, '20080103')
UNION ALL
SELECT 15,CONVERT(DATETIME, '20080104')
UNION ALL
SELECT 30,CONVERT(DATETIME, '20080105')
UNION ALL
SELECT 9,CONVERT(DATETIME, '20080106')
UNION ALL
SELECT 22,CONVERT(DATETIME, '20080107')
UNION ALL
SELECT 29,CONVERT(DATETIME, '20080108')
go
I want to calculate the difference between each statistic and the next, and then calculate the mean value of the 'gaps.'
Thougts:
I need to join each record with it's subsequent row. I can do that using the ever flexible joining syntax, thanks to the fact that I know the id field is an integer sequence with no gaps.
By aliasing the table I could incorporate it into the SQL query twice, then join them together in a staggered fashion by adding 1 to the id of the first aliased table. The first record in the table has an id of 1. 1 + 1 = 2 so it should join on the row with id of 2 in the second aliased table. And so on.
Now I would simply subtract one from the other.
Then I would use the ABS function to ensure that I always get positive integers as a result of the subtraction regardless of which side of the expression is the higher figure.
Is there an easier way to achieve what I want?
The lead analytic function should do the trick:
SELECT period, stat, stat - LEAD(stat) OVER (ORDER BY period) AS gap
FROM tbstats
The average value of the gaps can be done by calculating the difference between the first value and the last value and dividing by one less than the number of elements:
select sum(case when seqnum = num then stat else - stat end) / (max(num) - 1);
from (select period, row_number() over (order by period) as seqnum,
count(*) over () as num
from tbstats
) t
where seqnum = num or seqnum = 1;
Of course, you can also do the calculation using lead(), but this will also work in SQL Server 2005 and 2008.
By using Join also you achieve this
SELECT t1.period,
t1.stat,
t1.stat - t2.stat gap
FROM #tbstats t1
LEFT JOIN #tbstats t2
ON t1.id + 1 = t2.id
To calculate the difference between each statistic and the next, LEAD() and LAG() may be the simplest option. You provide an ORDER BY, and LEAD(something) returns the next something and LAG(something) returns the previous something in the given order.
select
x.id thisStatId,
LAG(x.id) OVER (ORDER BY x.id) lastStatId,
x.stat thisStatValue,
LAG(x.stat) OVER (ORDER BY x.id) lastStatValue,
x.stat - LAG(x.stat) OVER (ORDER BY x.id) diff
from tbStats x

Grouping values based on sequence in SQL

Is there a way to create just using select statements a table that contains in a column the range of the repeating values like in this example?
Example:
from the input table:
id value:
1 25
2 25
3 24
4 25
5 25
6 25
7 28
8 28
9 11
should result:
range value
1-2 25
3-3 24
4-6 25
7-8 28
9-9 11
Note: The id values from the input table are always in order and the difference between 2 values ordered by id is always equal to 1
You want to find sequences of consecutive values. Here is one approach, using window functions:
select min(id), max(id), value
from (select id, value,
row_number() over (order by id) as rownum,
row_number() over (partition by value order by id) as seqnum
from t
) t
group by (rownum - seqnum), value;
This takes into account that a value might appear in different places among the rows. The idea is simple. rownum is a sequential number. seqnum is a sequential number that increments for a given value. The difference between these is constant for values that are in a row.
Let me add, if you actually want the expression as "1-2", we need more information. Assuming the id is a character string, one of the following would work, depending on the database:
select min(id)+'-'+max(id), . . .
select concat(min(id), '-', max(id)), . . .
select min(id)||'-'||max(id), . . .
If the id is an integer (as I suspect), then you need to replace the ids in the above expressions with cast(id as varchar(32)), except in Oracle, where you would use cast(id as varchar2(32)).
SELECT CONCAT(MIN(id), '-', MAX(id)) AS id_range, value
FROM input_table
GROUP BY value
Maybe this:
SELECT MIN(ID), MAX(ID), VALUE FROM TABLE GROUP BY VALUE