sql make geometric sequence from series of bit values - sql

I have this table:
declare #Table table (value int)
insert #Table select 0
insert #Table select 1
insert #Table select 1
insert #Table select 1
insert #Table select 0
insert #Table select 1
insert #Table select 1
Now, I need to make a Select query, which would add a column. This column will make a geometric sequence once there is a serie of value 1 in column value.
This would be the result:

I would phrase this as an arithmetic problem. First, you problem suggests that the ordering of rows is important. Hence, you need a column to specify the ordering. I assume there is an id column with this information.
Then to create the groups where the sequences start, do a cumulative sum of the 0s -- all the 1 are in the same group. Given the data you can express this as sum(1 - value) over (order by id).
Then just use arithmetic:
select t.*,
value * power(2, row_number() over (partition by grp order by id) - 1) as generatedsequence
from (select t.*, sum(1 - value) over (order by id) as grp
from #table t
) t;
Here is a db<>fiddle.
The arithmetic is that you want to enumerate the values in the group and then raise 2 to that power (except when value is 0). So the subquery returns:
id. value grp
1 1 1
2 1 1
3 1 1
4 1 1
5 0 2
6 1 2
7 1 2
The row_number() then enumerates the values within each grp.

OK.. first things first, in a database there is no inherent ordering of the data within a table. Therefore, to do what you want, you will need to make a field to sort/order on. In this case, I'm using an IDENTITY field called 'SortID'.
CREATE TABLE #Table (SortID int IDENTITY(1,1), BitValue bit);
INSERT INTO #Table (BitValue)
VALUES (0), (1), (1), (1), (0), (1), (1);
This gives a table with the following starting data
SortID BitValue
1 0
2 1
3 1
4 1
5 0
6 1
7 1
Now, to solve the problem
One way to do it is via a recursive CTE - where the value of the current row is based on the values of the previous rows.
However, recursive CTEs can have performance issues (they're loops, basically) so it's better to do a set-based approach if possible.
In this case, as you want a geometric sequence which is 2 to the power of the relevant row number, we don't need the previous rows to calculate this row - we only need to know the row number
The following approach
Uses a CTE to make a new field called 'GroupNum' which is used to group the rows together. Every time a row has a BitValue of 0, it increments the GroupNum by 1.
In your example, the first four rows would have GroupNum = 1, the remaining three would have GroupNum = 2
Follows the above with a window function - partitioning by those group numbers, and getting the row_number (minus one) within each group.
The final result is set as the power of a variable #a to the relevant row_number.
To match your example, I have used #a = 2 as the base for the POWER function.
DECLARE #a int;
SET #a = 2;
WITH Grouped_BitValues AS
(SELECT SortID, BitValue,
CASE WHEN BitValue = 0 THEN 1 ELSE 0 END AS NewGrpFlag,
SUM(CASE WHEN BitValue = 0 THEN 1 ELSE 0 END) OVER (ORDER BY SortID) AS GroupNum
FROM #Table
)
SELECT BitValue, POWER(#a, ROW_NUMBER() OVER (PARTITION BY GroupNum ORDER BY SortID) -1) AS Geometric_Sequence
FROM Grouped_BitValues
ORDER BY SortID;
And here are the results
BitValue Geometric_Sequence
0 1
1 2
1 4
1 8
0 1
1 2
1 4
Note that in your question, 2^0 should be 1, not 0, for a proper geometric sequence. If instead you wanted 0, you'd need to code in Geometric_Sequence to have a CASE expression (e.g., CASE WHEN BitValue = 0 THEN 0 ELSE POWER(...) AS Geometric_Sequence).
Here is a db<>fiddle with
the setup
the answer
the components of the answer (e.g., the CTE, and calculations) to demonstrate how it's calculated

Related

Repeating a SQL INSERT query a particular number of times with an incremental value

I've got a SQL query for inserting data into a table consisting of values for StorageRowNo and StorageID. The goal is to repeat this query a specified number of times with StorageRowNo increasing by 1 every time the query is repeated. Any input would be greatly appreciated!
INSERT [dbo].[StorageRow]
SELECT StorageRowNo, StorageID
FROM (VALUES (1, 2)) V (StorageRowNo, StorageID)
WHERE NOT EXISTS (SELECT 1
FROM [dbo].[StorageRow] C
WHERE C.StorageRowNo = V.StorageRowNo
AND C.StorageID = V.StorageID);
Expected output would be something like this if the specified number were 3.
StorageRowNo
StorageID
1
2
2
2
3
2
use straight sql if you can
DECLARE #numRepeats INT = 10; -- number of times to repeat the query
INSERT [dbo].[StorageRow]
SELECT V.StorageRowNo, V.StorageID
FROM (select row_number() over(order by (select null)) as StorageRowNo
,2 as StorageID
from master..spt_values
) V
WHERE NOT EXISTS (SELECT 1
FROM [dbo].[StorageRow] C
WHERE C.StorageRowNo = V.StorageRowNo
AND C.StorageID = V.StorageID);
AND V.StorageRowNo<=#numRepeats

renumbering in a column when adding a row sql

For a table like
create table Stations_in_route
(
ID_station_in_route int primary key,
ID_route int,
ID_station int,
Number_in_route int not null
)
There is the following trigger that changes the values ​​in the Number_in_route column after a new row is added to the route. The list of numbers in the route must remain consistent.
create trigger stations_in_route_after_insert on Stations_in_route
after insert
as
if exists
(select *from Stations_in_route
where Stations_in_route.ID_station_in_route not in (select ID_station_in_route from inserted)
and Stations_in_route.ID_route in (select ID_route from inserted)
and Stations_in_route.Number_in_route in (select Number_in_route from inserted))
begin
update Stations_in_route
set Number_in_route = Number_in_route + 1
where Stations_in_route.ID_station_in_route not in (select ID_station_in_route from inserted)
and Stations_in_route.ID_route in (select ID_route from inserted)
and Stations_in_route.Number_in_route >= (select Number_in_route from inserted where Stations_in_route.ID_route = inserted.ID_route)
end
this trigger will throw an error if insertion into one ID_route is performed:
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
For example,
Insert into Stations_in_route values(25, 4, 11, 3),(26, 4, 10, 5)
How to fix?
ID_station_in_route
ID_route
ID_station
Number_in_route
1
4
1
1
2
4
2
2
3
4
3
3
4
4
4
4
5
4
5
5
6
4
6
6
7
4
7
7
8
4
8
8
i expect the list after adding will become like this
ID_station_in_route
ID_route
ID_station
Number_in_route
1
4
1
1
2
4
2
2
25
4
11
3
3
4
3
4
26
4
10
5
4
4
4
6
5
4
5
7
6
4
6
8
7
4
7
9
8
4
8
10
this is not the whole table, as there are other routes too
Based on the requirements, when you add new stops to the route, you need to insert them into their desired sequence correctly, and push all existing stops from that point forward so that a contiguous sequence is maintained. When you insert one row this isn't very hard (just number_in_route + 1 where number_in_route > new_number_in_route), but when you insert more rows, you need to basically push the entire set of subsequent stops by 1 for each new row. To illustrate, let's say you start with this:
If we insert two new rows, such as:
INSERT dbo.Stations_in_route
(
ID_station_in_route,
ID_route,
ID_station,
Number_in_route
)
VALUES (25, 4, 11, 3),(26, 4, 10, 5);
-- add a stop at 3 ^ ^
----------------- add a stop at 5 ^
We can illustrate this by slowing it down into separate steps. First, we need to add this row at position #3:
And we do this by pushing all the rows > 3 down by 1:
But now when we add this row at position #5:
That's the new position #5, after the previous shift, so it looks like this:
We can do this with the following trigger, which is possibly a little more complicated than it has to be, but is better IMHO than tedious loops which might otherwise be required.
CREATE TRIGGER dbo.tr_ins_Stations_in_route ON dbo.Stations_in_route
FOR INSERT AS
BEGIN
;WITH x AS
(
SELECT priority = 1, *, offset = ROW_NUMBER() OVER
(PARTITION BY ID_route ORDER BY Number_in_route)
FROM inserted AS i
UNION ALL
SELECT priority = 2, s.*, offset = NULL FROM dbo.Stations_in_route AS s
WHERE s.ID_route IN (SELECT ID_route FROM inserted)
),
y AS
(
SELECT *, rough_rank = Number_in_route
+ COALESCE(MAX(offset) OVER (PARTITION BY ID_Route
ORDER BY Number_in_route ROWS UNBOUNDED PRECEDING),0)
- COALESCE(offset, 0),
tie_break = ROW_NUMBER() OVER
(PARTITION BY ID_route, ID_station_in_route ORDER BY priority)
FROM x
),
z AS
(
SELECT *, new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority)
FROM y WHERE tie_break = 1
)
UPDATE s SET s.Number_in_route = z.new_Number_in_route
FROM dbo.Stations_in_route AS s
INNER JOIN z ON s.ID_route = z.ID_route
AND s.ID_station_in_route = z.ID_station_in_route;
END
Working example db<>fiddle
I've mentioned a couple of times that you might want to handle ties for new rows, e.g. if the insert happened to be:
Insert into Stations_in_route values(25, 4, 11, 3),(26, 4, 10, 3)
For that you can add additional tie-breaking criteria to this clause:
new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority)
e.g.:
new_Number_in_route = ROW_NUMBER() OVER
(PARTITION BY ID_Route ORDER BY rough_rank, priority,
ID_station_in_route DESC)
I'm unable to repro the exception with the test code/data in the question, however I'm gonna guess that the issue is with this bit of the code in the trigger:
AND Stations_in_route.Number_in_route >=
(
SELECT Number_in_route
FROM inserted
WHERE Stations_in_route.ID_route = inserted.ID_route
)
The engine there will implicitly expect that subquery on the right-side of the >= operator to return a scalar result (single row, single column result), however the inserted table is in fact, a table...which may contain multiple records (as would be the case in a multi-row insert/update/etc. type statement as outlined in your example). Given that the filter (i.e. WHERE clause) in that subquery isn't guaranteed to be unique (ID_route doesn't appear to be unique, and in your example you have an insert statement that actually inserts multiple rows with the same ID_route value), then it's certainly possible that query will return a non-scalar result.
To fix that, you'd need to adjust that subquery to guarantee a result of a scalar value (single row and single column). You've guaranteed the single column already with the selector...now you need to add logic to guarantee a single result/record as well. That could include one or more of the following (or possibly other things also):
Wrap the selected Number_in_route column in an aggregate function (i.e. a MAX() perhaps?)
Add a TOP 1 with an ORDER BY to get the record you want to compare with
Add additional filters to the WHERE clause to ensure a single result is returned

Count length of consecutive duplicate values for each id

I have a table as shown in the screenshot (first two columns) and I need to create a column like the last one. I'm trying to calculate the length of each sequence of consecutive values for each id.
For this, the last column is required. I played around with
row_number() over (partition by id, value)
but did not have much success, since the circled number was (quite predictably) computed as 2 instead of 1.
Please help!
First of all, we need to have a way to defined how the rows are ordered. For example, in your sample data there is not way to be sure that 'first' row (1, 1) will be always displayed before the 'second' row (1,0).
That's why in my sample data I have added an identity column. In your real case, the details can be order by row ID, date column or something else, but you need to ensure the rows can be sorted via unique criteria.
So, the task is pretty simple:
calculate trigger switch - when value is changed
calculate groups
calculate rows
That's it. I have used common table expression and leave all columns in order to be easy for you to understand the logic. You are free to break this in separate statements and remove some of the columns.
DECLARE #DataSource TABLE
(
[RowID] INT IDENTITY(1, 1)
,[ID]INT
,[value] INT
);
INSERT INTO #DataSource ([ID], [value])
VALUES (1, 1)
,(1, 0)
,(1, 0)
,(1, 1)
,(1, 1)
,(1, 1)
--
,(2, 0)
,(2, 1)
,(2, 0)
,(2, 0);
WITH DataSourceWithSwitch AS
(
SELECT *
,IIF(LAG([value]) OVER (PARTITION BY [ID] ORDER BY [RowID]) = [value], 0, 1) AS [Switch]
FROM #DataSource
), DataSourceWithGroup AS
(
SELECT *
,SUM([Switch]) OVER (PARTITION BY [ID] ORDER BY [RowID]) AS [Group]
FROM DataSourceWithSwitch
)
SELECT *
,ROW_NUMBER() OVER (PARTITION BY [ID], [Group] ORDER BY [RowID]) AS [GroupRowID]
FROM DataSourceWithGroup
ORDER BY [RowID];
You want results that are dependent on actual data ordering in the data source. In SQL you operate on relations, sometimes on ordered set of relations rows. Your desired end result is not well-defined in terms of SQL, unless you introduce an additional column in your source table, over which your data is ordered (e.g. auto-increment or some timestamp column).
Note: this answers the original question and doesn't take into account additional timestamp column mentioned in the comment. I'm not updating my answer since there is already an accepted answer.
One way to solve it could be through a recursive CTE:
create table #tmp (i int identity,id int, value int, rn int);
insert into #tmp (id,value) VALUES
(1,1),(1,0),(1,0),(1,1),(1,1),(1,1),
(2,0),(2,1),(2,0),(2,0);
WITH numbered AS (
SELECT i,id,value, 1 seq FROM #tmp WHERE i=1 UNION ALL
SELECT a.i,a.id,a.value, CASE WHEN a.id=b.id AND a.value=b.value THEN b.seq+1 ELSE 1 END
FROM #tmp a INNER JOIN numbered b ON a.i=b.i+1
)
SELECT * FROM numbered -- OPTION (MAXRECURSION 1000)
This will return the following:
i id value seq
1 1 1 1
2 1 0 1
3 1 0 2
4 1 1 1
5 1 1 2
6 1 1 3
7 2 0 1
8 2 1 1
9 2 0 1
10 2 0 2
See my little demo here: https://rextester.com/ZZEIU93657
A prerequisite for the CTE to work is a sequenced table (e. g. a table with an identitycolumn in it) as a source. In my example I introduced the column i for this. As a starting point I need to find the first entry of the source table. In my case this was the entry with i=1.
For a longer source table you might run into a recursion-limit error as the default for MAXRECURSION is 100. In this case you should uncomment the OPTION setting behind my SELECT clause above. You can either set it to a higher value (like shown) or switch it off completely by setting it to 0.
IMHO, this is easier to do with cursor and loop.
may be there is a way to do the job with selfjoin
declare #t table (id int, val int)
insert into #t (id, val)
select 1 as id, 1 as val
union all select 1, 0
union all select 1, 0
union all select 1, 1
union all select 1, 1
union all select 1, 1
;with cte1 (id , val , num ) as
(
select id, val, row_number() over (ORDER BY (SELECT 1)) as num from #t
)
, cte2 (id, val, num, N) as
(
select id, val, num, 1 from cte1 where num = 1
union all
select t1.id, t1.val, t1.num,
case when t1.id=t2.id and t1.val=t2.val then t2.N + 1 else 1 end
from cte1 t1 inner join cte2 t2 on t1.num = t2.num + 1 where t1.num > 1
)
select * from cte2

How can I put a column in SQL that shows a 1 if the value in another column is unique and 0 if it's duplicate?

I would like to build a column that puts the value 1 if it's the first occurrence of a value in one row and 0 if it's not the first occurrence.
You could use a window function and CTE to assign a rownumber to each partition of "data" and then have flag set to 1 when rownumber is 1 else 0.
Rextester Demo
This assumes:
the filed to evaluate is called "Data"
the 1st entry isn't considered a duplicate where all others are. 1st is tricky here as we've not defined an order so the first entry the system encounters will be treated as non-duplicate; unless we further define the order by in the window function; that non-duplicate could change from run to run.
.
With CTE AS (SELECT Data, Row_Number() over (partition by Data order by Data) RN
FROM TableName T)
SELECT Data, case when RN = 1 then 1 else 0 end as Flg
FROM CTE
Returns something like this given my sample data used:
Data Flg
1 A 1
2 A 0
3 B 1
4 B 0
5 B 0
6 C 1
7 C 0
First of all it's not possible to automagically set and update this value in the table. They'd have to be recalculated after every modification to the table and calculated columns can't refer to other rows, or execute full queries.
You can use a window function like LAG to check if there's a previous value in a set according to the search order you provide exists or not, eg :
declare #t table (data varchar(10))
insert into #t
values ('A'),
('B'),
('A'),
('B'),
('B'),
('C'),
('C')
SELECT
Data,
iif( lag(1) over (partition by Data order by Data) is null,1,0) as Value
FROM #t
This will return :
Data Value
A 1
A 0
B 1
B 0
B 0
C 1
C 0
lag(1) over (partition by Data order by Data) partitions the rows by the values in the data column and sorts them by the same value, essentially producing a random order. It then returns the "previous" value for that partition. For the first row, there is none and LAG returns null.
iif( ... is null,1,0) as Value checks the result and returns 1 if it's null, 0 otherwise.
If you want to store the changes to the table, you can use an updatable CTE :
declare #t table (data varchar(10),value bit)
insert into #t(data)
values ('A'),
('B'),
('A'),
('B'),
('B'),
('C'),
('C');
with x as ( SELECT
Data,value,
iif( lag(1) over (partition by Data order by Data) is null,1,0) as Val
FROM #t)
update x
set value=val
from #t;
select *
from #t
order by data
You could use a trigger on INSERT, UPDATE, DELETE to run that query on every modification and update the table. It would be better though to make all modifications and execute UPDATE only at the end
SQL table represent unordered sets. Your table does not have enough information to answer this question, once the table has been created.
If you do have such a table, you can use a query such as this to assign the value:
select value,
(case when row_number() over (partition by value order by ?) = 1
then 1 else 0
end) as flag
from t;
The ? is a placeholder for a column such as createdAt or id, which specifies the ordering of the values in the table.
If you don't have a column that specifies the ordering, you could do this with a trigger that checks if the value already exists. However, rather than writing a trigger, I would suggest creating the table with an identity() column or createdAt column (or both) to capture insertion order.

Teradata: Recursively Subtract

I have a set of data as follows:
Product Customer Sequence Amount
A 123 1 928.69
A 123 2 5032.81
A 123 3 6499.19
A 123 4 7908.57
What I want to do is recursively subtract the amounts based on the result of the previous subtraction (keeping the first amount as-is), into in a 'Result' column
e.g. Subtract 0 from 928.69 = 928.69, subtract 928.69 from 5032.81 = 4104.12, subtract 4104.12 from 6499.19 = 2395.07, etc (for each product/customer)
The results I'm trying to achieve are:
Product Customer Sequence Amount Result
A 123 1 928.69 928.69
A 123 2 5032.81 4104.12
A 123 3 6499.19 2395.07
A 123 4 7908.57 5513.50
I had been trying to achieve this using combinations of LEAD & LAG, but couldn't figure out how to use the result in the next row.
I'm thinking it's possible using a recursive statement, iterating over the sequence, however I'm not familiar with teradata recursion and couldn't successfully adapt the samples I found.
Can anyone please direct me on how to format a recursive teradata SQL statement to achieve the above result? I'm also open to non-recursive options if there are any.
CREATE VOLATILE TABLE MY_TEST (Product CHAR(1), Customer INTEGER, Sequence INTEGER, Amount DECIMAL(16,2)) ON COMMIT PRESERVE ROWS;
INSERT INTO MY_TEST VALUES ('A', 123, 1, 928.69);
INSERT INTO MY_TEST VALUES ('A', 123, 2, 5032.81);
INSERT INTO MY_TEST VALUES ('A', 123, 3, 6499.19);
INSERT INTO MY_TEST VALUES ('A', 123, 4, 7908.57);
This is really weird because of the alternation of the + and -.
If you know the value is always positive, then this works:
with t as (
select 1 as customer, 928.69 as amount, 928.69 as result union all
select 2, 5032.81, 4104.12 union all
select 3, 6499.19, 2395.07 union all
select 4, 7908.57, 5513.50
)
select t.*,
abs(sum( case when seqnum mod 2 = 1 then - amount else amount end ) over (partition by product order by sequence rows unbounded preceding)
from t;
The abs() is really a shortcut. If the resulting value could be negative, you can have an outer case expression to determine if the result should be multiplied by -1 or 1:
select t.*,
((case when sequence mod 2 = 1 then -1 else 1 end) *
sum( case when sequence mod 2 = 1 then - amount else amount end ) over (partition by product order by sequence rows unbounded preceding)
)
from t
select colA-der_col_A from table A,
(select coalesce(min(col_A) as der_col_A over (partition by col_B order by col_A rows between 1 following and 1 following), 0)
from table) B
on (A.col_b=B.Col_B);
Replace col_A and col_B with your key columns.Product,customer and sequence in your case.