Count rows with equal values in a window function - sql

I have a time series in a SQLite Database and want to analyze it.
The important part of the time series consists of a column with different but not unique string values.
I want to do something like this:
Value concat countValue
A A 1
A A,A 1
B A,A,B 1
B A,B,B 2
B B,B,B 3
C B,B,C 1
B B,C,B 2
I don't know how to get the countValue column. It should count all Values of the partition equal to the current rows Value.
I tried this but it just counts all Values in the partition and not the Values equal to this rows Value.
SELECT
Value,
group_concat(Value) OVER wind AS concat,
Sum(Case When Value Like Value Then 1 Else 0 End) OVER wind AS countValue
FROM TimeSeries
WINDOW
wind AS (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW)
ORDER BY
date
;
The query is also limited by these factors:
The query should work with any amount of unique Values
The query should work with any Partition Size (ROWS BETWEEN n PRECEDING AND CURRENT ROW)
Is this even possible using only SQL?

Here is an approach using string functions:
select
value,
group_concat(value) over wind as concat,
(
length(group_concat(value) over wind) - length(replace(group_concat(value) over wind, value, ''))
) / length(value) cnt_value
from timeseries
window wind as (order by date rows between 2 preceding and current row)
order by date;

Related

Row numbering based on contiguous data?

I need to assign numbers to rows based on a date. The rule is that the same number is assigned to multiple contiguous rows with the same date. When a row's date value differs from the previous row's date value, the number is incremented. The result set would look something like this (the first column would be used to determine row order):
1 7/1/2021 1
2 7/2/2021 2
3 7/2/2021 2
4 7/1/2021 3
5 7/2/2021 4
The value of the date is not what' relevant in this case. As you can see, there are repeats of the same date that get assigned different numeric values because they are not contiguous. I'm struggling to figure out how I would accomplish this.
This is a Gaps & Islands problem. You need to provide the extra ordering columns for the query to make sense.
If you added these, the solution would go along the lines of:
select
d,
1 + sum(inc) over(order by ordering_columns) as grp
from (
select d, ordering_columns,
case when d <> lag(d) over(order by ordering_columns) then 1 else 0 end as inc
from t
) x
order by ordering_columns

SQLite Select SUM Rolling Window

I have this table in SQLite
Table [Ticks]
Fields: 2
[Value]: INT
[Time]: DATETIME
And I want to select a window or partition of 10 hours of values and make a sum of those values then move one row forward and do the same for last 10 hours through the whole range of records.
The value field contains -1 or 1
How can I achieve this? Is this possible with the WINDOW, PARTITION query?
You can convert the time to seconds and then use range():
select t.*,
sum(value) over (order by strftime('%s', time) + 0
range between 35999 preceding and current row
) as sum_10hours
from ticks t;
The strftime() expression converts the value to seconds. The range takes (106060 - 1) seconds before to the current row.
Here is a db-fiddle.

SQL - Recursive average based on preceding row (AR model)

I was wondering how to use either While or Recursion to create an AR(1) model.
In my database I have the following variables in one table (Y is a value):
Period
Values
20171
Y_0
20172
Y_1
20173
Y_2
20174
Y_3
20181
Y_4
I'm trying to create a query that will create a new column AR which is defined as:
Period
Value
AR
20171
Y_0
Y_0
20172
Y_1
AVG( AR_0 & Y_1)
20173
Y_2
AVG( AR_1 & Y_2)
such as the following:
Image of desired dataflow from excel
I tried the following:
SELECT Period , Values, Values as AR,
INTO #Beginning
FROM table
WHERE Period = (SELECT MIN(PERIOD) FROM table)
SELECT Period , Values, Values as AR,
FROM #Beginning
UNION ALL
SELECT Period , Values, NULL as AR,
FROM table
WHERE Period >(SELECT MIN(PERIOD) FROM table)
Which results in a table with the first row in the desired result. However I can't seem to get the rest of the AR column, since these are dependent on one another. As of this moment these are null.
Is it possible to use recursion in SQL to create a column, where each row is dependent on one column in the same row, and one column in the preceding row?
You would use window functions. For instance:
select period, value,
avg(value) over (order by period rows between 1 preceding and current row)
from t;

Comparing time difference for every other row

I'm trying to determine the length of time in days between using the AR_Event_Creation_Date_Time for every other row. For example, the number of days between the 1 and 2 row, 3rd and 4th, 5th and 6th etc. In other words, there will be a number of days value for every even row and NULL for every odd row. My code below works if there are only two rows per borrower number but falls down when there are more than two. In the results, notice the change in 1002092539
SELECT Borrower_Number,
Workgroup_Name,
FORMAT(AR_Event_Creation_Date_Time,'d','en-us') AS Tag_Date,
Usr_Usrnm,
DATEDIFF(day, LAG(AR_Event_Creation_Date_Time,1) OVER(PARTITION BY
Borrower_Number Order By Borrower_Number), AR_Event_Creation_Date_Time) Diff
FROM Control_Mail
You need to add in a row number. Also your partition by is non-deterministic:
SELECT Borrower_Number,
Workgroup_Name,
FORMAT(AR_Event_Creation_Date_Time,'d','en-us') AS Tag_Date,
Usr_Usrnm,
DATEDIFF(day, LAG(AR_Event_Creation_Date_Time,1) OVER(PARTITION BY Borrower_Number, (rn - 1) / 2 ORDER BY AR_Event_Creation_Date_Time),
AR_Event_Creation_Date_Time) Diff
FROM (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY Borrower_Number ORDER BY AR_Event_Creation_Date_Time) AS rn
FROM Control_Mail
) C
```

SQL Server need the total of the previous 6 rows

I'm using SQL Server and I need to get the sum of the previous 6 rows of my table and place the results in its own column.
I'm able to get the 6th row back with the below query:
SELECT id
,FileSize
,LAG(FileSize,6) OVER (ORDER BY DAY(CompleteTime)) previous
FROM Jobs_analytics
group by id, CompleteTime, Jobs_analytics.FileSize
which gives me the six row back, but what I need is the sum of all six rows previous.
any help would be appreciate
Mike
You can use:
SELECT ja.id, ja.FileSize, CompleteTime,
SUM(FileSize) OVER (ORDER CompleteTime ROWS BETWEEN 5 PRECEDING AND CURRENT ROW) as previous
FROM Jobs_analytics ja;
I don't see why GROUP BY is necessary. There are no aggregation functions.
Note that this takes 6 days including the current day. If you want the six preceding rows:
SELECT ja.id, ja.FileSize, DATE,
SUM(FileSize) OVER (ORDER BY CompleteTime ja.id ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING) as previous
FROM Jobs_analytics ja