Running sum with max and min cap in SQL Server - sql

I have a table that looks like this
|ID1| ID2| Date |count |
+---+----+------------+------+
|1 | 1 | 2019-07-24 | 3 |
|1 | 1 | 2019-07-25 | 3 |
|1 | 1 | 2019-07-26 | 3 |
|1 | 1 | 2019-07-27 | 1 |
|1 | 1 | 2019-07-28 | -3 |
|1 | 2 | 2019-07-24 | 1 |
|1 | 2 | 2019-07-25 | -3 |
|1 | 2 | 2019-07-26 | 3 |
|1 | 2 | 2019-07-27 | 3 |
|1 | 2 | 2019-07-28 | 3 |
I am interested in calculating the running sum with a min cap of 0 and a max cap of 8. Resulting table would look like this.
|ID1| ID2| Date |count |runningSum|
+---+----+------------+------+----------+
|1 | 1 | 2019-07-24 | 3 | 3 |
|1 | 1 | 2019-07-25 | 3 | 6 |
|1 | 1 | 2019-07-26 | 3 | 8 |
|1 | 1 | 2019-07-27 | 1 | 8 |
|1 | 1 | 2019-07-28 | -3 | 5 |
|1 | 2 | 2019-07-24 | 1 | 1 |
|1 | 2 | 2019-07-25 | -3 | 0 |
|1 | 2 | 2019-07-26 | 3 | 3 |
|1 | 2 | 2019-07-27 | 3 | 6 |
|1 | 2 | 2019-07-28 | 3 | 8 |
I know that Oracle has many different solution to address this problem, like
described here in number 7
https://blog.jooq.org/2016/04/25/10-sql-tricks-that-you-didnt-think-were-possible/. Does anything as simple as this exist for Microsoft SQL Server.
Note that I am not allowed to create tables, temporary tables or table variables.
EDIT I am using Azure Datawarehouse where recursive CTE and cursor statements are not available. Are there really not any other ways to solve this problem in SQL Server?

I don't think you can do this with window functions, alas. The problem is that the caps introduce a state change, so you have to process the rows incrementally to get the value for a given row.
A recursive CTE does iteration, so it can do what you want:
with t as (
select t.*,
row_number() over (partition by id1, id2 order by date) as seqnum
from <yourtable> t
),
cte as (
select id1, id2, date, count,
(case when count < 0 then 0
when count > 8 then 8
else count
end) as runningsum,
seqnum
from t
where seqnum = 1
union all
select cte.id1, cte.id2, t.date, t.count,
(case when t.count + cte.runningsum < 0 then 0
when t.count + cte.runningsum > 8 then 8
else t.count + cte.runningsum
end) as runningsum, t.seqnum
from cte join
t
on t.seqnum = cte.seqnum + 1 and
t.id1 = cte.id1 and t.id2 = cte.id2
)
select *
from cte
order by id1, id2, date;
Here is a db<>fiddle.
Note that very similar code will work in Oracle 12C, which supports recursive CTEs. In earlier versions of Oracle, you can use connect by.

Related

Query for splitting into 2 columns

I have the following table:
table1
----------------------------
| id | desc | dc | amount |
----------------------------
|1 | trx 1 | d | 100000 |
|2 | trx 2 | d | 500000 |
|3 | trx 3 | c | 800000 |
|4 | trx 4 | d | 100000 |
|5 | trx 5 | c | 900000 |
|6 | trx 6 | d | 700000 |
----------------------------
I need to query from table1 above to have the following output :
----------------------------------
| id | desc | d | c |
----------------------------------
|1 | trx 1 | 100000 | |
|2 | trx 2 | 500000 | |
|3 | trx 3 | | 800000 |
|4 | trx 4 | 100000 | |
|5 | trx 5 | | 900000 |
|6 | trx 6 | 700000 | |
----------------------------------
total | 1500000 | 1700000|
----------------------------------
Please advise what is the SQL command do be executed.
Try this:
SELECT
id, desc,
CASE WHEN dc = 'c' THEN amount ELSE NULL END AS c,
CASE WHEN dc = 'd' THEN amount ELSE NULL END AS d
FROM table1
I suggest you to do something like:
SELECT id, desc,
CASE WHEN dc='d' then amount else null end as d,
CASE WHEN dc='c' then amount else null end as c
FROM table1

How to find overlapping time slices of serveral key-value elements

I would like to find out if I have overlapping time slices that have the same id and the same name.
In the following example, the entries with id=2 and name=c overlaps.
Entry with id=1 is just for demonstration of a good case.
Given table:
+---+------+-------+------------+--------------+
|id | name | value | validFrom | validTo |
+---+------+-------+------------+--------------+
|1 | a | 12 | 2019-01-01 | 9999-12-31 |
|1 | b | 34 | 2019-01-01 | 2019-10-31 |
|1 | b | 35 | 2019-11-01 | 9999-12-31 |
|1 | c | 13 | 2019-01-01 | 2025-12-31 |
|2 | a | 49 | 2019-01-01 | 9999-12-31 |
|2 | b | 99 | 2019-01-01 | 2034-12-31 |
|2 | c | 75 | 2019-01-01 | 2019-10-31 |
|2 | c | 84 | 2019-10-28 | 9999-12-31 |
|n | ... | ... | ... | ... |
+---+------+-------+------------+--------------+
expected output:
+---+------+
|id | name |
+---+------+
|2 | c |
+---+------+
Thanks for your help in advance!
You can get the overlapping rows using exists:
select t.*
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.name = t.name and
t2.value <> t.value and
t2.validTo > t.validFrom and
t2.validFrom < t.validTo
);
If you just want the id/name combinations:
select distinct t.id, t.name
from t
where exists (select 1
from t t2
where t2.id = t.id and
t2.name = t.name and
t2.value <> t.value and
t2.validTo > t.validFrom and
t2.validFrom < t.validTo
);
You can also do this with a cumulative max:
select t.*
from (select t.*,
max(validTo) over (partition by id, name
order by validFrom
rows between unbounded preceding and 1 preceding
) as prev_validTo
from t
) t
where prev_validTo >= validFrom;

SQL - Identify consecutive numbers in a table

Is there a way to flag consecutive numbers in an SQL table?
Based on the values in 'value_group_4' column, is it possible to tag continous values? This needs to be done within groups of each 'date_group_1'
I tried using row_numbers, rank, dense_rank but unable to come up with a foolproof way.
This has nothing to do with consecutiveness. You simply want to mark all rows where date_group_1 and value_group_4 are not unique.
One way:
select
mytable.*,
case when exists
(
select null
from mytable agg
where agg.date_group_1 = mytable.date_group_1
and agg.value_group_4 = mytable.value_group_4
group by agg.date_group_1, agg.value_group_4
having count(*) > 1
) then 1 else 0 end as flag
from mytable
order by date_group_1, value_group_4;
In a later version of SQL Server you'd use COUNT OVER instead.
SQL tables represent unordered sets. There is no such thing as consecutive values, unless a column specifies the ordering. Your data does not have such an obvious column, but I'll assume one exists and just call it id for convenience.
With such a column, lag()/lead() does what you want:
select t.*,
(case when lag(value_group_4) over (partition by data_group1 order by id) = value_group_4
then 1
when lead(value_group_4) over (partition by data_group1 order by id) = value_group_4
then 1
else 0
end) as flag
from t;
On close inspection, value_group_3 may do what you want. So you can use that for the id.
If your version of SQL Server doesn't have a full suite of windowing functions it should be still possible. This problem looks like a last-non-null problem which Itzik Ben-Gan has good example here... http://www.itprotoday.com/software-development/last-non-null-puzzle
Also, look at Mikael Eriksson's answer here which uses no windowing functions.
If the order of your data is determined by the date_group_1, value_group_3 column values, then why not make it as simple as the following query:
select
*,
rank() over(partition by date_group_1 order by value_group_3) - 1 value_group_3,
case
when count(*) over(partition by date_group_1, value_group_3) > 1 then 1
else 0
end expected_result
from data;
Output:
| date_group_1 | category_group_2 | value_group_3 | value_group_3 | expected_result |
+--------------+------------------+---------------+---------------+-----------------+
| 2018-01-11 | A | 15.3 | 0 | 0 |
| 2018-01-11 | B | 17.3 | 1 | 1 |
| 2018-01-11 | A | 17.3 | 1 | 1 |
| 2018-01-11 | B | 21 | 3 | 0 |
| 2018-01-22 | A | 15.3 | 0 | 0 |
| 2018-01-22 | B | 17.3 | 1 | 0 |
| 2018-01-22 | A | 21 | 2 | 0 |
| 2018-01-22 | B | 23 | 3 | 0 |
| 2018-03-13 | A | 15.3 | 0 | 0 |
| 2018-03-13 | B | 17.3 | 1 | 1 |
| 2018-03-13 | A | 17.3 | 1 | 1 |
| 2018-03-13 | B | 23 | 3 | 0 |
| 2018-05-15 | A | 6 | 0 | 0 |
| 2018-05-15 | B | 6.3 | 1 | 0 |
| 2018-05-15 | A | 15 | 2 | 0 |
| 2018-05-15 | B | 16.3 | 3 | 1 |
| 2018-05-15 | A | 16.3 | 3 | 1 |
| 2018-05-15 | B | 22 | 5 | 0 |
| 2019-05-04 | A | 0 | 0 | 0 |
| 2019-05-04 | B | 7 | 1 | 0 |
| 2019-05-04 | A | 15.3 | 2 | 0 |
| 2019-05-04 | B | 17.3 | 3 | 0 |
Test it online with SQL Fiddle.

Need a simple query to calculate sequence length in SQL Server

I have this view that represent the status of connections for each user to a system inside table as below:
---------------------------------------
|id | date | User | Connexion |
|1 | 01/01/2018 | A | 1 |
|2 | 02/01/2018 | A | 0 |
|3 | 03/01/2018 | A | 1 |
|4 | 04/01/2018 | A | 1 |
|5 | 05/01/2018 | A | 0 |
|6 | 06/01/2018 | A | 0 |
|7 | 07/01/2018 | A | 0 |
|8 | 08/01/2018 | A | 1 |
|9 | 09/01/2018 | A | 1 |
|10 | 10/01/2018 | A | 1 |
|11 | 11/01/2018 | A | 1 |
---------------------------------------
The target output would be to get the count of succeeded and failed connection order by date so the output would be like that
---------------------------------------------------------------
|StartDate EndDate User Connexion Length|
|01/01/2018 | 01/01/2018 | A | 1 | 1 |
|02/01/2018 | 02/01/2018 | A | 0 | 1 |
|03/01/2018 | 04/01/2018 | A | 1 | 2 |
|05/01/2018 | 07/01/2018 | A | 0 | 3 |
|08/01/2018 | 11/01/2018 | A | 1 | 4 |
---------------------------------------------------------------
This is what is called a gaps-and-islands problem. The best solution for your version is a difference of row numbers:
select user, min(date), max(date), connexion, count(*) as length
from (select t.*,
row_number() over (partition by user order by date) as seqnum,
row_number() over (partition by user, connexion order by date) as seqnum_uc
from t
) t
group by user, connexion, (seqnum - seqnum_uc);
Why this works is a little tricky to explain. Generally, I find that if you stare at the results of the subquery, you'll see how the difference is constant for the groups that you care about.
Note: You should not use user or date for the names of columns. These are keywords in SQL (of one type or another). If you do use them, you have to clutter up your SQL with escape characters, which just makes the code harder to write, read, and debug.

How to add multiple rows with the same value and ID in one row

I have data like this:
|ID|partner_name|quantity|Price|Period |
|1 |partner 1 | 1 | 100 |01/2017|
|2 |partner 1 | 2 | 200 |01/2017|
|3 |partner 1 | 4 | 400 |01/2017|
|4 |partner 1 | 1 | 100 |02/2017|
I want the data to be like this:
|ID|partner_name|quantity|Price|Period |
|1 |partner 1 | 7 | 700 |01/2017|
|2 |partner 1 | 1 | 100 |02/2017|
How can i create that with sql?
thanks,
You should group your query:
SELECT partner_name, SUM(quantity), SUM(price), period FROM your_table
GROUP BY partner_name, period;
This will merge rows with same partner_name and period together.