Cumulative sum of multiple window functions V3 - sql

I have this table:
id | date | player_id | score | all_games | all_wins | n_games | n_wins
============================================================================================
6747 | 2018-08-10 | 1 | 0 | 1 | | 1 |
6751 | 2018-08-10 | 1 | 0 | 2 | 0 | 2 |
6764 | 2018-08-10 | 1 | 0 | 3 | 0 | 3 |
6783 | 2018-08-10 | 1 | 0 | 4 | 0 | 4 |
6804 | 2018-08-10 | 1 | 0 | 5 | 0 | 5 |
6821 | 2018-08-10 | 1 | 0 | 6 | 0 | 6 |
6828 | 2018-08-10 | 1 | 0 | 7 | 0 | 7 |
17334 | 2018-08-23 | 1 | 0 | 8 | 0 | 8 | 0
17363 | 2018-08-23 | 1 | 0 | 9 | 0 | 9 | 0
17398 | 2018-08-23 | 1 | 0 | 10 | 0 | 10 | 0
17403 | 2018-08-23 | 1 | 0 | 11 | 0 | 11 | 0
17409 | 2018-08-23 | 1 | 0 | 12 | 0 | 12 | 0
33656 | 2018-09-13 | 1 | 0 | 13 | 0 | 13 | 0
33687 | 2018-09-13 | 1 | 0 | 14 | 0 | 14 | 0
45393 | 2018-09-27 | 1 | 0 | 15 | 0 | 15 | 0
45402 | 2018-09-27 | 1 | 0 | 16 | 0 | 16 | 0
45422 | 2018-09-27 | 1 | 1 | 17 | 0 | 17 | 0
45453 | 2018-09-27 | 1 | 0 | 18 | 1 | 18 | 0
45461 | 2018-09-27 | 1 | 0 | 19 | 1 | 19 | 0
45474 | 2018-09-27 | 1 | 0 | 20 | 1 | 20 | 0
57155 | 2018-10-11 | 1 | 0 | 21 | 1 | 21 | 1
57215 | 2018-10-11 | 1 | 0 | 22 | 1 | 22 | 1
57225 | 2018-10-11 | 1 | 0 | 23 | 1 | 23 | 1
69868 | 2018-10-25 | 1 | 0 | 24 | 1 | 24 | 1
The issue that I now need to solve is that I need n_games to be a rolling count of the last number of games per day, i.e. a user can play multiple games per day, as present it is just the same as row_number(*) OVER all_games
The other issues is that the column n_wins only does a sum(*) of the rolling windows wins for the day, so if a user wins a couple of games early on in day, that will not be added to the n_wins column until the next day.
I have the example DEMO:
I have tried this query
SELECT id,
date,
player_id,
score,
row_number(*) OVER all_races AS all_games,
sum(score) OVER all_races AS all_wins,
row_number(*) OVER last_n AS n_games,
sum(score) OVER last_n AS n_wins
FROM scores
WINDOW
all_races AS (PARTITION BY player_id ORDER BY id ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING),
last_n AS (PARTITION BY player_id ORDER BY date ASC RANGE BETWEEN interval '7 days' PRECEDING AND interval '1 day' PRECEDING);
Ideally I need a query that will output something like this table
id | date | player_id | score | all_games | all_wins | n_games | n_wins
============================================================================================
6747 | 2018-08-10 | 1 | 0 | 1 | | 1 |
6751 | 2018-08-10 | 1 | 0 | 2 | 0 | 2 |
6764 | 2018-08-10 | 1 | 0 | 3 | 0 | 3 |
6783 | 2018-08-10 | 1 | 0 | 4 | 0 | 4 |
6804 | 2018-08-10 | 1 | 0 | 5 | 0 | 5 |
6821 | 2018-08-10 | 1 | 0 | 6 | 0 | 6 |
6828 | 2018-08-10 | 1 | 0 | 7 | 0 | 7 |
17334 | 2018-08-23 | 1 | 0 | 8 | 0 | 1 | 0
17363 | 2018-08-23 | 1 | 0 | 9 | 0 | 2 | 0
17398 | 2018-08-23 | 1 | 0 | 10 | 0 | 3 | 0
17403 | 2018-08-23 | 1 | 0 | 11 | 0 | 4 | 0
17409 | 2018-08-23 | 1 | 0 | 12 | 0 | 5 | 0
33656 | 2018-09-13 | 1 | 1 | 13 | 1 | 6 | 0
33687 | 2018-09-13 | 1 | 0 | 14 | 1 | 7 | 1
45393 | 2018-09-27 | 1 | 0 | 15 | 1 | 1 | 1
45402 | 2018-09-27 | 1 | 0 | 16 | 1 | 2 | 1
45422 | 2018-09-27 | 1 | 1 | 17 | 1 | 3 | 1
45453 | 2018-09-27 | 1 | 0 | 18 | 2 | 4 | 2
45461 | 2018-09-27 | 1 | 0 | 19 | 2 | 5 | 2
45474 | 2018-09-27 | 1 | 0 | 20 | 2 | 6 | 1
57155 | 2018-10-11 | 1 | 0 | 21 | 2 | 7 | 1
57215 | 2018-10-11 | 1 | 0 | 22 | 2 | 1 | 1
57225 | 2018-10-11 | 1 | 0 | 23 | 2 | 2 | 1
69868 | 2018-10-25 | 1 | 0 | 24 | 2 | 3 | 1

Related

dense_rank over boolean column

Good day. I have a permutated table with condition and I am running redshift DB. This is a table with events log and I splitted it into session start (bool = 1) and session continue (bool = 0) like this:
=======================
| ID | BOOL |
=======================
| 1 | 0 |
| 2 | 1 |
| 3 | 0 |
| 4 | 0 |
| 5 | 0 |
| 6 | 0 |
| 7 | 0 |
| 8 | 0 |
| 9 | 0 |
| 10 | 0 |
| 11 | 1 |
| 12 | 0 |
| 13 | 0 |
| 14 | 1 |
| 15 | 0 |
| 16 | 0 |
=======================
I need to create sesssion_id column with something like dense_rank:
================================
| ID | BOOL | D_RANK |
================================
| 1 | 0 | 1 |
| 2 | 1 | 2 |
| 3 | 0 | 2 |
| 4 | 0 | 2 |
| 5 | 0 | 2 |
| 6 | 0 | 2 |
| 7 | 0 | 2 |
| 8 | 0 | 2 |
| 9 | 0 | 2 |
| 10 | 0 | 2 |
| 11 | 1 | 3 |
| 12 | 0 | 3 |
| 13 | 0 | 3 |
| 14 | 1 | 4 |
| 15 | 0 | 4 |
| 16 | 0 | 4 |
================================
Is there any option to do this? Would appreciate any help.
Use a cumulative sum. Assuming that bool is the start of a new session:
select t.*,
sum(bool) over (order by id) as session_id
from t;
Note: This will start at 0. You can add 1 if you need.

SQL - Partition restarted based on a column value

I need to create a new column that restarts at every 0 value of Column Repeated Call of each Customer_ID:
+-------------+---------+----------------------+---------------+
| Customer_ID | Call_ID | Days Since Last Call | Repeated Call |
+-------------+---------+----------------------+---------------+
| 1 | 1 | Null | 0 |
| 1 | 2 | 45 | 0 |
| 1 | 3 | 0 | 1 |
| 1 | 4 | 0 | 1 |
| 1 | 5 | 0 | 1 |
| 1 | 6 | 48 | 0 |
| 1 | 7 | 1 | 1 |
| 2 | 8 | Null | 0 |
| 2 | 9 | 1 | 1 |
+-------------+---------+----------------------+---------------+
In to something like this:
+-------------+---------+----------------------+---------------+-------------+
| Customer_ID | Call_ID | Days Since Last Call | Repeated Call | Order_Group |
+-------------+---------+----------------------+---------------+-------------+
| 1 | 1 | Null | 0 | 1 |
| 1 | 2 | 45 | 0 | 2 |
| 1 | 3 | 0 | 1 | 2 |
| 1 | 4 | 0 | 1 | 2 |
| 1 | 5 | 0 | 1 | 2 |
| 1 | 6 | 48 | 0 | 3 |
| 1 | 7 | 1 | 1 | 3 |
| 2 | 8 | Null | 0 | 1 |
| 2 | 9 | 1 | 1 | 1 |
+-------------+---------+----------------------+---------------+-------------+
Appreciate your suggestion, thanks!
You can use SUM() window function:
select t.*,
sum(case when Repeated_Call = 0 then 1 else 0 end)
over (partition by Customer_ID order by Call_Id) Order_Group
from tablename t
See the demo (for MySql but it is standard SQL).
Results:
| Customer_ID | Call_ID | Days Since Last Call | Repeated_Call | Order_Group |
| ----------- | ------- | -------------------- | ------------- | ----------- |
| 1 | 1 | | 0 | 1 |
| 1 | 2 | 45 | 0 | 2 |
| 1 | 3 | 0 | 1 | 2 |
| 1 | 4 | 0 | 1 | 2 |
| 1 | 5 | 0 | 1 | 2 |
| 1 | 6 | 48 | 0 | 3 |
| 1 | 7 | 1 | 1 | 3 |
| 2 | 8 | | 0 | 1 |
| 2 | 9 | 1 | 1 | 1 |
You can calculation every 0 value in column Repeated Call (for each customer) using window analytic function COUNT with ROWS UNBOUNDED PRECEDING:
SELECT *,
COUNT(CASE WHEN Repeated Call=0 THEN 1 ELSE NULL END )OVER(PARTITION BY Customer_ID
ORDER BY Call_ID ROWS UNBOUNDED PRECEDING)Order_Gr FROM Table

How do I conditionally increase the value of the proceeding row number by 1

I need to increase the value of the proceeding row number by 1. When the row encounters another condition I then need to reset the counter. This is probably easiest explained with an example:
+---------+------------+------------+-----------+----------------+
| Acct_ID | Ins_Date | Acct_RowID | indicator | Desired_Output |
+---------+------------+------------+-----------+----------------+
| 5841 | 07/11/2019 | 1 | 1 | 1 |
| 5841 | 08/11/2019 | 2 | 0 | 2 |
| 5841 | 09/11/2019 | 3 | 0 | 3 |
| 5841 | 10/11/2019 | 4 | 0 | 4 |
| 5841 | 11/11/2019 | 5 | 1 | 1 |
| 5841 | 12/11/2019 | 6 | 0 | 2 |
| 5841 | 13/11/2019 | 7 | 1 | 1 |
| 5841 | 14/11/2019 | 8 | 0 | 2 |
| 5841 | 15/11/2019 | 9 | 0 | 3 |
| 5841 | 16/11/2019 | 10 | 0 | 4 |
| 5841 | 17/11/2019 | 11 | 0 | 5 |
| 5841 | 18/11/2019 | 12 | 0 | 6 |
| 5132 | 11/03/2019 | 1 | 1 | 1 |
| 5132 | 12/03/2019 | 2 | 0 | 2 |
| 5132 | 13/03/2019 | 3 | 0 | 3 |
| 5132 | 14/03/2019 | 4 | 1 | 1 |
| 5132 | 15/03/2019 | 5 | 0 | 2 |
| 5132 | 16/03/2019 | 6 | 0 | 3 |
| 5132 | 17/03/2019 | 7 | 0 | 4 |
| 5132 | 18/03/2019 | 8 | 0 | 5 |
| 5132 | 19/03/2019 | 9 | 1 | 1 |
| 5132 | 20/03/2019 | 10 | 0 | 2 |
+---------+------------+------------+-----------+----------------+
The column I want to create is 'Desired_Output'. It can be seen from this table that I need to use the column 'indicator'. I want the following row to be n+1; unless the next row is 1. The counter needs to reset when the value 1 is encountered again.
I have tried to use a loop method of some sort but this did not produce the desired results.
Is this possible in some way?
The trick is to identify the group of consecutive rows starts from indicator 1 to the next 1. This is achieve by using the cross apply finding the Acct_RowID with indicator = 1 and use that as a Grp_RowID to use as partition by in the row_number() window function
select *,
Desired_Output = row_number() over (partition by t.Acct_ID, Grp_RowID
order by Acct_RowID)
from your_table t
cross apply
(
select Grp_RowID = max(Acct_RowID)
from your_table x
where x.Acct_ID = t.Acct_ID
and x.Acct_RowID <= t.Acct_RowID
and x.indicator = 1
) g

SQL how to force to display row with 0 if no data available?

My table returns results as following (skips row if HourOfDay does not have data for particular ID)
ID HourOfDay Counts
--------------------------
1 5 5
1 13 10
1 23 3
..........................HourOfDay up till 23
2 9 1
and so on.
What I am trying to achieve is to force showing rows displaying 0 for HoursOfDay, which don't have data, like following:
ID HourOfDay Counts
--------------------------
1 0 0
1 1 0
1 2 0
1......................
1 5 5
1 6 0
1......................
1 23 3
2 0 0
2 1 0
etc.
I have researched around about it. It looks like I can achieve this result if I create an extra table and outer join it. So I have created table variable in SP (as a temp workaround)
DECLARE #Hours TABLE
(
[Hour] INT NULL
);
INSERT INTO #Hours VALUES (0),(1),(2),(3),(4),(5),(6),(7),(8),(9),(10),(11),(12)
,(13),(14),(15),(16),(17),(18),(19),(20),(21),(22),(23);
However, no matter how I join it, it does not achieve desired result.
How do I proceed? Do I add extra columns to join on? Completely different approach? Any hint in the right direction is appreciated!
Using a derived table for the distinct Ids cross joined to #Hours, left joined to your table:
select
i.Id
, h.Hour
, coalesce(t.Counts,0) as Counts
from (select distinct Id from t) as i
cross join #Hours as h
left join t
on i.Id = t.Id
and h.Hour = t.HourOfDay
rextester demo: http://rextester.com/XFZYX88502
returns:
+----+------+--------+
| Id | Hour | Counts |
+----+------+--------+
| 1 | 0 | 0 |
| 1 | 1 | 0 |
| 1 | 2 | 0 |
| 1 | 3 | 0 |
| 1 | 4 | 0 |
| 1 | 5 | 5 |
| 1 | 6 | 0 |
| 1 | 7 | 0 |
| 1 | 8 | 0 |
| 1 | 9 | 0 |
| 1 | 10 | 0 |
| 1 | 11 | 0 |
| 1 | 12 | 0 |
| 1 | 13 | 10 |
| 1 | 14 | 0 |
| 1 | 15 | 0 |
| 1 | 16 | 0 |
| 1 | 17 | 0 |
| 1 | 18 | 0 |
| 1 | 19 | 0 |
| 1 | 20 | 0 |
| 1 | 21 | 0 |
| 1 | 22 | 0 |
| 1 | 23 | 3 |
| 2 | 0 | 0 |
| 2 | 1 | 0 |
| 2 | 2 | 0 |
| 2 | 3 | 0 |
| 2 | 4 | 0 |
| 2 | 5 | 0 |
| 2 | 6 | 0 |
| 2 | 7 | 0 |
| 2 | 8 | 0 |
| 2 | 9 | 1 |
| 2 | 10 | 0 |
| 2 | 11 | 0 |
| 2 | 12 | 0 |
| 2 | 13 | 0 |
| 2 | 14 | 0 |
| 2 | 15 | 0 |
| 2 | 16 | 0 |
| 2 | 17 | 0 |
| 2 | 18 | 0 |
| 2 | 19 | 0 |
| 2 | 20 | 0 |
| 2 | 21 | 0 |
| 2 | 22 | 0 |
| 2 | 23 | 0 |
+----+------+--------+

writing SQL query to show result in specific order

I have this table
+----+--------+------------+-----------+
| Id | day_id | subject_id | period_Id |
+----+--------+------------+-----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 8 | 2 | 6 | 1 |
| 9 | 2 | 7 | 2 |
| 15 | 3 | 3 | 1 |
| 16 | 3 | 4 | 2 |
| 22 | 4 | 5 | 1 |
| 23 | 4 | 5 | 2 |
| 24 | 4 | 6 | 3 |
| 29 | 5 | 8 | 1 |
| 30 | 5 | 1 | 2 |
to something like this
| Id | day_id | subject_id | period_Id |
| 1 | 1 | 1 | 1 |
| 8 | 2 | 6 | 1 |
| 15 | 3 | 3 | 1 |
| 22 | 4 | 5 | 1 |
| 29 | 5 | 8 | 1 |
| 2 | 1 | 2 | 2 |
| 2 | 1 | 2 | 2 |
| 16 | 3 | 4 | 2 |
| 23 | 4 | 5 | 2 |
| 30 | 5 | 1 | 2 |
+----+--------+------------+-----------+
SO, I want to choose one period with a different subject each day and doing this for number of weeks. so first subject dose not come until all subject have been chosen.
You can ORDER BY period_id first and then by day_id:
SELECT *
FROM your_table
ORDER BY period_Id, day_Id
LiveDemo