Get number of times a user has availed a particular offer - sql

I have a table which gives information about when a particular user has used an offer. It has 3 columns
Date: Date at which the offer was used
user_id: Identifier for a particular user
txn_id: Transaction id when a user uses an offer. It is always unique in the table.
The offer is such that a particular user can use it for 5 times.
I want to know at each date the number of users are in which stage of offer usage.
For example
On Day 1 there could be 3 users who have used offer once(redemption_1), 2 users who could have used offer twice (redemption_2).
Now on Day 2 there could be users from day 1(repeat users) as well as users who are coming for offer usage for the first time(new users).
For the new users of day 2 the logic is same as that of day 1 users.(May be 2 new users use the offer for 1 time(redemption_1), 3 new users use it for 3 times(redemption_3))
But for the repeat users now I want to add up to there previous day's usage.
For example
On Day 1, 3 users had used offer once(redemption_1) but on day 2 if they use it one more time then they should be counted in redemption_2.(And not in redemption_1 since they are using it for second time since the offer has started/or there last usage)
In this way I want to go on adding cumulatively the number of time a user has used a offer and the count the number of users who have used offer for 1 time(redemption_1), 2 time(redemption_2) and so on for each date
Table
+------------+---------+------------+
| Date | user_id | txn_id |
+------------+---------+------------+
| 2019-06-04 | 1 | 1ACSA0-ABA |
| 2019-06-04 | 2 | 1BEAA0-CSC |
| 2019-06-04 | 3 | 1AGHF0-CBA |
| 2019-06-04 | 1 | 1AVFA0-GAA |
| 2019-06-05 | 1 | 1BCFA0-AAA |
| 2019-06-05 | 1 | 1AVFB0-GAC |
| 2019-06-05 | 2 | 1AVFA0-GVA |
| 2019-06-05 | 4 | 1AVFA0-GVB |
| 2019-06-05 | 5 | 1AVFA0-BCF |
| 2019-06-06 | 6 | 1AGHF0-CCA |
| 2019-06-06 | 1 | 1BXHF0-CCA |
| 2019-06-06 | 2 | 1AGHF0-CBG |
| 2019-06-06 | 3 | 1AGHF0-CAW |
| 2019-06-06 | 2 | 1AGHF0-CTU |
+------------+---------+------------+
Desired Output
+------------+--------------+--------------+--------------+--------------+--------------+
| Date | redemption_1 | redemption_2 | redemption_3 | redemption_4 | redemption_5 |
+------------+--------------+--------------+--------------+--------------+--------------+
| 2019-06-04 | 2 | 1 | 0 | 0 | 0 |
| 2019-06-05 | 2 | 1 | 0 | 1 | 0 |
| 2019-06-06 | 1 | 1 | 0 | 1 | 1 |
+------------+--------------+--------------+--------------+--------------+--------------+
I will walk you through the rows of output for better understanding
In row one with date 2019-06-04 there are two users who used offer once (2,3) and one user who used offer twice(1)
In row with date 2019-06-05 there are 2 user who used offer once(4,5). Note that they have never used offer before that so they counted for redemption_1.
In the same row there is 1 user who has used offer 2 times (2: Once on 2019-06-04 and then on 2019-06-05) so he is counted for redemption_2
In the same row there is 1 user who has used offer 4 times (1: Twice on 2019-06-04 and then again twice on 2019-06-05) so he is counted for redemption_4
And so on for row with date 2019-06-06
Please let me know for any kind of clarification

Not a paragon of efficiency, but it works.
Test data:
Create Table offer_used(date DateTime, user_id Int, txn_id Varchar(50))
Insert Into dbo.offer_used (date,
user_id,
txn_id)
Values
('2019-06-04', 1, '1ACSA0-ABA'),
('2019-06-04', 2, '1BEAA0-CSC'),
('2019-06-04', 3, '1AGHF0-CBA'),
('2019-06-04', 1, '1AVFA0-GAA'),
('2019-06-05', 1, '1BCFA0-AAA'),
('2019-06-05', 1, '1AVFB0-GAC'),
('2019-06-05', 2, '1AVFA0-GVA'),
('2019-06-05', 4, '1AVFA0-GVB'),
('2019-06-05', 5, '1AVFA0-BCF'),
('2019-06-06', 6, '1AGHF0-CCA'),
('2019-06-06', 1, '1BXHF0-CCA'),
('2019-06-06', 2, '1AGHF0-CBG'),
('2019-06-06', 3, '1AGHF0-CAW'),
('2019-06-06', 2, '1AGHF0-CTU')
Query:
; With
Dates As (Select Distinct date From dbo.offer_used OU),
Users As (Select user_id, FirstTime = Min(date) From dbo.offer_used OU Group By user_id),
UserCounts As (Select
Dates.date,
Users.user_id,
Users.FirstTime,
UsedCount = (Select Count(*) From dbo.offer_used As Used
Where Used.date <= Dates.date
And Used.user_id = Users.user_id)
From
Dates
Cross Join Users)
Select
date = UserCounts.date,
[first time today] = Sum(Case When UserCounts.date = UserCounts.FirstTime
And UserCounts.UsedCount = 1 Then 1 Else 0 End),
[2 times total] = Sum(Case When UserCounts.UsedCount = 2 Then 1 Else 0 End),
[3 times total] = Sum(Case When UserCounts.UsedCount = 3 Then 1 Else 0 End),
[4 times total] = Sum(Case When UserCounts.UsedCount = 4 Then 1 Else 0 End),
[5 times total] = Sum(Case When UserCounts.UsedCount = 5 Then 1 Else 0 End),
[bonus: never] = Sum(Case When UserCounts.UsedCount = 0 Then 1 Else 0 End)
From UserCounts
Group By UserCounts.date
Order By UserCounts.date
Results:
date first time today 2 times total 3 times total 4 times total 5 times total bonus: never
----------- ---------------- ------------- ------------- ------------- ------------- ------------
2019-06-04 2 1 0 0 0 3
2019-06-05 2 1 0 1 0 1
2019-06-06 1 1 0 1 1 0

I think you want conditional aggregation:
select t.date,
sum(case when seqnum = 1 then 1 else 0 end) as redemption_1,
sum(case when seqnum = 2 then 1 else 0 end) as redemption_2,
sum(case when seqnum = 3 then 1 else 0 end) as redemption_3,
sum(case when seqnum = 4 then 1 else 0 end) as redemption_4,
sum(case when seqnum = 5 then 1 else 0 end) as redemption_5
from (select t.*, row_number() over (partition by user_id order by date) as seqnum
from table t
) t
group by t.date
order by t.date

Related

SQL - How to turn the rows of a record into columns?

i need some help here!!
I have a "users" table within my platform. In this table I have information like:
id = is the user ID
created_at = is the date that the user created an agreement within the platform
agent = is responsible for serving the user
This information is in the following format:
id | created_at | deal_id | agent (columns of the table)
1 | 2020-08-01 | 1 | 123456
1 | 2020-09-01 | 2 | 123456
1 | 2020-09-10 | 3 | 345676
1 | 2020-10-29 | 4 | 456677
I would like to bring this data as follows:
id | created_at1 | created_at2 | created_at3 | created_at4 | agent1 | agent2 | agent3 | agent4
1 | 2020-08-01 | 2020-09-01 | 2020-09-10 | 2020-10-29 | 123456 | 123456 | 345676 | 456677
Is it possible?
I tried to do it with minimum and maximum, but it would only return me two situations.
Remembering that I gave an example of a user, I wanted it to return to all ID's.
You can use conditional aggregation as follows:
Select t.id,
Max(case when deal_id = 1 then created_at end) as created_at1,
Max(case when deal_id = 2 then created_at end) as created_at2,
Max(case when deal_id = 3 then created_at end) as created_at3,
Max(case when deal_id = 4 then created_at end) as created_at4,
Max(case when deal_id = 1 then agent end) as agent1,
Max(case when deal_id = 2 then agent end) as agent2,
Max(case when deal_id = 3 then agent end) as agent3,
Max(case when deal_id = 4 then agent end) as agent4
From your_table t
Group by id

SQL "Group" and "Count" categories

Edit. This is a follow up from another question. To simplify the question. Assume a table
date | id | type
01/01 | 1 | F
02/01 | 1 | F
02/01 | 1 | F
03/01 | 1 | S
03/01 | 1 | S
04/01 | 1 | F
04/01 | 1 | S
05/01 | 1 | S
I am looking for a way to summarise the above table by combination of transaction types per day. If a person (id) has only one transaction per day it counts as a Single type. If they have more than one it counts as a Multiple one. I've done that with my original query and it works. The output from the above table would be:
date | Single | Multiple
01/01 | 1 | 0
02/01 | 0 | 1
03/01 | 0 | 1
04/01 | 0 | 1
05/01 | 1 | 0
I got that far and it works. What's I'm struggling with (ie. don't have a clue of how to start) is how set up a query to show all possible combinations of Type (SS, FF, FS) instead of just counting the multiple transactions. The desired output would be like:
date | Single | # FF | # FS | # SS
01/01 | 1 | 0 | 0 | 0
02/01 | 0 | 1 | 0 | 0
03/01 | 0 | 0 | 0 | 1
04/01 | 0 | 0 | 1 | 0
05/01 | 1 | 0 | 0 | 0
Any constructive hints or ideas will be much appreciated.
this is assuming that you have max 2 types per date.
You can use the CASE WHEN statement with MIN() and MAX() to check for combination of FF, FS or SS
select [date],
case when count(*) = 1 then 1 else 0 end as Single,
case when count(*) >= 2
and min([type]) = 'F'
and max([type]) = 'F'
then 1
else 0
end as [# FF],
case when count(*) >= 2
and min([type]) = 'F'
and max([type]) = 'S'
then 1
else 0
end as [# FS],
case when count(*) >= 2
and min([type]) = 'S'
and max([type]) = 'S'
then 1
else 0
end as [# SS]
from yourtable
group by [date]
EDIT :
for more then 3 types, just change the count(*) = 2 to count(*) >= 2 as long as the type are either F or S

SQL sum total each column in last row

I wish SQL for SUM each column(IPO and UOR) in TOTAL in second last. And GRAND TOTAL(Sum IPO + UOR) in the last one. Thank you so much
No Code IPO UOR
----------------------
1 D173 1 0
2 D176 3 0
3 D184 1 1
4 D185B 1 0
5 D187 1 2
6 F042 3 0
7 ML004 12 3
8 TTPMC 2 0
9 Z00204 1 0
------------------
TOTAL (NOS) 25 6
-------------------------
GRAND TOTAL (NOS) 31
Here is my code, :
SELECT
SUM(CASE WHEN IPOType = 'IPO' THEN 1 ELSE 0 END) as IPO,
SUM(CASE WHEN IPOType = 'UOR' THEN 1 ELSE 0 END) as UOR
FROM IPO2018
GROUP BY OriProjNo
it can show like this
No Code IPO UOR
----------------------
1 D173 1 0
2 D176 3 0
3 D184 1 1
4 D185B 1 0
5 D187 1 2
6 F042 3 0
7 ML004 12 3
8 TTPMC 2 0
9 Z00204 1 0
------------------
Generally speaking, you want to leave totals and sub-totals to whatever tool you are presenting your data in, as they will be able to handle the formatting with significantly more ease. In addition, your desired output does not have the same number of columns (Grand Total row only has one numeric) so even if you did shoehorn this in to the same dataset, the column headings wouldn't make sense.
That said, you can return group totals via the with rollup statement. This will provide an additional row with the aggregate totals for the group. Where there is more than one group in your data, you will get a sub-total row for each group and a total row for the entire dataset:
declare #t table(c nvarchar(10),t nvarchar(3));
insert into #t values ('D173','IPO'),('D176','IPO'),('D176','IPO'),('D176','IPO'),('D184','IPO'),('D184','UOR'),('D185B','IPO'),('D187','IPO'),('D187','UOR'),('D187','UOR'),('F042','IPO'),('F042','IPO'),('F042','IPO'),('TTPMC','IPO'),('TTPMC','IPO'),('Z00204','IPO'),('ML004','UOR'),('ML004','UOR'),('ML004','UOR'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO'),('ML004','IPO');
select row_number() over (order by grouping(c),c) as n
,case when grouping(c) = 1 then 'TOTAL (NOS)' else c end as c
,sum(case when t = 'IPO' then 1 else 0 end) as IPO
,sum(case when t = 'UOR' then 1 else 0 end) as UOR
from #t
group by c
with rollup
order by grouping(c)
,c;
Output:
+----+-------------+-----+-----+
| n | c | IPO | UOR |
+----+-------------+-----+-----+
| 1 | D173 | 1 | 0 |
| 2 | D176 | 3 | 0 |
| 3 | D184 | 1 | 1 |
| 4 | D185B | 1 | 0 |
| 5 | D187 | 1 | 2 |
| 6 | F042 | 3 | 0 |
| 7 | ML004 | 12 | 3 |
| 8 | TTPMC | 2 | 0 |
| 9 | Z00204 | 1 | 0 |
| 10 | TOTAL (NOS) | 25 | 6 |
+----+-------------+-----+-----+

Generate time series with daily statistics using a PostgreSQL query

I am finding myself in the position of having to formulate a (to me) rather complex SQL query and I can't seem to get my head around it.
I have a table called orders and a related table order_state_history that logs the state of those orders over time (see below).
I now need to generate a series of rows - one row per day - containing the amount of orders that were in particular states at the end of that day (see report). Also I want to consider only orders of order.type = 1.
The data resides in a PostgreSQL database. I already found out how to generate a time series using GENERATE_SERIES(DATE '2001-01-01', CURRENT_DATE, '1 DAY'::INTERVAL) days which allows me to generate rows for days on which no state changes were recorded.
My current approach is to join orders, order_state_history and the generated series of days all together and try to filter out all the rows that have DATE(order_state_history.timestamp) > DATE(days) and then somehow get the final state of each order on that day by first_value(order_state_history.new_state) OVER (PARTITION_BY(orders.id) ORDER BY order_state_history.timestamp DESC), but this is where my tiny bit of SQL experience abandons me.
I just can't wrap my head around the problem.
Can this even be solved in a single query or would I be better adviced to compute the data by some kind of intelligent script that performs one query per day?
What would be a reasonable approach to the problem?
orders===
id type
10000 1
10001 1
10002 2
10003 2
10004 1
order_state_history===
order_id index timestamp new_state
10000 1 01.01.2001 12:00 NEW
10000 2 02.01.2001 13:00 ACTIVE
10000 3 03.01.2001 14:00 DONE
10001 1 02.01.2001 13:00 NEW
10002 1 03.01.2001 14:00 NEW
10002 2 05.01.2001 10:00 ACTIVE
10002 3 05.01.2001 14:00 DONE
10003 1 07.01.2001 04:00 NEW
10004 1 05.01.2001 14:00 NEW
10004 2 10.01.2001 17:30 DONE
Expected result===
date new_orders active_orders done_orders
01.01.2001 1 0 0
02.01.2001 1 1 0
03.01.2001 1 0 1
04.01.2001 1 0 1
05.01.2001 2 0 1
06.01.2001 2 0 1
07.01.2001 2 0 1
08.01.2001 2 0 1
09.01.2001 2 0 1
10.01.2001 1 0 2
Step 1. Calculate a cumulative sum of state for each order, using values NEW = 1, ACTIVE = 1, DONE = 2:
select
order_id, timestamp::date as day,
sum(case new_state when 'DONE' then 2 else 1 end) over w as state
from order_state_history h
join orders o on o.id = h.order_id
where o.type = 1
window w as (partition by order_id order by timestamp)
order_id | day | state
----------+------------+-------
10000 | 2001-01-01 | 1
10000 | 2001-01-02 | 2
10000 | 2001-01-03 | 4
10001 | 2001-01-02 | 1
10004 | 2001-01-05 | 1
10004 | 2001-01-10 | 3
(6 rows)
Step 2. Calculate a transition matrix for each order based on states from step 1 (2 means NEW->ACTIVE, 3 means NEW->DONE, 4 means ACTIVE->DONE):
select
order_id, day, state,
case when state = 1 then 1 when state = 2 or state = 3 then -1 else 0 end as new,
case when state = 2 then 1 when state = 4 then -1 else 0 end as active,
case when state > 2 then 1 else 0 end as done
from (
select
order_id, timestamp::date as day,
sum(case new_state when 'DONE' then 2 else 1 end) over w as state
from order_state_history h
join orders o on o.id = h.order_id
where o.type = 1
window w as (partition by order_id order by timestamp)
) s
order_id | day | state | new | active | done
----------+------------+-------+-----+--------+------
10000 | 2001-01-01 | 1 | 1 | 0 | 0
10000 | 2001-01-02 | 2 | -1 | 1 | 0
10000 | 2001-01-03 | 4 | 0 | -1 | 1
10001 | 2001-01-02 | 1 | 1 | 0 | 0
10004 | 2001-01-05 | 1 | 1 | 0 | 0
10004 | 2001-01-10 | 3 | -1 | 0 | 1
(6 rows)
Step 3. Calculate a cumulative sum of each state for a series of days:
select distinct
day::date,
sum(new) over w as new,
sum(active) over w as active,
sum(done) over w as done
from generate_series('2001-01-01'::date, '2001-01-10', '1d'::interval) day
left join (
select
order_id, day, state,
case when state = 1 then 1 when state = 2 or state = 3 then -1 else 0 end as new,
case when state = 2 then 1 when state = 4 then -1 else 0 end as active,
case when state > 2 then 1 else 0 end as done
from (
select
order_id, timestamp::date as day,
sum(case new_state when 'DONE' then 2 else 1 end) over w as state
from order_state_history h
join orders o on o.id = h.order_id
where o.type = 1
window w as (partition by order_id order by timestamp)
) s
) s
using(day)
window w as (order by day)
order by 1
day | new | active | done
------------+-----+--------+------
2001-01-01 | 1 | 0 | 0
2001-01-02 | 1 | 1 | 0
2001-01-03 | 1 | 0 | 1
2001-01-04 | 1 | 0 | 1
2001-01-05 | 2 | 0 | 1
2001-01-06 | 2 | 0 | 1
2001-01-07 | 2 | 0 | 1
2001-01-08 | 2 | 0 | 1
2001-01-09 | 2 | 0 | 1
2001-01-10 | 1 | 0 | 2
(10 rows)

SQLite: Multiple aggregate columns

I'm a little new to SQL world and still learning the ins and outs of the language.
I have a table with an id, dayOfWeek, and a count for each day. So any given id might appear in the table up to seven times, with a count of events for each day for each id. I'd like to restructure the table to have a single row for each id with a column for each day of the week, something like the following obviously incorrect query:
SELECT id, sum(numEvents where dayOfWeek = 0), sum(numEvents where dayOfWeek = 1) ... from t;
Is there a solid way to approach this?
EDIT:
I'm worried I may not have been very clear. The table would ideally be structured something like this:
id | Sunday | Monday | Tuesday | Wednesday | Thursday | Friday | Saturday
0 | 13 | 45 | 142 | 3 | 36 | 63 | 15336
1 | 17 | 25 | 45 | 364 | 37 | 540 | 0
So event 0 occurred 13 times on Sunday, 45 on Monday, etc... My current table looks like this:
id | dayOfWeek | count
0 | 0 | 13
0 | 1 | 45
0 | 2 | 142
0 | 3 | 3
0 | 4 | 36
0 | 5 | 63
0 | 6 | 15336
1 | 0 | 17
1 | 1 | 25
...
Hope that helps clear up what I'm after.
The following is verbose, but should work (generic ideone sql demo unfortunately SqlLite on SqlFiddle is down at the moment):
SELECT id,
SUM(case when dayofweek = 1 then numevents else 0 end) as Day1Events,
SUM(case when dayofweek = 2 then numevents else 0 end) as Day2Events,
SUM(case when dayofweek = 3 then numevents else 0 end) as Day3Events
--, etc...
FROM EventTable
GROUP BY ID;
SELECT dayOfWeek, sum(numEvents) as numberOfEvents
FROM t
GROUP BY dayOfWeek;