Find Active customers in past X days - sql

I am facing some hard times, Need quick help. It would be great if someone could assist me.
Thanks a lot in advance:)
I have 2 tables.
1st table: daily_customer_snapshot: the daily snapshot of the customer which looks something as shown below.
c_id
date
state
location
b1
2020-12-01
Active
OOW
b1
2020-12-02
Active
OOW
b1
2020-12-03
Active
OOW
b1
2020-12-04
Active
OOW
b1
2020-12-05
Active
OOW
b3
2020-12-06
Active
OOW
b3
2020-12-07
Active
OOW
b3
2020-12-08
Active
OOW
b1
2020-12-09
Decay
IW
b2
2020-12-15
Active
OOW
2nd table: customer_date_series: contains the date series from the day user became our customer.
Ex: refer image 2: user b1 became our customer on '2020-12-01' and user b3 became our customer on '2020-12-06'
and b2 became our customer on '2020-12-15'. I have generated the date series with customer_id to count at any given date how many customers we had.
c_id
date
b1
2020-12-01
b1
2020-12-02
b1
2020-12-03
b1
2020-12-04
b1
2020-12-05
b1
2020-12-06
b1
2020-12-07
b1
2020-12-08
b1
2020-12-09
b1
2020-12-10
b1
2020-12-11
b1
2020-12-12
b1
2020-12-13
b1
2020-12-14
b1
2020-12-15
b1
2020-12-16
b3
2020-12-06
b3
2020-12-07
b3
2020-12-08
b3
2020-12-09
b3
2020-12-10
b3
2020-12-11
b3
2020-12-12
b3
2020-12-13
b3
2020-12-14
b3
2020-12-15
b3
2020-12-16
b2
2020-12-15
b2
2020-12-16
I left joined table1 (customer_date_series) with table2 (daily_customer_snapshot) to get the overview of the customer behavior at any given date.
I got the results as displayed in image 3.
Query to Join:
select
bds.date,
bds.c_id,
b.state,
b.location
FROM
customer_date_series bds LEFT JOIN daily_customer_snapshot b ON bds.c_id = b.c_id and bds.date = b.date
ORDER BY
1,2;
date
c_id
state
location
2020-12-01
b1
Active
OOW
2020-12-02
b1
Active
OOW
2020-12-03
b1
Active
OOW
2020-12-04
b1
Active
OOW
2020-12-05
b1
Active
OOW
2020-12-06
b1
2020-12-06
b3
Active
OOW
2020-12-07
b1
2020-12-07
b3
Active
OOW
2020-12-08
b1
2020-12-08
b3
Active
OOW
2020-12-09
b1
Decay
IW
2020-12-09
b3
2020-12-10
b1
2020-12-10
b3
2020-12-11
b1
2020-12-11
b3
2020-12-12
b1
2020-12-12
b3
2020-12-13
b1
2020-12-13
b3
2020-12-14
b1
2020-12-14
b3
2020-12-15
b1
2020-12-15
b2
Active
OOW
2020-12-15
b3
2020-12-16
b1
2020-12-16
b2
2020-12-16
b3
This is where I am struggling.
I am facing a challenge here. I want to create new column called 'status' and if the customer data in the daily_customer_snapshot is updated in the past 5 days from the current_date
I want to set the status to be 'Active' Else 'Inactive'.
Ex:

If I follow you correctly, you can use boolean window aggregation:
select
bds.date,
bds.c_id,
b.state,
b.location,
bool_or(b.state = 'Active') over(
partition by bds.c_id
order by bds.date
range between interval '5 days' preceding and current row
) as is_active
from customer_date_series bds
left join daily_customer_snapshot b on bds.c_id = b.c_id and bds.date = b.date
order by 1,2;
This sets a boolean flag on rows where the same customer was active at least once within the last 5 days (or in the current day).
If you do want to see 'Active'/ 'InActive' instead (which I find less useful than a boolean) you can do:
min(b.state) over(
partition by bds.c_id
order by bds.date
range between interval '5 days' preceding and current row
) as status
... Which works because, string-wise, 'Active' < 'InActive'.

If you want to use both tables, then a lateral join does what you want:
select bds.date, bds.c_id, b.state, b.location
--CASE WHEN b.state = '%ActiveDecay%' between current_date- 10 and current_date THEN 'ActIve' ELSE 'DECAY' END as STATUS
FROM battery_date_series bds LEFT JOIN LATERAL
(SELECT b.*
FROM battery b
WHERE bds.c_id = b.c_id and b.date <= bds.date
ORDER BY b.date DESC
LIMIT 1
) b
ON 1=1
ORDER BY 1,2;

Related

Splunk: Use output of search A row by row as input for search B, then produce common result table

In Splunk, I have a search producing a result table like this:
_time
A
B
C
2022-10-19 09:00:00
A1
B1
C1
2022-10-19 09:00:00
A2
B2
C2
2022-10-19 09:10:20
A3
B3
C3
Now, for each row, I want to run a second search, using the _time value as input parameter.
For above row 1 and 2 (same _time value), the result of the second search would be:
_time
D
E
2022-10-19 09:00:00
D1
E1
For above row 3, the result of the second search would be:
_time
D
E
2022-10-19 09:10:20
D3
E3
And now I want to output the results in a common table, like this:
_time
A
B
C
D
E
2022-10-19 09:00:00
A1
B1
C1
D1
E1
2022-10-19 09:00:00
A2
B2
C2
D1
E1
2022-10-19 09:10:20
A3
B3
C3
D3
E3
I experimented with join, append, map, appendcols and subsearch, but I am struggling both with the row-by-row character of the second search and with pulling to data together into one common table.
For example, appendcols simply tacks one result table onto another, even if they are completely unrelated and differently shaped. Like so:
_time
A
B
C
D
E
2022-10-19 09:00:00
A1
B1
C1
D1
E1
2022-10-19 09:00:00
A2
B2
C2
-
-
2022-10-19 09:10:20
A3
B3
C3
-
-
Can anybody please point me into the right direction?

Compare values from one column in table A and another column in table B

I need to create a NeedDate column in the expected output. I will compare the QtyShort from Table B with QtyReceive from table A.
In the expected output, if QtyShort = 0, NeedDate = MaltDueDate.
For the first row of table A, if 0 < QtyShort (in Table B) <= QtyReceive (=6), NeedDate = 10/08/2021 (DueDate from Table A).
If 6 < QtyShort <= 10 (QtyReceive), move to the second row, NeedDate = 10/22/2021 (DueDate from Table A).
If 10 < QtyShort <= 20 (QtyReceive), move to the third row, NeedDate = 02/01/2022 (DueDate from Table A).
If QtyShort > QtyReceive (=20), NeedDate = 09/09/9999.
This should continue in a loop until the last row on table B has been compared
How could we do this? Any help will be appreciated. Thank you in advance!
Table A
Item DueDate QtyReceive
A1 10/08/2021 6
A1 10/22/2021 10
A1 02/01/2022 20
Table B
Item MatlDueDate QtyShort
A1 06/01/2022 0
A1 06/02/2022 0
A1 06/03/2022 1
A1 06/04/2022 2
A1 06/05/2022 5
A1 06/06/2022 7
A1 06/07/2022 10
A1 06/08/2022 15
A1 06/09/2022 25
Expected Output:
Item MatlDueDate QtyShort NeedDate
A1 06/01/2022 0 06/01/2022
A1 06/02/2022 0 06/02/2022
A1 06/03/2022 1 10/08/2021
A1 06/04/2022 2 10/08/2021
A1 06/05/2022 5 10/08/2021
A1 06/06/2022 7 10/22/2021
A1 06/07/2022 10 10/22/2021
A1 06/08/2022 15 02/01/2022
A1 06/09/2022 25 09/09/9999
Use OUTER APPLY() operator to find the minimum DueDate from TableA that is able to fulfill the QtyShort
select b.Item, b.MatlDueDate, b.QtyShort,
NeedDate = case when b.QtyShort = 0
then b.MatlDueDate
else isnull(a.DueDate, '9999-09-09')
end
from TableB b
outer apply
(
select DueDate = min(a.DueDate)
from TableA a
where a.Item = b.Item
and a.QtyReceive >= b.QtyShort
) a
Result:
Item
MatlDueDate
QtyShort
NeedDate
A1
2022-06-01
0
2022-06-01
A1
2022-06-02
0
2022-06-02
A1
2022-06-03
1
2021-10-08
A1
2022-06-04
2
2021-10-08
A1
2022-06-05
5
2021-10-08
A1
2022-06-06
7
2021-10-22
A1
2022-06-07
10
2021-10-22
A1
2022-06-08
15
2022-02-01
A1
2022-06-09
25
9999-09-09
db<>fiddle demo

Related ids per row in SAS

Given the following table with two columns:
ID ACC
A1 ACC1
A2 ACC1
A3 ACC1
B1 ACC2
B2 ACC2
All rows are related based on the ACC column. So my goal is to have the following table:
ID ID2 ACC
A1 A2 ACC1
A1 A3 ACC1
A2 A1 ACC1
A2 A3 ACC1
A3 A1 ACC1
A3 A2 ACC1
B1 B2 ACC2
B2 B1 ACC2
proc sql;
create table want as
select left.ID, rigth.ID, left.ACC
from have as left, have as right
where left.ACC eq right.ACC
and left.ID ne right.ID;
quit;

DISTINCT columns from 2 different tables

I have 2 tables with similar information. Let's call them DAILYROWDATA and SUMMARYDATA.
Table DAILYROWDATA
NIP NAME DEPARTMENT
A1 ARIA BB
A2 CHLOE BB
A3 RYAN BB
A4 STEVE BB
Table SUMMARYDATA
NIP NAME DEPARTMENT STATUSIN STATUSOUT
A1 ARIA BB 1/21/2020 8:06:23 AM 1/21/2020 8:07:53 AM
A2 CHLOE BB 1/21/2020 8:16:07 AM 1/21/2020 9:51:21 AM
A1 ARIA BB 1/22/2020 9:06:23 AM 1/22/2020 10:07:53 AM
A2 CHLOE BB 1/22/2020 9:16:07 AM 1/22/2020 10:51:21 AM
A3 RYAN BB 1/22/2020 8:15:03 AM 1/22/2020 9:12:03 AM
And I need to combine these two tables and show all data in table DAILYROWDATA and set the value if STATUSIN = NULL and STATUSOUT= Null then write 'NA'. This is the output that I meant:
NIP NAME DEPARTMENT STATUSIN STATUSOUT
A1 ARIA BB 1/21/2020 8:06:23 AM 1/21/2020 8:07:53 AM
A2 CHLOE BB 1/21/2020 8:16:07 AM 1/21/2020 9:51:21 AM
A3 RYAN BB NA NA
A4 STEVE BB NA NA
A1 ARIA BB 1/22/2020 9:06:23 AM 1/22/2020 10:07:53 AM
A2 CHLOE BB 1/22/2020 9:16:07 AM 1/22/2020 10:51:21 AM
A3 RYAN BB 1/22/2020 8:15:03 AM 1/22/2020 9:12:03 AM
A4 STEVE BB NA NA
I need to add some condition, so, i wanna set the value STATUSIN = NULL just when there is no NIP,NAME,DEPARTMENT,STATUSIN,STATUSOUT in one date.. so, that's can be multiple
You want a left join to bring the two tables together. The trickier part is that you need strings in order to represent the 'NA':
select drd.*,
coalesce(cast(statusin as varchar(255)), 'NA') as statusin,
coalesce(cast(statusout as varchar(255)), 'NA') as statusout
from DAILYROWDATA drd left join
SUMMARYDATA sd
on drd.nip = sd.nip;

Secondary Sorting (individually)

How would I do the Secondary sorting on a bar chart, for each individual date ?
for example, I have data as follows
Date Type Value
1/1/2020 A1 4
1/1/2020 A2 2
1/1/2020 A3 9
1/1/2020 A4 5
1/1/2020 A5 7
2/1/2020 A1 7
2/1/2020 A2 5
2/1/2020 A3 0
2/1/2020 A4 3
2/1/2020 A5 1
3/1/2020 A1 3
3/1/2020 A2 5
3/1/2020 A3 7
3/1/2020 A4 9
3/1/2020 A5 8
now I need to plot daily bar chart only showing the top three maximum values of individual dates? i.e., the chart would be
Date Type Value
1/1/2020 A3 9
1/1/2020 A4 5
1/1/2020 A5 7
2/1/2020 A1 7
2/1/2020 A2 5
2/1/2020 A4 3
3/1/2020 A3 7
3/1/2020 A4 9
3/1/2020 A5 8
i.e. individual date top three, not like first sum up A1,A2,A3,A4,A5 for each date, and then sorting based on the cumulative sum.
You should be able to achieve the sorting you need through having Date as the dimension and Type as the breakdown dimension.
You should be able to then sort by Date and then secondary sort by type.
Restricting to 3 per date however is something you'd need to do in your data source as Data Studio can't currently do that.