How to display monthly revenue splitting between new users, resurrected users, and churned etc users in sql? - sql

So say I have 3 fields in my table - month, user, and revenue. I am trying to build a table that shows me total revenue by month broken down by the revenue type - new (which means it's a user that spent for the first time)
- returning (which means it's a user that did not spend in the previous month but that spent in the past and spent in the month in question)
- expansion (which means it's a user that spent in the previous month but spent more this time)
- contraction (which means it's a user that spent in the previous month but spent less this time)
- churned (which means it's a user that spent in the previous month but not in this month...so this will actually be negative and will not add to or take away from the total revenue). So basically:
New+
Expansion+Resurrected+Contraction+Remaining+= Total then (Churned)
Exemplary table with the added fields to give a sense of the math
I'm ultimately looking to build a stacked column chart like this one:

So it looks like you need to solve for
2+agoMo PriorMo ThisMo
0 0 x New
x 0 x Returning/Resurected
? w < x Expansion
? z > x Contraction
? y 0 Churned
? x = x >0 Remaining
? 0 0 Sleeping
What have you coded already? what results are you getting?

Related

How to calculated on created fields? Why the calculation is wrong?

I am working on the workforce analysis project. And I did some case when conditional calculations in Google Data Studio. However, when I successfully conducted the creation of the new field, I couldn't do the calculation again based on the fields I created.
Based on my raw data, I generated the start_headcount, new_hires, terminated, end_headcount by applying the Case When conditional calculations. However, I failed in the next step to calculate the Turnover rate and Retention rate.
The formula for Turnover rate is
terms/((start_headcount+end_headcount)/2)
for retention is
end_headcount/start_headcount
However, the result is wrong. Part of my table is as below:
Supervisor sheadcount newhire terms eheadcount turnover Retention
A 1 3 1 3 200% 0%
B 6 2 2 6 200% 500%
C 6 1 3 4 600% 300%
So the result is wrong. The turnover rate for A should be 1/((1+3)/2)=50%; For B should be 2/((6+6)/2)=33.33%.
I don't know why it is going wrong. Can anyone help?
For example, I wrote below for start_headcount for each employee
CASE
WHEN Last Hire Date<'2018-01-01' AND Termination Date>= '2018-01-01'
OR Last Hire Date<'2018-01-01' AND Termination Date IS NULL
THEN 1
ELSE 0
END
which means if an employee meets the above standard, will get 1. And then they all grouped under a supervisor. I think it might be the problem why the turnover rate in sum is wrong since it is not calculated on the grouped date but on each record and then summed up.
Most likely you are trying to do both steps within the same query and thus newly created fields like start_headcount, etc. not visible yet within the same select statement - instead you need to put first calculation as a subquery as in example below
#standardSQL
SELECT *, terms/((start_headcount+end_headcount)/2) AS turnover
FROM (
<query for your first step>
)

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!
Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.
This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

SQL Conditional select - calculate running total

I have a stored procedure that calculates requirements for customers based on input that we receive from them.
Displaying this information is not a problem.
What I'd like to do is show the most recent received amount and subtract that from the weekly requirements.
So if last Friday I shipped 150 items and this weeks requirements are 100 items for each day then I'd like the data grid to show 0 for Monday, 50 for Tuesday, 100 for Wednesday - Friday.
I have currently tried using with limited success the sample select statement -
Select Customer, PartNumber, LastReceivedQty, Day1Qty, Day2Qty, Day3Qty, Day4Qty, Day5Qty,
TotalRequired
FROM Requirements
Obviously the above select statement does nothing but display data as it is in the table. So when I add the case state as follows I get a bit closer to what I need but not fully and I'm unsure how to proceed.
Select Customer, PartNumber, LastReceivedQty,
"Day1Qty" = case When Day1Qty > 0 then Day1Qty - LastReceivedQty end
...
This method works ok as long as the LastReceivedQty is less than the Day1 requirements but it's incorrect because it allows a negative number to be displayed in day one rather than pulling the remainder from day2.
Sample Data looks like the following:
Customer PartNumber LastReceivedQty Day1Qty Day2Qty Day3Qty Day4Qty Day5Qty TotalRqd
45Bi 2526 150 -50 100 100 100
In the sample above the requirements for part number 2526 Day 1 are 100 and the last received qty is 150
The day1qty shows -50 as opposed to zeroing out day 1 and subtract from day2, 3, etc.
How do I display those figures without showing a negative balance on the requirement dates?
Any help/suggestions on this is greatly appreciated.

SQL - Order/Delivery manipulation

Figured this out in excel - just need to convert it to SQL - thought I would write this here in case anyone has looked at this and started to reply.
I'm currently looking at outstanding orders and future estimated deliveries for a range of products where there can be multiple orders and deliveries. I have a large table — see image:
I have no reputation so unable to paste an image in here and I'm unable to draw it out using spaces,
A positive in the Quantity column represents a reserved order from a priority area that has first pick when any future order comes in. Similarly a negative represents a delivery (For example if we look at product A;
Week 1 – There is a priority order for 60.
Week 2 – 40 are delivered meaning 40 are allocated to the 60 in priority order week 1 (still 20 outstanding).
Week 3 – A New Priority order takes effect for 20 (combining with the 20 outstanding from Week 2 to create 40)
Week 3 – at the same time in week 3 an order comes in for 50 – this can satisfy the current outstanding request for 40 leaving 10 left over
Week 5 – A new priority order take effect for 20, taking the 10 remaining and creating an outstanding order of 10.
I’ve been looking for a way to nicely look at the effect of the priority orders such that the estimated quantity and therefore cost can be seen. i.e. for product A
Week 1 - Initial Demand for 60 - can be ignored as nothing delivered
Week 2 - 40 delivered - 40 at cost
Week 3 - 40 delivered - 40 at cost
Week 5 – 10 delivered - 10 at cost
I think there may be an easy solution but having been looking at it for a while now I can’t see the wood from the trees. I think there is an issue with when a large enough order comes in and there is sufficient quantity to cover the priority order yet the remaining has effectively been ‘reserved’ by the priority department and needs to be ‘rolled over’.
Any help or prompts much appreciated
When I first read your question it sounded like a stock-inventory problem, but based on your example data it seems to be a simple cumulative sum (at least for the first part):
SELECT
product, week, quantity,
SUM(quantity)
OVER (PARTITION BY product
ORDER BY week
ROWS UNBOUNDED PRECEDING) AS "cumulative quantity"
FROM tab
Regarding the second part I'm not shure what you expect as result, could you elaborate on that?

In Crystal Report print only first record in group and leave it summable

I have a table that lists every task an operator completed during a day. This is gathered by a Shop Floor Control program. There is also a column that has the total hours worked that day, this field comes from their time punches. The table looks something like this:
Operator 1 Bestupid 0.5 8 5/12/1986
Operator 1 BeProductive 0.1 8 5/12/1986
Operator 1 Bestupidagain 3.2 8 5/12/1986
Operator 1 Belazy 0.7 8 5/13/1986
Operator 2 BetheBest 1.7 9.25 5/12/1986
I am trying to get an efficiency out of this by summing the process hours and comparing it to the hours worked. The problem is that when I do any kind of summary on the hours worked column it sums EVERY DETAIL LINE.
I have tried:
If Previous (groupingfield) = (groupingfield) Then
HoursWorked = 0
Else
HoursWorked = HoursWorked
I have tried a global three formula trick, but neither of the above leave me with a summable field, I get "A summary has been specified on a non-recurring field"
I currently use a global variable, reset in the group header, but not WhilePrintinganything. However it is missing some records and upon occasion I will get two hoursworked > 0 in the same group :(
Any ideas?
I just want to clarify, I have three groups:
Groups: Work Center --> Operator --> Date
I can summarize the process hours across any group and that's fine. However, the hours worked prints on every detail line even though it really should only print once per Date. Therefore when I summarize the Hours Worked for an operator the total is WAY off because it is adding up 8hours for each entry instead of 8 hours for each day.
Try grouping by the operators. Then create a running total for the process hours that sum for each record and reset on change of group. In the group footer you can display the running total and any other stats for that operator you care to.
Try another running total for the daily hours but pick maximum as the type of summary. Since all the records for the day will have the same hours work the maximum will be correct. Reset with the change of the date group and you should be good to go.