SQL Conditional select - calculate running total

SQL Conditional select - calculate running total - sql

I have a stored procedure that calculates requirements for customers based on input that we receive from them.
Displaying this information is not a problem.
What I'd like to do is show the most recent received amount and subtract that from the weekly requirements.
So if last Friday I shipped 150 items and this weeks requirements are 100 items for each day then I'd like the data grid to show 0 for Monday, 50 for Tuesday, 100 for Wednesday - Friday.
I have currently tried using with limited success the sample select statement -
Select Customer, PartNumber, LastReceivedQty, Day1Qty, Day2Qty, Day3Qty, Day4Qty, Day5Qty,
TotalRequired
FROM Requirements
Obviously the above select statement does nothing but display data as it is in the table. So when I add the case state as follows I get a bit closer to what I need but not fully and I'm unsure how to proceed.
Select Customer, PartNumber, LastReceivedQty,
"Day1Qty" = case When Day1Qty > 0 then Day1Qty - LastReceivedQty end
...
This method works ok as long as the LastReceivedQty is less than the Day1 requirements but it's incorrect because it allows a negative number to be displayed in day one rather than pulling the remainder from day2.
Sample Data looks like the following:
Customer PartNumber LastReceivedQty Day1Qty Day2Qty Day3Qty Day4Qty Day5Qty TotalRqd
45Bi 2526 150 -50 100 100 100
In the sample above the requirements for part number 2526 Day 1 are 100 and the last received qty is 150
The day1qty shows -50 as opposed to zeroing out day 1 and subtract from day2, 3, etc.
How do I display those figures without showing a negative balance on the requirement dates?
Any help/suggestions on this is greatly appreciated.

Related

How to calculated on created fields? Why the calculation is wrong?

I am working on the workforce analysis project. And I did some case when conditional calculations in Google Data Studio. However, when I successfully conducted the creation of the new field, I couldn't do the calculation again based on the fields I created.
Based on my raw data, I generated the start_headcount, new_hires, terminated, end_headcount by applying the Case When conditional calculations. However, I failed in the next step to calculate the Turnover rate and Retention rate.
The formula for Turnover rate is
terms/((start_headcount+end_headcount)/2)
for retention is
end_headcount/start_headcount
However, the result is wrong. Part of my table is as below:
Supervisor sheadcount newhire terms eheadcount turnover Retention
A 1 3 1 3 200% 0%
B 6 2 2 6 200% 500%
C 6 1 3 4 600% 300%
So the result is wrong. The turnover rate for A should be 1/((1+3)/2)=50%; For B should be 2/((6+6)/2)=33.33%.
I don't know why it is going wrong. Can anyone help?
For example, I wrote below for start_headcount for each employee
CASE
WHEN Last Hire Date<'2018-01-01' AND Termination Date>= '2018-01-01'
OR Last Hire Date<'2018-01-01' AND Termination Date IS NULL
THEN 1
ELSE 0
END
which means if an employee meets the above standard, will get 1. And then they all grouped under a supervisor. I think it might be the problem why the turnover rate in sum is wrong since it is not calculated on the grouped date but on each record and then summed up.

Most likely you are trying to do both steps within the same query and thus newly created fields like start_headcount, etc. not visible yet within the same select statement - instead you need to put first calculation as a subquery as in example below
#standardSQL
SELECT *, terms/((start_headcount+end_headcount)/2) AS turnover
FROM (
<query for your first step>
)

DAX - Need column with row count within past year

I have a table with sales information at the transaction level. We want to institute a new model where we compensate sales reps if a customer has been makes a purchase after more than a year of dormancy. To figure out how much this would have cost historically, I want to add a column with a flag for whether or not each purchase was the Buyer's first in the past 365 days. What I'd like to do is a rowcount in Powerpivot, for all sales made by that customer in the past 365 days, and wrap it in an IF to set the result to 0 or 1.
Example:
Order Date Buyer First Purchase in Year?
1/1/2015 1 1
1/2/2015 2 1
2/1/2015 1 0
4/1/2015 2 0
3/1/2016 2 1
5/1/2017 2 1
Any assistance would be greatly appreciated.

Excellent business use case! It's quite relevant in the business world.
To break this down for you, I will create 3 columns: 2 with some calculations, and 1 with the result. Once you understood how I did this, you can combine all 3 column formulas and make a single column for your dataset, if you like.
Here's a picture of the results:
So here's the 3 columns that I created:
Last Purchase - in order to run this calculation, you need to know when the buyer made their last purchase.
CALCULATE(MAX([Order Date]),FILTER(Table1,[Order Date]<EARLIER([Order Date]) && [Buyer]=EARLIER([Buyer])))
Days Since Last Purchase - now you can compare the Last Purchase date to the current Order Date.
DATEDIFF([Last Purchase],[Order Date],DAY)
First Purchase in 1 Year - finally, the results column. This simply checks to see if it has been more than 365 days since the last purchase OR if the last purchase column is blank (which means it was the first purchase), and creates the flag you want.
IF([Days Since Last Purchase]>365 || ISBLANK([Days Since Last Purchase]),1,0)
Now, you can easily combine the logic of these 3 columns into a single column and get what you want. Hope this helps!
One note I wanted to add is that for this type of analysis it's not a wise move to do row counts as you had originally suggested, as your dataset can easily expand later on (what if you wanted to add more attribute columns?) and then you would have problems. So this solution that I shared with you is much more robust.

Calculating the number of new ID numbers per month in powerpivot

My dataset provides a monthly snapshot of customer accounts. Below is a very simplified version:
Date_ID | Acc_ID
------- | -------
20160430| 1
20160430| 2
20160430| 3
20160531| 1
20160531| 2
20160531| 3
20160531| 4
20160531| 5
20160531| 6
20160531| 7
20160630| 4
20160630| 5
20160630| 6
20160630| 7
20160630| 8
Customers can open or close their accounts, and I want to calculate the number of 'new' customers every month. The number of 'exited' customers will also be helpful if this is possible.
So in the above example, I should get the following result:
Month | New Customers
------- | -------
20160430| 3
20160531| 4
20160630| 1
Basically I want to compare distinct account numbers in the selected and previous month, any that exist in the selected month and not previous are new members, any that were there last month and not in the selected are exited.
I've searched but I can't seem to find any similar problems, and I hardly know where to start myself - I've tried using CALCULATE and FILTER along with DATEADD to filter the data to get two months, and then count the unique values. My PowerPivot skills aren't up to scratch to solve this on my own however!

Getting the new users is relatively straightforward - I'd add a calculated column which counts rows for that user in earlier months and if they don't exist then they are a new user:
=IF(CALCULATE(COUNTROWS(data),
FILTER(data, [Acc_ID] = EARLIER([Acc_ID])
&& [Date_ID] < EARLIER([Date_ID]))) = BLANK(),
"new",
"existing")
Once this is in place you can simply write a measure for new_users:
=CALCULATE(COUNTROWS(data), data[customer_type] = "new")
Getting the cancelled users is a little harder because it means you have to be able to look backwards to the prior month - none of the time intelligence stuff in PowerPivot will work out of the box here as you don't have a true date column.
It's nearly always good practice to have a separate date table in your PowerPivot models and it is a good way to solve this problem - essentially the table should be 1 record per date with a unique key that can be used to create a relationship. Perhaps post back with a few more details.

This is an alternative method to Jacobs which also works. It avoids creating a calculated column, but I actually find the calculated column useful to use as a flag against other measures.
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, LASTDATE('Dates'[Date])
)
) - CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATESBETWEEN(
'Dates'[Date], 0, FIRSTDATE('Dates'[Date]) - 1
)
)
It basically uses the dates table to make a distinct count of all Acc_ID from the beginning of time until the first day of the period of time selected, and subtracts that from the distinct count of all Acc_ID from the beginning of time until the last day of the period of time selected. This is essentially the number of new distinct Acc_ID, although you can't work out which Acc_ID's these are using this method.
I could then calculate 'exited accounts' by taking the previous months total as 'existing accounts':
=CALCULATE(
DISTINCTCOUNT('Accounts'[Acc_ID]),
DATEADD('Dates'[Date], -1, MONTH)
)
Then adding the 'new accounts', and subtracting the 'total accounts':
=DISTINCTCOUNT('Accounts'[Acc_ID])

SQL - Order/Delivery manipulation

Figured this out in excel - just need to convert it to SQL - thought I would write this here in case anyone has looked at this and started to reply.
I'm currently looking at outstanding orders and future estimated deliveries for a range of products where there can be multiple orders and deliveries. I have a large table — see image:
I have no reputation so unable to paste an image in here and I'm unable to draw it out using spaces,
A positive in the Quantity column represents a reserved order from a priority area that has first pick when any future order comes in. Similarly a negative represents a delivery (For example if we look at product A;
Week 1 – There is a priority order for 60.
Week 2 – 40 are delivered meaning 40 are allocated to the 60 in priority order week 1 (still 20 outstanding).
Week 3 – A New Priority order takes effect for 20 (combining with the 20 outstanding from Week 2 to create 40)
Week 3 – at the same time in week 3 an order comes in for 50 – this can satisfy the current outstanding request for 40 leaving 10 left over
Week 5 – A new priority order take effect for 20, taking the 10 remaining and creating an outstanding order of 10.
I’ve been looking for a way to nicely look at the effect of the priority orders such that the estimated quantity and therefore cost can be seen. i.e. for product A
Week 1 - Initial Demand for 60 - can be ignored as nothing delivered
Week 2 - 40 delivered - 40 at cost
Week 3 - 40 delivered - 40 at cost
Week 5 – 10 delivered - 10 at cost
I think there may be an easy solution but having been looking at it for a while now I can’t see the wood from the trees. I think there is an issue with when a large enough order comes in and there is sufficient quantity to cover the priority order yet the remaining has effectively been ‘reserved’ by the priority department and needs to be ‘rolled over’.
Any help or prompts much appreciated

When I first read your question it sounded like a stock-inventory problem, but based on your example data it seems to be a simple cumulative sum (at least for the first part):
SELECT
product, week, quantity,
SUM(quantity)
OVER (PARTITION BY product
ORDER BY week
ROWS UNBOUNDED PRECEDING) AS "cumulative quantity"
FROM tab
Regarding the second part I'm not shure what you expect as result, could you elaborate on that?

SQL change over time query

I have created 2 tables. one table has 4 fields. a unique name, a date and 3 figures. The second table contains the same fields but records the output of a merge function. therefore has a date at which time the update or insert function happened. what I want to do is retrieve a sum of either the difference between 2 days or alternatively the totals of the 2 days to work out how much the value has changed over the day. The merge function only updates if a value has changed or it needs to insert a new value.
so far I have this
select sum(Change_Table_1.Disk_Space) as total,
Change_Table_1.Date_Updated
from VM_Info
left join Change_Table_1
on VM_Info.VM_Unique = Change_Table_1.VM_Unique
where VM_Info.Agency = 'test'
group by Change_Table_1.Date_Updated
but this would just return the sum of that days updated total rather than the difference between the two days. One answer to this question would be to to add all new records to the table but this would contain a number of duplicates. So in my head what I want it to do is loop over the current figures for the day then loop over the next day but also to include all values that haven't updated. sorry if I haven't explained this well. so what I want to achieve is to get some sort of change of the total over time. If its poor design im in a position to accept that also.
Any help is much appreciated.
maybe this would explain it better. show me total for day 1, if the value hasn't changed then show me the same value for day 2 if it has changed show me new value. and so on...
ok to further elaborate.
the Change_Table looks like
vm date created action value_Field1 value_field_2 Disk_Space
abc 14/10/2013 insert 5 5 30
def 14/10/2013 insert 5 5 75
abc 15/10/2013 update 5 5 75
so the out put I want is for the 14th the total for the last column is 105. On the 15th abc has changed from 30 to 75 but def hasn't changed but still neds to be included giving 150
so the output would look like
date disk_Space
14/10/2013 105
15/10/2013 150

Does this help? If not, can you provide a few rows of sample data, and an example of the desired result?
select
(VM_Info.Disk_Space - Change_Table_1.Disk_Space) as DiskSpaceChange,
Change_Table_1.Date_Updated
from
VM_Info
left join Change_Table_1 on VM_Info.VM_Unique = Change_Table_1.VM_Unique and VM_Info.Date = Change_Table_1.Date_Updated
where
VM_Info.Agency = 'test'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas