Sum a field for each month cummulatively and dynamically - sql

I have a table which contains around 50,000 records of information which has been set up to look back as far as start of current financial year.
As it stands I have not updated this table since last month so currently the data in there assumes we are still looking back as far back as April 1st 2011.
note(when i refresh the data, there will only be April 2012's data in there as we are now in April, then in May it will have April 2012 and May 2012 and so on...)
Each record has 4 columns I am concerned with:
Department,
Incident date,
month,
year,
reduced
Both the month and year columns have been generated from the incident date field which is in this format:
2011-06-29 00:00:00.000
I need to for each department, sum the reduced but in a cumulative fashion.
eg seen as though April 2011 will be the earliest month/year data I have at the moment, I will want to know the total reduced for every department for April.
Then for May I want April & May combined, for June I need April,May,June and so on...
Is there an intelligent way to do this so that as soon as I reimport data into this table it will realise that there is now only one month and that the year has updated and will for now until next month only display April's sum(reduced)

The following will return the cumulative totals grouped by Department, Year and Month. If you're clearing out the data from the previous tax year when refreshing then you can omit the WHERE clause.
SELECT T1.[Year],
T1.[Month],
T1.Department,
SUM(T2.Reduced) ReducedTotals
FROM [TABLENAME] T1
INNER JOIN [TABLENAME] T2 ON ( T1.Department = T2.Department AND T1.IncidentDate >= T2.IncidentDate )
WHERE T1.IncidentDate >= '2012-04-01'
GROUP BY T1.[Year],
T1.[Month],
T1.Department
ORDER BY T1.[Year],
T1.[Month],
T1.Department

select t1.id, t1.singlenum, SUM(t2.singlenum) as sum
from #t t1 inner join #t t2 on t1.id >= t2.id
group by t1.id, t1.singlenum
order by t1.id

Related

Running Total - Create row for months that don't have any sales in the region (1 row for each region in each month)

I am working on the below query that I will use inside Tableau to create a line chart that will be color-coded by year and will use the region as a filter for the user. The query works, but I found there are months in regions that don't have any sales. These sections break up the line chart and I am not able to fill in the missing spaces (I am using a non-date dimension on the X-Axis - Number of months until the end of its fiscal year).
I am looking for some help to alter my query to create a row for every month and every region in my dataset so that my running total will have a value to display in the line chart. if there are no values in my table, then = 0 and update the running total for the region.
I have a dimDate table and also a Regions table I can use in the query.
My Query now, (Results sorted in Excel to view easier) Results Table Now
What I want to do; New rows highlighted in Yellow What I want to do
My Code using SQL Server:
SELECT b.gy,
b.sales_month,
b.region,
b.gs_year_total,
b.months_away,
Sum(b.gs_year_total)
OVER (
partition BY b.gy, b.region
ORDER BY b.months_away DESC) RT_by_Region_GY
FROM (SELECT a.gy,
a.region,
a.sales_month,
Sum(a.gy_total) Gs_Year_Total,
a.months_away
FROM (SELECT g.val_id,
g.[gs year] AS GY
,
g.sales_month
AS
Sales_Month,
g.gy_total,
Datediff(month, g.sales_month, dt.lastdayofyear) AS
months_away,
g.value_type,
val.region
FROM uv_sales g
JOIN dbo.dimdate AS dt
ON g.[gs year] = dt.gsyear
JOIN dimvalsummary val
ON g.val_id = val.val_id
WHERE g.[gs year] IN ( 2017, 2018, 2019, 2020, 2021 )
GROUP BY g.valuation_id,
g.[gs year],
val.region,
g.sales_month,
dt.lastdayofyear,
g.gy_total,
g.value_type) a
WHERE a.months_away >= 0
AND sales_month < Dateadd(month, -1, Getdate())
GROUP BY a.gy,
a.region,
a.sales_month,
a.months_away) b
It's tough to envision the best method to solve without data and the meaning of all those fields. Here's a rough sketch of how one might attempt to solve it. This is not complete or tested, sorry, but I'm not sure the meaning of all those fields and don't have data to test.
Create a table called all_months and insert all the months from oldest to whatever date in the future you need.
01/01/2017
02/01/2017
...
12/01/2049
May need one query per region and union them together. Select the year & month from that all_months table, and left join to your other table on month. Coalesce your dollar values.
select 'East' as region,
extract(year from m.month) as gy_year,
m.month as sales_month,
coalesce(g.gy_total, 0) as gy_total,
datediff(month, m.month, dt.lastdayofyear) as months_away
from all_months m
left join uv_sales g on g.sales_month = m.month
--and so on

how to produce a customer retention table /cohort analysis with SQL

I'm trying to write an SQL query (Presto SQL syntax) to produce a customer retention table (see sample below).
A customer who makes at least one transaction in a month is considered as retained for that month.
this is the table
user_id transaction_date
bdcff651- . 2018-01-01
bdcff641 . 2018-03-15
this is the result I would like to get
The first row should be understood as follows:
Out of all customers who made their first transaction in the month of Jan 2018 (defined as “Jan Activation Cohort”), 35% subsequently made a transaction during the one month period following their first transaction date, 23% in the next month, 15% in the next month and so on.
Date 1st Month 2nd Month 3rd Month
2018-01-01 35% 23% . 15%
2018-02-0 33 % 26% . 13%
2018-03-0 36% 27% 12%
As an example, if person XYZ makes his first transaction on 10th February 2018, his 1st month will be from 11th February 2018 to 10th March 2018, 2nd month will be from 11th March 2018 to 10th April 2018 and so on. This person’s details need to appear in the Feb 2018 cohort in the Customer Retention Table.
would appreciate any help! thanks.
You can use conditional aggregation. However, I am not sure what your real calculations are.
If I just use the built-in definitions of date_diff(), then the logic looks like:
select date_trunc(month, first_td) as yyyymm,
count(distinct user_id) as cnt,
(count(distinct case when date_diff(month, first_td, transaction_date) = 1
then user_id
end) /
count(distinct user_id)
) as month_1_ratio,
(count(distinct case when date_diff(month, first_td, transaction_date) = 2
then user_id
end) /
count(distinct user_id)
) as month_2_ratio
from (select t.*,
min(transaction_date) over (partition by user_id) as first_td
from t
) t
group by date_trunc(month, first_td)
order by yyyymm;
I am not familiar with Presto exactly, and do not have a way to test Presto code. However, it looks like from searching around a bit that it wouldn't be too hard to convert to Presto syntax from something like SQL Server syntax. Here is what I would do in SQL Server and you should be able to carry the concept over to Presto:
with transactions_info_per_user as (
select user_id, min(transaction_date) as first_transaction,
convert(datepart(year, min(transaction_date)) as varchar(4)) + convert(datepart(month, min(transaction_date)) as varchar(2)) as activation_cohort
from my_table
group by user_id
),
users_per_activation_cohort as (
select activation_cohort, count(*) as number_of_users
from transactions_info_per_user
group by activation_cohort
),
months_after_activation_per_purchase as (
select distinct mt.user_id, ti.activation_cohort, datediff(month, mt.transaction_date, ti.first_transaction) AS months_after_activation
from my_table mt
left join transactions_info_per_user as ti
on mt.user_id = ti.user_id
),
final as (
select activation_cohort, months_after_activation, count(*) as user_count_per_cohort_with_purchase_per_month_after_activation
from months_after_activation_per_purchase
group by activation_cohort, months_after_activation
)
select activation_cohort, months_after_activation,
convert(user_count_per_cohort_with_purchase_per_month_after_activation as decimal(9,2)) / convert(users_per_activation_cohort as decimal(9,2)) * 100
from final
--Then pivot months_after_activation into columns
I was very explicit with the naming of things so you could follow the thought process. Here is an example of how to pivot in Presto. Hopefully this helps you!

¿How to calculate a difference between rows in a select in SQL Oracle?

I would like your help for a Oracle sql construction.
Let's say I have below database tables and descriptions:
Store: store the information related to all new customers
country allows to take two values: “Argentina” / “Brasil”
CustomerBOM: stores historical clientes
Date: First date of each month (1/01/2017, 1/02/2017…)
Client: a customer who’s subscription is up to date.
Now I have to answer this question:
Which is the quantity for each month?
The code I've made so far is shown below:
select date,
from CustomerBOM t1, Store t2
where t1.ID = t2.ID
group by date
having
order by date asc
Can you please guide me on how to have a full list with the difference between each month?
Edit:
This is how it should look like the output of the sentence:
Month Difference
January 100 (total clients)
February 20 (120 clients from February - 100 clients from January)
March 60 (180 clients from March - 120 clients from February)
Thanks,
Nicolás.
Try this:
select to_char(t1.date, 'YYYY/MM/DD') customer_month, count(1) number_of_customers
from CustomerBOM t1,
Store t2
where t1.ID = t2.ID
group by to_char(t1.date, 'YYYY/MM/DD')
order by to_char(t1.date, 'YYYY/MM/DD')

How to run sql n times increasing variable and after joining results

I've a transact table (historical) with a CreatedDate, this transact is related to employee transact table. (inner join in transact_id)
This being said, comes the problem: I need to query these table and get the state by month , because during the year, the CreatedDate can change. e.g. An employee update in July will create a new line, but this shouldn't affect the March total.
The solution looks like a forech, but how can I join all lines at the end? The result should be something like:
January - $123
February - $234
March - $123
...
I get the last state of each employee with this:
select AllTransact.id_employee, AllTransact.id_department from (
select id_employee, id_department, rank() over (partition by id_employee order by created_date desc) desc_rank
from Transact_Employee TransEmployee
inner join Transact on TransEmployee.ID_Transact = Transact.ID_Transact
and Transact.Status = 8
and Transact.Created_Date < #currentMonth) AllTransact
where desc_rank = 1
*I don't want to copy and past all the code 12 times. :)
You can partition over many columns. rank() OVER (partition BY [id_employee],datepart(month,[Created_Date]) ORDER BY [Created_Date] DESC) will give you what you have now but for each month (and it doesn't care what year that month is in so you either need to partition by year too or add limit on created_date).

TSQL Running Totals aggregate from sum of previous rows

Not sure how to word this. Say i have a select returing this.
Name, month, amount
John, June, 5
John, July,6
John, July, 3
John August, 10
and I want to aggregate and report beggining blance for each month.
name, month, beggining balance.
john, may, 0
john, june, 0
john, july, 5
john, august, 14
john, September, 24
I can do this in excel with cell formulas, but how can I do it in SQL without storing values somewhere? I have another table with fiscal months i can do a left outer join with so all months are reported, just not sure how to aggregate from prior months in sql.
select
name
, month
, (select sum(balance) from mytable
where mytable.month < m.month and mytable.name = m.name) as starting_balance
from mytable m
group by name, month
This is not as nice as windowing functions, but since they vary from database to database, you'd need to tell us which system you are using.
And it's an inline subquery, which is not very performant. But at least it's easy to understand what's going on !
Use Grouping like this
SELECT NAME, MONTH , SUM(Balance) FROM table GROUP BY NAME, MONTH
Assuming your months are represented as dates, this will give you the running total.
select t1.name, t1.month, sum(t2.amount)
from yourtable t1
left join yourtable t2
on t1.name = t2.name
and t1.month>t2.month
group by t1.name, t1.month