SQL Method to report next period value for this period - sql

This may already be answered, but I can't figure out the correct search terms for what I need. We store values by Year / Period for the Beginning of Month (BOM). The BOM for one month is the same value as End of Month (EOM) for the previous month. I need a way to report this as such.
So 2018-02 BOM = 2018-01 EOM.
I thought I might be able to use something simple, but it does not account for the month/year wrap at 12 months as those fields are numerical.
select yr as YEAR, (pd-1) as PERIOD, sum(BOM) as EOM
from Table1
where type = '3'
group by yr, pd
order by yr desc, pd desc
This works for the middle months, but not for January, which becomes 2018-0 instead of 2017-12.
Example Data
Yr Pd Type BOM
18 02 3 100
18 02 3 100
18 02 2 200
18 02 2 100
18 01 3 100
18 01 3 100
18 01 2 200
18 01 2 100
18 01 3 100
18 01 2 300
17 12 3 100
17 12 3 200
17 12 2 300
17 12 3 200
17 12 2 100
17 11 3 300
17 11 2 400
17 11 3 400
17 11 2 100
So the results I am looking for would be:
Yr Pd EOM
18 01 200
17 12 300
17 11 500
17 10 700
I'm working in System iNavigator currently, but hoping to move this into an externally connected Excel query at some point.

Your DB2 database should be able to use CASE WHEN
Which can be used to calculate the year and the month, depending on the month.
For example:
select
CASE WHEN pd = 1 THEN yr - 1 ELSE yr END as Yr,
CASE WHEN pd = 1 THEN 12 ELSE pd - 1 END as Pd,
SUM(BOM) as EOM
from Table1
where type = '3'
group by yr, pd
order by yr desc, pd desc

Related

running total starting from a date column

I'm trying to get a running total as of a date. This is the data I have
Date
transaction Amount
End of Week Balance
jan 1
5
100
jan 2
3
100
jan 3
4
100
jan 4
3
100
jan 5
1
100
jan 6
3
100
I would like to find out what the daily end balance is. My thought is to get a running total from each day to the end of the week and subtract it from the end of week balance, like below
Date
transaction Amount
Running total
End of Week Balance
Balance - Running total
jan 1
5
19
100
86
jan 2
3
14
100
89
jan 3
4
11
100
93
jan 4
3
7
100
96
jan 5
1
4
100
97
jan 6
3
3
100
100
I can use
SUM(transactionAmount) OVER (Order by Date)
to get a running total, is there a way to specify that I only want the total of transactions that have taken place after the date?
You can use sum() as a window function, but accumulate in reverse:
select t.*,
(end_of_week_balance -
sum(transactionAmount) over (order by date desc)
)
from t;
If you have this example:
1> select i, sum(i) over (order by i) S from integers where i<10;
2> go
i S
----------- -----------
1 1
2 3
3 6
4 10
5 15
6 21
7 28
8 36
9 45
you can also do:
1> select i, sum(case when i>3 then i else 0 end) over (order by i) S from integers where i<10;
2> go
i S
----------- -----------
1 0
2 0
3 0
4 4
5 9
6 15
7 22
8 30
9 39

pandas- return Month containing Max value for each year

I have a dataframe like:
Year Month Value
2017 1 100
2017 2 1
2017 4 2
2018 3 88
2018 4 8
2019 5 87
2019 6 1
I'd the dataframe to return the Month and Value for each year where the value is the maximum:
year month value
2017 1 100
2018 3 88
2019 5 87
I've attempted something like df=df.groupby(["Year","Month"])['Value']).max() however, it returns the full data set because each Year / Month pair is unique (i believe).
You can get the index where the top Value occurs with .groupby(...).idxmax() and use that to index into the original dataframe:
In [28]: df.loc[df.groupby("Year")["Value"].idxmax()]
Out[28]:
Year Month Value
0 2017 1 100
3 2018 3 88
5 2019 5 87
Here is a solution that also handles duplicate possibility:
m = df.groupby('Year')['Value'].transform('max') == df['Value']
dfmax = df.loc[m]
Full example:
import pandas as pd
data = '''\
Year Month Value
2017 1 100
2017 2 1
2017 4 2
2018 3 88
2018 4 88
2019 5 87
2019 6 1'''
fileobj = pd.compat.StringIO(data)
df = pd.read_csv(fileobj, sep='\s+')
m = df.groupby('Year')['Value'].transform('max') == df['Value']
print(df[m])
Year Month Value
0 2017 1 100
3 2018 3 88
4 2018 4 88
5 2019 5 87

I have set of jobs record for whole month and want to display qty of jobs date wise in SQL

We have thousands of record in our data and want to count date wise jobs with category through single query. It is Possible?
Display required as under
TypesJobs 01 02 03 04 05 06 07
A 2 1 6 4 1 3 4
B 10 12 8 10 12 9 13
C 3 5 4 3 2 5 4
Here Types of jobs count for a day in date column 01, 02, 03 are date range of the month
You can use conditional aggregation, something like this:
select typesjobs,
sum(case when month(datecol) = 1 then 1 els e0 end) as month_01,
sum(case when month(datecol) = 2 then 1 els e0 end) as month_02,
. . .
from t
where <date condition here>
group by typesjobs;

Aggregate result from query by quarter SQL

Lets say I have a table which holds all exports for some time back in Microsoft SQL database:
Name:
ExportTable
Columns:
id - numeric(18)
exportdate - datetime
In order to get the number of exports per week I can run the following query:
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports'
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
Returns:
exportdate totalExports
---------- ------------
27 13
28 12
29 15
30 8
31 17
32 10
33 7
34 15
35 4
36 18
37 10
38 14
39 14
40 21
41 19
Would it be possible to aggregate the week results by quarter so the output becomes something like the bellow?
UPDATE
Sorry for not being crystal clear, I would like the current result to add upp with previous result up to a new quarter.
Note week 41 contains 21+19 = 40
Week 39 contains 157 (13+12+15+8+17+10+7+15+4+18+10+14+14)
exportdate totalExports Quarter
---------- ------------ -------
27 13 3
28 25 3
29 40 3
30 48 3
31 65 3
32 75 3
33 82 3
34 97 3
35 101 3
36 119 3
37 129 3
38 143 3
39 157 3 -- Sum of 3 Quarter values.
40 21 4 -- New Quarter show current week value
41 40 4 -- (21+19)
You can use this.
SELECT
DATEPART(ISO_WEEK,[exportdate]) as 'exportdate'
, SUM( count(exportdate) ) OVER ( PARTITION BY DATEPART(QUARTER,MIN([exportdate])) ORDER BY DATEPART(ISO_WEEK,[exportdate]) ROWS UNBOUNDED PRECEDING ) as 'totalExports'
, DATEPART(QUARTER,MIN([exportdate])) [Quarter]
FROM [ExportTable]
Group By DATEPART(ISO_WEEK,[exportdate])
order by exportdate;
You could use a case statement to separate the dates into quarters.
e.g.
CASE
WHEN EXPORT_DATE BETWEEN '1' AND '4' THEN 1
WHEN Export_Date BETWEEN '5' and '9' THEN 2
ELSE 0 AS [Quarter]
END
Its just an example but you get the idea.
You could then use the alias from the case
SELECT DATEPART(ISO_WEEK,[exportdate]) as 'exportdate', count(exportdate) as 'totalExports', DATEPART(quarter,[exportdate]) as quarter FROM [ExportTable] Group By DATEPART(ISO_WEEK,[exportdate]), DATEPART(quarter,[exportdate]) order by exportdate;

Pandas: Group by two columns to get sum of another column

I look most of the previously asked questions but was not able to find answer for my question:
I have following data.frame
id year month score num_attempts
0 483625 2010 01 50 1
1 967799 2009 03 50 1
2 213473 2005 09 100 1
3 498110 2010 12 60 1
5 187243 2010 01 100 1
6 508311 2005 10 15 1
7 486688 2005 10 50 1
8 212550 2005 10 500 1
10 136701 2005 09 25 1
11 471651 2010 01 50 1
I want to get following data frame
year month sum_score sum_num_attempts
2009 03 50 1
2005 09 125 2
2010 12 60 1
2010 01 200 2
2005 10 565 3
Here is what I tried:
sum_df = df.groupby(by=['year','month'])['score'].sum()
But this doesn't look efficient and correct. If I have more than one column need to be aggregate this seems like a very expensive call. for example if I have another column num_attempts and just want to sum by year month as score.
This should be an efficient way:
sum_df = df.groupby(['year','month']).agg({'score': 'sum', 'num_attempts': 'sum'})