Pandas :: How to plot based on sum amount each month - pandas

I made data frame shown below which has 3 companies A,B and C. Companies buy certain amount of vouchers during 2016 to 2018 period. Some days eg. Company A buys 100 pieces for $3000 other days no company buys any.
I'd like see how these three companies compare for last two years when it comes to money spent for vouchers so my ideas was following:
Sum all money spent each month for each company, and plot them in bar graph or just standard line - so three lines each with different color.
Since its 2 years of data, there would be roughly 24 date-points on x axis
I tried something like: plt.bar(A['Datetime'], A['PaidTotal'])
But get: ufunc subtract cannot use operands with types dtype('
But this is just for one company anyway not for all 3 in one graph
(I can sort those dates, thats not a problem)
Company Name PaidTotal Datetime
585 CompanyA 218916.0 2016-10-14 10:51:07
586 CompanyB 430000.0 2017-01-23 11:05:08
591 CompanyB 546217.0 2016-09-26 14:20:00
592 CompanyC 73780.0 2016-12-07 07:52:01
593 CompanyA 132720.0 2016-10-04 16:14:10
595 CompanyC 52065.0 2016-11-12 14:32:40

For a bar chart you can call df.groupby('Company Name')['PaidTotal'].sum().plot.bar():
To see a line chart of all three over time, you can try this (the axes are wrong, but this is the general idea):
sums = df.groupby(['Company Name', 'Datetime'])['PaidTotal'].sum().reset_index(level=0)
for company in sums['Company Name'].unique():
sums[sums['Company Name'] == company]['PaidTotal'].plot();

Related

How can I convert monthly data to yearly data in pandas dataframe?

My data looks like that:
data_dte
Year
Month
usg_apt
Total
01/1990
1990
1
JFK
80
01/1990
1990
1
MIA
100
01/1990
1990
1
ORD
58
I want to have a yearly total for each "usg_apt" instead of monthly.
"usg_apt" stands for "US Gateway Airport Code".
Assuming that the dataframe in question is called df, maybe you could try df.groupby(["usg_apt", "Year"])['Total'].agg('sum').

Smoothed Average over rows and columns with pandas

I am trying to create a function that averages over both row and column. For example:
**State** **1943 1944 1945 1946 1947 (1947_AVG) 1948 (1948_AVG)**
Alaska 1 2 3 4 5 2 6 3
CA 234 234 234 6677 34
I want a code that will give me an average for 1947 using 1943, 1944, and 1945. Something that gives me 1948 using 1944, 1945, 1946, ect, ect.
I currently have:
d3['pandas_SMA_Year'] = d3.iloc[:,1].rolling(window=3).mean()
But this is simply working over the rows, not the columns, and it doesn't take into account the fact that I'm looking 2 years back. Please and thank you for any guidance!

Pandas/MLP :: plot 36 months instead of 12 only on 3 year period

Below code snippet shows show revenue growth/decline of 3 companies for 3 year period however in this format I'm not able to find out how to add following:
1) instead of either seeing only 12 months out of 3 years (df.PublishedAtUtc.dt.month) or seeing very rugged graph based on year(df.PublishedAtUtc.dt.year), how do I see 36 or any amount month period(I tried different parameters with no luck)
https://i.imgur.com/beiO8w2.png
sums = df.groupby(['Company Name', df.Datetime.dt.month])['PaidTotal'].sum().reset_index(level=0)
for company in sums['Company Name'].unique():
sums[sums['Company Name'] == company]['PaidTotal'].plot();
Data sample only - original has thousands of lines:
Company Name PaidTotal Datetime
585 CompanyA 218916.0 2016-10-14 10:51:07
586 CompanyB 430000.0 2016-01-23 11:05:08
591 CompanyB 546217.0 2016-09-26 14:20:00
592 CompanyC 73780.0 2016-12-07 07:52:01
593 CompanyA 132720.0 2017-10-04 16:14:10
595 CompanyC 52065.0 2017-11-12 14:32:40
585 CompanyA 234566.0 2017-10-14 10:51:07
586 CompanyB 252325.0 2017-01-23 11:05:08
591 CompanyB 546217.0 2018-09-26 14:20:00
592 CompanyC 745780.0 2018-12-07 07:52:01
593 CompanyA 1322320.0 2018-10-04 16:14:10
595 CompanyC 5432065.0 2018-11-12 14:32:40
It looks like you're using the Datetime object to store your dates, so you will be able to use the Matplotlib's plt.plot_date() function documented here to solve your problem.
This function assumes that you're storing the dates/times as Datetime objects. So, you must specify this by passing xdate=True (see the code below).
To plot any range of Datetime objects datetimes with their paid totals totals, use something like this:
plt.plot_date(dates,totals,xdate=True)
plt.xlabel('date')
plt.ylabel('totals')
This should give you what you're looking for, something like this using your sample data:

Normalize monthly payments

First, sorry for my bad English. I'm trying to normalize a table in a pension system where subscribers are paid monthly. I need to know who has been paid and who has not and how much they've been paid. I believe I'm using SQL Server. Here's an example:
id_subscriber id_receipt year month pay_value payment type_pay
12 1 2016 January 100 80 1
13 1 2016 January 100 100 1
14 1 2016 January 100 100 1
12 2 2016 February 100 100 2
13 2 2016 February 100 80 1
But I'm not happy repeating the year and the month for every single subscriber. It doesn't seem right. Is there a better way to store this data?
EDIT:
The case is as follows: this company has many subscribers who must pay monthly and payment can be in various ways. They produce a single receipt for many customers, and each customer that receipt may be paying one or more installments.
These are my other tables:
tbl_subscriber
id_suscriber(PK) first_name last_name address tel_1 tel_2
12 Juan Perez xxx xxx xxx
13 Pedro Lainez xxx xxx xxx
14 Maria Lopez xxx xxx xxx
tbl_receipt
id_receipt(PK) value elaboration_date deposit_date
1 1,000.00 2015-09-16 2015-09-20
2 890.00 2015-12-01 2015-12-18
tbl_type_paym
id type description
1 bank xxxx
2 ventanilla xxx
This basically seems fine. You could split dates out into a separate table and reference that, but that strikes me as a kind of silly way to do it. I would recommend storing the month as an integer instead of a varchar column though. Besides not storing the same string over and over you can more reasonably do comparisons.
You could also use date values, although that might not be worth the trouble when you don't want greater granularity than the month.

Create all combinations of summations given criteria in Access VBA

I have a subset summation problem I cannot find the answer to. I am trying to write something in VBA for access that will take all combinations of summations within a certain criteria and place them in a table so I can match a different table to it. Right now I am more concerned with creating the table of combinations. First time I have asked a question sorry if I mess something up.
Example:
Access Table: ImpTable
Fields: ID, Year-Month, Name, Country, Quantity
I need to make every combination of summations where the country and Year-Month are the same. Yet keep track of what was included in the formula. If the new table was created and kept track of which ID's were included in the combination I can reference the original table for the name.
Expected Ending Table Results:
NewID, Year-Month, Country, SumQuantity, ComboName (ID's from original table)
Any help is appreciated.
Raw Data:
ID Year-Month Name Country Quantity
1 2016-06 Person1 US 10
2 2016-06 Person2 US 12
3 2016-10 Person3 US 4
4 2016-06 Person4 UK 5
5 2016-06 Person5 UK 6
6 2016-06 Person6 US 3
Desired Results:
NewID Year-Month Country SumQuantity ComboName
1 2016-06 US 22 1,2
2 2016-06 US 13 1,6
3 2016-06 US 25 1,2,6
4 2016-06 US 15 2,6
5 2016-06 UK 11 4,5
6 2016-10 US 4 3