Is it possible to change a graphs x and y axis major/minor units using openpyxl? - openpyxl

I have tried the following but none work.
chart.auto_axis = False
chart.x_axis.unit = 365
chart.set_y_axis({'minor_unit': 100, 'major_unit':365})
changing the max and min scale for both axis is straight forward
chart.x_axis.scaling.min = 0
chart.x_axis.scaling.max = 2190
chart.y_axis.scaling.min = 0
chart.y_axis.scaling.max = 2
so I'm hoping there is a straight forward solution to this. Here is a mcve.
from openpyxl import load_workbook, Workbook
import datetime
from openpyxl.chart import ScatterChart, Reference, Series
wb = Workbook()
ws = wb.active
rows = [
['data point 1', 'data point2'],
[25, 1],
[100, 2],
[500, 3],
[800, 4],
[1200, 5],
[2100, 6],]
for row in rows:
ws.append(row)
chart = ScatterChart()
chart.title = "Example Chart"
chart.style = 18
chart.y_axis.title = 'y'
chart.x_axis.title = 'x'
chart.x_axis.scaling.min = 0
chart.y_axis.scaling.min = 0
chart.X_axis.scaling.max = 2190
chart.y_axis.scaling.max = 6
xvalues = Reference(ws, min_col=1, min_row=2, max_row=7)
yvalues = Reference(ws, min_col=2, min_row=2, max_row=7)
series = Series(values=yvalues, xvalues=xvalues, title="DP 1")
chart.series.append(series)
ws.add_chart(chart, "D2")
wb.save("chart.xlsx")
I need to automate changing the axis to units of 365 or what ever.

Very late answer, but I figured out how to do this just after finding this question.
You need to set the major unit axis to 365.25, and the format to show just the year:
chart.x_axis.number_format = 'yyyy'
chart.x_axis.majorUnit = 365.25

Related

Automatically assigning p-value position in ggplot loop

I am running an mapply loop on a huge set of data to graph 13 parameters for 19 groups. This is working great except the p-value position. Due to the data varying for each plot I cannot assign position using label.y = 125 for example, in some plots it is in the middle of the bar/error bar. However, I can't assign it higher without having it way to high on other graphs. Is there a way to adjust to the data and error bars?
This is my graphing function and like I said the graph is great, except p-value position. Specifically, the stat compare means anova line.
ANOVA_plotter <- function(Variable, treatment, Grouping, df){
Inputdf <- df %>%
filter(Media == treatment, Group == Grouping) %>%
ggplot(aes_(x = ~ID, y = as.name(Variable))) +
geom_bar(aes(fill = ANOVA_Status), stat = "summary", fun = "mean", width = 0.9) +
stat_summary(geom = "errorbar", fun.data = "mean_sdl", fun.args = list(mult = 1), size = 1) +
labs(title = paste(Variable, "in", treatment, "in Group", Grouping, sep = " ")) +
theme(legend.position = "none",axis.title.x=element_blank(), axis.text = element_text(face="bold", size = 18 ), axis.text.x = element_text(angle = 45, hjust = 1)) +
stat_summary(geom = "errorbar", fun.data = "mean_sdl", fun.args = list(mult = 1), width = 0.2) +
stat_compare_means(method = "anova", label.y = 125) +
stat_compare_means(label = "p.signif", method = "t.test", paired = FALSE, ref.group = "Control")
}
I get graphs that look like this
(https://i.stack.imgur.com/hV9Ad.jpg)
But I can't assign it to label.y = 200 because of plots like this
(https://i.stack.imgur.com/uStez.jpg)

Python3 to speed up the computing of dataframe

I have a dataframe (df) as following
id date t_slot dayofweek label
1 2021-01-01 2 0 1
1 2021-01-02 3 1 0
2 2021-01-01 4 6 1
.......
The data frame is very large(6 million rows). the t_slot is from 1 to 6 value. dayofweek is from 0-6.
I want to get the rate:
- the each id's rate about the label is 1 rate when the t_slot is 1 to 4, and dayofweek is 0-4 in the past 3 months before the date in each row.
- the each id's rate about the label is 1 rate when the t_slot is 1 to 4, and dayofweek is 0-4 in the past 3 months before the date in each row.
- the each id's rate about the label is 1 rate when the t_slot is 5 to 6, and dayofweek is 5-6 in the past 3 months before the date in each row.
- the each id's rate about the label is 1 rate when the t_slot is 5 to 6, and dayofweek is 5-6 in the past 3 months before the date in each row.
I have used loop to compute the rate, but it is very slow, do you have fast way to compute it. My code is copied as following:
def get_time_slot_rate(df):
import numpy as np
if len(df)==0:
return np.nan, np.nan, np.nan, np.nan
else:
work = df.loc[df['dayofweek']<5]
weekend = df.loc[df['dayofweek']>=5]
if len(work)==0:
work_14, work_56 = np.nan, np.nan
else:
work_14 = len(work.loc[(work['time_slot']<5)*(work['label']==1)])/len(work)
work_56 = len(work.loc[(work['time_slot']>5)*(work['label']==1)])/len(work)
if len(weekend)==0:
weekend_14, weekend_56 = np.nan, np.nan
else:
weekend_14 = len(weekend.loc[(weekend['time_slot']<5)*(weekend['label']==1)])/len(weekend)
weekend_56 = len(weekend.loc[(weekend['time_slot']>5)*(weekend['label']==1)])/len(weekend)
return work_14, work_56, weekend_14, weekend_56
import datetime as d_t
lst_id = list(df['id'])
lst_date = list(df['date'])
lst_t14_work = []
lst_t56_work = []
lst_t14_weekend = []
lst_t56_weekend = []
for i in range(len(lst_id)):
if i%100==0:
print(i)
d_date = lst_date[i]
dt = d_t.datetime.strptime(d_date, '%Y-%m-%d')
month_step = relativedelta(months=3)
pre_date = str(dt - month_step).split(' ')[0]
df_s = df.loc[(df['easy_id']==lst_easy[i])
& ((df['delivery_date']>=pre_date)
&(df['delivery_date']< d_date))].reset_index(drop=True)
work_14_rate, work_56_rate, weekend_14_rate, weekend_56_rate = get_time_slot_rate(df_s)
lst_t14_work.append(work_14_rate)
lst_t56_work.append(work_56_rate)
lst_t14_weekend.append(weekend_14_rate)
lst_t56_weekend.append(weekend_56_rate)
I could only fix your function and it's completely untested, but here we go:
Import only once by putting the imports at the top of your .py.
try/except blocks are more efficient than if/else statements.
True and False equals to 1 and 0 respectively in Python.
Don't multiply boolean selectors and use the reverse operator ~
Create the least amount of copies.
import numpy as np
def get_time_slot_rate(df):
# much faster than counting
if df.empty:
return np.nan, np.nan, np.nan, np.nan
# assuming df['label'] is either 0 or 1
df = df.loc[df['label']]
# create boolean selectors to be inverted with '~'
weekdays = df['dayofweek']<=5
slot_selector = df['time_slot']<=5
weekday_count = np.sum(weekdays)
try:
work_14 = len(df.loc[weekdays & slot_selector])/weekday_count
work_56 = len(df.loc[weekdays & ~slot_selector])/weekday_count
except ZeroDivisionError:
work_14 = work_56 = np.nan
weekend_count = np.sum(~weekdays)
try:
weekend_14 = len(df.loc[~weekdays & slot_selector])/weekend_count
weekend_56 = len(df.loc[~weekdays & ~slot_selector])/weekend_count
except ZeroDivisionError:
weekend_14 = weekend_56 = np.nan
return work_14, work_56, weekend_14, weekend_56
The rest of your script doesn't really make sense, see my comments:
for i in range(len(lst_id)):
if i%100==0:
print(i)
d_date = date[i]
# what is d_t ?
dt = d_t.datetime.strptime(d_date, '%Y-%m-%d')
month_step = relativedelta(months=3)
pre_date = str(dt - month_step).split(' ')[0]
df_s = df.loc[(df['easy_id']==lst_easy[i])
& (df['delivery_date']>=pre_date)
&(df['delivery_date']< d_date)].reset_index(drop=True)
# is it df or df_s ?
work_14_rate, work_56_rate, weekend_14_rate, weekend_56_rate = get_time_slot_rate(df)
If your date column is a datetime object than you can compare dates directly (no need for strings).

Data-Visualization Python

Plot 4 different line plots for the 4 companies in dataframe open_prices. Year would be on X-axis, stock price on Y axis, you will need (2,2) plot. Set figure size to 10, 8 and share X-axis for better visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nsepy import get_history
import datetime as dt
%matplotlib inline
start = dt.datetime(2015, 1, 1)
end = dt.datetime.today()
infy = get_history(symbol='INFY', start = start, end = end)
infy.index = pd.to_datetime(infy.index)
hdfc = get_history(symbol='HDFC', start = start, end = end)
hdfc.index = pd.to_datetime(hdfc.index)
reliance = get_history(symbol='RELIANCE', start = start, end = end)
reliance.index = pd.to_datetime(reliance.index)
wipro = get_history(symbol='WIPRO', start = start, end = end)
wipro.index = pd.to_datetime(wipro.index)
open_prices = pd.concat([infy['Open'], hdfc['Open'],reliance['Open'],
wipro['Open']], axis = 1)
open_prices.columns = ['Infy', 'Hdfc', 'Reliance', 'Wipro']
f, (ax1, ax2) = plt.subplots(1, 2, sharey=True)
axes[0, 0].plot(open_prices.index.year,open_prices.INFY)
axes[0, 1].plot(open_prices.index.year,open_prices.HDB)
axes[1, 0].plot(open_prices.index.year,open_prices.TTM)
axes[1, 1].plot(open_prices.index.year,open_prices.WIT)
Blank graph is coming.Please help....?!??
Below code works fine , I have changed the following things
a) axis should be ax b) DF column names were incorrect c) for any one to try this example would also need to install lxml library
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from nsepy import get_history
import datetime as dt
start = dt.datetime(2015, 1, 1)
end = dt.datetime.today()
infy = get_history(symbol='INFY', start = start, end = end)
infy.index = pd.to_datetime(infy.index)
hdfc = get_history(symbol='HDFC', start = start, end = end)
hdfc.index = pd.to_datetime(hdfc.index)
reliance = get_history(symbol='RELIANCE', start = start, end = end)
reliance.index = pd.to_datetime(reliance.index)
wipro = get_history(symbol='WIPRO', start = start, end = end)
wipro.index = pd.to_datetime(wipro.index)
open_prices = pd.concat([infy['Open'], hdfc['Open'],reliance['Open'],
wipro['Open']], axis = 1)
open_prices.columns = ['Infy', 'Hdfc', 'Reliance', 'Wipro']
print(open_prices.columns)
ax=[]
f, ax = plt.subplots(2, 2, sharey=True)
ax[0,0].plot(open_prices.index.year,open_prices.Infy)
ax[1,0].plot(open_prices.index.year,open_prices.Hdfc)
ax[0,1].plot(open_prices.index.year,open_prices.Reliance)
ax[1,1].plot(open_prices.index.year,open_prices.Wipro)
plt.show()

How to add data table into legend in openpyxl

How to use openpyxl to add data table into legend area like the pic show below:
There is a openpyxl.chart.chartspace.DataTable class in openpyxl, but I can't find any examples to use it.
Maybe it's too late, but if someone needs it, I've found a solution.
from openpyxl import Workbook
from openpyxl.chart import BarChart, Series, Reference
from openpyxl.chart.plotarea import DataTable
# Create a new Workbook.
wb = Workbook()
# Select the active worksheet.
ws = wb.active
# Data
rows = [
('val', 'Batch 1', 'Batch 2'),
('val_1', 10, 30),
('val_2', 40, 60),
('val_3', 50, 70),
('val_4', 20, 10),
('val_5', 10, 40),
('val_6', 50, 30),
]
# Adding data to worksheet.
for row in rows:
ws.append(row)
# Create a new BarChart.
chart1 = BarChart()
# Adding some attributes.
chart1.type = "col"
chart1.style = 3
chart1.title = "Bar Chart"
chart1.y_axis.title = 'Test number'
chart1.x_axis.title = 'Sample length (mm)'
# Adding data and labels.
data = Reference(ws, min_col=2, min_row=1, max_row=7, max_col=3)
labels = Reference(ws, min_col=1, min_row=2, max_row=7)
chart1.add_data(data, titles_from_data=True)
chart1.set_categories(labels)
chart1.shape = 4
# --- SOLUTION --- Creating the legend table.
chart1.plot_area.dTable = DataTable()
chart1.plot_area.dTable.showHorzBorder = True
chart1.plot_area.dTable.showVertBorder = True
chart1.plot_area.dTable.showOutline = True
chart1.plot_area.dTable.showKeys = True
ws.add_chart(chart1, "A10")
wb.save(".\\result.xlsx")
--- RESULT ---
openpyxl version == 3.0.10
I hope I've been able to help! :D

vb.net chart with week numbers as X Axis

I know a similar question was asked before, but it wasn't answered.
I have a chart where I use week numbers as its X-axis, that are retrieved as part of the SQL query.
It works fine up until a new year starts. In such a case, even though the weeks are ordered correctly when retrieved (for example, 49, 50, 51, 52, 1, 2, 3), they appear on the axis numerically ordered: 1, 2, 3, 49, 50, 51, 52.
Is there a way to fix that?
The relevant sql query part is:
SELECT DATEPART(year, start_charge_time),
DATEPART(week, start_charge_time) week_num,
COUNT(*) num_sessions
FROM parking_log
GROUP BY DATEPART(year, start_charge_time),
DATEPART(week, start_charge_time)
ORDER BY DATEPART(year, start_charge_time),
DATEPART(week, start_charge_time)
The Chart is:
Dim SeriesParkingRev As Series
SeriesParkingRev = ChartParkingSummery.Series("SeriesParkingRev")
' Set series chart type
SeriesParkingRev.ChartType = SeriesChartType.Line
SeriesParkingRev.MarkerStyle = MarkerStyle.Square
SeriesParkingRev.MarkerSize = 10
SeriesParkingRev.BorderWidth = 3
SeriesParkingRev.Color = Color.Red
SeriesParkingRev.IsValueShownAsLabel = True
SeriesParkingRev.IsVisibleInLegend = True
'' Set series members names for the X and Y values
SeriesParkingRev.XValueMember = "week_num"
SeriesParkingRev.YValueMembers = "total_charge"
SeriesParkingRev.LegendText = "Parking Revenue"
'' --------------------------
' Create the destination series and add it to the chart
'Dim SeriesParkingTime As New Series("SeriesParkingTime")
'ChartParkingSummery.Series.Add(SeriesParkingTime)
Dim SeriesParkingTime As Series
SeriesParkingTime = ChartParkingSummery.Series("SeriesParkingTime")
' Ensure the destination series is a Line or Spline chart type
SeriesParkingTime.ChartType = SeriesChartType.Line
SeriesParkingTime.MarkerStyle = MarkerStyle.Diamond
SeriesParkingTime.MarkerSize = 12
SeriesParkingTime.BorderWidth = 3
SeriesParkingTime.IsValueShownAsLabel = True
SeriesParkingTime.Color = Color.Blue
' Assign the series to the same chart area as the column chart
SeriesParkingTime.ChartArea = ChartParkingSummery.Series("SeriesParkingRev").ChartArea
' Assign this series to use the secondary axis and set its maximum to be 100%
SeriesParkingTime.YAxisType = AxisType.Secondary
ChartParkingSummery.Series("SeriesParkingTime").XValueMember = "week_num"
ChartParkingSummery.Series("SeriesParkingTime").YValueMembers = "avg_time"
ChartParkingSummery.Series("SeriesParkingTime").LegendText = "Average Time"
'' ----------------------------------------
'Dim SeriesCars As New Series("SeriesCars")
'ChartParkingSummery.Series.Add(SeriesCars)
Dim SeriesCars As Series
SeriesCars = ChartParkingSummery.Series("SeriesCars")
SeriesCars.ChartType = SeriesChartType.Point
SeriesCars.MarkerStyle = MarkerStyle.Triangle
SeriesCars.MarkerSize = 8
SeriesCars.BorderWidth = 3
SeriesCars.IsValueShownAsLabel = True
SeriesCars.MarkerStyle = MarkerStyle.Triangle
SeriesCars.MarkerSize = 15
SeriesCars.MarkerColor = Color.Green
ChartParkingSummery.Series("SeriesCars").XValueMember = "week_num"
ChartParkingSummery.Series("SeriesCars").YValueMembers = "num_cars"
ChartParkingSummery.Series("SeriesCars").LegendText = "Cars"
'' ----------------------------------------
ChartParkingSummery.Legends.Add(New Legend("Default"))
' Set legend style
ChartParkingSummery.Legends("Default").LegendStyle = LegendStyle.Table
' Set table style if legend style is Table
' Set legend docking
ChartParkingSummery.Legends("Default").Docking = Docking.Right
' Set legend alignment
ChartParkingSummery.Legends("Default").Alignment = StringAlignment.Center
ChartParkingSummery.Titles.Add("Revenue, Avg Time & Cars")