Why do I get wrong Results in my Gekko Optimization when I integrate a switching variable for fix-costs of Components? - optimization

I have a problem with my optimization program. I am trying to optimize the energy Supply of a Single Family House. To get faster Results I have linearized the Cost-Function on the Components of the System (Heat Pump, Storage (Heat,Electicity), electrical Water heater). The linearization leads to Cost-Functions of the form: a*x+b where x is the Design-Value for the components (e.g. P_el_HP_max) of the energy System. This means that if the design Value of a Component is equal to Zero the "fix-costs" (b) still need to be payed. To tell the Program that it only needs to Pay the fix costs if the design-value for the components is greater than Zero, I have added a binary Variable in the Form of m.if2(-x,1,0) to see if the "fix-costs" need to be payed. The Objective of the Optimization is to minimize the costs for the Energy System. The Problem I have is that the solution I get with the m.if2-Statements is not the optimal Solution. When I run the Program without the m.if2 statements for the same time period I get a lower objective even though the "fix-costs" need to be payed so it should be a higher Objective if anything. I have also tried it with the m.if3-Statements and a m.sign(x) function, which also doesn't seem to work. It would be much appreciated if someone has an Idea on how i can solve this problem. Thank you very much.
Here is my Code:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
timesteps= 100
m = GEKKO(remote=False)
t = np.linspace(0, timesteps-1, timesteps) #Zeit
m.time = t
m.options.SOLVER = 1
m.options.IMODE = 6
m.options.NODES = 2
m.options.REDUCE=3
m.solver_options = ['minlp_maximum_iterations 1000',\
'minlp_max_iter_with_int_sol 500',\
'minlp_integer_tol 1.0e-1',\
'minlp_branch_method 1',\
'objective_convergence_tolerance 1.0e-4',\
'nlp_maximum_iterations 100',\
'minlp_gap_tol 0.1']
# Energy Demand
#1. electricity
EL_Demand_Arr1=np.array([1.9200000,
1.4890000,1.4920000,1.1300000,0.64099997,0.58600003,
0.58399999,0.61000001,0.54900002,0.59500003,0.92699999,
0.95599997,0.91000003,1.1450000,1.1090000,1.6360000,
1.4740000,1.4680001,2.6150000,2.1810000,1.2320000,
1.3700000,0.96899998,1.3220000,1.1880000,0.64399999,
0.53899997,0.55299997,0.52899998,0.56099999,0.54600000,
0.80000001,1.1350000,0.70700002,1.1680000,1.0440000,
2.3160000,1.6420000,2.2370000,2.8870001,1.8550000,
1.4030000,0.70599997,1.4980000,3.4879999,1.5130000,
1.4349999,1.3520000,1.0530000,0.51700002,0.55000001,
0.52800000,0.52999997,0.56199998,0.53700000,0.58999997,
0.53500003,0.92500001,1.3490000,0.66299999,4.3810000,
1.0200000,0.79799998,0.77899998,1.0840000,2.1530001,
3.7449999,5.3490000,1.8710001,2.3610001,0.78799999,
0.47099999,0.56800002,0.51700002,0.54799998,0.55699998,
0.51400000,0.56500000,3.2790000,2.2750001,1.2300000,
0.97899997,0.78200001,1.0140001,0.77800000,0.58099997,
0.52999997,0.55900002,1.1770000,1.5400000,1.4349999,
2.0400000,2.2790000,1.6520000,1.6450000,1.2830000,
0.55800003,0.52499998,0.51899999,0.53799999])
EL_Demand_Arr2=EL_Demand_Arr1.round(decimals=3)
EL_Demand_Arr=EL_Demand_Arr2[0:timesteps]
EL_Demand=m.Param(EL_Demand_Arr,name='EL_Demand')
#2. heat
H_Demand_Arr1=np.array([1.0960000,1.0790000,
1.1590000,1.1760000,1.6940000,2.2639999,2.1450000,
2.0769999,2.0720000,2.0300000,1.9069999,1.8810000,
1.7880000,1.8180000,1.8049999,2.0430000,2.1489999,
2.1700001,2.1830001,2.1910000,1.9920000,1.5290000,
1.1810000,1.0400000,1.4310000,1.4110000,1.4700000,
1.4900000,1.8880000,2.4530001,2.2809999,2.3199999,
2.2960000,2.3299999,2.1630001,2.1289999,2.0599999,
2.1090000,2.0940001,2.3450000,2.4380000,2.4679999,
2.4630001,2.4480000,2.2219999,1.8480000,1.5779999,
1.4310000,1.5000000,1.4790000,1.5410000,1.5620000,
1.9790000,2.5720000,2.3910000,2.4319999,2.4070001,
2.4430001,2.2679999,2.2309999,2.1589999,2.2110000,
2.1949999,2.4579999,2.5560000,2.5869999,2.5820000,
2.5660000,2.3290000,1.9380000,1.6540000,1.5000000,
1.7160000,1.6930000,1.7630000,1.7869999,2.2650001,
2.9430001,2.7360001,2.7839999,2.7539999,2.7950001,
2.5950000,2.5539999,2.4710000,2.5300000,2.5120001,
2.8130000,2.9250000,2.9600000,2.9549999,2.9370000,
2.6659999,2.2170000,1.8930000,1.7160000,1.7980000,
1.7670000,1.8789999,1.9160000])
H_Demand_Arr2=H_Demand_Arr1.round(decimals=3)
H_Demand_Arr=H_Demand_Arr2[0:timesteps]
H_Demand=m.Param(H_Demand_Arr,name='H_Demand')
#3. Domestic Hot Water
DHW_Demand_Arr1=np.array([1.7420000,0,0,2.0320001,
0,0,3.7739999,2.4960001,3.3670001,0,2.4380000,
1.1030000,0,0,0,3.1350000,2.2060001,0,4.4120002,
0,0,0,0.87099999,1.5089999,0,0,0,0,0,0.87099999,
0.81300002,1.1610000,2.5539999,1.6260000,0,0,
0.63900000,0,3.4830000,2.8450000,2.4960001,
7.1409998,5.7480001,2.3800001,3.1930001,0,1.1610000,
0,0,0,0,0,0,0,2.6129999,1.9160000,4.2379999,
0.34799999,5.4569998,0,0,2.8450000,0,0,0,0,0,
2.4960001,1.6260000,0,2.5539999,0,0,0,0,0,
1.6260000,0,3.0190001,0,2.8450000,1.1030000,
2.9030001,0,0,0,0.98699999,0,1.1610000,0.34799999,
1.3930000,1.2770000,4.4120002,0,0,0,0,1.8580000,
0,0.98699999])
DHW_Demand_Arr2=DHW_Demand_Arr1.round(decimals=3)
DHW_Demand_Arr=DHW_Demand_Arr2[0:timesteps]
DHW_Demand=m.Param(DHW_Demand_Arr,name='TWW_BED')
#4. electricity production from PV
PV_P_Arr1=np.array([0,0,0,0,0,0,0,0,0,0.057000000,
0.14399999,0.30500001,0.13600001,0.28900000,0.22000000,
0.0040000002,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0.061999999,0.78899997,0.56300002,0.13600001,
0.052999999,0.017000001,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0.037000000,0.098999999,0.15000001,
0.11200000,0,0.12600000,0.032000002,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0.0040000002,0.73600000,1.8250000,
2.4020000,3.1870000,0.66500002,0.045000002,0,0,0,0,0,
0,0,0,0,0,0,0,0])
PV_P_Arr2=PV_P_Arr1.round(decimals=3)
PV_P_Arr=PV_P_Arr2[0:timesteps]
PV_P=m.Param(PV_P_Arr,name='PV_P')
# Heat Pump "Bit" ist '1' during the Heating Season and '0' outside the heating Season to tell the Promgram that the Heat Pump may only be used during heating Season
HP_Bit_Arr1=np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])
HP_Bit_Arr=HP_Bit_Arr1[0:timesteps]
HP_Bit=m.Param(HP_Bit_Arr,name='HP_Bit')
# Battery Storage
B_S = m.SV(0,lb=0)
B_S.FSTATUS=1
m.fix_initial(B_S,0)
B_S_Load = m.SV(0,lb=0) #Loading Battery
B_S_Recover = m.SV(0,lb=0) #Recover Energy from Batterie
eff_B_S = 0.95 #efficiency Battery
# Heat Storage
H_S = m.SV(0,lb=0)
H_S.FSTATUS=1
m.fix_initial(H_S,0)
H_S_Load = m.SV(lb=0) #Loading Heat Storage
H_S_Recover = m.SV(lb=0) #Recover Energy from Heat Storage
eff_H_S = 0.9 #efficiency Heat Storage
#Heat Pump
# Binary Variable for Heat Pump
# it can either be turned on '1' or off '0'
P_HP_Binary=HP_Bit*m.sos1([0,1])
P_HP_Binary.STATUS = 1
#Electrical sizing of the Heat Pump
P_el_HP1= m.SV(0,lb=0)
P_el_HP1.FSTATUS=1
#Electrical Water Heater
EH=m.SV(0,lb=0)
EH.FSTATUS=1
# The Power of the Heat Pump multiplied with
# the Binary Variable gives the actual
# Output of the Heat Pump
P_el_HP=m.Intermediate(P_HP_Binary*P_el_HP1)
P_el_HP_max=m.FV(lb=0) #Heat Pump
P_el_HP_max.STATUS=1
H_S_max=m.FV(lb=0) #Heat Storage
H_S_max.STATUS=1
B_S_max=m.FV(lb=0) #Battery
B_S_max.STATUS=1
EH_max=m.FV(lb=0) #Electrical Water Heater
EH_max.STATUS=1
COP_HP=3.5 #COP of the Heat Pump
#Modulating Heat Pump
Q_HP=m.Intermediate(1.4*COP_HP*0.3*P_el_HP_max* \
P_HP_Binary+(COP_HP*(1-0.3*1.4)/(1-0.3)) \
*(P_el_HP-0.3*P_el_HP_max*P_HP_Binary))
# thermal Energy Output of the Heat Pump
#Q_HP=m.Intermediate(COP_HP*P_el_HP)
# the objective of this Optimization is to minimize
# the Cost for the Energy-System, since you only
# Pay for the maximal Value of the Heat Pump,
# Energy Storage and the electrical Water Heater
# and not for the value at each timestep, I define
# a FV that describes the maximal Value of the Components
#Losses of the Heat Storage
# eps_HS = 0.9 # Heat Storage Efficiancy
# Vol_HS=m.Intermediate((H_S_max*3600)/(1000*4.18*(45-20)))
# Vol_HS_timestep=m.Intermediate((H_S*3600)/(1000*4.18*(45-20)))
# alpha_a=8 #W/m^2K
# s=0.1 #m
# Lamda= 0.03 #W/mK
# r_i=0.3
# r_a=r_i+0.1
# A_Top_Bottom=2*pi*r_a**2
# U_Top_Bottom=(1/((1/alpha_a)+(s/Lamda))) #annahme, keine Konvektion an der Innenseite also alpha_i=0
# U_A_Surface=(((Vol_HS_timestep/(r_i**2))/(((1/Lamda)\
# *np.log(r_a/r_i))+(1/(alpha_a*r_a)))))
#Q_Losses_HS=m.Intermediate((U_Top_Bottom*A_Top_Bottom*2+U_A_Surface)*(45-20)/1000)
# We have energy Production from PV, there ist a possibility to give Energy thats not needed to the public Grid
I_Excess=m.Var(0,lb=0)
# In Case we have more Demand for Electrical Enery than Production from PV we have the possibility to get Energy from the public Grid
I_feed_out=m.SV(0,lb=0)
I_feed_out.FSTATUS=1
# Volume of the Heat Storage in m^3
Vol_HS=m.Intermediate((H_S_max*3600)/(1000*4.18*(45-20)))
# boundary conditions
m.Equations([PV_P +I_feed_out + B_S_Recover - P_el_HP - B_S_Load - I_Excess - EH == EL_Demand, #Energy Balance needs to satisfy the Demand
B_S.dt() == B_S_Load - B_S_Recover/eff_B_S, #Loading and Recovery of the Battery
B_S_Load * B_S_Recover == 0, #It is not allowed to Load and Recover at the same Time, at least one of both needs to be equal to '0' at each Timestep
Q_HP + H_S_Recover - H_S_Load + EH == H_Demand + DHW_Demand, #The Demand of Heat and DHW needs to be satisfied at each timestep
H_S.dt() == H_S_Load - H_S_Recover/eff_H_S, #Loading and recovery of the Heat Storage
H_S_Load * H_S_Recover == 0, #It is not allowed to Load and Recover at the same Time, at least one of both needs to be equal to '0' at each Timestep
# The maximal Value of the Enery System Components is the Upper Bound for the Value at each time Step
P_el_HP1 <= P_el_HP_max,
P_el_HP1 >= 0.3*P_el_HP_max, # the Heat Pump is a variable speed heat Pump and has a minimal output of 40% of the nominal Power
H_S <= H_S_max,
B_S <= B_S_max,
EH <= EH_max,])
#Binary Variable to tell the Program that it only needs to Pay the "Fix Costs" for the Component if the Components have a Value greater than 0
BS_bin=m.if2(-B_S_max,1,0) #Battery Storage
HS_bin=m.if2(-H_S_max,1,0) #Heat Storage
HP_bin=m.if2(-P_el_HP_max,1,0) # Heat Pump
EH_bin=m.if2(-EH_max,1,0) #Electrical Heater
#Objective is to minimize the cost of the Energy System (the Cost of Components that only need to be bought once get divided by the number of timesteps)
Objective=((2599.3*HP_bin+1142.9*P_el_HP_max*COP_HP+(EH_max*50)+1234.8*BS_bin+792.8*B_S_max+(((H_S_max*3600)/(4.18*(45-20)))*1.6+672.5*HS_bin))/(20*timesteps)-0.05*I_Excess+0.35*(I_feed_out))
m.Minimize(Objective)
m.solve(disp=True)
#Print Results
print("Nominal Power of the Heat Pump=",max(P_el_HP),"kW")
print("maximum Capacity of the Heat Storage=",max(H_S),"kW")
print("Volume of the Heat Storage=", max(Vol_HS),"m^3")
print("maximum Capacity of the Battery", max(B_S),"kW")
print("Electricity from the Public Grid",sum(I_feed_out[0:timesteps-1]))
# Plot results
fig, axes = plt.subplots(6, 1, figsize=(5, 5.1), sharex=True)
axes = axes.ravel()
ax = axes[0]
ax.plot(t, EL_Demand, 'r-', label='Electrical Demand',lw=1)
ax.plot(t, PV_P,'b:', label='PV Production',lw=1) #z.B. Generator (haben wir aber in unserem Energiesystem nicht)
ax = axes[1]
ax.plot(t, EL_Demand, 'r-', label='Electrical Demand',lw=1)
ax.plot(t,I_feed_out, 'k--', label='Electricity from the public Grid',lw=1)
ax = axes[2]
ax.plot(t,B_S.value, 'k-', label='Battery Storage',lw=1)
ax.plot(t,B_S_Load,'g--',label='Battery Storage Loading',lw=1)
ax.plot(t,B_S_Recover,'b:',label='Battery Storage Recovery',lw=1) #lw=2 --> linewidth
ax = axes[3]
ax.plot(t,H_Demand, 'r-', label='Heat Demand',lw=1)
ax.plot(t, Q_HP.value,'b:',\
label='Thermal Production Heat Pump',lw=1)
ax = axes[4]
ax.plot(t,H_S, 'k-', label='Heat Storage',lw=1)
ax.plot(t,H_S_Load,'g--',label='Heat Storage Loading',lw=1)
ax.plot(t,H_S_Recover.value,'b:',\
label='Heat Storage Recovered Energy',lw=1)
ax = axes[5]
ax.plot(t,DHW_Demand, 'r-', label='Domestic Hot Water Demand',lw=1)
ax.plot(t, EH,'b:',\
label='Electrical Water Heater',lw=1)
for ax in axes:
ax.legend(loc='center left',\
bbox_to_anchor=(1,0.5),frameon=False)
ax.grid()
ax.set_xlim(0,len(t)-1)
plt.savefig('Results.png', dpi=600,\
bbox_inches = 'tight')
plt.show()

Gekko solvers are numerical so values of 1e-8 and -1e-8 are equivalent according to solver tolerance. This means that < and <= are the same for a numerical solver. Try setting the switching point slightly away from zero to get the intended effect of setting BS_bin=0 when B_S_max<tol.
tol = 0.1
BS_bin=m.if3(-B_S_max+tol,1,0) #Battery Storage
HS_bin=m.if3(-H_S_max+tol,1,0) #Heat Storage
HP_bin=m.if3(-P_el_HP_max+tol,1,0) # Heat Pump
EH_bin=m.if3(-EH_max+tol,1,0) #Electrical Heater
The m.if3() functions add a new binary variable for each function at every time step so the problem can get very large.
Additionally, the m.sos1() function is only needed when the discrete variable choices are not continuous (e.g. [0,2,7,10]) or when they are not integers (e.g. [0,0.2,0.6,1.5]). Try using a binary variable instead:
# Binary Variable for Heat Pump
# it can either be turned on '1' or off '0'
b_HP = m.MV(lb=0,ub=1,integer=True)
b_HP.STATUS = 1
P_HP_Binary=HP_Bit*b_HP #m.sos1([0,1])
The prior version of the model had P_HP_Binary.STATUS = 1. This is only needed for m.MV() or m.CV() types to turn them on or off.
The feedback status on the m.SV() variables (e.g EH.FSTATUS=1, P_el_HP1.FSTATUS=1, and I_feed_out.FSTATUS=1) are only needed when there is a measurement for these values and you want to update the initial condition. This is sometimes used in MPC applications, but it isn't needed here.
The successful solution is:
---------------------------------------------------
Solver : APOPT (v1.0)
Solution time : 3.01009999999951 sec
Objective : 174.293488846257
Successful solution
---------------------------------------------------
Nominal Power of the Heat Pump= 0.0 kW
maximum Capacity of the Heat Storage= 0.0 kW
Volume of the Heat Storage= 0.0 m^3
maximum Capacity of the Battery 0.0 kW
Electricity from the Public Grid 426.5910395607
However, this doesn't seem like a correct solution to have zero capacity for everything and just buy from the grid. Here is the full script:
from gekko import GEKKO
import numpy as np
import matplotlib.pyplot as plt
timesteps= 100
m = GEKKO(remote=False)
t = np.linspace(0, timesteps-1, timesteps) #Zeit
m.time = t
m.options.SOLVER = 1
m.options.IMODE = 6
m.options.NODES = 2
m.options.REDUCE=3
m.solver_options = ['minlp_maximum_iterations 1000',\
'minlp_max_iter_with_int_sol 500',\
'minlp_integer_tol 1.0e-1',\
'minlp_branch_method 1',\
'objective_convergence_tolerance 1.0e-4',\
'nlp_maximum_iterations 100',\
'minlp_gap_tol 0.1']
# Energy Demand
#1. electricity
EL_Demand_Arr1=np.array([1.9200000,
1.4890000,1.4920000,1.1300000,0.64099997,0.58600003,
0.58399999,0.61000001,0.54900002,0.59500003,0.92699999,
0.95599997,0.91000003,1.1450000,1.1090000,1.6360000,
1.4740000,1.4680001,2.6150000,2.1810000,1.2320000,
1.3700000,0.96899998,1.3220000,1.1880000,0.64399999,
0.53899997,0.55299997,0.52899998,0.56099999,0.54600000,
0.80000001,1.1350000,0.70700002,1.1680000,1.0440000,
2.3160000,1.6420000,2.2370000,2.8870001,1.8550000,
1.4030000,0.70599997,1.4980000,3.4879999,1.5130000,
1.4349999,1.3520000,1.0530000,0.51700002,0.55000001,
0.52800000,0.52999997,0.56199998,0.53700000,0.58999997,
0.53500003,0.92500001,1.3490000,0.66299999,4.3810000,
1.0200000,0.79799998,0.77899998,1.0840000,2.1530001,
3.7449999,5.3490000,1.8710001,2.3610001,0.78799999,
0.47099999,0.56800002,0.51700002,0.54799998,0.55699998,
0.51400000,0.56500000,3.2790000,2.2750001,1.2300000,
0.97899997,0.78200001,1.0140001,0.77800000,0.58099997,
0.52999997,0.55900002,1.1770000,1.5400000,1.4349999,
2.0400000,2.2790000,1.6520000,1.6450000,1.2830000,
0.55800003,0.52499998,0.51899999,0.53799999])
EL_Demand_Arr2=EL_Demand_Arr1.round(decimals=3)
EL_Demand_Arr=EL_Demand_Arr2[0:timesteps]
EL_Demand=m.Param(EL_Demand_Arr,name='EL_Demand')
#2. heat
H_Demand_Arr1=np.array([1.0960000,1.0790000,
1.1590000,1.1760000,1.6940000,2.2639999,2.1450000,
2.0769999,2.0720000,2.0300000,1.9069999,1.8810000,
1.7880000,1.8180000,1.8049999,2.0430000,2.1489999,
2.1700001,2.1830001,2.1910000,1.9920000,1.5290000,
1.1810000,1.0400000,1.4310000,1.4110000,1.4700000,
1.4900000,1.8880000,2.4530001,2.2809999,2.3199999,
2.2960000,2.3299999,2.1630001,2.1289999,2.0599999,
2.1090000,2.0940001,2.3450000,2.4380000,2.4679999,
2.4630001,2.4480000,2.2219999,1.8480000,1.5779999,
1.4310000,1.5000000,1.4790000,1.5410000,1.5620000,
1.9790000,2.5720000,2.3910000,2.4319999,2.4070001,
2.4430001,2.2679999,2.2309999,2.1589999,2.2110000,
2.1949999,2.4579999,2.5560000,2.5869999,2.5820000,
2.5660000,2.3290000,1.9380000,1.6540000,1.5000000,
1.7160000,1.6930000,1.7630000,1.7869999,2.2650001,
2.9430001,2.7360001,2.7839999,2.7539999,2.7950001,
2.5950000,2.5539999,2.4710000,2.5300000,2.5120001,
2.8130000,2.9250000,2.9600000,2.9549999,2.9370000,
2.6659999,2.2170000,1.8930000,1.7160000,1.7980000,
1.7670000,1.8789999,1.9160000])
H_Demand_Arr2=H_Demand_Arr1.round(decimals=3)
H_Demand_Arr=H_Demand_Arr2[0:timesteps]
H_Demand=m.Param(H_Demand_Arr,name='H_Demand')
#3. Domestic Hot Water
DHW_Demand_Arr1=np.array([1.7420000,0,0,2.0320001,
0,0,3.7739999,2.4960001,3.3670001,0,2.4380000,
1.1030000,0,0,0,3.1350000,2.2060001,0,4.4120002,
0,0,0,0.87099999,1.5089999,0,0,0,0,0,0.87099999,
0.81300002,1.1610000,2.5539999,1.6260000,0,0,
0.63900000,0,3.4830000,2.8450000,2.4960001,
7.1409998,5.7480001,2.3800001,3.1930001,0,1.1610000,
0,0,0,0,0,0,0,2.6129999,1.9160000,4.2379999,
0.34799999,5.4569998,0,0,2.8450000,0,0,0,0,0,
2.4960001,1.6260000,0,2.5539999,0,0,0,0,0,
1.6260000,0,3.0190001,0,2.8450000,1.1030000,
2.9030001,0,0,0,0.98699999,0,1.1610000,0.34799999,
1.3930000,1.2770000,4.4120002,0,0,0,0,1.8580000,
0,0.98699999])
DHW_Demand_Arr2=DHW_Demand_Arr1.round(decimals=3)
DHW_Demand_Arr=DHW_Demand_Arr2[0:timesteps]
DHW_Demand=m.Param(DHW_Demand_Arr,name='TWW_BED')
#4. electricity production from PV
PV_P_Arr1=np.array([0,0,0,0,0,0,0,0,0,0.057000000,
0.14399999,0.30500001,0.13600001,0.28900000,0.22000000,
0.0040000002,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
0.061999999,0.78899997,0.56300002,0.13600001,
0.052999999,0.017000001,0,0,0,0,0,0,0,0,0,0,0,0,
0,0,0,0,0,0,0.037000000,0.098999999,0.15000001,
0.11200000,0,0.12600000,0.032000002,0,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0.0040000002,0.73600000,1.8250000,
2.4020000,3.1870000,0.66500002,0.045000002,0,0,0,0,0,
0,0,0,0,0,0,0,0])
PV_P_Arr2=PV_P_Arr1.round(decimals=3)
PV_P_Arr=PV_P_Arr2[0:timesteps]
PV_P=m.Param(PV_P_Arr,name='PV_P')
# Heat Pump "Bit" ist '1' during the Heating Season and '0' outside the heating Season to tell the Promgram that the Heat Pump may only be used during heating Season
HP_Bit_Arr1=np.array([1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1])
HP_Bit_Arr=HP_Bit_Arr1[0:timesteps]
HP_Bit=m.Param(HP_Bit_Arr,name='HP_Bit')
# Battery Storage
B_S = m.SV(0,lb=0)
B_S.FSTATUS=1
m.fix_initial(B_S,0)
B_S_Load = m.SV(0,lb=0) #Loading Battery
B_S_Recover = m.SV(0,lb=0) #Recover Energy from Batterie
eff_B_S = 0.95 #efficiency Battery
# Heat Storage
H_S = m.SV(0,lb=0)
H_S.FSTATUS=1
m.fix_initial(H_S,0)
H_S_Load = m.SV(lb=0) #Loading Heat Storage
H_S_Recover = m.SV(lb=0) #Recover Energy from Heat Storage
eff_H_S = 0.9 #efficiency Heat Storage
#Heat Pump
# Binary Variable for Heat Pump
# it can either be turned on '1' or off '0'
b_HP = m.MV(lb=0,ub=1,integer=True)
b_HP.STATUS = 1
P_HP_Binary=HP_Bit*b_HP #m.sos1([0,1])
P_HP_Binary.STATUS = 1
#Electrical sizing of the Heat Pump
P_el_HP1= m.SV(0,lb=0)
#P_el_HP1.FSTATUS=1
#Electrical Water Heater
EH=m.SV(0,lb=0)
#EH.FSTATUS=1
# The Power of the Heat Pump multiplied with
# the Binary Variable gives the actual
# Output of the Heat Pump
P_el_HP=m.Intermediate(P_HP_Binary*P_el_HP1)
P_el_HP_max=m.FV(lb=0) #Heat Pump
P_el_HP_max.STATUS=1
H_S_max=m.FV(lb=0) #Heat Storage
H_S_max.STATUS=1
B_S_max=m.FV(lb=0) #Battery
B_S_max.STATUS=1
EH_max=m.FV(lb=0) #Electrical Water Heater
EH_max.STATUS=1
COP_HP=3.5 #COP of the Heat Pump
#Modulating Heat Pump
Q_HP=m.Intermediate(1.4*COP_HP*0.3*P_el_HP_max* \
P_HP_Binary+(COP_HP*(1-0.3*1.4)/(1-0.3)) \
*(P_el_HP-0.3*P_el_HP_max*P_HP_Binary))
# thermal Energy Output of the Heat Pump
#Q_HP=m.Intermediate(COP_HP*P_el_HP)
# the objective of this Optimization is to minimize
# the Cost for the Energy-System, since you only
# Pay for the maximal Value of the Heat Pump,
# Energy Storage and the electrical Water Heater
# and not for the value at each timestep, I define
# a FV that describes the maximal Value of the Components
#Losses of the Heat Storage
# eps_HS = 0.9 # Heat Storage Efficiancy
# Vol_HS=m.Intermediate((H_S_max*3600)/(1000*4.18*(45-20)))
# Vol_HS_timestep=m.Intermediate((H_S*3600)/(1000*4.18*(45-20)))
# alpha_a=8 #W/m^2K
# s=0.1 #m
# Lamda= 0.03 #W/mK
# r_i=0.3
# r_a=r_i+0.1
# A_Top_Bottom=2*pi*r_a**2
# U_Top_Bottom=(1/((1/alpha_a)+(s/Lamda))) #annahme, keine Konvektion an der Innenseite also alpha_i=0
# U_A_Surface=(((Vol_HS_timestep/(r_i**2))/(((1/Lamda)\
# *np.log(r_a/r_i))+(1/(alpha_a*r_a)))))
#Q_Losses_HS=m.Intermediate((U_Top_Bottom*A_Top_Bottom*2+U_A_Surface)*(45-20)/1000)
# We have energy Production from PV, there ist a possibility to give Energy thats not needed to the public Grid
I_Excess=m.Var(0,lb=0)
# In Case we have more Demand for Electrical Enery than Production from PV we have the possibility to get Energy from the public Grid
I_feed_out=m.SV(0,lb=0)
#I_feed_out.FSTATUS=1
# Volume of the Heat Storage in m^3
Vol_HS=m.Intermediate((H_S_max*3600)/(1000*4.18*(45-20)))
# boundary conditions
m.Equations([PV_P +I_feed_out + B_S_Recover - P_el_HP - B_S_Load - I_Excess - EH == EL_Demand, #Energy Balance needs to satisfy the Demand
B_S.dt() == B_S_Load - B_S_Recover/eff_B_S, #Loading and Recovery of the Battery
B_S_Load * B_S_Recover == 0, #It is not allowed to Load and Recover at the same Time, at least one of both needs to be equal to '0' at each Timestep
Q_HP + H_S_Recover - H_S_Load + EH == H_Demand + DHW_Demand, #The Demand of Heat and DHW needs to be satisfied at each timestep
H_S.dt() == H_S_Load - H_S_Recover/eff_H_S, #Loading and recovery of the Heat Storage
H_S_Load * H_S_Recover == 0, #It is not allowed to Load and Recover at the same Time, at least one of both needs to be equal to '0' at each Timestep
# The maximal Value of the Enery System Components is the Upper Bound for the Value at each time Step
P_el_HP1 <= P_el_HP_max,
P_el_HP1 >= 0.3*P_el_HP_max, # the Heat Pump is a variable speed heat Pump and has a minimal output of 40% of the nominal Power
H_S <= H_S_max,
B_S <= B_S_max,
EH <= EH_max])
#Binary Variable to tell the Program that it only needs to Pay the "Fix Costs" for the Component if the Components have a Value greater than 0
tol = 0.01
BS_bin=m.if3(-B_S_max+tol,1,0) #Battery Storage
HS_bin=m.if3(-H_S_max+tol,1,0) #Heat Storage
HP_bin=m.if3(-P_el_HP_max+tol,1,0) # Heat Pump
EH_bin=m.if3(-EH_max+tol,1,0) #Electrical Heater
#Objective is to minimize the cost of the Energy System (the Cost of Components that only need to be bought once get divided by the number of timesteps)
Objective=((2599.3*HP_bin+1142.9*P_el_HP_max*COP_HP+(EH_max*50)+1234.8*BS_bin+792.8*B_S_max+(((H_S_max*3600)/(4.18*(45-20)))*1.6+672.5*HS_bin))/(20*timesteps)-0.05*I_Excess+0.35*(I_feed_out))
m.Minimize(Objective)
m.solve(disp=True)
#Print Results
print("Nominal Power of the Heat Pump=",max(P_el_HP),"kW")
print("maximum Capacity of the Heat Storage=",max(H_S),"kW")
print("Volume of the Heat Storage=", max(Vol_HS),"m^3")
print("maximum Capacity of the Battery", max(B_S),"kW")
print("Electricity from the Public Grid",sum(I_feed_out[0:timesteps-1]))
# Plot results
fig, axes = plt.subplots(6, 1, figsize=(5, 5.1), sharex=True)
axes = axes.ravel()
ax = axes[0]
ax.plot(t, EL_Demand, 'r-', label='Electrical Demand',lw=1)
ax.plot(t, PV_P,'b:', label='PV Production',lw=1) #z.B. Generator (haben wir aber in unserem Energiesystem nicht)
ax = axes[1]
ax.plot(t, EL_Demand, 'r-', label='Electrical Demand',lw=1)
ax.plot(t,I_feed_out, 'k--', label='Electricity from the public Grid',lw=1)
ax = axes[2]
ax.plot(t,B_S.value, 'k-', label='Battery Storage',lw=1)
ax.plot(t,B_S_Load,'g--',label='Battery Storage Loading',lw=1)
ax.plot(t,B_S_Recover,'b:',label='Battery Storage Recovery',lw=1) #lw=2 --> linewidth
ax = axes[3]
ax.plot(t,H_Demand, 'r-', label='Heat Demand',lw=1)
ax.plot(t, Q_HP.value,'b:',\
label='Thermal Production Heat Pump',lw=1)
ax = axes[4]
ax.plot(t,H_S, 'k-', label='Heat Storage',lw=1)
ax.plot(t,H_S_Load,'g--',label='Heat Storage Loading',lw=1)
ax.plot(t,H_S_Recover.value,'b:',\
label='Heat Storage Recovered Energy',lw=1)
ax = axes[5]
ax.plot(t,DHW_Demand, 'r-', label='Domestic Hot Water Demand',lw=1)
ax.plot(t, EH,'b:',\
label='Electrical Water Heater',lw=1)
for ax in axes:
ax.legend(loc='center left',\
bbox_to_anchor=(1,0.5),frameon=False)
ax.grid()
ax.set_xlim(0,len(t)-1)
plt.savefig('Results.png', dpi=600,\
bbox_inches = 'tight')
plt.show()
If you have an idea for a better solution, try setting .STATUS=0 and insert the value that may be better as a test. If it is feasible and the objective function is lower, then there may be multiple local minima. This is unlikely if the problem is a mixed integer linear programming (MILP) problem. Below are example tests.
Increase Heat Pump Capacity
Objective increases (worse solution) to 456.39.
P_el_HP_max=m.FV(value=1.0,lb=0) #Heat Pump
P_el_HP_max.STATUS=0
Increase Heat Storage
Objective increases (worse solution) to 208.49.
H_S_max=m.FV(value=1.0,lb=0) #Heat Storage
H_S_max.STATUS=0
Increase Battery Storage
Objective increases (worse solution) to 308.09.
H_S_max=m.FV(value=1.0,lb=0) #Heat Storage
H_S_max.STATUS=0
Set Electrical Water Heater Capacity
No feasible solution found. Optimal capacity is 9.609 kW. Anything lower appears to be infeasible.
EH_max=m.FV(value=1,lb=0) #Electrical Water Heater
EH_max.STATUS=0
I haven't tried manually setting the heat pump on or off, but all of the capacity design variable appear to worsen the objective function. The exception is the Electrical Water Heater max capacity that is 9.609 kW. It appears from this study and model that the original strategy to buy all electricity from the grid and solely use an electrical water heater is optimal.

Related

Python audio signal doesn't filter well

I need to take an .wav audio file that's noisy and filter out all that noise. I have to do it using Fourier Transform. After some days researching and experimenting, I finally made a working function, the problem is that it doesn't work as I intend it to. Here is the function I made:
# Audio signal processing
from scipy.io.wavfile import read, write
import matplotlib.pyplot as plt
import numpy as np
from scipy.fft import fft, fftfreq, ifft
def AudioSignalProcessing(audio):
# Import the .wav format audio into two variables:
# sampling (int)
# audio signal (numpy array)
sampling, signal = read(audio)
# time duration of the audio
length = signal.shape[0] / sampling
# x axis based on the time duration
time = np.linspace(0., length, signal.shape[0])
# show original signal
plt.plot(time, signal)
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.title("Original signal")
plt.show()
# apply Fourier transform and normalize
transform = abs(fft(signal))
transform = transform/np.linalg.norm(transform)
# obtain frequencies
xf = fftfreq(transform.size, 1/sampling)
# show transformed signal (frequencies domain)
plt.plot(xf, transform)
plt.xlabel("Frecuency (Hz)")
plt.ylabel("Amplitude")
plt.title("Frequency domain signal")
plt.show()
# filter the transformed signal to a 40% of its maximum amplitude
threshold = np.amax(transform)*0.4
filtered = transform[np.where(transform > threshold)]
xf_filtered = xf[np.where(transform > threshold)]
# show filtered transformed signal
plt.plot(xf_filtered, filtered)
plt.xlabel("Frecuency (Hz)")
plt.ylabel("Amplitude")
plt.title("FILTERED time domain signal")
plt.show()
# transform the signal back to the time domain
filtrada = ifft(signal)
# show original signal filtered
plt.plot(time, filtrada)
plt.xlabel("Time (s)")
plt.ylabel("Amplitude")
plt.title("Filtered signal")
plt.show()
# convert audio signal to .wav format audio
# write(audio.replace(".wav", " filtrado.wav"), sampling, filtrada.astype(signal.dtype))
return None
AudioSignalProcessing("audio.wav")
Here is the output plots:
Original signal
Transformed signal
Filtered transformed signal
Filtered audio signal
The filtered frequencies don't look as I think they should, and after converting the filtered signal back to audio it doesn't sound good at all. Also, I've tried with different audios but the same filter distortion happens.
I suggest asking at https://dsp.stackexchange.com/ for detailed signal processing questions.
It looks like you want to keep only those frequency components that are within at least 40% of the maximum component. If that is the case:
Keep the complex form of the DFT, or you won't be able to transform back; so remove the abs from the line transform = abs(fft(signal)).
Don't use np.where to "keep" the frequencies you want; instead, set the places where the transform magnitude is below you threshold to 0; something like
transform[abs(transform) < 0.4 * max(abs(transform))] = 0
Finally, apply the inverse DFT to this altered transform; you've applied it to signal (see line filtrata = ifft(signal)). (You probably get warning when plotting filtrada about discarding imaginary values.)

Central Limit Theorem: Sample means do not follow a normal distribution

The Problem
Good evening.
I am learning about the Central Limit Theorem. As practice, I ran simulations in an attempt to find the mean of a fair die (I know, a toy problem).
I took 4000 samples, and in each sample I rolled a die 50 times (screenshot of the code at the bottom). For each of these 4000 samples I computed the mean. Then, I plotted these 4000 sample means in a histogram (with bin size 0.03) using matplotlib.
Here is the result:
Question
Why aren't the sample means normally distributed given that the conditions for CLT (sample size >= 30) were respected?
Specifically, why does the histogram look like two normal distributions superimposed on top of each other? More intriguingly, why does the "outer" distribution look "discrete" with empty spaces occurring at regular intervals?
It almost seems like the result is off in a systematic way.
All help is greatly appreciated. I am very lost.
Supplementary Code
The code I used to generate the 4000 sample means.
"""
Take multiple samples of dice rolls. For
each sample, compute the sample mean.
With the sample means, plot a histogram.
By the Central Limit Theorem, the sample
means should be normally distributed.
"""
sample_means = []
num_samples = 4000
for i in range(num_samples):
# Large enough for CLT to hold
num_rolls = 50
sample = []
for j in range(num_rolls):
observation = random.randint(1, 6)
sample.append(observation)
sample_mean = sum(sample) / len(sample)
sample_means.append(sample_mean)
When num_rolls equals 50, each possible mean will be a fraction with denominator 50. So, in reality, you are looking at a discrete distribution.
To create a histogram of a discrete distribution, the bin boundaries are best placed nicely in-between the values. Using a step size of 0.03, some bin boundaries will coincide with the values, putting the double of values into one bin compared to its neighbor. Moreover, due to subtle floating point rounding problems, the result can become unpredictable when values and boundaries coincide.
Here is some code to illustrate what is going on:
from matplotlib import pyplot as plt
import numpy as np
import random
sample_means = []
num_samples = 4000
for i in range(num_samples):
num_rolls = 50
sample = []
for j in range(num_rolls):
observation = random.randint(1, 6)
sample.append(observation)
sample_mean = sum(sample) / len(sample)
sample_means.append(sample_mean)
fig, axs = plt.subplots(2, 2, figsize=(14, 8))
random_y = np.random.rand(len(sample_means))
for (ax0, ax1), step in zip(axs, [0.03, 0.02]):
bins = np.arange(3.01, 4, step)
ax0.hist(sample_means, bins=bins)
ax0.set_title(f'step={step}')
ax0.vlines(bins, 0, ax0.get_ylim()[1], ls=':', color='r') # show the bin boundaries in red
ax1.scatter(sample_means, random_y, s=1) # show the sample means with a random y
ax1.vlines(bins, 0, 1, ls=':', color='r') # show the bin boundaries in red
ax1.set_xticks(np.arange(3, 4, 0.02))
ax1.set_xlim(3.0, 3.3) # zoom in to region to better see the ins
ax1.set_title('bin boundaries between values' if step == 0.02 else 'chaotic bin boundaries')
plt.show()
PS: Note that the code would run much, much faster if instead of Python lists, the code would work completely with numpy.

Analysis frame centering for scipy.signal.stft (and matplotlib plotting discrepancy)

I'm doing some experiments with Scipy's STFT, and would like to confirm that I'm understanding things correctly.
The following code generates the image I would expect, but labeled with the wrong time values:
from math import ceil, log
from scipy.io.wavfile import read
from scipy.signal import stft
import numpy as np
import matplotlib.pyplot as plt
# read a 2s, 440 Hz test tone, padded with 0.5s of silence on either end
fs, x = read('a440_2s_padded.wav')
nperseg = 44100
# pick an FFT size that's the smallest power of 2 >= the window size
nfft = pow(2, ceil(log(nperseg, 2)))
# N.B. no overlap between windows
f, t, Zxx = stft(x, fs, 'blackman', nperseg=nperseg, noverlap=0, nfft=nfft, boundary='zeros')
# crop the display to relevant bins
minBin, maxBin = 600, 700
# plot it
plt.pcolormesh(t, f[minBin:maxBin], np.abs(Zxx[minBin:maxBin]), vmin=None, vmax=None)
plt.title('STFT Magnitude')
plt.ylabel('Frequency [Hz]')
plt.xlabel('Time [sec]')
plt.show()
matplotlib STFT output
As noted in the code, I'm analyzing a 2s, 440 Hz test tone, padded with 0.5s of silence on either end, but in the image, the signal starts at 1s and lasts until 3s. For small nperseg values, this discrepancy doesn't make much difference, but for large values and musical data, the difference can be substantial, as it determines whether the STFT is centering its frames within beats (the desired behavior), or on beats (undesired, because then it's smearing data from two consecutive beats).
Am I misunderstanding something about the STFT analysis settings? Thanks for any insight.

Draw a point at the mean peak of a distplot or kdeplot in Seaborn

I'm interested in automatically plotting a point just above the mean peak of a distribution, represented by a kdeplot or distplot with kde. Plotting points and lines manually is simple, but I'm having difficulty deriving this maximal coordinate point.
For example, the kdeplot generated below should have a point drawn at about (3.5, 1.0):
iris = sns.load_dataset("iris")
setosa = iris.loc[iris.species == "setosa"]
sns.kdeplot(setosa.sepal_width)
This question is serving the ultimate goal to draw a line across to the next peak (two distributions in one graph) with a t-statistic printed above it.
Here is one way to do it. The idea here is to first extract the x and y-data of the line object in the plot. Then, get the id of the peak and finally plot the single (x,y) point corresponding to the peak of the distribution.
import numpy as np
import seaborn as sns
iris = sns.load_dataset("iris")
setosa = iris.loc[iris.species == "setosa"]
ax = sns.kdeplot(setosa.sepal_width)
x = ax.lines[0].get_xdata() # Get the x data of the distribution
y = ax.lines[0].get_ydata() # Get the y data of the distribution
maxid = np.argmax(y) # The id of the peak (maximum of y data)
plt.plot(x[maxid],y[maxid], 'bo', ms=10)

How to change pyplot.specgram x and y axis scaling?

I have never worked with audio signals before and little do I know about signal processing. Nevertheless, I need to represent and audio signal using pyplot.specgram function from matplotlib library. Here is how I do it.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
rate, frames = wavfile.read("song.wav")
plt.specgram(frames)
The result I am getting is this nice spectrogram below:
When I look at x-axis and y-axis which I suppose are frequency and time domains I can't get my head around the fact that frequency is scaled from 0 to 1.0 and time from 0 to 80k.
What is the intuition behind it and, what's more important, how to represent it in a human friendly format such that frequency is 0 to 100k and time is in sec?
As others have pointed out, you need to specify the sample rate, else you get a normalised frequency (between 0 and 1) and sample index (0 to 80k). Fortunately this is as simple as:
plt.specgram(frames, Fs=rate)
To expand on Nukolas answer and combining my Changing plot scale by a factor in matplotlib
and
matplotlib intelligent axis labels for timedelta
we can not only get kHz on the frequency axis, but also minutes and seconds on the time axis.
import matplotlib.pyplot as plt
import scipy.io.wavfile as wavfile
cmap = plt.get_cmap('viridis') # this may fail on older versions of matplotlib
vmin = -40 # hide anything below -40 dB
cmap.set_under(color='k', alpha=None)
rate, frames = wavfile.read("song.wav")
fig, ax = plt.subplots()
pxx, freq, t, cax = ax.specgram(frames[:, 0], # first channel
Fs=rate, # to get frequency axis in Hz
cmap=cmap, vmin=vmin)
cbar = fig.colorbar(cax)
cbar.set_label('Intensity dB')
ax.axis("tight")
# Prettify
import matplotlib
import datetime
ax.set_xlabel('time h:mm:ss')
ax.set_ylabel('frequency kHz')
scale = 1e3 # KHz
ticks = matplotlib.ticker.FuncFormatter(lambda x, pos: '{0:g}'.format(x/scale))
ax.yaxis.set_major_formatter(ticks)
def timeTicks(x, pos):
d = datetime.timedelta(seconds=x)
return str(d)
formatter = matplotlib.ticker.FuncFormatter(timeTicks)
ax.xaxis.set_major_formatter(formatter)
plt.show()
Result:
Firstly, a spectrogram is a representation of the spectral content of a signal as a function of time - this is a frequency-domain representation of the time-domain waveform (e.g. a sine wave, your file "song.wav" or some other arbitrary wave - that is, amplitude as a function of time).
The frequency values (y-axis, Hertz) are wholly dependant on the sampling frequency of your waveform ("song.wav") and will range from "0" to "sampling frequency / 2", with the upper limit being the "nyquist frequency" or "folding frequency" (https://en.wikipedia.org/wiki/Aliasing#Folding). The matplotlib specgram function will automatically determine the sampling frequency of the input waveform if it is not otherwise specified, which is defined as 1 / dt, with dt being the time interval between discrete samples of the waveform. You can can pass the option Fs='sampling rate' to the specgram function to manually define what it is. It will be easier for you to get your head around what is going on if you figure out and pass these variables to the specgram function yourself
The time values (x-axis, seconds) are purely dependent on the length of your "song.wav". You may notice some whitespace or padding if you use a large window length to calculate each spectra slice (think- the individual spectra which are arranged vertically and tiled horizontally to create the spectrogram image)
To make the axes more intuitive in the plot, use x- and y-axes labels and you can also scale the axes values (i.e. change the units) using a method similar to this
Take home message - try to be a bit more verbose with your code: see below for my example.
import matplotlib.pyplot as plt
import numpy as np
# generate a 5Hz sine wave
fs = 50
t = np.arange(0, 5, 1.0/fs)
f0 = 5
phi = np.pi/2
A = 1
x = A * np.sin(2 * np.pi * f0 * t +phi)
nfft = 25
# plot x-t, time-domain, i.e. source waveform
plt.subplot(211)
plt.plot(t, x)
plt.xlabel('time')
plt.ylabel('amplitude')
# plot power(f)-t, frequency-domain, i.e. spectrogram
plt.subplot(212)
# call specgram function, setting Fs (sampling frequency)
# and nfft (number of waveform samples, defining a time window,
# for which to compute the spectra)
plt.specgram(x, Fs=fs, NFFT=nfft, noverlap=5, detrend='mean', mode='psd')
plt.xlabel('time')
plt.ylabel('frequency')
plt.show()
5Hz_spectrogram: