I had difficulty producing a continuous regression curve, with no splits as is currently the case.
from numpy import *
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from scipy.optimize import curve_fit
from sklearn.metrics import mean_squared_error
from matplotlib.pyplot import *
from scipy.interpolate import *
%matplotlib inline
x = array([28,47,43,40,42,37,64,48,47,45,58,38,45,38,50,42,59,40,49,38,52])
y = array([64,44,46,48,43,55,41,50,52,58,44,56,47,59,40,50,43,58,47,55,40])
p2 = polyfit(x,y,2) # Polynoial curve, secend order
plot (x,y,'o')
plot (x,polyval(p2,x),'r-')
The printed plot is:
But all I need is just a normal curve, the red curve should be just one continuous curve, and not be split or divided as it is now.
How can this be done?
(In the code snippet above this is printing a second order regression curve, but the problem also exists in a higher order (and in the first order))
I would appreciate any help.
Related
I am practicing Axes3D to play with 3D graph. When I ran code, I am able to produce the axis, but no plots in it.
from mpl_toolkits.mplot3d import Axes3D
df_test=pd.DataFrame(data=np.random.normal(0,1,(20,3)),columns=['a','b','c'])
fig=plt.figure()
ax=fig.add_subplot(111,projection='3d')
ax.scatter=(df_test['a'],df_test['b'],df_test['c'])
ax.set_xlabel('a')
ax.set_ylabel('b')
ax.set_zlabel('c')
plt.show()
Result is like below: As shown, there is no plots, only axis. What did I do wrong in my code? Many thanks!
The line ax.scatter=(df_test['a'],df_test['b'],df_test['c']) should be replaced with
ax.scatter(df_test['a'],df_test['b'],df_test['c']), because you do not need the = operator.
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from mpl_toolkits.mplot3d import Axes3D
df_test=pd.DataFrame(data=np.random.normal(0,1,(20,3)),columns=['a','b','c'])
fig=plt.figure()
ax=fig.add_subplot(111,projection='3d')
ax.scatter(df_test['a'],df_test['b'],df_test['c'])
ax.set_xlabel('a')
ax.set_ylabel('b')
ax.set_zlabel('c')
plt.show()
The result is:
So I tried to make a categorical plot of my data and this is what my code and the graph.
import pandas as pd
import numpy as np
import matplotlib as plt
import seaborn as sns
sns.set(style="whitegrid")
sns.set_style("ticks")
sns.set_context("paper", font_scale=1, rc={"lines.linewidth": 6})
sns.catplot(y = "Region",x = "Interest by subregion",data = sample)
Image:
How can I make the y-labels more spread out and have a bigger font?
Try using sns.figure(figsize(x,y)) and sns.set_context(context=None,font_scale=1).
Try different values for these parameters to get the best results.
When plotting using matplotlib, I ran into an interesting issue where the y axis is scaled by a very inconvenient quantity. Here's a MWE that demonstrates the problem:
import numpy as np
import matplotlib.pyplot as plt
l = np.linspace(0.5,2,2**10)
a = (0.696*l**2)/(l**2 - 9896.2e-9**2)
plt.plot(l,a)
plt.show()
When I run this, I get a figure that looks like this picture
The y-axis clearly is scaled by a silly quantity even though the y data are all between 1 and 2.
This is similar to the question:
Axis numerical offset in matplotlib
I'm not satisfied with the answer to this question in that it makes no sense to my why I need to go the the convoluted process of changing axis settings when the data are between 1 and 2 (EDIT: between 0 and 1). Why does this happen? Why does matplotlib use such a bizarre scaling?
The data in the plot are all between 0.696000000017 and 0.696000000273. For such cases it makes sense to use some kind of offset.
If you don't want that, you can use you own formatter:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
l = np.linspace(0.5,2,2**10)
a = (0.696*l**2)/(l**2 - 9896.2e-9**2)
plt.plot(l,a)
fmt = matplotlib.ticker.StrMethodFormatter("{x:.12f}")
plt.gca().yaxis.set_major_formatter(fmt)
plt.show()
Assume we want to plot a time series, e.g.:
import pandas as pd
import numpy as np
a=pd.DatetimeIndex(start='2010-01-01',end='2014-01-01' , freq='D')
b=pd.Series(np.randn(len(a)), index=a)
b.plot()
The result is a figure in which the x-axis has years as labels, I would like to get month-year labels. Is there a fast way to do this (possibly avoiding the use of tens of lines of complex code calling matplotlib)?
Pandas does some really weird stuff to the Axes objects, making it hard to avoid matplotlib calls.
Here's how I would do it
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
a = pd.DatetimeIndex(start='2010-01-01',end='2014-01-01' , freq='D')
b = pd.Series(np.random.randn(len(a)), index=a)
fig, ax = plt.subplots()
ax.plot(b.index, b)
ax.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
which give me:
How can I fit my plots?
Up to now, I've got the following code, which plots a variety of graphs from an array (data from an experiment) as it is placed in a loop:
import matplotlib as plt
plt.figure(6)
plt.semilogx(Tau_Array, Correlation_Array, '+-')
plt.ylabel('Correlation')
plt.xlabel('Tau')
plt.title("APD" + str(detector) + "_Correlations_log_graph")
plt.savefig(DataFolder + "/APD" + str(detector) + "_Correlations_log_graph.png")
This works so far with a logarithmic plot, but I am wondering how the fitting process could work right here. In the end I would like to have some kind of a formula or/and a graph which best describes the data I measured.
I would be pleased if someone could help me!
You can use curve_fit from scipy.optimize for this. Here is an example
# -*- coding: utf-8 -*-
import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import curve_fit
def func(x,a):
return np.exp(a*x)
x,y,z = np.loadtxt("fit3.dat",unpack=True)
popt,pcov = curve_fit(func,x,y)
y_fit = np.exp(popt[0]*x)
plt.plot(x,y,'o')
plt.errorbar(x,y,yerr=z)
plt.plot(x,y_fit)
plt.savefig("fit3_plot.png")
plt.show()
In yourcase, you can define the func accordingly. popt is an array containing the value of your fitting parameters.