Matrix values are not shown - Plotly create_annotated_heatmap - matplotlib

Matrix values (zz) are not shown on the output. How can I fix this problem ?
fig = make_subplots(rows=1, cols=2)
# matrix
zz = [[0.1,0.2],
[0.2,0.3]]
#labels
names = ["No", "Yes"]
fig1 = ff.create_annotated_heatmap(zz, x = names, y = names)
fig2 = ff.create_annotated_heatmap(zz, x = names, y = names)
fig.add_trace(fig1.data[0], 1, 1)
fig.add_trace(fig2.data[0], 1, 2)
fig.show()

Related

How to label line plot in a combined bar & line plot in ggplot r?

I want to combine a bar and line plot and label line plot.
This is what I got: plot
this is my code:
df %>%
ggplot(aes(reorder(NAME, pval),y = pval)) +
geom_col(aes(x = NAME, y = pval), size = 1, color = "royalblue", fill = "white") +
geom_line(aes(x = NAME, y = 10*Ratio), size = 1.5, color="#c4271b", group = 1) + geom_text(aes(label = Ratio))+coord_flip()
I want to label line plot, but the bar plot gets the labels?
My second question:
How to rearrange the y-axis from the largest -log(pvalue) to lowest one?
Any help will be really appreciated!
try set the x and y aes in geom_text() with the same in geom_line()
geom_text(aes(x = NAME, y = 10*Ratio, label = Ratio))

Plot two columns of data with different number of data points

Hi, I have two columns of data. They are over the same time period but column one generates data every 1000ms, and column 2 generates data every 500ms. How can i plot them on the same graph looking of equal length. The x-axis doesnt have to be "Time". Thank you.
plt.rcParams['figure.figsize'] = [40,20]
x = df['Time']
y1 = df['Engine RPM']
y2 = df['FMS RPM']
plt.plot(x,y1,color='r', label='column1',linewidth=2)
plt.plot(x,y2,color='b', label='column2',linewidth=2)
I can have both lines looking equal using the following code, but on seperate graphs.
x = np.linspace(0, 100,100)
x2 = np.linspace(0,200,200)
f, ((ax1, ax2)) = plt.subplots(2)
y1 = df['Engine RPM']
y2 = df1['FMS RPM']
ax1.plot(x,y1, label = 'column1')
ax2.plot(x2,y2, label = 'column2')
Try this:
x = np.linspace(0, 100,100)
x2 = np.linspace(0,200,200)
f, ax = plt.subplots(1,1)
ax2 = ax1.twiny()
ax.plot(x,y1,color='r', label='column1',linewidth=2)
ax2.plot(x,y2,color='b', label='column2',linewidth=2)

zscore v.s. minmax normalization, why their results look the same

I'm trying to normalize my time series with two different normalization method, minmax and zscore and compare the results. Here is my code:
def scale_raw_data_zscore(raw_data):
scaled_zscore = pd.DataFrame()
idx = 514844
values = raw_data.loc[idx]['d_column'].values
values = values.reshape((len(values), 1))
scaler = StandardScaler()
scaler = scaler.fit(values)
normalized = scaler.transform(values)
normalized = normalized.reshape(normalized.shape[0])
normalized = pd.DataFrame(normalized, index=raw_data.loc[idx].index, columns=raw_data.columns)
scaled_zscore = scaled_zscore.append(normalized)
return scaled_zscore
def scale_raw_data_minmax(raw_data):
scaled_minmax = pd.DataFrame()
idx = 514844
values = raw_data.loc[idx]['d_column'].values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0, 1))
scaler = scaler.fit(values)
normalized = scaler.transform(values)
normalized = normalized.reshape(normalized.shape[0])
normalized = pd.DataFrame(normalized, index=raw_data.loc[idx].index, columns=raw_data.columns)
scaled_minmax = scaled_minmax.append(normalized)
return scaled_minmax
def plot_data(raw_data, scaled_zscore, scaled_minmax):
fig = pyplot.figure()
idx = 514844
ax1 = fig.add_subplot(311)
ax2 = fig.add_subplot(312)
ax3 = fig.add_subplot(313)
raw_data.loc[idx].plot(kind='line', x='date', y='d_column', ax=ax1, title='ID: ' + str(idx), legend=False, figsize=(20, 5))
scaled_zscore.reset_index(drop=True).plot(kind='line', y='d_column', ax=ax2, title='zscore', color='green', legend=False, figsize=(20, 5))
scaled_minmax.reset_index(drop=True).plot(kind='line', y='d_column', ax=ax3, title='minmax', color='red', legend=False, figsize=(20, 5))
pyplot.show()
scaled_zscore = scale_raw_data_zscore(raw_data)
scaled_minmax = scale_raw_data_minmax(raw_data)
plot_data(raw_data, scaled_zscore, scaled_minmax)
I'm adding the plot of the results. Why the results of both scaling methods are exactly the same? And why they have a different pattern from the raw data?

How to plot 4-D data embedded in a dataframe in Julia using a subplots approach?

I have a Julia DataFrame where the first 4 columns are dimensions and the 5th one contains the actual data.
I would like to plot it using a subplots approach where the two main plot axis concern the first two dimensions and each subplot then is a contour plot over the remaining two dimensions.
I am almost there with the above code:
using DataFrames,Plots
# plotlyjs() # doesn't work with plotlyjs backend
pyplot()
X = [1,2,3,4]
Y = [0.1,0.15,0.2]
I = [2,4,6,8,10,12,14]
J = [10,20,30,40,50,60]
df = DataFrame(X=Int64[], Y=Float64[], I=Float64[], J=Float64[], V=Float64[] )
[push!(df,[x,y,i,j,(5*x+20*y+2)*(0.2*i^2+0.5*j^2+3*i*j+2*i^2*j+1)]) for x in X, y in Y, i in I, j in J]
minvalue = minimum(df[:V])
maxvalue = maximum(df[:V])
function toDict(df, dimCols, valueCol)
toReturn = Dict()
for r in eachrow(df)
keyValues = []
[push!(keyValues,r[d]) for d in dimCols]
toReturn[(keyValues...)] = r[valueCol]
end
return toReturn
end
dict = toDict(df, [:X,:Y,:I,:J], :V )
M = [dict[(x,y,i,j)] for j in J, i in I, y in Y, x in X ]
yL = length(Y)
xL = length(X)
plot(contour(M[:,:,3,1], ylabel="y = $(string(Y[3]))", zlims=(minvalue,maxvalue)), contour(M[:,:,3,2]), contour(M[:,:,3,3]), contour(M[:,:,3,4]),
contour(M[:,:,2,1], ylabel="y = $(string(Y[2]))", zlims=(minvalue,maxvalue)), contour(M[:,:,2,2]), contour(M[:,:,2,3]), contour(M[:,:,2,4]),
contour(M[:,:,1,1], ylabel="y = $(string(Y[1]))", xlabel="x = $(string(X[1]))"), contour(M[:,:,1,2], xlabel="x = $(string(X[2]))"), contour(M[:,:,1,3], xlabel="x = $(string(X[3]))"), contour(M[:,:,3,4], xlabel="x = $(string(X[4]))"),
layout=(yL,xL) )
This produces:
I remain however with the following concerns:
How do I automatize the creation of each subplot in the subplot call ? Do I need to write a macro ?
I would like each subplot to have the same limits in the z axis, but zlims seems not to work. Is zlims not yet supported ?
How do I hide the legend on the z axis on each subplot and plot it instead apart (best would be on the right side of the main/total plot) ?
EDIT:
For the first point I don't need a macro, I can create the subplots in a for loop, add them in a array and pass the array to the plot() call using the ellipsis operator:
plots = []
for y in length(Y):-1:1
for x in 1:length(X)
xlabel = y == 1 ? "x = $(string(X[x]))" : ""
ylabel = x==1 ? "y = $(string(Y[y]))" : ""
println("$y - $x")
plot = contour(I,J,M[:,:,y,x], xlabel=xlabel, ylabel=ylabel, zlims=(minvalue,maxvalue))
push!(plots,plot)
end
end
plot(plots..., layout=(yL,xL))

Printing the equation of the best fit line

I have created the best fit lines for the dataset using the following code:
fig, ax = plt.subplots()
for dd,KK in DATASET.groupby('Z'):
fit = polyfit(x,y,3)
fit_fn = poly1d(fit)
ax.plot(KK['x'],KK['y'],'o',KK['x'], fit_fn(KK['x']),'k',linewidth=4)
ax.set_xlabel('x')
ax.set_ylabel('y')
The graph displays the best fit line for each group of Z. I want print the equation of the best fit line on top of the line.Please suggest what can i do out here
So you need to write some function that convert a poly parameters array to a latex string, here is an example:
import pylab as pl
import numpy as np
x = np.random.randn(100)
y = 1 + 2 * x + 3 * x * x + np.random.randn(100) * 2
poly = pl.polyfit(x, y, 2)
def poly2latex(poly, variable="x", width=2):
t = ["{0:0.{width}f}"]
t.append(t[-1] + " {variable}")
t.append(t[-1] + "^{1}")
def f():
for i, v in enumerate(reversed(poly)):
idx = i if i < 2 else 2
yield t[idx].format(v, i, variable=variable, width=width)
return "${}$".format("+".join(f()))
pl.plot(x, y, "o", alpha=0.4)
x2 = np.linspace(-2, 2, 100)
y2 = np.polyval(poly, x2)
pl.plot(x2, y2, lw=2, color="r")
pl.text(x2[5], y2[5], poly2latex(poly), fontsize=16)
Here is the output:
Here's a one liner.
If fit is the poly1d object, while plotting the fitted line, just use label argument as bellow,
label='y=${}$'.format(''.join(['{}x^{}'.format(('{:.2f}'.format(j) if j<0 else '+{:.2f}'.format(j)),(len(fit.coef)-i-1)) for i,j in enumerate(fit.coef)]))