PLOTLY tracegroupgap - plotly-python

Where do i insert tracegroupgap in this code? I have tried in legend=(dict.. without success.
layout = go.Layout(
title = 'IGS',
xaxis = dict(title = "Point", tickmode='linear', tick0=0, dtick=10),
yaxis = dict(title = "sp, mv"),
hovermode = 'closest',
legend = dict(font=dict(family="Courier",
size=10,
color="black"),
*tracegroupgap = 5*
)
)

Related

Colors don't stick when lollipop plot is run

I have created a lollipop chart that I love. However, when the code runs to create the plot, the colors of the lines, segments, and points all change from what they were set to. Everything else runs great, so this isn't the end of the world, but I am trying to stick with a color palette throughout a report.
The colors should be this ("#9a0138", and "#000775" specifically):
But come out like this:
Any ideas?
Here is the data:
TabPercentCompliant <- structure(list(Provider_ShortName = c("ProviderA", "ProviderA", "ProviderA", "ProviderB",
"ProviderB", "ProviderB", "ProviderC", "ProviderC", "ProviderC", "ProviderD"), SubMeasureID = c("AMM2", "FUH7", "HDO", "AMM2", "FUH7", "HDO", "AMM2", "FUH7", "HDO", "AMM2"), AdaptedCompliant = c(139, 2, 117, 85, 1, 33, 36, 2, 22, 43), TotalEligible = c(238, 27, 155, 148, 10, 34, 61, 3, 24, 76), PercentCompliant = c(0.584033613445378, 0.0740740740740741, 0.754838709677419, 0.574324324324324, 0.1, 0.970588235294118, 0.590163934426229, 0.666666666666667, 0.916666666666667, 0.565789473684211 ), PercentTotalEligible = c(0.00516358587173479, 0.00058578495183546, 0.00336283953831467, 0.00321096936561659, 0.000216957389568689, 0.000737655124533542, 0.001323440076369, 6.50872168706066e-05, 0.000520697734964853, 0.00164887616072203), ClaimsAdjudicatedThrough = structure(c(19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024 ), class = "Date"), AdaptedNCQAMean = c(0.57, 0.39, 0.93, 0.57, 0.39, 0.93, 0.57, 0.39, 0.93, 0.57), PerformanceLevel = c(0.0140336134453782, -0.315925925925926, -0.175161290322581, 0.00432432432432439, -0.29, 0.0405882352941176, 0.0201639344262295, 0.276666666666667, -0.0133333333333334, -0.00421052631578944)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"))
VBP_Report_Date = "2022-09-01"
And the code for the plot:
Tab_PercentCompliant %>%
filter(ClaimsAdjudicatedThrough == VBP_Report_Date) %>%
ggplot(aes(x = Provider_ShortName,
y = PercentCompliant)
) +
geom_line(aes(x = Provider_ShortName,
y = AdaptedNCQAMean,
group = SubMeasureID,
color = "#9a0138",
size = .001)
) +
geom_point(aes(color = "#000775",
size = (PercentTotalEligible)
)
) +
geom_segment(aes(x = Provider_ShortName,
xend = Provider_ShortName,
y = 0,
yend = PercentCompliant,
color = "#000775")
)+
facet_grid(cols = vars(SubMeasureID),
scales = "fixed",
space = "fixed")+
theme_classic()+
theme(legend.position = "none") +
theme(panel.spacing = unit(.5, "lines"),
panel.border = element_rect(
color = "black",
fill = NA,
linewidth = .5),
panel.grid.major.y = element_line(
color = "gray",
linewidth = .5),
axis.text.x = element_text(
angle = 65,
hjust=1),
axis.title.x = element_blank(),
axis.line = element_blank(),
strip.background = element_rect(
color = NULL,
fill = "#e1e7fa"))+
scale_y_continuous(labels = scales::percent)+
labs(title = "Test",
subtitle = "Test",
caption = "Test")
If you have an aesthetic constant, it is often easier / better to have it "outside" your aes call. If you want to have a legend for your color, then you need to keep it "inside", but you will need to manually set the colors with + scale_color/fill_manual.
I've had to cut down quite a lot in your code to make it work. I've also removed bits that are extraneous to the problem. I've removed line size = 0.001 or the line wasn't visible. I've removed the weird filter step or the plot wasn't possible.
Tips: when defining a global aesthetic with ggplot(aes(x = ... etc), you don't need to specify this aesthetic in each geom layer (those aesthetics will be inherited)- makes a more concise / readable code.
library(ggplot2)
ggplot(TabPercentCompliant, aes(x = Provider_ShortName, y = PercentCompliant)) +
geom_line(aes(y = AdaptedNCQAMean, group = SubMeasureID),
color = "#9a0138") +
geom_point(aes(size = PercentTotalEligible), color = "#000775") +
geom_segment(aes(xend = Provider_ShortName, y = 0, yend = PercentCompliant),
color = "#000775") +
facet_grid(~SubMeasureID) +
theme(strip.background = element_rect(color = NULL, fill = "#e1e7fa"))
Here is the final code. Thanks again tjebo!
# Lollipop Chart ----------------------------------------------------------
Tab_PercentCompliant %>%
filter(ClaimsAdjudicatedThrough == VBP_Report_Date) %>%
ggplot(aes(x = Provider_ShortName,
y = PercentCompliant)
) +
geom_line(aes(y = AdaptedNCQAMean,
group = SubMeasureID),
color = "#9a0138"
) +
geom_point(aes(size = PercentTotalEligible),
color = "#000775",
) +
geom_segment(aes(xend = Provider_ShortName,
y = 0,
yend = PercentCompliant),
color = "#000775"
)+
facet_grid(cols = vars(SubMeasureID)
)+
theme_bw()+
theme(legend.position = "none",
axis.text.x = element_text(
angle = 65,
hjust=1),
axis.title.x = element_blank(),
axis.line = element_blank(),
strip.background = element_rect(
fill = "#e1e7fa"))+
scale_y_continuous(labels = scales::percent)+
labs(title = "Test",
subtitle = "Test",
caption = "Test")

Calculating means for Columns based on data in another data set

I have two data sets, lets call them A and B (dput of the first 5 rows of each below):
`A: structure(list(Location = c(3960.82823, 3923.691, 3919.40593,
3907.97909, 3886.55377), Height = c(0.163744751, 0.231555472,
0.232150996, 0.192475738, 0.162966924), Start = c(3963.68494,
3946.54468, 3920.83429, 3909.40745, 3895.1239), End = c(3953.68645,
3920.83429, 3909.40745, 3895.1239, 3883.69706)), row.names = c(NA,
5L), class = "data.frame")
`
`B:structure(list(Wavenumber..cm.1. = c(3997.96546, 3996.5371, 3995.10875,
3993.68039, 3992.25204), M100 = c(0.00106, 0.00105, 0.00095,
0.00075, 0.00053), M101 = c(0.00081, 0.00092, 0.00102, 0.001,
0.00082), M102 = c(0.00099, 0.00109, 0.00105, 9e-04, 0.00072),
M103 = c(0.00101, 0.00111, 0.0012, 0.00129, 0.00133), M104 = c(0.00081,
0.00083, 0.00084, 0.00086, 0.00089), M105 = c(0.00139, 0.00113,
0.00092, 0.00089, 0.00102), M106 = c(0.00095, 0.00103, 0.00095,
0.00074, 0.00058), M107 = c(0.00054, 0.00058, 0.00059, 0.00049,
0.00032), M108 = c(0.00042, 5e-04, 5e-04, 0.00034, 0.00011
), M109 = c(0.00069, 0.00051, 0.00043, 0.00051, 0.00065),
M110 = c(0.00113, 0.00121, 0.00124, 0.00116, 0.00099), M111 = c(0.00039,
0.00056, 0.00068, 0.00068, 0.00056), M112 = c(0.0011, 0.00112,
0.00112, 0.00108, 0.00099), M113 = c(3e-04, 3e-04, 3e-04,
0.00027, 0.00019), M114 = c(0.00029, 6e-05, -2e-05, 9e-05,
0.00028), M115 = c(0.00091, 0.00079, 0.00061, 0.00038, 2e-04
), M116 = c(0.00117, 0.00105, 0.00096, 0.00092, 0.00092),
M117 = c(0.00039, 2e-04, 6e-05, 6e-05, 0.00018), M118 = c(0.00096,
0.00073, 0.00055, 0.00047, 0.00049), M119 = c(0.00037, 0.00031,
0.00024, 0.00018, 0.00018), M120 = c(0.00116, 0.00098, 0.00084,
0.00076, 0.00067), M121 = c(0.00039, 0.00024, 0.00011, 7e-05,
0.00011), M122 = c(0.00032, 0.00038, 0.00045, 0.00044, 0.00035
), M123 = c(9e-04, 0.00097, 0.00108, 0.0012, 0.00128), M124 = c(-0.00082,
-0.00065, -0.00049, -0.00037, -0.00036), M125 = c(0.00053,
0.00054, 0.00055, 6e-04, 0.00071), M126 = c(7e-05, 0.00022,
0.00022, 0.00011, 2e-05), M127 = c(0.00086, 9e-04, 0.00086,
0.00073, 0.00058), M128 = c(0.00089, 0.00078, 0.00069, 0.00057,
0.00043), M129 = c(0.00094, 0.00097, 0.00106, 0.00114, 0.00105
), M130 = c(0.0013, 0.00118, 0.00115, 0.00116, 0.00111),
M131 = c(0.00029, 0.00033, 0.00033, 3e-04, 0.00022), M132 = c(0,
0.00026, 0.00048, 6e-04, 0.00063), M133 = c(3e-05, -6e-05,
-6e-05, 5e-05, 0.00019), M134 = c(0.00056, 0.00054, 0.00052,
0.00054, 0.00057), M135 = c(2e-05, -4e-05, 6e-05, 0.00031,
0.00057), M136 = c(0.00083, 0.00075, 0.00068, 0.00068, 0.00073
), M137 = c(0.00064, 0.00074, 0.00084, 0.00095, 0.00105),
M139 = c(0.00044, 0.00044, 0.00042, 0.00043, 0.00047), M140 = c(0.00138,
0.00113, 0.00102, 0.0011, 0.00121), M141 = c(0.00062, 0.00043,
2e-04, 2e-05, 0), M142 = c(-0.00022, -0.00017, -0.00014,
-1e-04, 0), M143 = c(0.00109, 0.00108, 0.00103, 0.00093,
0.00087), M144 = c(0.00104, 0.00116, 0.00117, 0.00105, 0.00085
), M145 = c(7e-04, 0.00096, 0.00109, 0.00098, 0.00069), M146 = c(0.0014,
0.00158, 0.00165, 0.00154, 0.0013), M147 = c(6e-04, 0.00071,
0.00075, 0.00072, 0.00065), M148 = c(0.00098, 0.00093, 0.00091,
9e-04, 0.00088), M149 = c(0.00055, 0.00058, 0.00054, 0.00037,
0.00017), M150 = c(7e-04, 0.00068, 8e-04, 0.00107, 0.00132
), M151 = c(0.00037, 0.00042, 0.00046, 0.00047, 0.00046),
M152 = c(0.00047, 0.00042, 0.00043, 0.00045, 0.00045), M153 = c(0.00095,
0.00088, 0.00083, 8e-04, 0.00072), M154 = c(6e-05, 0.00013,
0.00032, 0.00054, 0.00062), M155 = c(0.00061, 0.00057, 0.00043,
0.00022, 4e-05), M156 = c(0.00077, 0.00078, 0.00071, 0.00052,
0.00025), M157 = c(0.00088, 0.00078, 0.00069, 0.00063, 0.00058
), M158 = c(0.00091, 0.00085, 0.00082, 0.00081, 8e-04), M159 = c(0.00078,
0.00076, 0.00073, 0.00074, 0.00079), M160 = c(0.00068, 7e-04,
0.00075, 8e-04, 0.00079), M161 = c(0.00055, 0.00073, 0.00082,
0.00085, 9e-04), M162 = c(0.00104, 0.00111, 0.0011, 0.00104,
0.00102), M163 = c(0.00076, 0.00071, 0.00069, 0.00068, 0.00067
), M164 = c(0.0012, 0.00133, 0.00154, 0.00174, 0.00177),
M165 = c(0.00072, 0.00073, 0.00072, 0.00074, 0.00083), M166 = c(0.00067,
0.00055, 0.00035, 0.00012, -2e-05), M167 = c(0.00068, 0.00053,
0.00047, 0.00051, 0.00059), M168 = c(0.00067, 0.00092, 0.001,
0.00087, 0.00067), M169 = c(0.00124, 0.00107, 0.00101, 0.00108,
0.00118), M170 = c(0.00054, 0.00064, 0.00069, 0.00066, 0.00053
), M171 = c(0.00029, 3e-04, 3e-04, 0.00031, 3e-04), M172 = c(0.00085,
0.00091, 0.00082, 0.00063, 0.00052), M173 = c(0.00022, 0.00036,
0.00053, 0.00061, 0.00056), M174 = c(5e-04, 0.00031, 0.00021,
0.00023, 0.00031), M175 = c(0.00074, 0.00066, 0.00059, 0.00051,
0.00043), M176 = c(9e-04, 0.00062, 0.00044, 0.00039, 0.00039
), M177 = c(0.00045, 0.00038, 0.00033, 0.00035, 0.00043),
M178 = c(0.00075, 0.00092, 0.00097, 0.00086, 0.00067), M179 = c(0.00047,
0.00033, 0.00026, 3e-04, 0.00037), M180 = c(0.00083, 0.00077,
0.00074, 0.00074, 7e-04), M181 = c(0.0013, 0.00138, 0.00137,
0.00127, 0.00109), M182 = c(0.00062, 0.00049, 0.00043, 0.00042,
0.00038), M183 = c(0.00056, 4e-04, 0.00034, 0.00046, 0.00065
), M184 = c(0.00122, 0.00116, 0.00096, 0.00067, 0.00039),
M185 = c(0.00045, 0.00026, 0.00012, 1e-04, 0.00024), M187 = c(0.00078,
0.00038, 8e-05, 0, 0.00014)), row.names = c(NA, 5L), class = "data.frame")
`
I want to be able to calculate the means of the M columns in data set B, based on the Start and End columns in data set A (which correspond to the Wavenumber cm-1 column in data set B). So that for each Start and End set of values you have a corresponding mean for each M column in data set B.
So for example for the Start and End values in the first row of data set A:
Start: 3963.68494 End: 3953.68645 you would calculate the mean of each M column in data set B using the absorbance values corresponding to the Wavenumber cm-1 range of 3963.6849 to 3953.68645, which would then be stored in a separate data frame (with all the M column names) called meanData or something.
I can quite figure out how to write a function/loop that would do that, going and taking the Start and End values in dataset A, looking at dataset B getting the corresponding Absorbance values that fall into that Start and End range, calculate their mean and write it into a new data frame under its corresponding M column name and repeating this for each row of Start and End Values in dataset A. I know you would likely do it with an index, but I'm not sure how to write it exactly. Any help would be very much appreciated!
I tried creating different indexes for the Start and End columns and using them to try and specify the values I want in dataset B, using [] but I was unsuccessful:
`test<-mean(B$M100[which(B$Wavenumber..cm.1.[index2[i] to B$Wavenumber..cm.1.index3[i]])`
where index2 is the Start values in dataset A and index3 is the end values in datasetA, this did not work

How to change a map tilt

I'm struggling with the map tilt. I would like help to change the tilt of the following map. Thanks!
The first map is my result, the second map is how I would like the slope to be.
library(usmap)
library (ggplot2)
read.table("NY_data.txt", header = T)->NY_data
NY1 <- plot_usmap(regions = "county", include = c("NY"), data = NY_data, values = "YEAR_2010") +
labs(title = "New York by county", subtitle = "2010") +
theme(plot.title = element_text(face="bold", size=18, hjust = 0.5),
plot.subtitle = element_text(face="bold", size=16)) +
scale_fill_continuous(low = "white", high = "#CB454A", limits=c(0, 35),
name = "Cumulative cases",
guide = guide_colourbar(barwidth = 27, barheight = 0.5,
title.position = "top"),
label = scales::comma) +
theme(legend.position = "bottom",
legend.title=element_text(size=12, face = "bold"),
legend.text=element_text(size=10))
NY1
map1
map2

Is there any other way to find percentage and plot a group bar-chart without using matplotlib?

emp_attrited = pd.DataFrame(df[df['Attrition'] == 'Yes'])
emp_not_attrited = pd.DataFrame(df[df['Attrition'] == 'No'])
print(emp_attrited.shape)
print(emp_not_attrited.shape)
att_dep = emp_attrited['Department'].value_counts()
percentage_att_dep = (att_dep/237)*100
print("Attrited")
print(percentage_att_dep)
not_att_dep = emp_not_attrited['Department'].value_counts()
percentage_not_att_dep = (not_att_dep/1233)*100
print("\nNot Attrited")
print(percentage_not_att_dep)
fig = plt.figure(figsize=(20,10))
ax1 = fig.add_subplot(221)
index = np.arange(att_dep.count())
bar_width = 0.15
rect1 = ax1.bar(index, percentage_att_dep, bar_width, color = 'black', label = 'Attrited')
rect2 = ax1.bar(index + bar_width, percentage_not_att_dep, bar_width, color = 'green', label = 'Not Attrited')
ax1.set_ylabel('Percenatage')
ax1.set_title('Comparison')
xTickMarks = att_dep.index.values.tolist()
ax1.set_xticks(index + bar_width)
xTickNames = ax1.set_xticklabels(xTickMarks)
plt.legend()
plt.tight_layout()
plt.show()
The first block represents how the dataset is split into 2 based upon Attrition
The second block represents the calculation of percentage of Employees in each Department who are attrited and not attrited.
The third block is to plot the given as a grouped chart.
You can do:
(df.groupby(['Department'])
['Attrited'].value_counts(normalize=True)
.unstack('Attrited')
.plot.bar()
)

How can the edge colors of individual matplotlib histograms be set?

I've got a rough and ready function that can be used to compare two sets of values using histograms:
I want to set the individual edge colors of each of the histograms in the top plot (much as how I set the individual sets of values used for each histogram). How could this be done?
import os
import datavision
import matplotlib.pyplot
import numpy
import shijian
def main():
a = numpy.random.normal(2, 2, size = 120)
b = numpy.random.normal(2, 2, size = 120)
save_histogram_comparison_matplotlib(
values_1 = a,
values_2 = b,
label_1 = "a",
label_2 = "b",
normalize = True,
label_ratio_x = "measurement",
label_y = "",
title = "comparison of a and b",
filename = "histogram_comparison_1.png"
)
def save_histogram_comparison_matplotlib(
values_1 = None,
values_2 = None,
filename = None,
directory = ".",
number_of_bins = None,
normalize = True,
label_x = "",
label_y = None,
label_ratio_x = None,
label_ratio_y = "ratio",
title = "comparison",
label_1 = "1",
label_2 = "2",
overwrite = True,
LaTeX = False,
#aspect = None,
font_size = 20,
color_1 = "#3861AA",
color_2 = "#00FF00",
color_3 = "#7FDADC",
color_edge_1 = "#3861AA", # |<---------- insert magic for these
color_edge_2 = "#00FF00", # |
alpha = 0.5,
width_line = 1
):
matplotlib.pyplot.ioff()
if LaTeX is True:
matplotlib.pyplot.rc("text", usetex = True)
matplotlib.pyplot.rc("font", family = "serif")
if number_of_bins is None:
number_of_bins_1 = datavision.propose_number_of_bins(values_1)
number_of_bins_2 = datavision.propose_number_of_bins(values_2)
number_of_bins = int((number_of_bins_1 + number_of_bins_2) / 2)
if filename is None:
if title is None:
filename = "histogram_comparison.png"
else:
filename = shijian.propose_filename(
filename = title + ".png",
overwrite = overwrite
)
else:
filename = shijian.propose_filename(
filename = filename,
overwrite = overwrite
)
values = []
values.append(values_1)
values.append(values_2)
bar_width = 0.8
figure, (axis_1, axis_2) = matplotlib.pyplot.subplots(
nrows = 2,
gridspec_kw = {"height_ratios": (2, 1)}
)
ns, bins, patches = axis_1.hist(
values,
color = [
color_1,
color_2
],
normed = normalize,
histtype = "stepfilled",
bins = number_of_bins,
alpha = alpha,
label = [label_1, label_2],
rwidth = bar_width,
linewidth = width_line,
#edgecolor = [color_edge_1, color_edge_2] <---------- magic here? dunno
)
axis_1.legend(
loc = "best"
)
bars = axis_2.bar(
bins[:-1],
ns[0] / ns[1],
alpha = 1,
linewidth = 0, #width_line
width = bins[1] - bins[0]
)
for bar in bars:
bar.set_color(color_3)
axis_1.set_xlabel(label_x, fontsize = font_size)
axis_1.set_ylabel(label_y, fontsize = font_size)
axis_2.set_xlabel(label_ratio_x, fontsize = font_size)
axis_2.set_ylabel(label_ratio_y, fontsize = font_size)
#axis_1.xticks(fontsize = font_size)
#axis_1.yticks(fontsize = font_size)
#axis_2.xticks(fontsize = font_size)
#axis_2.yticks(fontsize = font_size)
matplotlib.pyplot.suptitle(title, fontsize = font_size)
if not os.path.exists(directory):
os.makedirs(directory)
#if aspect is None:
# matplotlib.pyplot.axes().set_aspect(
# 1 / matplotlib.pyplot.axes().get_data_ratio()
# )
#else:
# matplotlib.pyplot.axes().set_aspect(aspect)
figure.tight_layout()
matplotlib.pyplot.subplots_adjust(top = 0.9)
matplotlib.pyplot.savefig(
directory + "/" + filename,
dpi = 700
)
matplotlib.pyplot.close()
if __name__ == "__main__":
main()
You may simply plot two different histograms but share the bins.
import numpy as np; np.random.seed(3)
import matplotlib.pyplot as plt
a = np.random.normal(size=(89,2))
kws = dict(histtype= "stepfilled",alpha= 0.5, linewidth = 2)
hist, edges,_ = plt.hist(a[:,0], bins = 6,color="lightseagreen", label = "A", edgecolor="k", **kws)
plt.hist(a[:,1], bins = edges,color="gold", label = "B", edgecolor="crimson", **kws)
plt.show()
Use the lists of Patches objects returned by the hist() function.
In your case, you have two datasets, so your variable patches will be a list containing two lists, each with the Patches objects used to draw the bars on your plot.
You can easily set the properties on all of these objects using the setp() function. For example:
a = np.random.normal(size=(100,))
b = np.random.normal(size=(100,))
c,d,e = plt.hist([a,b], color=['r','g'])
plt.setp(e[0], edgecolor='k', lw=2)
plt.setp(e[1], edgecolor='b', lw=3)