Plotly express wont show `color` lines - dataframe

I am trying to make a simple line plot using plotly express, of a data frame:
diccionary = {'Bet_type': ['dozen', 'column', 'even_odd', 'red_black'],
'Max_money': [216, 329, 377,193],
'times_played':[51,83,594,202]}
df = pd.DataFrame(diccionary)
df
when I try to plot all the Bet_type in a single graph, the graph appears with correct names but no lines
fig = px.line(df, x="times_played", y="Max_money", color ='Bet_type')
fig.show()
If I remove color it works but plots something that I am not interested in, In the same notebook I've made a similar plot with a different data frame and it works perfectly, so I assume doesn't have to do with the libraries.
If someone have any idea why is this happening would be really appreciated. thanks

Related

How to change the position of legend in a map?

I've been trying to plot a shapefile over a basemap. My issue here is the placement of the legend. I wanted it to be placed outside (next to) the map.
Specifically, I am plotting the "Ecoregion" column of the shapefile which basically labels each polygon with a colour (I figured this was better than actually putting the names on each polygon). I've tried the following code and receive an error:
pip install geopandas
pip install contextily
import geopandas as gpd
import contextily as ctx
data = gpd.read_file("icemap.shp")
plt.rcParams.update({'font.size': 14})
ax = data.plot(
figsize=(12, 10),
column="Ecoregion",
cmap="tab10",
)
map = Basemap(
llcrnrlon=-50,
llcrnrlat=30,
urcrnrlon=70.0,
urcrnrlat=85.0,
resolution="i",
lat_0=39.5,
lon_0=1,
)
ax.legend(bbox_to_anchor=(1.05, 1), loc='upper left')
map.fillcontinents(color="lightgreen")
map.drawcoastlines()
map.drawparallels(np.arange(10,90,20),labels=[1,1,1,1])
map.drawmeridians(np.arange(-180,180,30),labels=[1,1,0,1])
plt.title("Map", fontsize=16)
The error is -
WARNING:matplotlib.legend:No handles with labels found to put in legend.
So I tried adding
legend-True
in the "ax" parentheses and removed the "ax.legend(...)", but then the legend appears on top of the map as the picture below.
Does "handle" refer to the column that is plotted? If so, I'm confused as to why I get this error. Or do I need to add another line of code?
I'd be grateful to receive some help in this.
(Attached file link: https://drive.google.com/file/d/1OfOAstBbbxiqSybpl_CQf-o47YgpbY7D/view?usp=sharing)
I would also consider Cartopy, since Basemap has been end-of-life for a long time. The resolution of your vector is also well beyond what's plotted on screen, so you could really increase performance by simplifying it a little.
But you can pass the legend keywords along when plotting the Geodataframe.
ax = data.plot(
figsize=(10, 8),
column="Ecoregion",
cmap="tab10",
legend=True,
legend_kwds=dict(bbox_to_anchor=(1.05, 1), loc='upper left'),
)

How to display all the lables present on x and y axis in matplotlib [duplicate]

I'm playing around with the abalone dataset from UCI's machine learning repository. I want to display a correlation heatmap using matplotlib and imshow.
The first time I tried it, it worked fine. All the numeric variables plotted and labeled, seen here:
fig = plt.figure(figsize=(15,8))
ax1 = fig.add_subplot(111)
plt.imshow(df.corr(), cmap='hot', interpolation='nearest')
plt.colorbar()
labels = df.columns.tolist()
ax1.set_xticklabels(labels,rotation=90, fontsize=10)
ax1.set_yticklabels(labels,fontsize=10)
plt.show()
successful heatmap
Later, I used get_dummies() on my categorical variable, like so:
df = pd.get_dummies(df, columns = ['sex'])
resulting correlation matrix
So, if I reuse the code from before to generate a nice heatmap, it should be fine, right? Wrong!
What dumpster fire is this?
So my question is, where did my labels go, and how do I get them back?!
Thanks!
To get your labels back, you can force matplotlib to use enough xticks so that all your labels can be shown. This can be done by adding
ax1.set_xticks(np.arange(len(labels)))
ax1.set_yticks(np.arange(len(labels)))
before your statements ax1.set_xticklabels(labels,rotation=90, fontsize=10) and ax1.set_yticklabels(labels,fontsize=10).
This results in the following plot:

Matplotlib modified histograms won't display after modification

I have plotted a histogram and would like to modify it, then re-plot it. It won't plot again without redefining the Figure and Axes object definitions. I'm using Jupyter Notebook, and I'm new to matplotlib, so I don't know if this is something that I'm not understanding about matplotlib, if it's an issue with the Jupyter Notebook or something else.
Here's my 1st block of code:
"""Here's some data."""
some_data = np.random.randn(150)
"""Here I define my `Figure` and `Axes` objects."""
fig, ax = plt.subplots()
"""Then I make a histogram from them, and it shows up just fine."""
ax.hist(some_data, range=(0, 5))
plt.show()
Here's the output from my 1st block of code:
Here's my 2nd block of code:
"""Here I modify the parameter `bins`."""
ax.hist(some_data, bins=20, range=(0, 5))
"""When I try to make a new histogram, it doesn't work."""
plt.show()
My 2nd block of code generates no visible output, which is the problem.
Here's my 3rd and final block of code:
"""But it does work if I define new `Figure` and `Axes` objects.
Why is this?
How can I display new, modified plots without defining new `Figure` and/or `Axes` objects? """
new_fig, new_ax = plt.subplots()
new_ax.hist(some_data, bins=20, range=(0, 5))
plt.show()
Here's the output from my 3rd and final block of code:
Thanks in advance.
When you generate a figure or an axis, it remains accessible for rendering or display until it's used for rendering or display. Once you execute plt.show() in your first block, the ax becomes unavailable. Your 3rd block of code is showing a plot because you're regenerating the figure and axes.

Matplotlib/Seaborn: Boxplot collapses on x axis

I am creating a series of boxplots in order to compare different cancer types with each other (based on 5 categories). For plotting I use seaborn/matplotlib. It works fine for most of the cancer types (see image right) however in some the x axis collapses slightly (see image left) or strongly (see image middle)
https://i.imgur.com/dxLR4B4.png
Looking into the code how seaborn plots a box/violin plot https://github.com/mwaskom/seaborn/blob/36964d7ffba3683de2117d25f224f8ebef015298/seaborn/categorical.py (line 961)
violin_data = remove_na(group_data[hue_mask])
I realized that this happens when there are too many nans
Is there any possibility to prevent this collapsing by code only
I do not want to modify my dataframe (replace the nans by zero)
Below you find my code:
boxp_df=pd.read_csv(pf_in,sep="\t",skip_blank_lines=False)
fig, ax = plt.subplots(figsize=(10, 10))
sns.violinplot(data=boxp_df, ax=ax)
plt.xticks(rotation=-45)
plt.ylabel("label")
plt.tight_layout()
plt.savefig(pf_out)
The output is a per cancer type differently sized plot
(depending on if there is any category completely nan)
I am expecting each plot to be in the same width.
Update
trying to use the order parameter as suggested leads to the following output:
https://i.imgur.com/uSm13Qw.png
Maybe this toy example helps ?
|Cat1|Cat2|Cat3|Cat4|Cat5
|3.93| |0.52| |6.01
|3.34| |0.89| |2.89
|3.39| |1.96| |4.63
|1.59| |3.66| |3.75
|2.73| |0.39| |2.87
|0.08| |1.25| |-0.27
Update
Apparently, the problem is not the data but the length of the title
https://github.com/matplotlib/matplotlib/issues/4413
Therefore I would close the question
#Diziet should I delete it or does my issue might help other ones?
Sorry for not including the line below in the code example:
ax.set_title("VERY LONG TITLE", fontsize=20)
It's hard to be sure without data to test it with, but I think you can pass the names of your categories/cancers to the order= parameter. This forces seaborn to use/display those, even if they are empty.
for instance:
tips = sns.load_dataset("tips")
ax = sns.violinplot(x="day", y="total_bill", data=tips, order=['Thur','Fri','Sat','Freedom Day','Sun','Durin\'s Day'])

how to shift x axis labesl on line plot?

I'm using pandas to work with a data set and am tring to use a simple line plot with error bars to show the end results. It's all working great except that the plot looks funny.
By default, it will put my 2 data groups at the far left and right of the plot, which obscures the error bar to the point that it's not useful (the error bars in this case are key to intpretation so I want them plainly visible).
Now, I fix that problem by setting xlim to open up some space on either end of the x axis so that the error bars are plainly visible, but then I have an offset from where the x labels are to where the actual x data is.
Here is a simplified example that shows the problem:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df6 = pd.DataFrame( [-0.07,0.08] , index = ['A','B'])
df6.plot(kind='line', linewidth=2, yerr = [ [0.1,0.1],[0.1,0.1 ] ], elinewidth=2,ecolor='green')
plt.xlim(-0.2,1.2) # Make some room at ends to see error bars
plt.show()
I tried to include a plot (image) showing the problem but I cannot post images yet, having just joined up and do not have anough points yet to post images.
What I want to know is: How do I shift these labels over one tick to the right?
Thanks in advance.
Well, it turns out I found a solution, which I will jsut post here in case anyone else has this same issue in the future.
Basically, it all seems to work better in the case of a line plot if you just specify both the labels and the ticks in the same place at the same time. At least that was helpful for me. It sort of forces you to keep the length of those two lists the same, which seems to make the assignment between ticks and labels more well behaved (simple 1:1 in this case).
So I coudl fix my problem by including something like this:
plt.xticks([0, 1], ['A','B'] )
right after the xlim statement in code from original question. Now the A and B align perfectly with the place where the data is plotted, not offset from it.
Using above solution it works, but is less good-looking since now the x grid is very coarse (this is purely and aesthetic consideration). I could fix that by using a different xtick statement like:
plt.xticks([-0.2, 0, 0.2, 0.4, 0.6, 0.8, 1.0], ['','A','','','','','B',''])
This gives me nice looking grid and the data where I need it, but of course is very contrived-looking here. In the actual program I'd find a way to make that less clunky.
Hope that is of some help to fellow seekers....