How to scatter plot without sort in Matplotlib? - matplotlib

I have this plot
I want this plot to be inverted i.e the categories with the highest avg buyer365 should appear at the top. Does anyone know how to deal with this? I really appreciate any help you can provide.
P.S code I used:
fig,axs=plt.subplots()
axs.scatter(trace1_summary["mean"],trace1_summary["cat3_items"])
axs.set(xlim=(-3,15),title= "89% Hdi Interval",ylabel="categories", xlabel="avg buyer365(standarized)")
axs.axvline(0.0,ls="--",color="k",alpha=0.5,label="average Buyer")
plt.legend()

You can plot the sorted in ascending order values:
axs.scatter("mean", "cat3_items", data=trace1_summary.sort_values("mean"))

Related

How to plot histogram where a country is a bin

I am attempting to plot a histogram where the y-axis should be the count and the x-axis (i.e. the bins) should be composed of all the countries, but I cannot seem to figure out the code.
I have tried following https://datasciencelab.wordpress.com/tag/pandas/ and How to create Histograms in Panda Python Using Specific Rows and Columns in Data Frame, but both of those are for bar graphs and not for histograms.
per_country = df.groupby(['COUNTRY'])[['TOTAL']].sum()
per_country = per_country.sort_values('TOTAL')
per_country.head()
medals_per_country['TOTAL'].plot(kind='hist') # if hist is changed to bar, then it works, but I need a histogram and not a bar graph
The result should be a histogram where the x-axis includes a label for each country and the y-axis is the total for each country. Right now the bins are just based on the 'TOTAL', whereas the y-axis is the number of countries with said 'TOTAL'.
Can someone please point me in the right direction? Thank you!

Why is Pandas inverting my x-axis order? [duplicate]

This question already has answers here:
x-axis inverted unexpectedly by pandas.plot(...)
(2 answers)
Closed 4 years ago.
When I was plotting two series of data against eachother, the X axis was inverted unexpectedly. I know this question sounds pretty similar to this other: x-axis inverted unexpectedly by pandas.plot(...) and it actually is, but I want to know if this can be disabled or something, not a workaround. Let me explain myself.
I have a very simple DF that consists on a datetime index and two columns; one has humidity measurements and the other daily weights. Both of them are in descending order because when my sample loses water, it also loses weight and humidity. So my DF looks something like this, where my data is in descending order
But then, when I plot using X = "Peso" (weight), and Y = 'Humedad' (humidity), my X axis goes in ascending order insted of descending order.
My ploting code:
plt.figure(figsize=(12,9))
plt.scatter(data['Peso'],data['Humedad'])
plt.xlabel('Peso (kg)',fontsize=14)
plt.ylabel("Raw Counts",fontsize=14)
plt.xticks(rotation=90,fontsize=10)
plt.grid()
Resulting in this kind of plot, where X axis is inverted
So, I could do two simple types of workaround:
plt.scatter(sorted(data['Peso']),data['Humedad'])
or
plt.scatter(data['Peso'][::-1],data['Humedad'])
Both of them have the same result, they print my data as I wanted, BUT my xticks are still inverted:
So what I did was creating a list with my weight values in order to insert it as it follows:
semin=data['Peso']
semin=semin.tolist()
And then adding it to my plt.xticks like this
plt.xticks(semin,rotation=90,fontsize=10)
It "kind off" worked, overlaping some of the xticks as you can see in the image below:
I know I can solve this with [Locs] and general xticks information, but I really wanted to know if it's possible to just ask Pandas to follow the natural data descending order or anything similiar and avoiding all of this xticks stuff?
I've checked this too: https://github.com/pandas-dev/pandas/issues/10118
and I tried by doing the set_index suggestion:
plt.figure(figsize=(12,9))
data.set_index('Peso').Humedad.plot()
plt.xlabel('Peso (kg)',fontsize=14)
plt.ylabel("Raw Counts",fontsize=14)
plt.xticks(rotation=90,fontsize=10)
plt.grid()
And it went almost perfect, except that I needed it in scatter...
So I tried some stuff to "scatter it"
1. Putting the marker type:
data.set_index('Peso').Humedad.plot(marker='o')
Got a marker + line graph:
2. Changing .plot for .scatter to the plot:
data.set_index('Peso').Humedad.scatter()
Got this error:
AttributeError: 'Series' object has no attribute 'scatter'
3. Using both
data.set_index('Peso').Humedad.plot.scatter()
Got this one:
AttributeError: 'SeriesPlotMethods' object has no attribute 'scatter'
4. Making this giant question. Please help.
And that's all, sorry if I'm missing something or if my post is too long. I'm open to suggestions, corrections or anything you're willing to tell me.
Thanks!
Oh I just saw that the linked question actually does exactly what you need. Will leave this as here. But please refer to the linked question instead.
It's not a solution to change the ticks! The resultung plot may easily get completely wrong.
Instead google for "invert x axis" or so and find that you can invert the axis via
ax = df.plot(...)
ax.invert_xaxis()
This is not a workaround. It is the solution. (How much easier can it get?)

3d scatter plot with mplot3d with missing frequency as z axis

I am trying to plot some data but I only have my x and y axis given. I would like to get the frequency for my z axis like I would in a histogram. But I don't really want a histogram, rather a scatter plot or a surface plot.
This is my histogram right now, quite messy.
Is it possible to use the function of the histogram to get the frequency of the points but have a scatter/surface plot instead? Or to change the bars of the histogram to something like dots or something similar?
None of my data points have the same value, so I would like them to be sorted in intervals, like [0,0.05] or so, that's why I am trying to find another solution than calculating this.

How to change text of y-axes on a matplotlib generated picture

The page is
"http://matplotlib.sourceforge.net/examples/pylab_examples/histogram_demo_extended.html"
Let's look at the y-axis, the numbers there do not make any sense, could we change it to something else that is meaningful?
Except the cumulative distribution plot, and the last one, the rest of the y-axes data show normalized histogram values with normed=1 keyword set (i.e., the are underneath the histogram equals to 1 as in the definition of a probability density function (PDF))
You can use yticks(), see this example.

Weird graphics in matplotlib when changing the scale

I get a histogram picture in matplotlib which looks great. Now I realize I need a log scale on the y-axis, so I just add to the code:
ax.set_yscale('log')
but then, the histogram bars dissapear and I only get some sparse points, do you know waht could be the reason?
Thanks
Use hist's log=True keyword argument instead. This is a FAQ in matplotlib-user list :)