change format values of the axis plots in matplotlib - matplotlib

I'm new in matplotlib, and I want change the value's format of the axis x, like this:
Axis x : 250000, 500000, etc. Change to the format 0.250, 0.500, etc.
What can I do for this point?
Thanks

You would usually just scale the data prior to plotting. So instead of plt.plot(x,y), you'd use plt.plot(x,y/1e6).
To format the values with 3 decimal places, use a matplotlib.ticker.StrMethodFormatter and supply a format with 3 decimals, in this case "{x:.3f}".
import matplotlib.pyplot as plt
import matplotlib.ticker
import numpy as np; np.random.seed(42)
x = np.arange(5)
y = np.array([5e5,2e5,0,3e5,4e5])
plt.plot(x,y/1e6)
plt.gca().yaxis.set_major_formatter(matplotlib.ticker.StrMethodFormatter("{x:.3f}"))
plt.show()

Related

Seaborn: How to add a "%" symbol after each value in the X axis of a plot, not converting the value to a percentage? [duplicate]

This question already has answers here:
Format y axis as percent
(10 answers)
Closed 2 years ago.
I am just wondering, how do I add a percentage (%) symbol after each value in my plot along the X axis, not converting the values to a percentage?
For example: Say I have the values 5, 10, 15, 20... How would I make them appear as 5%, 10%, 15%, 20% etc., along the X axis, but not converting them to 0.05%, 0.1%, 0.15%, 0.2% etc.
Code for my plot:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns
sns.set(
What needs to be added to my plot code to make the percentage (%) symbol appear after each value in my X axis?
sns.displot(data=df, x="column_name", kde=True)
PercentFormatter() should work for you as shown in the working example below:
import seaborn as sns
import numpy as np
import matplotlib.ticker as mtick
import matplotlib.pyplot as plt
np.random.seed(0)
x = np.random.randn(100)
fig, ax = plt.subplots(figsize=(16,8))
sns.distplot(x, vertical=True)
Then, at the end do:
ax.xaxis.set_major_formatter(mtick.PercentFormatter())

How can I use matplotlib ticklabel_format to not use scientific notation on y axis labels

I'm creating a plot (in a colab worksheet) and want the y tick labels to not use scientific notation. The ticklabel_format doesn't make any difference to the final graph. The y axis labels are still shown as 10^3 instead of 1000. How do I format the y tick labels to not use scientific notation?
Here is my code
import matplotlib.pyplot as plt
plt.ticklabel_format(style='plain', axis='y')
plt.plot(Cd_rank,Cd_raw,linewidth=4)
plt.plot(Cd_rank,Cd_sed,linewidth=4)
plt.plot(Cd_rank,Cd_filter,linewidth=4)
plt.plot([0,1],[0.3,0.3],linewidth=4)
plt.plot([0,1],[5,5],linewidth=4)
plt.ylabel('Turbidez (UTN)')
plt.xlabel('Datos ordenados')
plt.yscale('log')
plt.legend(['Agua cruda','Decantada','Filtrada','Norma EPA','Norma ENACAL'])
The ScalarFormatter shows the tick labels in a default format. Note that depending on your concrete situation, matplotlib still might be using scientific notation:
When the numbers are too high (default this is about 4 digits). set_powerlimits((n, m)) can be used to change the limits.
In case the numbers are very close together, matplotlib describes the range using an offset. That offset is placed at the top of the axis. This can be suppressed with the useOffset=None parameter of the formatter.
In some cases with a logarithmic scale, there are very few major ticks. Then also some (but not all) minor ticks get a label. Also for these, the formatter could be changed. A problem can be that a simple ScalarFormatter will set too many labels. Either suppress all these minor labels using a NullFormatter or you'll need a very custom formatter that returns empty strings for the minor tick labels that need to be suppressed.
A simple example:
from matplotlib import pyplot as plt
from matplotlib import ticker
import numpy as np
N = 50
Cd_rank = np.linspace(0, 100, N)
Cd_raw = np.random.normal(1, 20, N).cumsum() + 100
plt.plot(Cd_rank, Cd_raw, linewidth=4)
plt.plot([0, 1], [0.3, 0.3], linewidth=4)
plt.plot([0, 1], [5, 5], linewidth=4)
plt.yscale('log')
plt.gca().yaxis.set_major_formatter(ticker.ScalarFormatter())
plt.gca().yaxis.set_minor_formatter(ticker.NullFormatter())
plt.show()
And here is a more complicated example, with both minor (green) and major (red) ticks.
from matplotlib import pyplot as plt
from matplotlib import ticker
import numpy as np
N = 50
Cd_rank = np.linspace(0, 100, N)
Cd_raw = np.random.normal(10, 5, N).cumsum() + 80
plt.plot(Cd_rank, Cd_raw, linewidth=4)
plt.yscale('log')
mticker = ticker.ScalarFormatter(useOffset=False)
mticker.set_powerlimits((-6, 6))
ax = plt.gca()
ax.yaxis.set_major_formatter(mticker)
ax.yaxis.set_minor_formatter(mticker)
ax.tick_params(axis='y', which='major', colors='crimson')
ax.tick_params(axis='y', which='minor', colors='seagreen')
plt.show()
PS: When the ticks involve both powers of 10 larger than 1 and smaller than 1 (so, e.g. 100, 10, 1, 0.1, 0.01) the ScalarFormatter doesn't display the numbers smaller than 1 well (it displays 0.1 and 0.01 as 0). In that case, the StrMethodFormatter can be used instead:
plt.gca().yaxis.set_major_formatter(ticker.StrMethodFormatter("{x}"))
Here is code that turns off scientific notation and handles numbers that are smaller than 1 correctly. Thanks to #Johanc for this code.
from matplotlib import pyplot as plt
from matplotlib import ticker
import numpy as np
N = 50
x = np.linspace(0,1,N)
y = np.logspace(-3, 2, N)
plt.plot(x, y, linewidth=4)
plt.yscale('log')
plt.ylim(bottom=0.001,top=100)
plt.gca().yaxis.set_major_formatter(ticker.ScalarFormatter())
plt.gca().yaxis.set_major_formatter(ticker.StrMethodFormatter("{x}"))
plt.show()```

Plotting a graph with specific x-axis values

I have plotted a graph using data I have on excel and have the values of the x and y axes. However, I want to change the value on the x-axis by just presenting specific values which would reflect the key days on the axis only. Is that possible?
Here is the code I have written:
import pandas as pd
from matplotlib import pyplot as plt #download matplot library
#create a graph of the cryptocurrencies in excel
btc = pd.read_excel('/Users/User/Desktop/bitcoin_prices.xlsx')
btc.set_index('Date', inplace=True) #Chart Fit
btc.plot()
plt.xlabel('Date', fontsize= 12)
plt.ylabel('Price ($)', fontsize= 12)
plt.title('Cryptocurrency Prices', fontsize=15)
plt.figure(figsize=(60,40))
plt.show() #plot then show the file
Thank you.
I guess you want the program to recognize the datetime format of the 'Date' column. Supply parse_dates=['Dates'] to the loading call. Then you can index your data for certain days. For example:
import datetime as dt
import numpy as np
import pandas as pd
btc = pd.read_csv('my_excel_data.xlsx', parse_dates=['Dates'], index_col='Dates')
selected_time = np.arange(dt.datetime(2015, 1, 1), dt.datetime(2016, 1, 1), dt.timedelta(7))
btc_2015 = btc.loc[selected_time, :]
If you want specific labels for specific dates you have to read into axes and date formatters

Matplotlib Bar Graph Yaxis not being set to 0 [duplicate]

My DataFrame's structure
trx.columns
Index(['dest', 'orig', 'timestamp', 'transcode', 'amount'], dtype='object')
I'm trying to plot transcode (transaction code) against amount to see the how much money is spent per transaction. I made sure to convert transcode to a categorical type as seen below.
trx['transcode']
...
Name: transcode, Length: 21893, dtype: category
Categories (3, int64): [1, 17, 99]
The result I get from doing plt.scatter(trx['transcode'], trx['amount']) is
Scatter plot
While the above plot is not entirely wrong, I would like the X axis to contain just the three possible values of transcode [1, 17, 99] instead of the entire [1, 100] range.
Thanks!
In matplotlib 2.1 you can plot categorical variables by using strings. I.e. if you provide the column for the x values as string, it will recognize them as categories.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"x" : np.random.choice([1,17,99], size=100),
"y" : np.random.rand(100)*100})
plt.scatter(df["x"].astype(str), df["y"])
plt.margins(x=0.5)
plt.show()
In order to optain the same in matplotlib <=2.0 one would plot against some index instead.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df = pd.DataFrame({"x" : np.random.choice([1,17,99], size=100),
"y" : np.random.rand(100)*100})
u, inv = np.unique(df["x"], return_inverse=True)
plt.scatter(inv, df["y"])
plt.xticks(range(len(u)),u)
plt.margins(x=0.5)
plt.show()
The same plot can be obtained using seaborn's stripplot:
sns.stripplot(x="x", y="y", data=df)
And a potentially nicer representation can be done via seaborn's swarmplot:
sns.swarmplot(x="x", y="y", data=df)

Matplotlib float values on the axis instead of integers

I have the following code that shows the following plot. I can't get to show the fiscal year correctly on the x axis and it's showing as if they are float. I tried to do the astype(int) and it didn't work. Any ideas on what I am doing wrong?
p1 = plt.bar(list(asset['FISCAL_YEAR']),list(asset['TOTAL']),align='center')
plt.show()
This is the plot:
In order to make sure only integer locations obtain a ticklabel, you may use a matplotlib.ticker.MultipleLocator with an integer number as argument.
To then format the numbers on the axes, you may use a matplotlib.ticker.StrMethodFormatter.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
df = pd.DataFrame({"FISCAL_YEAR" : np.arange(2000,2017),
'TOTAL' : np.random.rand(17)})
plt.bar(df['FISCAL_YEAR'],df['TOTAL'],align='center')
locator = matplotlib.ticker.MultipleLocator(2)
plt.gca().xaxis.set_major_locator(locator)
formatter = matplotlib.ticker.StrMethodFormatter("{x:.0f}")
plt.gca().xaxis.set_major_formatter(formatter)
plt.show()