Why df.interpotale() doesn't work in my python3? why does it return the exact same dataframe withut any interpolation? - pandas

my code for df.interpolate was:
import pandas as pd
import numpy as np
import xlrd
from IPython.display import display
from scipy import interpolate
pd.set_option('display.max_rows',54100)
df = pd.read_excel(r'C:\Users\User\Desktop\tanvir random practice\gazipur.xlsx', parse_date=["DateTime"], index_col='DateTime']
df.interpolate(method="linear").bfill()
display(df)

Related

How to convert normal to uniform in sns pair plot

Code is below
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from io import StringIO
text = '''id,revenue ,profit,Label
101,779183,281257,1
102,144829,838451,1
103,766465,757565,-1'''
df = pd.read_csv(StringIO(text))
df = df[df.columns[1::]]
sns_plot = sns.pairplot(df,hue='Label')
My Picture is below
How to change the normal distribution to uniform distribution in the sns pairplot

In Pandas, how can a DataFrame be binned by two columns, with the other columns changed to the means within those bins?

I've got the standard iris dataset projected down to two dimensions using UMAP, with the UMAP dimensions for the x and y positions of the 2D plot added as columns to the dataframe:
import numpy as np
import math
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris
import umap # pip install umap-learn
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['species'] = pd.Series(iris.target).map(dict(zip(range(3), iris.target_names)))
_umap = umap.UMAP().fit_transform(iris.data)
iris_df['UMAP_x'] = _umap[:,0]
iris_df['UMAP_y'] = _umap[:,1]
iris_df.head()
I'd like to bin both the UMAP_x and UMAP_y columns into like 25 bins and then the other columns in the dataframe change to being the mean values of the columns in each of the bins. How might this be done? It feels like cut or resampling might lead to the answer, but I'm not sure how.
You can use cut to define bins and then use groupby with transform to calculate mean value for each bin.
import numpy as np
import math
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from sklearn.datasets import load_iris
import umap
iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['species'] = pd.Series(iris.target).map(dict(zip(range(3), iris.target_names)))
_umap = umap.UMAP().fit_transform(iris.data)
iris_df['UMAP_x'] = _umap[:,0]
iris_df['UMAP_y'] = _umap[:,1]
# Define bins for UMAP_x and UMAP_y params
iris_df['UMAP_x_bin'] = pd.cut(iris_df['UMAP_x'], bins=25)
iris_df['UMAP_y_bin'] = pd.cut(iris_df['UMAP_y'], bins=25)
# Calculate mean value for each bin
iris_df['UMAP_x_mean'] = iris_df.groupby('UMAP_x_bin')['UMAP_x'].transform('mean')
iris_df['UMAP_y_mean'] = iris_df.groupby('UMAP_y_bin')['UMAP_y'].transform('mean')
iris_df.head()

Why histogram ticks showing different answers in gui in compare with non-gui?

I am using jupyter notebook and I want to draw histograms. When I do not use GUI it is okay and everything is shown correctly but when I use the Tkinter version of the code, all bars in histogram are shifted to left so the first bar is missing.(e.g: It should be 4 on a,3 on b,9 on c but it shows 3 on a,9 on b, where a,b and c are ticks)
this is the first code i do not use gui:
import Tkinter as tk
from Tkinter import*
import tkMessageBox
import tkFileDialog
import pandas as pd
import pyautogui
import os
from PIL import Image, ImageTk
from tkinter import ttk
import pylab as plt
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter,
AutoMinorLocator)
from matplotlib.figure import Figure
from matplotlib.backends.backend_tkagg import FigureCanvasTkAgg
select1=pd.DataFrame()
select2=pd.DataFrame()
year1=1396
year2=1395
select1=df.loc[(df['yearj']==year1)]
select2=df.loc[(df['yearj']==year2)]
x=select1['monthj'].values.tolist()
y=select2['monthj'].values.tolist()
plt.xlabel('month')
plt.ylabel('number of orders')
bins=[1,2,3,4,5,6,7,8,9,10,11,12,13]
axx=plt.subplot()
axx.xaxis.set_major_locator(MultipleLocator(1))
axx.xaxis.set_major_formatter(FormatStrFormatter('%d'))
plt.hist(y,bins,rwidth=0.8)
plt.hist(x,bins,rwidth=0.8, alpha=0.6)
and the output is:
enter image description here
and here is second code:
import pandas as pd
import numpy
import pylab as plt
from matplotlib.ticker import (MultipleLocator, FormatStrFormatter,
AutoMinorLocator)
def compare_months(df,hh):
compmon = tk.Toplevel(h
bins=[1,2,3,4,5,6,7,8,9,10,11,12,13]
select1=pd.DataFrame()
select2=pd.DataFrame()
year1=1396
year2=1395
select1=df.loc[(df['yearj']==year1)]
select2=df.loc[(df['yearj']==year2)]
xr=select1['monthj'].values.tolist()
yr=select2['monthj'].values.tolist()
xr.sort(key=int)
yr.sort(key=int)
f = Figure(figsize=(7,6), dpi=80)
f.add_axes([0.15, 0.15,0.8,0.7])
canvas = FigureCanvasTkAgg(f, master=compmon)
canvas.get_tk_widget().grid(row=4, column=5, rowspan=8)
p = f.gca()
p.set_xlabel('month', fontsize = 10)
p.set_ylabel('number of orders', fontsize = 10)
p.hist(yr,bins,rwidth=0.8)
p.hist(xr,bins,rwidth=0.8, alpha=0.6)
p.xaxis.set_major_formatter(FormatStrFormatter('%d'))
p.xaxis.set_major_locator(MultipleLocator(1))
but=Button(compmon, text="ok", command=compare_months(df,root))
but.grid(row=2,column=2)
and the output is:enter image description here
Why does this happen?

matplotlib code does not shows anything on output

I am following this here: https://matplotlib.org/users/image_tutorial.html
The code is this:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
img = mpimg.imread('1.jpg')
I am trying to output something, but when I execute the code, I get nothing... Shouldn't I get a matrix as output?

Add thousands comma separator to Seaborn Heatmap [duplicate]

I am trying to format my colorbar such the numbers are formatted with commas. Any help would be greatly appreciated
import numpy as np
import matplotlib.pyplot as plt
plt.matshow(np.array(([30000,8000],[12000,25000])))
plt.colorbar()
You can create and specify a custom formatter:
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
comma_fmt = FuncFormatter(lambda x, p: format(int(x), ','))
plt.matshow(np.array(([30000,8000],[12000,25000])))
plt.colorbar(format=comma_fmt)
plt.show()