Can't figure out what syntax error is when trying to convert Int to Categorical - pandas

I have a variable (Bad Indicator) which is also my target. It is currently INT 64 and I am trying to convert to categorical but am getting a syntax error but I can't figure out why.Converting DTI from Object to Float worked. . .
import numpy as np
from scipy import stats
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
project = pd.read_csv('c:/users/Brandon Thomas/Project.csv')
project.dtypes
90+ Perf int64
Bankruptcy Code float64
Bad Indicator int64
RR Downgrade object
Beacon Range object
Product Grouping object
Final Product Grouping object
dtype: object
project["DTI"] = pd.to_numeric(project.DTI, errors='coerce')
project['Bad Indicator']=pd.categorical(project.Bad Indicator)
File "<ipython-input-166-b6f5f0432024>", line 2
project['Bad Indicator']=pd.categorical(project.Bad Indicator)
^
SyntaxError: invalid syntax

Related

FInding fft gives keyerror :'Aligned ' pandas

I have a time series data
I am trying to find the fft .But it gives keyerror :Aligned when trying to get the value
my data looks like below
this is the code:
import datetime
import numpy as np
import scipy as sp
import scipy.fftpack
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
temp_fft = sp.fftpack.fft(data3)
Looks like your data is a pandas series. fft works with numpy arrays rather than series.
Easy resolution is to convert your series into a numpy array either via
data3.values
or
np.array(data3)
You can then pass that array into fft function. So the end result is:
temp_fft = sp.fftpack.fft(data3.values)
This should work for you now.

System of second order ode

I am trying to solve an IVP, consisting of two second order ode. I use sympy. No problems to obtain general solution. But I don't understand how to plug in the IC:s.
MWE:
from sympy import *
import numpy as np
import matplotlib.pyplot as plt
x1,x2=symbols('x1,x2',cls=Function)
t,k1,k2,m1,m2=symbols('t k1 k2 m1 m2',real=True)
k1=1.0
k2=1.0
m1=1.0
m2=1.0
eq1=Eq(m1*diff(x1(t),t,2)+k1*x1(t)-k2*(x2(t)-x1(t)),0)
eq2=Eq(m2*diff(x2(t),t,2)+k2*(x2(t)-x1(t)),0)
eqs=[eq1,eq2]
IC1={x1(0):0,x1(t).diff().subs(t,0):0}
IC2={x2(0):1,x2(t).diff().subs(t,0):0}
sol=dsolve(eqs,[x1(t),x2(t)],ics=[IC1,IC2])
print(sol)
Error message:
for funcarg, value in ics.items():
AttributeError: 'list' object has no attribute 'items'

Seaborn hexbin plot with marginal distributions for datetime64[ns] and category variables

I'm uploading a spreadsheet from excel to a dataframe. In this table, I am only interested in two columns. The first column is the date and time in the format %Y-%m-%d %H-%M-%S. The second column is a categorical variable, namely the type of violation (for example, being late).
There are few types of violations in total. About 6-7 types.
Using the command df.info () you can make sure that the dataframe for the available columns has the datatime64[ns] type for the date and time column and the category type for the column with the types of violations.
I would like to use hexbin plot with marginal distributions from the seaborn library (https://seaborn.pydata.org/examples/hexbin_marginals.html ). However, the simple code available at the link above is not so simple for a variable with categories and time.
import seaborn as sns
sns.set_theme(style="ticks")
sns.jointplot(x=df['incident'], y=['date-time'], kind="hex", color="#4CB391")
The compiler reports TypeError: The x variable is categorical, but one of ['numeric', 'datetime'] is required
I understand that either a numeric variable or a date-time variable is needed for the ordinate axis. Conversion does not solve the problem.
This error can be reproduced using
import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
from datetime import datetime
ndf = pd.DataFrame({'date-time': ['2021-11-15 00:10:00','2021-11-15 00:20:00'], 'incident': ['a','b']})
print(ndf)
sns.set_theme(style="ticks")
sns.jointplot(data=ndf, x='incident', y='date-time', color="#4CB391", hue=ndf['incident'] )
plt.show()
Question. How to get a plot looks like seabron style
Based on the example cited in the question as the desired graph style, change the data on the x-axis to date/time data, convert it to a date format that can be handled by matplotlib, and place it as a tick on the x-axis. The ticks are placed by converting the time series back to its original format. Since the date and time overlap, the angle of the text is changed. Also, please judge for yourself the points that #JohanC pointed out.
import numpy as np
import seaborn as sns
sns.set_theme(style="ticks")
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import pandas as pd
rs = np.random.RandomState(11)
x = rs.gamma(2, size=1000)
date_rnge = pd.date_range('2021-11-15', '2021-11-16', freq='1min')
y = -.5 * x + rs.normal(size=1000)
g = sns.jointplot(x=mdates.date2num(date_rng[:1000]), y=y, kind="hex", color="#4CB391")
g.ax_joint.set_xticklabels([mdates.num2date(d).strftime('%Y-%m-%d %H:%M:%S') for d in mdates.date2num(date_rng[:1000])])
for tick in g.ax_joint.get_xticklabels():
tick.set_rotation(45)

Pandas dataframe accessing and timeseries

import pandas as pd
import numpy as np
from datetime import datetime
import matplotlib.pyplot as plt
a = pd.read_excel('C:\Users\user\Desktop\INTERNSHIP\SOLAR_DADRI\data.xlsx',
index_col='Date/ Time')
pd.to_datetime(a.index)
a['Date/ Time']
I am getting a Key Error when i m trying to get a column which is there. The spacing is right. Also i am getting a error that i cannot convert string to float when i simply do plot(ts) which is the time or the x axis.
Also when i try to see the type of index it says
pandas.indexes.base.Index
When you use parameter index_col, you move that column into the index. In single index dataframe you access to what was in this column using 'a.index' in your example. If you use a.reset_index() and move that index back into columns, you should be able to then access that column using a['Date/ Time'].

TypeError when setting y-axis range in pyplot

I am getting a TypeError like this for the following code.
TypeError: ufunc 'isfinite' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''
import numpy as np
import matplotlib.pyplot as plt
import math
plt.axis([1,5000,100,10**20])
plt.xscale('log')
plt.yscale('log')
plt.savefig('test.png')
plt.close()
When I set the maximum value for y-axis as 10^19, there is no error.
But from 10^20, I start getting the TypeError described above.
Could you help me understand this error and set the y-axis range from 1 to 10^20?