TypeError: must be str, not float - linear Regression

TypeError: must be str, not float - linear Regression - pandas

I am getting a TypeError: must be str, not float error for my linear regression when I have copied the code from a previous chart and just updated the variables. Below are my dependencies as well as the code. the (slope, intercept) line is where the error is pointing to. Any help is appreciated. I am fairly new to coding and just cannot seem to figure this one out.
import pandas as pd
import numpy as np
import requests
import time
import json
import random
import scipy.stats as st
from sklearn import datasets
from scipy.stats import linregress
from pprint import pprint```
x_values = city_data.loc[city_data['Latitude']>=0]
y_values = city_data['Temperature']
(slope, intercept, rvalue, pvalue, stderr) = stats.linregress(x_values, y_values)
regress_values = x_values * slope + intercept
line_eq = "y = " + str(round(slope,2)) + "x + " + str(round(intercept,2))
plt.scatter(x_values, y_values, marker="o", facecolors="green", edgecolors="black",
s=30, alpha=0.75)
plt.plot(x_values,regress_value,"r-")
plt.annotate(line_eq,(20,36),fontsize=15,color="red")
plt.xlim(-50, 85)
plt.ylim(10,95 )
plt.title('City Norther Hemisphere Latitude vs Temperature (10/10/2020)')
plt.xlabel('Latitude')
plt.ylabel('Tempurature (F)')
plt.show()```

Change this line to:
city_data = city_data[city_data['Latitude']>=0]
x_values = city_data['Latitude']

Related

how to display netcdf raster values over map?

I'm trying to plot netcdf raster values of snowfall data in a text format overlaying what I currently have (mentioned further below). Example, something like this below:
Example
This is all the relevant code I have so far. I excluded the non relevant code. I tried plt.text and it gave me "ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()"
What I have plotted so far
import numpy
from datetime import datetime
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import cartopy.mpl.ticker as cticker
import matplotlib.pyplot as plt
from matplotlib import ticker, patheffects
from metpy.units import units
import numpy as np
import numpy.ma as ma
from scipy.ndimage import gaussian_filter, maximum_filter, minimum_filter
import xarray as xr
from metpy.plots import USCOUNTIES
from gradient import Gradient
import pandas as pd
import matplotlib.colors as col
#Open NOAA Snowfall dataset
ds = xr.open_dataset('sfav2_CONUS_2021093012_to_2022042512.nc')
ds
lat = ds.lat
lon = ds.lon
#converts snowfall data to inches
snowdata = ds['Data'] * 39
plt.text(lon, lat, snowdata, transform=datacrs)

As far as I know there isn't a vectorized way of plotting text (plt.text or plt.annotated). So you'll have to loop over the arrays and plot each point.
import matplotlib.pyplot as plt
import matplotlib.patheffects as PathEffects
import cartopy.crs as ccrs
import numpy as np
data = np.random.rand(18, 9)
lons, lats = np.mgrid[-17:18:2, 8:-9:-2]
lons = lons * 10
lats = lats * 10
fig, ax = plt.subplots(figsize=(10, 5), dpi=86, facecolor="w", subplot_kw=dict(projection=ccrs.EqualEarth()))
ax.pcolormesh(lons, lats, data, cmap="coolwarm", alpha=.2, transform=ccrs.PlateCarree())
ax.coastlines()
for val, lat, lon in zip(data.flat, lats.flat, lons.flat):
ax.text(
lon, lat, f"{val:1.1f}", ha="center", va="center", transform=ccrs.PlateCarree(),
path_effects=[PathEffects.withStroke(linewidth=3, foreground="w", alpha=.5)],
)

Difference in bins distribution between Matplotlib & Holoviews

ALL software version info
Python 3.7.4;
On iMac (21.5-inch, 2017);
Using IDLE.
Description of expected behavior and the observed behavior
Problem is: Different bins distribution between Matplotlib & Holoviews is obtained.
Complete, minimal, self-contained example code that reproduces the issue
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import load_wine
wine = load_wine()
print("Feature Names : ", wine.feature_names)
print("\nTarget Names : ", wine.target_names)
wine_df = pd.DataFrame(wine.data, columns = wine.feature_names)
wine_df["Target"] = wine.target
wine_df["Target"] = ["Class_1" if typ==0 else "Class_2" if typ==1 else "Class_3" for typ in wine_df["Target"]]
print("\nDataset Size : ", wine_df.shape)
print(wine_df.head())
Target1=wine_df.query('Target == "Class_1"')
Target2=wine_df.query('Target == "Class_2"')
Target3=wine_df.query('Target == "Class_3"')
x = Target1['proline']
y = Target2['proline']
z = Target3['proline']
plt.hist(x, bins=20,histtype='bar',color='blue',alpha=0.7,label='Class_1')
plt.hist(y, bins=20,histtype='bar',color='red',alpha=0.7,label='Class_2')
plt.hist(z, bins=20,histtype='bar',color='orange',alpha=0.7,label='Class_3')
plt.xlabel('proline')
plt.ylabel('Frequency')
plt.title('Malic Acid Distribution')
plt.legend(frameon=False)
plt.tight_layout()
plt.savefig("Test", dpi=300)
plt.show()
import holoviews as hv
hv.extension('bokeh')
from bokeh.plotting import show
from holoviews import dim, opts
import hvplot.pandas
hist=wine_df.hvplot.hist(y="proline", by="Target", width=600, height=400, ylim=(0,16), alpha=0.7, bins=20, ylabel="Frequency", title="Malic Acid Distribution")
show(hv.render(hist))

Simple logistic regression with Statsmodels: Adding an intercept and visualizing the logistic regression equation

Using Statsmodels, I am trying to generate a simple logistic regression model to predict whether a person smokes or not (Smoke) based on their height (Hgt).
I have a feeling that an intercept needs to be included into the logistic regression model but I am not sure how to implement one using the add_constant() function. Also, I am unsure why the error below is generated.
This is the dataset, Pulse.CSV: https://drive.google.com/file/d/1FdUK9p4Dub4NXsc-zHrYI-AGEEBkX98V/view?usp=sharing
The full code and output are in this PDF file: https://drive.google.com/file/d/1kHlrAjiU7QvFXF2a7tlTSFPgfpq9bOXJ/view?usp=sharing
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
raw_data = pd.read_csv('Pulse.csv')
raw_data
x1 = raw_data['Hgt']
y = raw_data['Smoke']
reg_log = sm.Logit(y,x1,missing='Drop')
results_log = reg_log.fit()
def f(x,b0,b1):
return np.array(np.exp(b0+x*b1) / (1 + np.exp(b0+x*b1)))
f_sorted = np.sort(f(x1,results_log.params[0],results_log.params[1]))
x_sorted = np.sort(np.array(x1))
plt.scatter(x1,y,color='C0')
plt.xlabel('Hgt', fontsize = 20)
plt.ylabel('Smoked', fontsize = 20)
plt.plot(x_sorted,f_sorted,color='C8')
plt.show()
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
~/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4729 try:
-> 4730 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4731 except KeyError as e1:
((( Truncated for brevity )))
IndexError: index out of bounds

Intercept is not added by default in Statsmodels regression, but if you need you can include it manually.
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
raw_data = pd.read_csv('Pulse.csv')
raw_data
x1 = raw_data['Hgt']
y = raw_data['Smoke']
x1 = sm.add_constant(x1)
reg_log = sm.Logit(y,x1,missing='Drop')
results_log = reg_log.fit()
results_log.summary()
def f(x,b0,b1):
return np.array(np.exp(b0+x*b1) / (1 + np.exp(b0+x*b1)))
f_sorted = np.sort(f(x1,results_log.params[0],results_log.params[1]))
x_sorted = np.sort(np.array(x1))
plt.scatter(x1['Hgt'],y,color='C0')
plt.xlabel('Hgt', fontsize = 20)
plt.ylabel('Smoked', fontsize = 20)
plt.plot(x_sorted,f_sorted,color='C8')
plt.show()
This will also resolve the error as there was no intercept in your initial code.Source

How can I convert Arduino signal from Python to Fast Fourier transform?

I'm now trying to convert the signal into a Fast Fourier transform in Python and draw a graph. I have a problem with Len here. How can I fix this? And does anyone have any other ideas about converting Fast Fourier transform?
Exception has occurred: TypeError
object of type 'method' has no len()
That is my problem.
from PyQt5.QtWidgets import*
from matplotlib.backends.backend_qt5agg import FigureCanvasQTAgg as FigureCanvas
from matplotlib.figure import Figure
import matplotlib.pyplot as plt
import random
from PyQt5 import QtCore, QtGui, QtWidgets
import datetime
import serial
import time
import random
import numpy as np
from matplotlib import animation
from collections import deque
import threading
x = 0
value = [0]
ser = serial.Serial('com5', 9600)
class scope :
def data(self) :
if ser.readable() :
time.sleep(0.01)
reciving = ser.readline(ser.inWaiting())
str = reciving.decode()
if len(str) > 0 :
if str[:1] == 'X' :
value[0] = str[1:]
#print(float(value[5]))
time.sleep(0.5)
x = float(value[0])
return x
s = scope()
n = len(s.data)
Ts = 0.01
Fs = 1/Ts
# length of the signal
k = np.arange(n)
T = n/Fs
freq = k/T # two sides frequency range
freq = freq[range(int(n/2))] # one side frequency range
Y = np.fft.fft(x)/n # fft computing and normalization
Y = Y[range(int(n/2))]
fig, ax = plt.subplots(2, 1)
ax.plot(freq, abs(Y), 'r', linestyle=' ', marker='^')
ax.set_xlabel('Freq (Hz)')
ax.set_ylabel('|Y(freq)|')
#3ax.vlines(freq, [0], abs(Y))
ax.grid(True)
t = threading.Thread(target= s.data)
t.daemon = True
t.start()
plt.show()

Matplotlib double legend

With my code I get 2 equations in the legend that are the same. I don't how why it is so. I just want to correct this by making it only one equation. How can I do that? This equation is the line fit result of some of the data below.
Thanks in advance!
import matplotlib.pyplot as plt
import numpy as np
import plotly.plotly as py
import plotly.tools as tls
from sympy import S, symbols
import sympy
y = [2.7,2.3,1.9,1.5,1.3,1.0,0.8,0.6,0.5,0.4,0.2,0.1,0.0,0.0,-0.20,-0.2]
y = [i*10**(-16) for i in y]
x = [0,0.05,0.10,0.15,0.20,0.25,0.30,0.40,0.45,0.50,0.55,0.60,0.65,0.70,0.75,0.80]
e_y = [10**(-17)]* 16
e_x = [0.001] * 16
fig= plt.figure()
ax = fig.add_subplot(111)
ax.errorbar(x,y, yerr=e_y,xerr=0.001,fmt='-o')
ax.set_title('Current vs. Potential')
ax.set_xlabel('Retarding Potential')
ax.set_ylabel('Photocell Current')
x=x[:7]
y=y[:7]
e_y=e_y[:7]
e_x=e_x[:7]
#line fit:
fit=np.polyfit(x,y,1)
fit_fn = np.poly1d(fit)
a=symbols("x")
line = sum(S(format(v))*a**i for i, v in enumerate(fit[::-1]))
eq_latex = sympy.printing.latex(line)
plt.plot(x,y,x,fit_fn(x),label="${}$".format(eq_latex))
plt.legend(fontsize='small')
plt.show()

I solved this using the following:
#import matplotlib.patches as mpatches
plt.plot(x,y,x,fit_fn(x))
eqn = mpatches.Patch(color='green',label="${}$".format(eq_latex))
plt.legend(handles=[eqn])
instead of
plt.plot(x,y,x,fit_fn(x),label="${}$".format(eq_latex))
plt.legend(fontsize='small')

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

TypeError: must be str, not float - linear Regression - pandas

Change this line to: city_data = city_data[city_data['Latitude']>=0] x_values = city_data['Latitude']

Related

how to display netcdf raster values over map?

Difference in bins distribution between Matplotlib & Holoviews

Simple logistic regression with Statsmodels: Adding an intercept and visualizing the logistic regression equation

How can I convert Arduino signal from Python to Fast Fourier transform?

Matplotlib double legend

Categories

Resources