I'm trying to export a pandas dataframe with df.to_csv(), which should be easy enough. Unfortunately, this code:
df.to_csv(r'C:/Users/my/path/to/file.csv', index=FALSE, encoding='utf-8')
Gives me this error:
AttributeError: '_io.BufferedReader' object has no attribute 'to_csv'
What am I doing wrong? I'm working in a jupyter notebook on a mac in case that's important. Sorry for such a noob question, I know this should be super easy
I googled similar issues where attribute so-and-so is missing, but none of the ones I found helped my problem
Related
I am trying to configure a template for creating plots for my test data. Therefore I need to say I am pretty new to that in python, and I already googled quite a lot regarding my question but what I found could not help me. I have a excel table with data in two columns, which I want to plot against each other. My code looks as follows
file='C:/Documents/Test/test_file.xlsx'
df1=pd.read_Excel(file,sheet_name='sheet1',header=0, engine="openpyxl")
plt.figure()
sns.lineplot(data=df1[:,:],x="eps",y="sigma",sort=False,linewidth=0.8)
The excel has -as mentioned a header with eps and sigma as x and y values. The values following are floats, when I check the datatype with df1.dtypes, the result is 'float64' So has anyone an idea what is not working? I get the error 'ufunc 'isfinite' not supported for the input types'
Plotting data from excel with panda and seaborn against each other and save the image.
This might be a library issue. I've been running into the same problem with example datasets and even a very simple:
sns.lineplot(x=[1], y=[1])
I'll update if I find a solution.
Edit: There seems to be an issue with Numpy that is causing this issue with Seaborn. Solution is to downgrade Numpy to 1.23 until 1.24.1 is released.
https://github.com/mwaskom/seaborn/issues/3192
I first sub in Modin for Pandas for the benefit of distributed work over multiple cores:
import modin.pandas as pd
from modin.config import Engine
Engine.put("dask")
After initializing my dataframe, I attempt to use:
df['bins'] = pd.cut(df[column],300)
I get this error:
TypeError: ('Could not serialize object of type function.', '<function PandasDataframe._build_mapreduce_func.<locals>._map_reduce_func at 0x7fbe78580680>')
Would be glad to get help.
I can't seem to get Modin to perform the way that I want out of the box, the way I expected.
The code only has the error when I use the scipy fftpack on my data(from excel).
Plotting my data normally has worked just fine. I have heard some suggestions saying turn it into an array but I have tried this and it did not work. enter image description here
enter image description here
I'm working through an exercise in https://www.oreilly.com/library/view/hands-on-machine-learning/9781492032632/ and am finding unexpected behavior on my computer when I fetch a dataset. The following code returns
numpy.ndarray
on the author's Google Collab page, but returns
pandas.core.frame.DataFrame
on my local Jupyter notebook. As far as I know, my environment is using the exact same versions of libraries as the author. I can easily convert the data to a numPy array, but since I'm using this book as a guide for novices, I'd like to know what could be causing this discrepancy.
from sklearn.datasets import fetch_openml
mnist = fetch_openml('mnist_784', version=1)
mnist.keys()
type(mnist['data'])
The author's Google Collab is at the following link, scrolling down to the "MNIST" heading. Thanks!
https://colab.research.google.com/github/ageron/handson-ml2/blob/master/03_classification.ipynb#scrollTo=LjZxzwOs2Q2P.
Just to close off this question, the comment by Ben Reiniger, namely to add as_frame=False, is correct. For example:
mnist = fetch_openml('mnist_784', version=1, as_frame=False)
The OP has already made this change to the Colab code in the link.
Can anyone tell me why this NumPy record is having trouble with Python's new-style string formatting? All floats in the record choke on "{:f}".format(record).
Thanks for your help!
In [334]: type(tmp)
Out[334]: numpy.core.records.record
In [335]: tmp
Out[335]: ('XYZZ', 2001123, -23.823917388916016)
In [336]: tmp.dtype
Out[336]: dtype([('sta', '|S6'), ('ondate', '<i8'), ('lat', '<f4')])
# Some formatting works fine
In [337]: '{0.sta:6.6s} {0.ondate:8d}'.format(tmp)
Out[337]: 'XYZZ 2001123'
# Any float has trouble
In [338]: '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
/Users/jkmacc/python/pisces/<ipython-input-338-e5f6bcc4f60f> in <module>()
----> 1 '{0.sta:6.6s} {0.ondate:8d} {0.lat:11.6f}'.format(tmp)
ValueError: Unknown format code 'f' for object of type 'str'
This question was answered on the NumPy user mailing list under "floats coerced to string with "{:f}".format() ?":
It seems that np.int64/32 and np.str inherit their respective native Python __format__(), but np.float32/64 doesn't get __builtin__.float.__format__(). That's not intuitive, but I see now why this works:
In [8]: '{:6.6s} {:8d} {:11.6f}'.format(tmp.sta, tmp.ondate, float(tmp.lat))
Out[8]: 'XYZZ 2001123 -23.820000'
Thanks!
-Jon
EDIT:
np.float32/int32 inherits from native Python types if your system is 32-bit. Same for 64-bit. A mismatch will generate the same problem as the original post.