I am receiving a really unhelpful error message 'TypeError: narray.fields require' on doing the following;
I have a pandas data frame which I have converted to a numpy array using
df.as_matrix()
this is the numpy array "npArrayIN" shape: (3, 10)
I then need to create a feature class - here is the call to the arcpy function which has the list of 10 fields I want to create but which crashes returning the error. All numbers are floating point.
arcpy.da.NumPyArrayToFeatureClass(npArrayIN, outputShape, ("TID","X","Y","Z","H","D","WGS84Lat","WGS84Long","OFFSETA", "OFFSETB"), spRef)
Any suggestions gratefully received.
Thanks
Have you tried it with the "X","Y","Z" as the 1st three columns instead of leading it with "TID"?
Also, you may want to try it with only the xyz columns.
Related
I have a big data dataframe and I want to write it to disk for quick retrieval. I believe to_hdf(...) infers the data type of the columns and sometimes gets it wrong. I wonder what the correct way is to cope with this.
import pandas as pd
import numpy as np
length = 10
df = pd.DataFrame({"a": np.random.randint(1e7, 1e8, length),})
# df.loc[1, "a"] = "abc"
# df["a"] = df["a"].astype(str)
print(df.dtypes)
df.to_hdf("df.hdf5", key="data", format="table")
Uncommenting various lines leads me to the following.
Just filling the column with numbers will lead to a data type int32 and stores without problem
Setting one element to abc changes the data to object, but it seems that to_hdf internally infers another data type and throws an error: TypeError: object of type 'int' has no len()
Explicitely converting the column to str leads to success, and to_hdf stores the data.
Now I am wondering what is happening in the second case, and is there a way to prevent this? The only way I found was to go through all columns, check if they are dtype('O') and explicitely convert them to str.
Instead of using hdf5, I have found a generic pickling library which seems to be perfect for the job: jiblib
Storing and loading data is straight forward:
import joblib
joblib.dump(df, "file.jl")
df2 = joblib.load("file.jl")
I am giving myself an intro to plotting data and have come across some trouble. I am working on a line chart that I plan on making animated as soon as I figure out this problem.
I want a graph that looks like this:
However this code I have now:
`x=df_pre_2003['year']
y=df_pre_2003['nAllNeonic']
trace=go.Scatter(
x=x,
y=y
)
data=[trace]
ply.plot(data, filename='test.html')`
is giving me this:
So I added y=df_pre_2003['nAllNeonic'].sum()
but, now it says ValueError:
Invalid value of type 'builtins.float' received for the 'y' property of scatter
Received value: 1133180.4000000006
The 'y' property is an array that may be specified as a tuple,
list, numpy array, or pandas Series
Which I tried and it still did not work. The data types for year is int64 and nAllNeonic is float64.
It looks like you have to sort the values first based on the date. Now it's connecting a value in the year 1997 with a value in 1994.
df_pre_2003.sort_values(by = ['year'])
This is not to answer this question, but to share my similar case for any future research needs:
In my case the error message was coming when I tried to export the django models objects to use it in the plotly scatter chart, and the error was as follows:
The 'x' property is an array that may be specified as a tuple, list, numpy array, or pandas Series
The solution for this in my case was to export the django model info into pandas data frame then use the pandas data frame columns instead of the model fields name.
my question is related to this but I can't get that solution to work and didn't want to add my own scenario to the old question.
I have a 2D float numpy array, am running python 3.5.1 with numpy 1.10.4, and am trying to write out the array with
numpy.savetext(filename, arrayname, delimiter = ',')
which works beautifully with a 1D array.
I've tried the solution from the referenced post
with open(filename, 'ab') as f:
numpy.savetext(f, arrayname, delimiter = ',')
to no avail. Actually, I've tried this without the delimiter as well as with 'w', 'wb, 'a' and with formatting arguments, and always get the same error message:
TypeError: Mismatch between array dtype ('float64') and format specifier.
I need to write this 2D array to a file which will be read later into a panda dataframe (have been using read.csv). I understand this may be an issue with numpy.savetxt, so I'm looking for an alternative.
Please try a minimal example and post the result, since the following works for me:
import numpy as np
array1=np.array([[1,2],[3,4]])
np.savetxt('file1.txt', array1 , delimiter = ',')
file content:
1.000000000000000000e+00,2.000000000000000000e+00
3.000000000000000000e+00,4.000000000000000000e+00
I had the same error message - until I finally realized that the type of my output actually was a list, not a numpy array!
I got the following error while using NumPy argmax method. Could some one help me to understand what happened:
import numpy as np
b = np.zeros(1, dtype={'names':['a','b'], 'formats': ['i4']*2})
b.argmax()
The error is
TypeError: expected a readable buffer object
While the following runs without a problem:
a = np.zeros(3)
a.argmax()
It seems the error dues to the structured array. But could you anyone help to explain the reason?
Your b is:
array([(0, 0)], dtype=[('a', '<i4'), ('b', '<i4')])
I get a different error message with argmax:
TypeError: Cannot cast array data from dtype([('a', '<i4'), ('b', '<i4')]) to dtype('V8') according to the rule 'safe'
But this works:
In [88]: b['a'].argmax()
Out[88]: 0
Generally you can't do math operations across the fields of a structured array. You can operate within each field (if it is numeric). Since the fields could be a mix of numbers, strings and other objects, so there's been no effort to handle special cases where such operations might make sense.
If you really must to operations across the fields, try a different view, eg:
In [94]: b.view('<i4').argmax()
Out[94]: 0
I was printing one list of values in Python, when I got this:
[ 0.00020885 0.00021386 0.0002141 ..., 0.0501399 0.12051606
0.12359095]
What is the problem here? The list should have at least size 20. What happened to the elements shown as ...?
The problem is that you are not printing a Python list, but a NumPy array. NumPy output can be configured using numpy.set_printoptions().
Data types matter. If you wonder about the behaviour of some object, first check its type.