Linear regression on Tensor flow Google Collab - tensorflow

I am trying to code a linear regression but I am stuck on this cell, as it returns me an error and I donĀ“t understand how to correct it. Would aprecciate some detailed feedback as to how to change my code to avoid this
Here is the cell that raises the error
error = test_predictions - test_labels
plt.hist(error, bins = 25)
plt.xlabel("Prediction Error [MPG]")
_ = plt.ylabel("Count")
After that I Get:
/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_axes.py:6630: RuntimeWarning: All-NaN slice encountered
xmin = min(xmin, np.nanmin(xi))
/usr/local/lib/python3.7/dist-packages/matplotlib/axes/_axes.py:6631: RuntimeWarning: All-NaN slice encountered
xmax = max(xmax, np.nanmax(xi))
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-30-c4d487a4d6e3> in <module>()
1 error = test_predictions - test_labels
----> 2 plt.hist(error, bins = 25)
3 plt.xlabel("Prediction Error [MPG]")
4 _ = plt.ylabel("Count")
5 frames
<__array_function__ internals> in histogram(*args, **kwargs)
/usr/local/lib/python3.7/dist-packages/numpy/lib/histograms.py in _get_outer_edges(a, range)
322 if not (np.isfinite(first_edge) and np.isfinite(last_edge)):
323 raise ValueError(
--> 324 "autodetected range of [{}, {}] is not finite".format(first_edge, last_edge))
325
326 # expand empty range to avoid divide by zero
ValueError: autodetected range of [nan, nan] is not finite```

Related

Market Basket Analysis Association_rules - ValueError: cannot call `vectorize` on size 0 inputs unless `otypes` is set

I am currently running a market basket analysis on my dataset.
When I run my association_rules I get an error.
rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
rules.head()
ValueError Traceback (most recent call last)
<ipython-input-47-60252dc62442> in <module>
----> 1 rules = association_rules(frequent_itemsets, metric="lift", min_threshold=1)
2 rules.head()
3 frames
/usr/local/lib/python3.8/dist-packages/numpy/lib/function_base.py in _get_ufunc_and_otypes(self, func, args)
2195 args = [asarray(arg) for arg in args]
2196 if builtins.any(arg.size == 0 for arg in args):
-> 2197 raise ValueError('cannot call `vectorize` on size 0 inputs '
2198 'unless `otypes` is set')
2199
ValueError: cannot call `vectorize` on size 0 inputs unless `otypes` is set
In my dataset their is alot of 0's, I am currently looking into this to see if it is effecting my results.

ValueError: could not broadcast input array from shape (16,18,3) into shape (16)

I was trying to instance segment my RGB images using pixellib library. However, I encountered the problem from segmentImage function. From stacktrace, I found the issue within init.py, and I have no idea why it needs to broadcast from 3D arrays to 1D. 20 Images from another folder I tried earlier didn't counter any of these.
P.S. This was my first question on StackOverflow. if I miss any necessary details, please let me know.
for file in os.listdir(test_path):
abs_test_path = os.path.join(test_path, file)
if file.endswith('.jpg'):
filename = os.path.splitext(file)[0]
if (os.path.isfile(abs_test_path)):
out_path = out_seg_path + filename
segment_image.segmentImage(abs_test_path, show_bboxes=True,
save_extracted_objects=True,
extract_segmented_objects=True)
im_0 = cv2.imread('segmented_object_1.jpg')
cv2.imwrite(out_path + '_1.jpg', im_0)
im_1 = cv2.imread('segmented_object_2.jpg')
cv2.imwrite(out_path + '_2.jpg', im_1)
This is my error
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-835299843033> in <module>
15
16 segment_image.segmentImage(abs_test_path, show_bboxes=True,
---> 17 save_extracted_objects=True, extract_segmented_objects=True)
18
19 # segment_image.segmentImage('segmented_object_1.jpg', show_bboxes=True, output_image_name=out_path + '_1.jpg',
~\anaconda3\envs\mask_rcnn\lib\site-packages\pixellib\instance\__init__.py in segmentImage(self, image_path, show_bboxes, extract_segmented_objects, save_extracted_objects, mask_points_values, output_image_name, text_thickness, text_size, box_thickness, verbose)
762 cv2.imwrite(save_path, extracted_objects)
763
--> 764 extracted_objects = np.array(ex, dtype=object)
765
766 if mask_points_values == True:
ValueError: could not broadcast input array from shape (16,18,3) into shape (16)
There isn't enough information to help you.
I don't know what segment_image.segmentImage is, or what it expects. And I don't have your jpg file to test.
I have an idea of why the problem line raises this error, but since it occurs in an unknown function I can't suggest any fixes.
extracted_objects = np.array(ex, dtype=object)
ex probably is a list of arrays, arrays that match in some some dimensions but not others. It's trying to make an object dtype array of those arrays, but due to the mix of shapes it raises an error.
An simple example that raises the same error:
In [151]: ex = [np.ones((3, 4, 3)), np.ones((3, 5, 3))]
In [152]: np.array(ex, object)
Traceback (most recent call last):
Input In [152] in <module>
np.array(ex, object)
ValueError: could not broadcast input array from shape (3,4,3) into shape (3,)

tf.keras.layers.Concatenate() works with a list but fails on a tuple of tensors

This will work:
tf.keras.layers.Concatenate()([features['a'], features['b']])
While this:
tf.keras.layers.Concatenate()((features['a'], features['b']))
Results in:
TypeError: int() argument must be a string or a number, not 'TensorShapeV1'
Is that expected? If so - why does it matter what sequence do I pass?
Thanks,
Zach
EDIT (adding a code example):
import pandas as pd
import numpy as np
data = {
'a': [1.0, 2.0, 3.0],
'b': [0.1, 0.3, 0.2],
}
with tf.Session() as sess:
ds = tf.data.Dataset.from_tensor_slices(data)
ds = ds.batch(1)
it = ds.make_one_shot_iterator()
features = it.get_next()
concat = tf.keras.layers.Concatenate()((features['a'], features['b']))
try:
while True:
print(sess.run(concat))
except tf.errors.OutOfRangeError:
pass
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-135-0e1a45017941> in <module>()
6 features = it.get_next()
7
----> 8 concat = tf.keras.layers.Concatenate()((features['a'], features['b']))
9
10
google3/third_party/tensorflow/python/keras/engine/base_layer.py in __call__(self, inputs, *args, **kwargs)
751 # the user has manually overwritten the build method do we need to
752 # build it.
--> 753 self.build(input_shapes)
754 # We must set self.built since user defined build functions are not
755 # constrained to set self.built.
google3/third_party/tensorflow/python/keras/utils/tf_utils.py in wrapper(instance, input_shape)
148 tuple(tensor_shape.TensorShape(x).as_list()) for x in input_shape]
149 else:
--> 150 input_shape = tuple(tensor_shape.TensorShape(input_shape).as_list())
151 output_shape = fn(instance, input_shape)
152 if output_shape is not None:
google3/third_party/tensorflow/python/framework/tensor_shape.py in __init__(self, dims)
688 else:
689 # Got a list of dimensions
--> 690 self._dims = [as_dimension(d) for d in dims_iter]
691
692 #property
google3/third_party/tensorflow/python/framework/tensor_shape.py in as_dimension(value)
630 return value
631 else:
--> 632 return Dimension(value)
633
634
google3/third_party/tensorflow/python/framework/tensor_shape.py in __init__(self, value)
183 raise TypeError("Cannot convert %s to Dimension" % value)
184 else:
--> 185 self._value = int(value)
186 if (not isinstance(value, compat.bytes_or_text_types) and
187 self._value != value):
TypeError: int() argument must be a string or a number, not 'TensorShapeV1'
https://github.com/keras-team/keras/blob/master/keras/layers/merge.py#L329
comment on the concanate class states it requires a list.
this class calls K.backend's concatenate function
https://github.com/keras-team/keras/blob/master/keras/backend/tensorflow_backend.py#L2041
which also states it requires a list.
in tensorflow https://github.com/tensorflow/tensorflow/blob/r1.12/tensorflow/python/ops/array_ops.py#L1034
also states it requires a list of tensors. Why? I don't know. in this function the tensors (variable called "values") actually gets checked if its a list or tuple. but somewhere along the way you still get an error.

Convert labels(int) into one-hot vectors for tensorflow

Kindly help me to resolve my error.Thanks
This is my python code:
shape of Y (199584, 1) and data type is int
num_labels = len(np.unique(Y))
simulated_labels = np.eye(num_labels)[Y] # One liner trick!
print simulated_labels
Error:
IndexError Traceback (most recent call last)
in ()
1 num_labels = len(np.unique(Y)) # unique labels 681
2 print num_labels
----> 3 simulated_labels = np.eye(num_labels)[Y] # One liner trick!
4 print simulated_labels
5
IndexError: index 1001 is out of bounds for axis 0 with size 681
You can use tf.one_hot (There are examples in the doc string)

TypeError using sns.distplot() on dataframe with one row

I'm plotting subsets of a dataframe, and one subset happens to have only one row. This is the only reason I can think of for why it's causing problems. This is what it looks like:
problem_dataframe = prob_df[prob_df['Date']==7]
problem_dataframe.head()
I try to do:
sns.distplot(problem_dataframe['floatTime'])
But I get the error:
TypeError: len() of unsized object
Would someone please tell me what's causing this and how to work around it?
The TypeError is resolved by setting bins=1.
But that uncovers a different error, ValueError: x must be 1D or 2D, which gets triggered by an internal function in Matplotlib's hist(), called _normalize_input():
import pandas as pd
import seaborn as sns
df = pd.DataFrame(['Tue','Feb',7,'15:37:58',2017,15.6196]).T
df.columns = ['Day','Month','Date','Time','Year','floatTime']
sns.distplot(df.floatTime, bins=1)
Output:
ValueError Traceback (most recent call last)
<ipython-input-25-858df405d200> in <module>()
6 df.columns = ['Day','Month','Date','Time','Year','floatTime']
7 df.floatTime.values.astype(float)
----> 8 sns.distplot(df.floatTime, bins=1)
/home/andrew/anaconda3/lib/python3.6/site-packages/seaborn/distributions.py in distplot(a, bins, hist, kde, rug, fit, hist_kws, kde_kws, rug_kws, fit_kws, color, vertical, norm_hist, axlabel, label, ax)
213 hist_color = hist_kws.pop("color", color)
214 ax.hist(a, bins, orientation=orientation,
--> 215 color=hist_color, **hist_kws)
216 if hist_color != color:
217 hist_kws["color"] = hist_color
/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/__init__.py in inner(ax, *args, **kwargs)
1890 warnings.warn(msg % (label_namer, func.__name__),
1891 RuntimeWarning, stacklevel=2)
-> 1892 return func(ax, *args, **kwargs)
1893 pre_doc = inner.__doc__
1894 if pre_doc is None:
/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
6141 x = np.array([[]])
6142 else:
-> 6143 x = _normalize_input(x, 'x')
6144 nx = len(x) # number of datasets
6145
/home/andrew/anaconda3/lib/python3.6/site-packages/matplotlib/axes/_axes.py in _normalize_input(inp, ename)
6080 else:
6081 raise ValueError(
-> 6082 "{ename} must be 1D or 2D".format(ename=ename))
6083 if inp.shape[1] < inp.shape[0]:
6084 warnings.warn(
ValueError: x must be 1D or 2D
_normalize_input() was removed from Matplotlib (it looks like sometime last year), so I guess Seaborn is referring to an older version under the hood.
You can see _normalize_input() in this old commit:
def _normalize_input(inp, ename='input'):
"""Normalize 1 or 2d input into list of np.ndarray or
a single 2D np.ndarray.
Parameters
----------
inp : iterable
ename : str, optional
Name to use in ValueError if `inp` can not be normalized
"""
if (isinstance(x, np.ndarray) or
not iterable(cbook.safe_first_element(inp))):
# TODO: support masked arrays;
inp = np.asarray(inp)
if inp.ndim == 2:
# 2-D input with columns as datasets; switch to rows
inp = inp.T
elif inp.ndim == 1:
# new view, single row
inp = inp.reshape(1, inp.shape[0])
else:
raise ValueError(
"{ename} must be 1D or 2D".format(ename=ename))
...
I can't figure out why inp.ndim!=1, though. Performing the same np.asarray().ndim on the input returns 1 as expected:
np.asarray(df.floatTime).ndim # 1
So you're facing a few obstacles if you want to make a single-valued input work with sns.distplot().
Suggested Workaround
Check for a single-element df.floatTime, and if that's the case, just use plt.hist() instead (which is what distplot goes to anyway, along with KDE):
plt.hist(df.floatTime)