Related
I'm trying to create a pipline and to fine tune hyperparameters but I try to use fit, I get the error
ValueError: Invalid parameter n_esitmators for estimator Pipeline(steps=[('rfc', RandomForestClassifier())]). Check the list of available parameters with `estimator.get_params().keys()`.
I'd love to get some help with this please.
This is the code I'm using:
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
#Classifier Pipeline
pipeline = Pipeline([
('rfc', RandomForestClassifier())
])
# Params for classifier
params = {'n_estimators': [5,20,50,100,150],
"max_depth": [1, 3,5,10,20,30,50],
"max_features": [1, 3,5,10,20,30,45],
"min_samples_split": [1, 3,5,10],
"min_samples_leaf": [1, 3,5,10]}
# Grid Search Execute
rf_rnd = RandomizedSearchCV(estimator=pipeline, param_distributions = params, cv=5, verbose=2, random_state=0, n_jobs=-1)
rf_rnd.fit(training_data, target)
Made some changes:
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline
from sklearn.model_selection import RandomizedSearchCV
#Classifier Pipeline
pipeline = Pipeline([
('rfc', RandomForestClassifier())
])
# Params for classifier
params = {'estimator__rfc__n_estimators': [5,20,50,100,150],
"estimator__rfc__max_depth": [1, 3,5,10,20,30,50],
"estimator__rfc__max_features": [1, 3,5,10,20,30,45],
"estimator__rfc__min_samples_split": [1, 3,5,10],
"estimator__rfc__min_samples_leaf": [1, 3,5,10]}
# Grid Search Execute
rf_rnd = RandomizedSearchCV(estimator=pipeline, param_distributions = params, cv=5, random_state=0, n_jobs=-1, return_train_score=True)
Now the error is:
ValueError: Invalid parameter estimator for estimator Pipeline(steps=[('rfc', RandomForestClassifier())]). Check the list of available parameters with estimator.get_params().keys().
And this is what estimator.get_params().keys() outputs:
dict_keys(['cv', 'error_score', 'estimator__memory', 'estimator__steps', 'estimator__verbose', 'estimator__rfc', 'estimator__rfc__bootstrap', 'estimator__rfc__ccp_alpha', 'estimator__rfc__class_weight', 'estimator__rfc__criterion', 'estimator__rfc__max_depth', 'estimator__rfc__max_features', 'estimator__rfc__max_leaf_nodes', 'estimator__rfc__max_samples', 'estimator__rfc__min_impurity_decrease', 'estimator__rfc__min_impurity_split', 'estimator__rfc__min_samples_leaf', 'estimator__rfc__min_samples_split', 'estimator__rfc__min_weight_fraction_leaf', 'estimator__rfc__n_estimators', 'estimator__rfc__n_jobs', 'estimator__rfc__oob_score', 'estimator__rfc__random_state', 'estimator__rfc__verbose', 'estimator__rfc__warm_start', 'estimator', 'iid', 'n_iter', 'n_jobs', 'param_distributions', 'pre_dispatch', 'random_state', 'refit', 'return_train_score', 'scoring', 'verbose'])
When you use a pipeline with any search (randomized or grid) the param_grid keys have to follow this syntax (step name)__(param name). So you should use rfc__n_estimators and so on to the other parameters.
Summarizing just remove the estimator__ part that you've added in your last modification
When loading my file i get this error: I dont see the problem.
hm = matplotsoccer.count(x, y, n=25, m=25) # Construct a 25x25 heatmap from x,y-coordinates
hm = scipy.ndimage.gaussian_filter(hm,1) # blur the heatmap
matplotsoccer.heatmap(hm,cmap='Blues', figsize=20, linecolor="black", cbar=True) # plot the heatmap
I resorted to using the cloud training workflow. Given the product I got, I would have expected to drop directly into the code that I have that works with other tflite models, but the cloud produced model doesn't work. I get "index out of range" when asking for interpreter.get_tensor parameters.
Here is my code, basically a modified example, where I can ingest a video and produce a video with results.
import argparse
import cv2
import numpy as np
import sys
import importlib.util
# Define and parse input arguments
parser = argparse.ArgumentParser()
parser.add_argument('--modeldir', help='Folder the .tflite file is located in',
required=True)
parser.add_argument('--graph', help='Name of the .tflite file, if different than detect.tflite',
default='model.tflite')
# default='/tmp/detect.tflite')
parser.add_argument('--labels', help='Name of the labelmap file, if different than labelmap.txt',
default='dict.txt')
# default='/tmp/coco_labels.txt')
parser.add_argument('--threshold', help='Minimum confidence threshold for displaying detected objects',
default=0.5)
parser.add_argument('--video', help='Name of the video file',
default='test.mp4')
parser.add_argument('--edgetpu', help='Use Coral Edge TPU Accelerator to speed up detection',
action='store_true')
args = parser.parse_args()
MODEL_NAME = args.modeldir
GRAPH_NAME = args.graph
LABELMAP_NAME = args.labels
VIDEO_NAME = args.video
min_conf_threshold = float(args.threshold)
use_TPU = args.edgetpu
# Import TensorFlow libraries
# If tensorflow is not installed, import interpreter from tflite_runtime, else import from regular tensorflow
# If using Coral Edge TPU, import the load_delegate library
pkg = importlib.util.find_spec('tensorflow')
pkg = True
if pkg is None:
from tflite_runtime.interpreter import Interpreter
if use_TPU:
from tflite_runtime.interpreter import load_delegate
else:
from tensorflow.lite.python.interpreter import Interpreter
if use_TPU:
from tensorflow.lite.python.interpreter import load_delegate
# If using Edge TPU, assign filename for Edge TPU model
if use_TPU:
# If user has specified the name of the .tflite file, use that name, otherwise use default 'edgetpu.tflite'
if (GRAPH_NAME == 'detect.tflite'):
GRAPH_NAME = 'edgetpu.tflite'
# Get path to current working directory
CWD_PATH = os.getcwd()
# Path to video file
VIDEO_PATH = os.path.join(CWD_PATH,VIDEO_NAME)
# Path to .tflite file, which contains the model that is used for object detection
PATH_TO_CKPT = os.path.join(CWD_PATH,MODEL_NAME,GRAPH_NAME)
# Path to label map file
PATH_TO_LABELS = os.path.join(CWD_PATH,MODEL_NAME,LABELMAP_NAME)
# Load the label map
with open(PATH_TO_LABELS, 'r') as f:
labels = [line.strip() for line in f.readlines()]
# Have to do a weird fix for label map if using the COCO "starter model" from
# https://www.tensorflow.org/lite/models/object_detection/overview
# First label is '???', which has to be removed.
if labels[0] == '???':
del(labels[0])
# Load the Tensorflow Lite model.
# If using Edge TPU, use special load_delegate argument
if use_TPU:
interpreter = Interpreter(model_path=PATH_TO_CKPT,
experimental_delegates=[load_delegate('libedgetpu.so.1.0')])
print(PATH_TO_CKPT)
else:
interpreter = Interpreter(model_path=PATH_TO_CKPT)
interpreter.allocate_tensors()
# Get model details
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()
height = input_details[0]['shape'][1]
width = input_details[0]['shape'][2]
floating_model = (input_details[0]['dtype'] == np.float32)
input_mean = 127.5
input_std = 127.5
# Open video file
video = cv2.VideoCapture(VIDEO_PATH)
imW = video.get(cv2.CAP_PROP_FRAME_WIDTH)
imH = video.get(cv2.CAP_PROP_FRAME_HEIGHT)
out = cv2.VideoWriter('output.avi', cv2.VideoWriter_fourcc(
'M', 'J', 'P', 'G'), 10, (1920, 1080))
while(video.isOpened()):
# Acquire frame and resize to expected shape [1xHxWx3]
ret, frame = video.read()
frame_rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
frame_resized = cv2.resize(frame_rgb, (width, height))
input_data = np.expand_dims(frame_resized, axis=0)
# Normalize pixel values if using a floating model (i.e. if model is non-quantized)
if floating_model:
input_data = (np.float32(input_data) - input_mean) / input_std
# Perform the actual detection by running the model with the image as input
interpreter.set_tensor(input_details[0]['index'],input_data)
interpreter.invoke()
# Retrieve detection results
boxes = interpreter.get_tensor(output_details[0]['index'])[0] # Bounding box coordinates of detected objects
classes = interpreter.get_tensor(output_details[1]['index'])[0] # Class index of detected objects
scores = interpreter.get_tensor(output_details[2]['index'])[0] # Confidence of detected objects
print (boxes)
print (classes)
print (scores)
#num = interpreter.get_tensor(output_details[3]['index'])[0] # Total number of detected objects (inaccurate and not needed)
# Loop over all detections and draw detection box if confidence is above minimum threshold
for i in range(len(scores)):
if ((scores[i] > min_conf_threshold) and (scores[i] <= 1.0)):
# Get bounding box coordinates and draw box
# Interpreter can return coordinates that are outside of image dimensions, need to force them to be within image using max() and min()
ymin = int(max(1,(boxes[i][0] * imH)))
xmin = int(max(1,(boxes[i][1] * imW)))
ymax = int(min(imH,(boxes[i][2] * imH)))
xmax = int(min(imW,(boxes[i][3] * imW)))
cv2.rectangle(frame, (xmin,ymin), (xmax,ymax), (10, 255, 0), 4)
# Draw label
object_name = labels[int(classes[i])] # Look up object name from "labels" array using class index
label = '%s: %d%%' % (object_name, int(scores[i]*100)) # Example: 'person: 72%'
labelSize, baseLine = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, 0.7, 2) # Get font size
label_ymin = max(ymin, labelSize[1] + 10) # Make sure not to draw label too close to top of window
cv2.rectangle(frame, (xmin, label_ymin-labelSize[1]-10), (xmin+labelSize[0],
label_ymin+baseLine-10), (255, 255, 255), cv2.FILLED) # Draw white box to put label text in
cv2.putText(frame, label, (xmin, label_ymin-7), cv2.FONT_HERSHEY_SIMPLEX,
0.7, (0, 0, 0), 2) # Draw label text
# All the results have been drawn on the frame, so it's time to display it.
cv2.imshow('Object detector', frame)
#output_rgb = cv2.cvtColor(frame, cv2.COLOR_RGB2BGR)
out.write(frame)
# Press 'q' to quit
if cv2.waitKey(1) == ord('q'):
break
# Clean up
video.release()
out.release()
cv2.destroyAllWindows()
Here is what the print statements should look like when using the canned tflite model:
[32. 76. 56. 76. 0. 61. 74. 0. 0. 0.]
[0.609375 0.48828125 0.44921875 0.44921875 0.4140625 0.40234375
0.37890625 0.3125 0.3125 0.3125 ]
[[-0.01923192 0.17330796 0.747546 0.8384144 ]
[ 0.01866053 0.5023282 0.39603746 0.6143299 ]
[ 0.01673795 0.47382414 0.34407628 0.5580931 ]
[ 0.11588445 0.78543806 0.8778869 1.0039229 ]
[ 0.8106107 0.70675755 1.0080075 0.89248717]
[ 0.84941524 0.06391776 1.0006479 0.28792098]
[ 0.05543692 0.53557926 0.40413857 0.62823087]
[ 0.07051808 -0.00938512 0.8822515 0.28100258]
[ 0.68205094 0.33990026 0.9940187 0.6020821 ]
[ 0.08010477 0.01998334 0.6011186 0.26135433]]
Here is the error when presented with the cloud created model:
File "tflite_vid.py", line 124, in <module>
classes = interpreter.get_tensor(output_details[1]['index'])[0] # Class index of detected objects
IndexError: list index out of range
So I would kindly ask that someone explain how to either develop a TFlite model with TF2 with Python or how to get the cloud to generate a usable TFlite model. Please oh please do not point me into a direction that entails wondering through the Internet examples unless they are the actual gospel on how to do this.,
In output_details[1], it is [1] <- list index out of range. Your model may have 1 output, but the code try to access the 2nd output.
For more usage about Python code, please refer to https://www.tensorflow.org/lite/guide/inference#load_and_run_a_model_in_python for guidance.
Image training is ok with ssd_mobilenet_v1_coco in tensorflow object detection api.
getting the error while testing:
File "/home/hipstudents/anaconda3/envs/tensorflow_gpuenv/lib/python3.6/site-packages/object_detection-0.1-py3.6.egg/object_detection/utils/object_detection_evaluation.py", line 203, in add_single_ground_truth_image_info
raise ValueError('Image with id {} already added.'.format(image_id))
Please help.
System Info:
What is the top-level directory of the model you are using: ~/
Have I written custom code (as opposed to using a stock example script provided in TensorFlow): Yes, written scripts to convert .xml files to tf record
OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
TensorFlow installed from (source or binary): Compiled from source
TensorFlow version (use command below): 1.11.0
Bazel version (if compiling from source): 0.16.1
CUDA/cuDNN version: 9.0.176, cuDNN: 9.0
GPU model and memory: GeForce GTX1080Ti, 11GB
Exact command to reproduce: python eval.py --logtostderr --pipeline_config_path=training/ssd_mobilenet_v1_coco.config --checkpoint_dir=training/ --eval_dir=eval/
I created dataset manually. Then label it using labelimg. after labeling I created csv file for image annotation and file name. then I create tf record. I follow this tutorial: https://towardsdatascience.com/how-to-train-your-own-object-detector-with-tensorflows-object-detector-api-bec72ecfe1d9
My tfrecord generator for training and testing image:
"""
Usage:
# From tensorflow/models/
# Create train data:
python generate_tfrecord.py --csv_input=data/train_labels.csv --output_path=train.record
# Create test data:
python generate_tfrecord.py --csv_input=data/test_labels.csv --output_path=test.record
"""
from __future__ import division
from __future__ import print_function
from __future__ import absolute_import
import os
import io
import pandas as pd
import tensorflow as tf
from PIL import Image
from object_detection.utils import dataset_util
from collections import namedtuple, OrderedDict
flags = tf.app.flags
flags.DEFINE_string('csv_input', '', 'Path to the CSV input')
flags.DEFINE_string('output_path', '', 'Path to output TFRecord')
FLAGS = flags.FLAGS
# TO-DO replace this with label map
def class_text_to_int(row_label):
if row_label == 'Field':
return 1
else:
None
def split(df, group):
data = namedtuple('data', ['filename', 'object'])
gb = df.groupby(group)
return [data(filename, gb.get_group(x)) for filename, x in zip(gb.groups.keys(), gb.groups)]
def create_tf_example(group, path):
with tf.gfile.GFile(os.path.join(path, '{}'.format(group.filename)), 'rb') as fid:
encoded_jpg = fid.read()
encoded_jpg_io = io.BytesIO(encoded_jpg)
image = Image.open(encoded_jpg_io)
width, height = image.size
filename = group.filename.encode('utf8')
image_format = b'jpg'
xmins = []
xmaxs = []
ymins = []
ymaxs = []
classes_text = []
classes = []
for index, row in group.object.iterrows():
xmins.append(row['xmin'] / width)
xmaxs.append(row['xmax'] / width)
ymins.append(row['ymin'] / height)
ymaxs.append(row['ymax'] / height)
classes_text.append(row['class'].encode('utf8'))
classes.append(class_text_to_int(row['class']))
tf_example = tf.train.Example(features=tf.train.Features(feature={
'image/height': dataset_util.int64_feature(height),
'image/width': dataset_util.int64_feature(width),
'image/filename': dataset_util.bytes_feature(filename),
'image/source_id': dataset_util.bytes_feature(filename),
'image/encoded': dataset_util.bytes_feature(encoded_jpg),
'image/format': dataset_util.bytes_feature(image_format),
'image/object/bbox/xmin': dataset_util.float_list_feature(xmins),
'image/object/bbox/xmax': dataset_util.float_list_feature(xmaxs),
'image/object/bbox/ymin': dataset_util.float_list_feature(ymins),
'image/object/bbox/ymax': dataset_util.float_list_feature(ymaxs),
'image/object/class/text': dataset_util.bytes_list_feature(classes_text),
'image/object/class/label': dataset_util.int64_list_feature(classes),
}))
return tf_example
def main(_):
writer = tf.python_io.TFRecordWriter(FLAGS.output_path)
path = os.path.join(os.getcwd(), 'Images')
examples = pd.read_csv(FLAGS.csv_input)
grouped = split(examples, 'filename')
for group in grouped:
tf_example = create_tf_example(group, path)
writer.write(tf_example.SerializeToString())
writer.close()
output_path = os.path.join(os.getcwd(), FLAGS.output_path)
print('Successfully created the TFRecords: {}'.format(output_path))
if __name__ == '__main__':
tf.app.run()
In the ssd_mobilenet_coco_v1.config file, num_examples was 8000. In my case, test dataset only has 121 samples. I forgot to update that and got new kind of error that I couldn't find on the Internet. As it is a silly mistake, so I think a very few people did that. this answer might help someone who will do this kind of mistake. I changed the following in the config file and the error is resolved:
eval_config: {
#num of test images. In my case 121. Previously It was 8000
num_examples: 121
# Note: The below line limits the evaluation process to 10 evaluations.
# Remove the below line to evaluate indefinitely.
max_evals: 10
}
In my case, the problem was that I'd included images multiple times when constructing tfrecord files. Though now obvious, I hadn't noticed that many categories of the Open Images Dataset share the same images (which would have the same id in the evaluation, thus the error...). Once I'd corrected the algorithm creating the tfrecords, the error was gone.
I have solved the problem with this article: https://www.coder.work/article/3120495
By just adding 2 line
eval_config {
num_examples: 50
use_moving_averages: false
metrics_set: "coco_detection_metrics"
}
$ scrapy version: 0.14 $
$ file: settings.py $
EXTENSIONS = { 'myproject.extensions.MySQLManager': 500 }
$ file: pipeline.py $
# -- coding: utf-8 --
# Define your item pipelines here #
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: doc.scrapy.org/topics/item-pipeline.html
from scrapy.project import extensions
from urlparse import urlparse
import re
class MySQLStorePipeline(object):
def process_item(self, item, spider):
if self.is_allowed_domain(item['url'], spider) is True:
MySQLManager.cursor... #cannot load MySQLManager
ImportError: cannot import name extensions.
i can't find extensions class in /scrapy/project.py
I don't know what are you trying to import but as far as I understand, if you want to "import" an extension, you should add some lines to your settings.py file in your scrapy project.
Example:
EXTENSIONS = {
'scrapy.contrib.corestats.CoreStats': 500,
'scrapy.webservice.WebService': 500,
'scrapy.myextension.myextension': 500,
}
You can read more about Scrapy extensions here