I have trained a cnn model and I am trying to stack the output layer to an xgboost regressor to reduce mape. I am getting OOM error in Sagemaker training job when I try to include the input data (in npy format) with the cnn output layer and save it as csv - so this can be input to xgboost. When I try to run this in Sagemaker notebook instance the kernel dies. The training input npy file is around 42gb and I have tried these instances : ml.m5d.24xlarge, ml.r5.24xlarge
Here is my code I am running in notebook:
'''
import numpy as np
import tensorflow as tf
import boto3
from io import BytesIO
from keras.models import load_model
from keras import backend as K
client = boto3.client("s3")
bucket = <bucket_name>
key = '/path/cnn_model.h5'
client.download_file(bucket, key, 'cnn_model.h5')
cnn_model = load_model("cnn_model.h5")
def read_s3_npy(s3_uri, arg = False):
bytes = BytesIO()
bytes_.seek(0)
parsed_s3 = urlparse(s3_uri)
obj = client.get_object(Bucket=parsed_s3.netloc, key = parsed_s3.path[1:])
return np.load(BytesIO(obj['Body'].read()), allow_pickle=arg)
x_train_path = <path in s3>+'x_train.npy'
y_train_path = <path in s3>+'y_train.npy'
x_train = read_s3_npy(x_train_path)
y_train = read_s3_npy(y_train_path)
last_layer_op = K.function([cnn_model.layers[0].input], [cnn_model.layers[-2].output])
train_layer = last_layer_op([x_train, 1])[0]
'''
Related
I'm trying to use huggingface and tensorflow to train a BERT model on some data. Here's my code:
First, I initialized the tokenizer.
from transformers import BertTokenizer
tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', sep_token = "||")
Then applied my tokenizer to my data.
def preprocess_function(x):
return tokenizer(x, truncation = True, return_tensors = 'tf')['input_ids']
from tqdm import tqdm
tqdm.pandas()
df["Text"] = df["Text"].progress_apply(preprocess_function)
And some more preprocessing..
df["intvwStatus"] = [0 if x == "Completed" else 1 for x in df["intvwStatus"]]
import numpy as np
train, validate, test = \
np.split(df.sample(frac=1, random_state=42),
[int(.6*len(df)), int(.8*len(df))])
Created an optimizer
from transformers import create_optimizer
import tensorflow as tf
batch_size = 16
num_epochs = 5
batches_per_epoch = len(train) // batch_size
total_train_steps = int(batches_per_epoch * num_epochs)
optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=total_train_steps)
And then finally instantiated and compiled my model
from transformers import TFBertForSequenceClassification
model = TFBertForSequenceClassification.from_pretrained("bert-base-uncased")
import tensorflow as tf
model.compile(optimizer=optimizer)
Then fit my model
x_train = train["Text"]
y_train = train["intvwStatus"]
x_val = validate["Text"]
y_val = validate["intvwStatus"]
model.fit(x=x_train,y=y_train, validation_data=(x_val, y_val), epochs=3)
Which gives error:
ValueError: Failed to convert a NumPy array to a Tensor (Unsupported object type tensorflow.python.framework.ops.EagerTensor).
I'm confused. Why is it confusing tensorflow.python.framework.ops.EagerTensor to a NumPy array?
I have trained an image multi classification model based on MobileNet-V2(Only the Dense layer has been added), and have carried out full integer quantization(INT8), and then exported model.tflite file, using TF Class () to call this model.
Here is my code to quantify it:
import tensorflow as tf
import numpy as np
import pathlib
def representative_dataset():
for _ in range(100):
data = np.random.rand(1, 96, 96, 3) // random tensor for test
yield [data.astype(np.float32)]
converter = tf.lite.TFLiteConverter.from_saved_model('saved_model/my_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
tflite_quant_model = converter.convert()
tflite_models_dir = pathlib.Path("/tmp/mnist_tflite_models/")
tflite_models_dir.mkdir(exist_ok=True, parents=True)
tflite_model_quant_file = tflite_models_dir/"mnist_model_quant.tflite"
tflite_model_quant_file.write_bytes(tflite_quant_model)
The accuracy of this model is quite good in the test while training. However, when tested on openmv, the same label is output for all objects (although the probability is slightly different).
I looked up some materials, one of them mentioned TF Classify() has offset and scale parameters, which is related to compressing RGB values to [- 1,0] or [0,1] during training, but this parameter is not available in the official API document.
for obj in tf.classify(self.net , img1, min_scale=1.0, scale_mul=0.5, x_overlap=0.0, y_overlap=0.0):
print("**********\nTop 1 Detections at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
sorted_list = sorted(zip(self.labels, obj.output()), key = lambda x: x[1], reverse = True)
for i in range(1):
print("%s = %f" % (sorted_list[i][0], sorted_list[i][1]))
return sorted_list[i][0]
So are there any examples of workflow from tensorflow training model to deployment to openmv?
I can load the model with load_model("model.h5") in colab and do a model.predict which works. But when I download the h5 file and run load_model locally, the load_model call gets an error "ValueError: Improperly formatted model config."
This is the model:
base_model=MobileNet(weights='imagenet',include_top=False) #imports the mobilenet model and discards the last 1000 neuron layer.
x=base_model.output
x=GlobalAveragePooling2D()(x)
x=Dense(1024,activation='relu')(x) #we add dense layers so that the model can learn more complex functions and classify for better results.
x=Dense(1024,activation='relu')(x) #dense layer 2
x=Dense(512,activation='relu')(x) #dense layer 3
preds=Dense(2,activation='softmax')(x) #final layer with softmax activation
using transfer learning
model=Model(inputs=base_model.input,outputs=preds)
for layer in model.layers[:20]:
layer.trainable=False
for layer in model.layers[20:]:
layer.trainable=True
then trained
train_generator=train_datagen.flow_from_directory('/content/chest_xray/train/',
target_size=(224,224),
color_mode='rgb',
batch_size=32,
class_mode='categorical', shuffle=True)
model.compile(optimizer='Adam',loss='categorical_crossentropy',metrics=['accuracy'])
# Adam optimizer
# loss function will be categorical cross entropy
# evaluation metric will be accuracy
step_size_train=train_generator.n//train_generator.batch_size
model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
epochs=5)
model saved
model.save('chest-xray-pneumonia.h5')
prediction works
ill_path = "/content/chest_xray/train/PNEUMONIA/"
good_path = "/content/chest_xray/train/NORMAL/"
ill_pic = ill_path + os.listdir(ill_path)[1]
good_pic = good_path + os.listdir(good_path)[1]
print(get_rez(ill_pic))
print(get_rez(good_pic))
But locally running in a Flask app python script main.py, it doesn't
from flask import render_template, jsonify, Flask, redirect, url_for, request
from app import app
import random
import os
from tensorflow.keras.applications.resnet50 import ResNet50
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
from tensorflow.keras.applications.mobilenet import preprocess_input, decode_predictions
import numpy as np
import ipdb
weightsPath = app.config['UPLOAD_FOLDER']
get error on the next line: ValueError: Improperly formatted model config.
new_model = load_model(os.path.join(app.config['UPLOAD_FOLDER'],"chest-xray-pneumonia.h5"))
def upload_file():
if request.method == 'POST':
f = request.files['file']
path = os.path.join(app.config['UPLOAD_FOLDER'], f.filename)
#f.save(os.path.join(app.config['UPLOAD_FOLDER'], f.filename))
ill_pic = os.path.join(app.config['UPLOAD_FOLDER'],
f.filename)
print(get_rez(ill_pic))
I have a trained an InceptionV3 from scratch on a custom dataset containing 100 classes. Initialized the CNN model on Keras. I am now trying to generate adversarial examples for this model of mine using Foolbox, however I am getting the above error. Where am I going wrong? The library(Foolbox) seems to be working fine for others and my model gets past the image classification process correctly without any error but the wrapper model generates it.
from keras.models import load_model
from keras.applications.vgg16 import VGG16
import foolbox
from foolbox.models import KerasModel
from foolbox.attacks import LBFGSAttack
from foolbox.criteria import TargetClass
import numpy as np
import foolbox
keras.backend.set_learning_phase(0)
model=load_model('standard_inceptionV3.h5')
fmodel = foolbox.models.KerasModel(model, bounds=(0, 255))
from PIL import Image
img = Image.open('/home/shikhar/Downloads/suit.jpeg')
img = img.resize((224,224))
img = np.asarray(img)
img = img[:, :, :3]
lab=model.predict(np.expand_dims(img, axis=0))
label=np.argmax(lab,axis=1)
from foolbox.criteria import Misclassification, TargetClass
attack = foolbox.attacks.FGSM(model=fmodel)
adversarial = attack(img, label,unpack=False)
I just started to use tensorflow, but I failed to import the data properly to use with the DNNClassifier. I actually have two files in the hdf5 format, that I import with pandas. The feature vector has dimension 100 and there are 5 classes where the features can belong to. If I use for example the following code:
import pandas as pd
import numpy as np
import tensorflow as tf
#Data
train = pd.read_hdf("train.h5", "train")
test = pd.read_hdf("test.h5", "test")
Y=train.iloc[0:,0]
X=train.iloc[0:,1:]
X_t=test.iloc[0:,0:]
Y=np.array(Y.values).astype('int')
X=np.array(X.values).astype('double')
X_t=np.array(X_t.values).astype('double')
#Train
feature_columns = [tf.contrib.layers.real_valued_column("", dimension=100)]
classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,
hidden_units=[10, 20],
n_classes=5,
model_dir="/tmp/model")
# Define the training inputs
def get_train_inputs():
x = tf.constant(X)
y = tf.constant(Y)
return x, y
#fit
classifier.fit(input_fn=get_train_inputs, steps=1000)
predictions = list(classifier.predict(input_fn=get_train_inputs))
print(predictions)
I get the error: InvalidArgumentError (see above for traceback): Shape in shape_and_slice spec [100,10] does not match the shape stored in checkpoint: [1,10]
[[Node: save/RestoreV2_2 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/cpu:0"](_recv_save/Const_0, save/RestoreV2_2/tensor_names, save/RestoreV2_2/shape_and_slices)]]
I don't get why this happens? How should I transform my data to apply to this classifier?
My Solution:-
Change your model_dir="/tmp/model" to
model_dir="/tmp/model-1
Note:- It need not to be model-1, replace it with any valid names like
model_dir="/tmp/model-a ..something like that..