I'm doing image classification of with image labels of mode numbers m and n. I'm also varying the convnet training to take in different maximum mode numbers to test it out, before fully training on a massive data set.
Silly question but, given a label (m,n) how do I one hot encode it into an array of length n*m?
Edit: Yikes this is actually really simple,
modenum = %some integer
def getLabel(n,m):
array = np.zeros((modeNum,modeNum),dtype='int8')
array[n,m] = 1
label = np.ndarray.flatten(array)
return label
I want to predict disease and I want to try to make the image have some noise or disruption in specific spot or randomly spot is there any method or solution for it??
is there any way to add noise (random value) to image with tensorflow
I read the image and convert it to array and make a copy of it and then add to it some number is that right??
and i have noticed that when convert it the array became values of zeros and ones even it in rgb form.
i expect the some value in the array or the value in the image change to another values so when imshow (the image) notice some noise (different from guassian noise) so when the input to the model become different from the original image
I have trying this but operand didn't match between(224,224,3) and (224,224)
but when set colormode to grayscal the operand work but i didnt see that much of change in image.
,when trying replace img.size with img.height did'nt work either
img = tf.keras.preprocessing.image.load_img("/content/person1_bacteria_2.jpeg",color_mode="rgb",target_size=(256, 256))
nois_factor = 0.3
n = nois_factor * np.random.randn(*img.size)
noise_image = img + n
I am trying to integrate the Dataset API into my input pipeline. Before this integration, the program used tf.train.batch_join(), which had dynamic padding enabled. Hence, this would batch elements and pad them according to the largest one in the mini-batch.
image, width, label, length, text, filename = tf.train.batch_join(
For dataset, however, I was unable to find the exact alternative to this. I cannot use padded batch, since the dimensions of the images does not have a set threshold. The image width could be anything. My partner and I were able to come up with a work around for this using tf.contrib.data.bucket_by_sequence(). Here is an excerpt:
dataset = dataset.apply(tf.contrib.data.bucket_by_sequence_length
bucket_batch_sizes=np.full(len([0]) + 1, batch_size),
What this does is basically dumps all the elements into the overflow bucket since the boundary is set to 0. Then, it batches it from that bucket since bucketing pads the elements according to the largest one.
Is there a better way to achieve this functionality?
I meet exactly the same problem. Now I know how to solve this. If your input_data only has one dimension that is of variable length, try to use tf.contrib.data.bucket_by_sequence_length to dataset.apply() function, make bucket_batch_sizes = [batch_size] * (len(buckets) + 1). And there is another way to do so just as #mrry has said in comments.
iterator = dataset.make_one_shot_iterator()
item = iterator.get_next()
padded_shapes = []
for i in item:
padded_shapes = tf.contrib.framework.nest.pack_sequence_as(item, padded_shapes)
dataset = dataset.padded_batch(batch_size, padded_shapes)
If one dimension in the shapes of a tensor is None or -1, then padded_batch will pad the tensor on that dimension to max length of the batch.
My training data has two features of varibale length, And this method works fine.
I'm working through building a sequence-to-sequence shakespeare predictor and looking at sample code it seems to do batching in groups of 50 characters. I'm a little confused by this. If the text is continuous and you are processing in 50-character chunks, then surely that means you're only ever calculating loss based on the next expected character after the 50th character, and the model is never being trained on the next expected characters for the other 49 characters. In other words if you have 1000 characters with 20 sets of 50 characters it's only ever being taught about predicting 20 different characters. Shouldn't these batches be shifting by a random offset each epoch so it learns how to predict the other characters?
This can't be right, surely? What am I missing here in my understanding?
Also, are the batches always processed sequentially? When the state is being carried forward to represent the previous sequences, surely this is important.
Update 7/24: Here is the original code...
self.num_batches = int(self.tensor.size / (self.batch_size *
# When the data (tensor) is too small,
# let's give them a better error message
if self.num_batches == 0:
assert False, "Not enough data. Make seq_length and batch_size small."
self.tensor = self.tensor[:self.num_batches * self.batch_size * self.seq_length]
xdata = self.tensor
ydata = np.copy(self.tensor)
ydata[:-1] = xdata[1:]
ydata[-1] = xdata[0]
self.x_batches = np.split(xdata.reshape(self.batch_size, -1),
self.num_batches, 1)
self.y_batches = np.split(ydata.reshape(self.batch_size, -1),
self.num_batches, 1)
As far as I can see it doesn't seem to be overlapping, but I am new at Python so may be missing something.
If you have 1000 chars and if you create 20 sets of 50 chars, that becomes a non-overlapping window, and as you said it won't work. Instead you consider overlapping window by shifting by one char and create (1000-50) sets of training data. This is the right way to do it.
My goal is to detect digits from 0 to 9 on a random background. I wrote a dataset generator with the following features:
Grayscale data
Random digit rotation
Random digit blur
43 different fonts
Random noisy blurred background
Here are 1024 samples of my dataset:
1024 testset samples
I adapted the mnist expert model to train the dataset and get almost 100% on the train and validation set.
On the test set I get approximately 80% correct.
Here is a sample. The green digit is the digit predicted:
9 predicted as 5
It seems that my model has some troubles to distinguish between
1 and 7
8 and 3
9 and 6
5 and 9
I need to detect the digit on any background because the test images are not always binary images.
Now my questions:
For the testset generator:
How useful is applying digit rotation? When I rotate a 7 then I get a 1 for some fonts. When I rotate a 9 I get a 6 (rotation > 90°)
Is the convolution filter already treating image rotation?
Are 180'000 image samples enough to train the model?
For the model:
Should I increase the image size from 28x28 to 56x56 when I apply a blur filter onto the dataset?
What filter size should I use?
Do I have to increase the number of hidden layers?
Thanks a lot for any guide.
If you are stuck with the different image backgrounds, I suggest you try image filtering, which will turn your images into the same background for foreground, assuming your images have good qualities.
Try this (scikit-image library):
import numpy as np
from skimage import filters as flt
filtered_image = np.array(original_image > flt.threshold_li(original_image))
Then you can use the filtered images for both training and prediction.
I ended up extracting the dataset patches out of existing images instead of using a random background with random digits. This gives us less variance and a much better accuracy on the test set.
Here is a working but not so performant implementation which allows us to define shape and stride size:
def patchify(self, arr, shape, stride):
patches = []
arr_shape = arr.shape
(shape_h, shape_w) = shape
(stride_h, stride_w) = stride
num_patches = np.floor(np.array(arr_shape)/np.array(stride))
(num_patches_row, num_patches_col) = (int(num_patches[0]), int(num_patches[1]))
for row in range(num_patches_row):
row_from = row*stride_h
row_to = row_from+shape_h
for col in range(num_patches_col):
col_from = col * stride_w
col_to = col_from + shape_w
origin_information = (row_from,row_to, col_from,col_to)
roi = arr[row_from:row_to, col_from:col_to]
patches.append((roi, origin_information))
return patches
or we can also use scklearn where image is a numpy array
patches = image.extract_patches_2d(image, (patch_height, patch_width))
i have an error at this line:neigh.fit(X, y) :
ValueError: setting an array element with a sequence.
I checked fit function and X is: {array-like, sparse matrix, BallTree, cKDTree}
My X is a list of list with first element solidity number and second elemnt humoment list (7 cells).
If i change and i take only first humoment number for having a pure list of list
give this error: query data dimension must match BallTree data dimension.
My code:
listafeaturevector = list()
path = 'imgknn/'
for infile in glob.glob( os.path.join(path, '*.jpg') ):
print("current file is: " + infile )
gray = cv2.imread(infile,0)
element = cv2.getStructuringElement(cv2.MORPH_CROSS,(6,6))
graydilate = cv2.erode(gray, element)
ret,thresh = cv2.threshold(graydilate,127,255,cv2.THRESH_BINARY_INV)
imgbnbin = thresh
contours, hierarchy = cv2.findContours(imgbnbin, cv2.RETR_TREE ,cv2.CHAIN_APPROX_SIMPLE)
for i in range (0, len(contours)):
fv = list() #1 feature vector
mom = cv2.moments(contours[i], 1)
Humoments = cv2.HuMoments(mom)
fv.append(Humoments) #query data dimension must match BallTree data dimension
area = cv2.contourArea(contours[i])
hull = cv2.convexHull(contours[i]) #ha tanti valori
hull_area = cv2.contourArea(hull)
solidity = float(area)/hull_area
print("i have done")
X = listafeaturevector
y = [0,1,2,3]* (lenmatrice/4)
from sklearn.neighbors import KNeighborsClassifier
neigh = KNeighborsClassifier(n_neighbors=3)
neigh.fit(X, y) #ValueError: setting an array element with a sequence.
If i try to covert it in a numpy array:
listafv = np.dstack(listafeaturevector)
data = listafv.reshape((lenmatrice, -1))
X = data
i got: setting an array element with a sequence
A couple of suggestions/questions:
Humoments = cv2.HuMoments(mom)
What is the class of the return value Humoments? a float or a list? If float, that is fine.
for each image file
for i in range (0, len(contours)):
fv = list() #1 feature vector
The above code does not seem correct. In your problem, I think you need to a construct a feature vector for each image. So anything that is related to image i should go to the same feature vector x_i. Then you combine all feature vectors to get a list of feature vectors X. However, your listafeaturevector (or X) presents in the inner-most loop, it's obviously not correct.
Second, you have a loop against the number of elements in the contours, are you sure the number of elements stays the same for each image? Otherwise, the number of features (|x_i|) is totally different across different images, that might cause the error of
setting an array element with a sequence.
Third, are you clear about how you want to classify the images? what are the target values/labels of different images? I see you just setting labels with [0,1,2,3]* (lenmatrice/4). Can you elaborate on what you are trying to do with those images? Are they containing different type of object? Are they showing different patterns? Are those images describe different topic/color? If yes, for each different type, you give a different label - either 0,1,2 or 'red','white','black' (assume you have only 3 types). The values of the label do not matter. What matters is how many values they have. I am trying to understand the difference of labels in your case.
On the other hand, if you only want to retrieve similar images, you don't need to use a classifier or specify a label for each image. Instead, try to use NearestNeighbors.
Fourth, the above two lines of test are not correct. You need to set an X-like object in order to get a prediction from the classifier. That is to say, you need a feature vector x with the identical structure as you constructed in your training examples (with all h,e,s in the same order).