Regression analysis on color images using CNN and underlying image array to extract parameter - tensorflow

I am doing a regression analysis to extract the parameter from the data which is a 2D-array using CNN. The array represents some sort of map of the underlying parameter I am trying to extract. I converted the array into jpgs,pngs and fed into 3-channel 2D CNN model. So far the CNN model is able to extract the underlying parameter from the images but the way images are generated is by converting the array into image using plt.imshow() function in matplotlib which gives color image(3-channel) with compression in the data.
The issue in this case is loss of information in compression of array size during conversion to RGB images. So, I tried building a CNN model where I directly input the raw array into the network without converting it into image but the regression is very poor whereas for the same datasets the regression is quite good if I feed converted jpg or png images.
I suspect that 3-channel in the image is responsible for CNN to perform better with images. Logically speaking, an array is converted into RGB mode from 0 to 255 levels for each channel, isn't it same as feature scaling the data but from 0 to 255 instead of 0 to 1.
Prediction for color images
Prediction for raw array
So, I tried scaling the data raw array from 0 to 1 and stacking up 3 times making it 3 channel raw array and feed the data into the network but the prediction was still quite poor.
If my logic is correct then I want to make use of 3-channel in CNN to extract the parameter from the raw array without loss of information. Is there anyway to do it? What else can I implement to get the similar prediction as from images but from the raw 2D-array instead.

Related

Convolutional network returning a matrix with a image as input

I have been trying to code a model that looks at an image with a grid and returns a matrix with the contents of that grid.
Here is an example of the input image:
Input
And this should be the output:
[30202133333,
12022320321,
23103100322,
13103110301,
22221301212,
33100210001,
11012010320,
21230233011,
00330223230,
02121221220,
23133103321,
23110110330]
With 0: Blue, 1: Pink, 2: Lavender, 3: Green
I have a hard time finding resources on how to do this. What would be the simpelst way?
Thanks in advance!
There could be multiple design choices to generate this type of output. I suggest using Autoencoders.
Here is some information about Autoencoders taken from Wikipedia -
An autoencoder is a type of artificial neural network used to learn
efficient codings of unlabeled data (unsupervised learning).1 The
encoding is validated and refined by attempting to regenerate the
input from the encoding. The autoencoder learns a representation
(encoding) for a set of data, typically for dimensionality reduction,
by training the network to ignore insignificant data (“noise”).
While autoencoders are typically used to reconstruct the input, you have a slightly different problem of mapping the input to a specific matrix.
You'd want to set up the architecture by providing images as input and the corresponding matrices as your "labels." The architecture can be further optimized by using Convolutional layers instead of MLP layers.

How to modify the tensorflow loss function to suit multi labels on the same image

Tensorflow is fairly new to me and the way i would have the loss calculated on the mnist dataset was using the softmax_cross_entropy_with_logits function.
This function worked on that dataset due to the label input being a single label on each image
What im trying to do is to train a CNN on the mscoco dataset which has multiple labels on the same image with 80 classes total.
Is there a function that makes that possible?
My label input is currently somewhat a modified onehot representation, meaning that for each image i have a list of 80 elements having 0 for categories not in the image and 1 for categories present in an image
I.e. an image with a human and a dog would have a list of [0,1,0,0,1] assuming i have 5 classes with dogs and humans being in index 1 and 4
For multi-label classification problem, you can use the sigmoid function available in tensorflow (tf.nn.sigmoid_cross_entropy_with_logits). It would take the onehot encoded label input along with the final logits layer as its input.

Vector representation in multidimentional time-series prediction in Tensorflow

I have a large data set (~30 million data-points with 5 features) that I have reduced using K-means down to 200,000 clusters. The data is a time-series with ~150,000 time-steps. The data on which I would like to train the model is the presence of particular clusters at each time-step. The purpose of the predictive model is generate a generalized sequence similar to generating syntactically correct sentences from a model trained on word sequences. The easiest way to think about this data is that I'm trying to predict the pixels in the next video frame from pixels in the current video frame in order to generate a new sequence of frames that approximate the original sequence.
The raw and sparse representation at each time-step would be 200,000 binary values representing which clusters are present or not at that time step. Note, no more than 200 clusters may be present in any one time-step and thus this representation is extremely sparse.
What is the best representation to convert this sparse vector to a dense vector that would be more suitable to time-series prediction using Tensorflow?
I initially had in mind a RNN / LSTM trained on the vectors at each time-step, but due to the size of the training vector I'm now wondering if a convolution approach would be more suitable.
Note, I have not actually used tensorflow beyond some simple tutorials, but have have previously used OpenCV ML functions. Please consider me a novice in your responses.
Thank you.

TensorFlow: Convolution Neural Network with non-image input

I am interested in using Tensorflow for training my data for binary classification based on CNN.
Now I wonder about how to set the filter value, number of output nodes in the convolution process.
I have read many tutorials and example. However, most of them use image data and I cannot compare it with my data that is customer data, not pixel.
So could you suggest me about this issue?
If you data varies in time or space then you can use CNN,I am currently working with EEG data set which varies in time.Also you can refer to this paper
http://www.nlpr.ia.ac.cn/english/irds/People/lwang/M-MCG_EN/Publications/2015/YD2015ACPR.pdf
were the input data(Which is not an image) is presented as an image to the CNN.
You have to reshape the data to be 4d. In this example, I have only 4 column.
x_train = np.reshape(x_train, (x_train.shape[0],2, 2,1))
x_test = np.reshape(x_test, (x_test.shape[0],2,2, 1))
This is a good example to use none image data
https://github.com/fengjiqiang/LSTM-Wind-Speed-Forecasting
You just need to change the following :
prediction_cols
feature_cols
features
and dataload
This tutorial for text :
Here !
You might use one of following classes:
class Dataset: Represents a potentially large set of elements.
class FixedLengthRecordDataset: A Dataset of fixed-length records
from one or more binary files.
class Iterator: Represents the state of iterating through a Dataset.
class TFRecordDataset: A Dataset comprising records from one or more
TFRecord files.
class TextLineDataset: A Dataset comprising lines from one or more
text files.
Tutorial
official documentation

Feeding both jpeg and png images to the pre-trained inception3 model?

I gather from this question and its answer [ feeding image data in tensorflow for transfer learning ] that adding a new op to the imported graph will help, but it isn't clear to me if the resulting graph will handle both png and jpeg inputs automatically, and at the same time.
The answer to the above question suggests the following:
png_data = tf.placeholder(tf.string, shape=[])
decoded_png = tf.image.decode_png(png_data, channels=3)
# ...
graph_def = ...
softmax_tensor = tf.import_graph_def(
graph_def,
input_map={'DecodeJpeg:0': decoded_png},
return_elements=['softmax:0'])
sess.run(softmax_tensor, {png_data: ...})
Does this mean that a PNG input must be passed in as
sess.run(softmax_tensor, {png_data: image_array})
And a JPEG input must be given to the graph as
sess.run(softmax_tensor, {'DecodeJpeg:0': image_array})
Would the second statement work after the graph has been modified and an op added at the bottom?
The answers in the previous question center around switching the graph from taking JPEGs to PNGs. With the network as specified, there's no way for it to handle both.
You have a few options if you need to deal with both types.
Handle the decoding yourself, either with PIL, or TensorFlow, and feed the decoded image bytes into the graph at the output of the existing decode node.
If you're happy feeding the network, then do a two-step operation where you re-plumb the input to read from a variable, and create two new nodes that write decoded output to that variable.
sess.run(feed_jpeg, feed_dict={in_jpg: my_jpg})
sess.run(the_network)
or
sess.run(feed_png, feed_dict={in_png: my_png})
sess.run(the_network)
Create a more complex conditional input path where you can feed a flag value that tells it what data type it is, and uses TF conditionals to only pull on the specified decode node.
Write a new op that dispatches to either decode_png or decode_jpeg as necessary, based upon the format string at the start of the data.
I'm hoping we'll expose some string comparison ops so that you could write (4) in pure TensorFlow, but I don't have a timeline for any of that.