Classify on test data, number of instances shows of training data - testing

I have a dataset and I divided the dataset into training set(70%) and test set(30%). The total number of instances in my dataset is 1816. Training set has 1273 instances and test set has 543 instances. I used Naive Bayes classifier for classification purpose. I am using Weka software for classification. When I provide test dataset using classify tab in Weka, I found after classification it shows number of instances 1273. I provided test set where number of instances 543. However if I take a look on the confusion matrix
a b <-- classified as
31 44 | a = Yes
36 432 | b = No
As you can understand from confusion matrix if I add 31+44+36+432 = 543
I can not understand why I saw the number of instances of training set (1237) while I am doing classification on test set.
I opened the training set using open file button on preprocess tab and then go to classify tab and supplied test set.

Related

Multiple target (large) neural network regression using Python

My situation is I have a excel file with 747 nodes (as input) each with a value (imagine 747 columns with floats) and I have an output of 741 values/columns with again floats. These are basically inputs and outputs of a geological simulation. So one row has 747(input)+741(output) = 1488 floats which is one dataset (from one simulation). I have 4 such datasets (rows) to train a neural network such that when I test them on 3 test datasets (747 columns) I get the output of 741 columns. This is just a simple run to get the skeleton of the neural network going before further modifications.
I have come across the Multi-Target Regression example of NYCTaxi (https://github.com/zeahmed/DeepLearningWithMLdotNet/tree/master/NYCTaxiMultiOutputRegression) but I can seem to wrap my head around it.
This is the training set (Input till and including column 'ABS', rest is output):
https://docs.google.com/spreadsheets/d/12TKVbGExt9KcK5RQKTexrToVo8qA5YfeItSaa7E2QdU/edit?usp=sharing
This is the test set:
https://docs.google.com/spreadsheets/d/1-RjyZsdguucCSOr9QTdTp2ehJBqWCr5yz1-aRjQ_4zo/edit?usp=sharing
This is the test Output (To validate) : https://docs.google.com/spreadsheets/d/10O_6711CEpJ4DN1w-kCmW01NikjFVZTDmNRuqO3U_6A/edit?usp=sharing
Any guidance/tips would be well appreciated. TIA!
We can use an AutoEncoder for this task. An AutoEncoder takes in the data, compresses it into a latent representation. Now, this representation vector is used to construct the output variable.
So, you can feed the 747-dimensional vector to the model and generate another 747-dimensional vector which is the output. After proper training, the model will be able to generate the target variables for a given set of inputs.

How should the input shape of Keras LSTM layer looks like

I've been reading for a while about training LSTM models using tf.keras, where i did use the same framework for regression problems using simple feedforward NN architectures and i highly understand how should i prepare the input data for such models, however when it comes for training LSTM, i feel so confused about the shape of the input.
There is a lot to take care about: time steps, number of samples, batch size, number of units, etc. In addition to many parameters for the LSTM keras layer that aren't clear for me yet as what is shown in the below example:
model.add(keras.layers.LSTM(units=3, batch_input_shape=(8,2,10), return_sequence=True, stateful= True))
So I have the following forex data structure where i don't know how to reshape it properly for an LSTM model.
open | close | high | low | volume | i1 | i2 | i3 | ... | i30 | nextClose
These features represents the open/close/high/low prices of a certain currency pair, in addition to the volume and 30 different indicators' values from i1 -> i30, all of these features corresponds to one minuet tick. Besides the nextClose feature represents the next minuet tick close price value that I'm trying hopefully to predict.
Q1: Could anyone please explain the general concept of how the data should be shaped for input, and what are all of these required parameters (time step, batch size...)?
Q2: Where I think a simple example could be great, How should my data structure above be reshaped to look like a valid input for LSTM?
After a lot of search i found this excellent summary for keras LSTM in a diagram, that was so helpful. check it in the Keras_LSTM_Diagram git repository.

Architecture of CNN to accept 2 sets of inputs

There are multiple examples how to build Tensorflow model to recognise cats and dogs from images. Now suppose I have audio associated with each picture and train separate network to recognise cats and dogs by sound.
I want to feed predictions of both networks into another layer to combine results and increase final prediction success rate.
How should my model look like?
Create two neural networks, that given a pair image-audio, you input each value to its corresponding net.
After the convolution steps or whatever you want to use, proceed as you would do with a normal CNN, in the last step before passing data to a FNN, when you flatten the data, do the same with the output of the audio NN.
So, as an example, if the output of the images one (flattened) has shape 2048 and the audio 4096 just append these two and make first layer of the FNN to have the sum of these shapes = 6144.

How to customise AlexNet for 3 classes instead of 1000?

I am using an AlexNet from here
The data there has 1000 classes so it has weights according to that. How do I make it work for predicting values for my data that has 3 classes?
I know I have to change the weights but I don't know how.
You need to
add one more layer on top of the pre trained network
. This will be your
output layer
.
This output of the 1000 class layer will be the input of this layer, and it will give your 3 classes as output.
After that train this new network with your images
You just have to set num_classes = 3 which will reduce the number of output classes for both the model output tensor and the separately defined placeholder y.
The number of weights, i.e. parameters, will be adapted accordingly when calling model = AlexNet(....

Convolutional Neural Network Training

I have a question regarding convolutional neural network (CNN) training.
I have managed to train a network using tensorflow that takes an input image (1600 pixels) and output one of three classes that matches it.
Testing the network with variations of the trained classes is giving good results. However; when I give it a different -fourth- image (does not contain any of the trained 3 image), it always returns a random match to one of the classes.
My question is, how can I train a network to classify that the image does not belong to either of the three trained images? A similar example, if i trained a network against the mnist database and then a gave it the character "A" or "B". Is there a way to discriminate that the input does not belong to either of the classes?
Thank you
Your model will always make predictions like your labels, so for example if you train your model with MNIST data, when you will make predictions, prediction will always be 0-9 just like MNIST labels.
What you can do is train a different model first with 2 classes in which you will predict if an image belongs to data set A or B. E.x. for MNIST data you label all data as 1 and add data from other sources that are different (not 0-9) and label them as 0. Then train a model to find if image belongs to MNIST or not.
Convolutional Neural Network (CNN) predicts the result from the defined classes after training. CNN always return from one of the classes regardless of accuracy. I have faced similar problem, what you can do is to check for accuracy value. If the accuracy is below some threshold value then it's belong to none category. Hope this helps.
You probably have three output nodes, and choose the maximum value (one-hot encoding). That's a bit unfortunate as it's a low number of outputs. Non-recognized inputs tend to cause pretty random outputs.
Now, with 3 outputs, roughly speaking you can get 7 outcomes. You might get a single high value (3 possibilities) but non-recognized input can also cause 2 high outputs (also 3 possibilities) or approximately equal output (also 3 possibilities). So there's a decent chance (~ 3/7) of random inputs producing a pattern on the output nodes which you'd only expect for a recognized input.
Now, if you had 15 classes and thus 15 output nodes, you'd be looking at roughly 32767 possible outcomes for unrecognized inputs, only 15 of which correspond to expected one-hot outcomes.
Underlying this is a lack of training data. If your training set has examples outside the 3 classes, you can just dump this in a 4th "other" category and train with that. This by itself isn't a reliable indication, as usually the theoretical "other" set is huge, but you now have 2 complementary ways of detecting other inputs: either by the "other" output node or by one of the 11 ambiguous outputs.
Another solution would be to check what outcome your CNN usually gives when given something else. I believe the last layer must be softmax and your CNN should return probabilities of the three given classes. If none of these probabilities is close to 1 this might be a sign that this is something else assuming your CNN is well trained (it must be fined for overconfidence when predicting wrong labels).