LIBSVM easy.py and grid.py doing worse that svm-train - libsvm

I am facing a problem with libsvm and I am hoping you can help me
When I use svm-train.exe with default parameters like this...
svm-train dikomou
svm-predict dikomou.t dikomou.model dikomou.t.predict
I get accuracy 84.72%
When I use scaling [-1, 1] ,and same scaling for training and testing file like this....
svm-scale -l -1 -u 1 -s range1 dikomou > dikomou.scale
svm-scale -r range1 dikomou.t > dikomou.t.scale
svm-train dikomou.scale
svm-predict dikomou.t.scale dikomou.scale.model dikomou.t.predict
I get less accuracy 81.94%
If I do the scaling for 0 to 1 I get accuracy 87.5%
So I keep the 0 to 1 scaling.
BUT when I use grid.py with the 0 to 1 scaled data like this
grid.py dikomou.scale
..
8 0.0078125 84 ,25
$ ./svm-train -c 8 -g 0.0078125 dikomou.scale
$ ./svm-predict dikomou.t.scale dikomou.scale.model dikomou.t.predict
I get cross validation rate 84.25% and total accuracy of 79.166% and best given c = 8 gamma= 0.0078125
So the grid.py gives me less!!! accuracy than the one using the svm train with defaults. So I have two questions.
How is this possible??
What are the default values of c and gamma the svm train uses?? (I cant find this clearly in the documentation. is gamma 1/number of features and c = 1?) and why the do better than when I use the grid.py?
easy.py also gives me worse results than the defaults. what can i do?

Related

Trying to custom train MobilenetV2 with 40x40px images - wrong results after training

I need to classify small images in 4 different categories, +1 "background" for false detection.
While training the loss quickly drop to 0.7, but stay there even after 800k steps. In the end, the frozen graph seems to classify most images with the background label.
I'm probably missing something, I'll detail the steps I used below, and any feedback is welcomed.
I'm new to tf-slim, so it can be an obvious mistake, maybe too little samples ? I'm not looking for top accuracy, just something working for prototyping.
Source materials can be found there : https://www.dropbox.com/s/k55xoygdzb2efag/TilesDataset.zip?dl=0
I used tensorflow-gpu 1.15.3 on windows 10.
I created the dataset using :
python ./createTfRecords.py --tfrecord_filename=tilesV2_40 --dataset_dir=.\tilesV2\Tiles_40
I added a dataset provider in models-master\research\slim\datasets based on the flowers provider.
I modified the mobilnet_v2.py in models-master\research\slim\nets\mobilenet, changed num_classes=5 and mobilenet.default_image_size = 40
I trained the net with : python ./models-master/research/slim/train_image_classifier.py --model_name "mobilenet_v2" --learning_rate 0.045 --preprocessing_name "inception_v2" --label_smoothing 0.1 --moving_average_decay 0.9999 --batch_size 96 --learning_rate_decay_factor 0.98 --num_epochs_per_decay 2.5 --train_dir ./weight --dataset_name Tiles_40 --dataset_dir .\tilesV2\Tiles_40
When I try this python .\models-master\research\slim\eval_image_classifier.py --alsologtostderr --checkpoint_path ./weight/model.ckpt-XXX --dataset_dir ./tilesV2/Tiles_40 --dataset_name Tiles_40 --dataset_split_name validation --model_name mobilenet_v2 I get eval/Recall_5[1]eval/Accuracy[1]
I then export the graph with python .\models-master\research\slim\export_inference_graph.py --alsologtostderr --model_name mobilenet_v2 --image_size 40 --output_file .\export\output.pb --dataset_name Tiles_40
And freeze it with freeze_graph --input_graph .\export\output.pb --input_checkpoint .\weight\model.ckpt-XXX --input_binary true --output_graph .\export\frozen.pb --output_node_names MobilenetV2/Predictions/Reshape_1
I then try the net with images from the dataset with python .\label_image.py --graph .\export\frozen.pb --labels .\tilesV2\Tiles_40\labels.txt --image .\tilesV2\Tiles_40\photos\lac\1_1.png --input_layer input --output_layer MobilenetV2/Predictions/Reshape_1. This is where I get wrong classifications.,
like 0:background 0.92839915 2:lac 0.020171663 1:house 0.019106707 3:road 0.01677236 4:start 0.0155500565 for a "lac" image of the dataset
I tried changing the depth_multiplier, the learning rate, learning on a cpu, removing --preprocessing_name "inception_v2" from the learning command. I don't have any idea left...
Change your learning rate, maybe start from the usual choice of 3e-5.

Weka Random Forest model file size is too big

I'm using Random Forest in Weka 3.9 GUI. My dataset cotains 211965 instances with 95 attributes. I'm classifing a numerical value.
When I save the model it's size is 1 951 834 KB and it's way too big to load it in my Java application using Weka API.
Am I doing something wrong that causes the file to be that big?
Here is a classifier output from Weka so you can see the paramaters that I have used (I removed attributes list from it to make it shorter).
=== Run information ===
Scheme weka.classifiers.trees.RandomForest -P 100 -I 30 -num-slots 0 -K 0 -M 1.0 -V 0.001 -S 1 -depth 21
Relation all_cars_wo_cena300k3
Instances 211965
Attributes 95
Test mode evaluate on training data
=== Classifier model (full training set) ===
RandomForest
Bagging with 30 iterations and base learner
weka.classifiers.trees.RandomTree -K 0 -M 1.0 -V 0.001 -S 1 -depth 21 -do-not-check-capabilities
Time taken to build model 15.02 seconds
=== Evaluation on training set ===
Time taken to test model on training data 6.36 seconds
=== Summary ===
Correlation coefficient 0.9978
Mean absolute error 1532.7018
Root mean squared error 3087.1285
Relative absolute error 5.1246 %
Root relative squared error 6.9288 %
Total Number of Instances 211965

Increasing number of predictions in Inception for Tensorflow

I am going through the training tutorial on retraining Inception's final layer after having installed Tensorflow for Ubuntu with regular CPU support. I successfully made the flower examples work however after switching to a new set of categories with ten sub-folders I cannot make Inception produce ten scores for each input image rather than the default five. My current command line to run a test image looks like this, working with headers labelled 0-9.
bazel build tensorflow/examples/label_image:label_image && \
bazel-bin/tensorflow/examples/label_image/label_image \
--graph=/tmp/output_graph.pb --labels=/tmp/output_labels.txt \
--output_layer=final_result \ --input_layer=Mul
--image=$HOME/Input/Example.jpg
Which produces as a result
5 (4): 0.642959
3 (2): 0.243444
9 (8): 0.0513504
4 (5): 0.0231318
6 (7): 0.0180509
However I cannot find anything in the programs that Inception runs to reconfigure how many output scores are produced so that all ten of my categories have scores rather than just five. How do I change this?
I tried with 8 categories and was able to get result for all of them.
If your code has below line
top_k = predictions[0].argsort()[-5:][::-1]
change it to
top_k = predictions[0].argsort()[-len(predictions[0]):][::-1]
If code contains predictions = np.squeeze(predictions) then use predictions instead of predictions[0]
I have run this using following command instead of bazel and I found it easier.
python /path_to_file/label_image.py /path_to_image/image.jpeg
First make sure that graph is created after you run retrain.py and it is at the correct location. (default is inside /tmp/).

libsvm fails for very small numbers with error code 'Wrong input on line'

I've tried searching the internet for inputs on this one, but ineffectively.
I am using libSVM (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) and I've encountered this while training the SVM with rbf kernel.
If a feature contains very small numbers, like feature 15 in the following
0 1:4.25606e+07 2:4.2179e+07 3:5.1059e+07 4:7.72388e+06 5:7.72037e+06 6:8.87669e+06 7:4.40263e-06 8:0.0282494 9:819 10:2.34513e-05 11:21.5385 12:95.8974 13:179.117 14:9 15:6.91877e-310
libSVM will fail reading the file with the error code Wrong input at line <lineID>.
After some testing, I was able to confirm that changing such a small number to 0 appears to fix the error. i.e. this line is correctly read:
0 1:4.17077e+07 2:4.12838e+07 3:5.04597e+07 4:7.76011e+06 5:7.74881e+06 6:8.91813e+06 7:3.97472e-06 8:0.0284308 9:936 10:2.46506e-05 11:22.8714 12:100.969 13:186.641 14:17 15:0
Can anybody help me figure out why this is happening? My file contains a lot of number around that order of magnitude.
I am calling the SVM via terminal on Ubuntu like:
<path to>/svm-train -s 0 -t 2 -g 0.001 -c 100000 <path to features file> <path for output model file>

Understanding the result from Encog neural network example

I'm playing around with encog 3.2 for java. From the example (http://www.heatonresearch.com/wiki/Hello_World), I make my own network with 4 neutrons in input layers and 2 neutrons in output layer.
1.0,1.0, actual=0.22018401281844316,ideal=1.0
-1.0,-1.0, actual=0.9903002141301814,ideal=0.0
Can someone explain to me how can I understand the result(actual vs ideal and those numbers before them..)
Thank you very much.
Note that at this stage, the network has been trained, and you are now in the testing stage.
The network has 2 inputs neurons and 1 output neuron.
The first two numbers in your result are given to the trained network as the inputs. Using the internal weights and biases (which are not changed during testing) it computes the result / output ... listed as actual.
ideal is what the result should be, ie the number listed in the dataset for that sample/row.
Generally when you want a 0 or 1 output (eg one of n) you will round the actual result.
So in this case the network computes
1 XOR 1 = 0.22, (rounded = 0) which is wrong (according to ideal).
-1 XOR -1 = 0.99, (rounded = 1) which is also wrong