I'm trying to train a neural network for a path-finding problem on 10x10 grid map but it seems it doesn't work. Here is the details:
My input to the neural network is 10x10x2 matrix where first 10x10 represents obstacles on the map, second 10x10 represents only two points, initial and final points.
My output to the system is the shorthest path found by A* algorithm. I've written a code that produces desired number of cases, and the optimal route is found by A* just after producing the case. I want to teach finding this paths to neural network. As an example, the general structure for 4x4 case is like below.
obstacles matrix(input):
0 0 0 0
0 1 1 1
0 1 1 0
0 0 0 0
initial and final point matrix(input):
0 1 0 0
0 0 0 0
0 0 0 1
0 0 0 0
route(desired output):
1 1 0 0
1 0 0 0
1 0 0 1
1 1 1 1
Also, I'm adding the pictures of a case and the output of neural network.
obstacles
start and target points
desired route
combined image
Up to now, I've described the inputs and output of the neural network. I'm trying to train network using 3 fully connected layer but it seems it does not learn the pattern. Here is my network:
x = tf.placeholder(dtype=tf.float32, shape=[None,10,10,2])
y = tf.placeholder(dtype=tf.float32, shape=[None,10,10])
rate = tf.placeholder(dtype=tf.float32)
# flatten the input
x_flatten = tf.contrib.layers.flatten(x)
y_flatten = tf.contrib.layers.flatten(y)
# fully connected layer
fc = tf.layers.dense(inputs=x_flatten, units=1000, activation=tf.nn.tanh)
fc = tf.layers.dropout(fc, rate=rate, training=True) # rate = 0.3
fc = tf.layers.dense(inputs=fc, units=500, activation=tf.nn.tanh)
logits = tf.layers.dense(inputs=fc, units=100, activation=None)
cost = tf.reduce_mean(tf.abs(logits - y_flatten))
optimizer = tf.train.AdamOptimizer().minimize(cost)
Finally, I'm adding the outcome of the NN after training with 1000 cases and 20 epochs, and the ground truth together.
training outcome
test outcome
I have also tried CNN but it also did not work. Any suggestions will be welcomed, thanks in advance.
Related
I'm trying to train a binary classification model using DeepFM for the first time. The dataset consists of anonymized ids mapped to a list of segments with a boolean 1 or 0 if they have the segment.
The data is one hot encoded so data looks like:
id
SEGMENT1
SEGMENT2
SEGMENT3
Label
id1
0
1
0
0
id2
1
1
1
1
id2
1
0
1
1
I am training via the documentation in deepctr documents, but they have a requirement for dense (numeric) and sparse features (categorical). I would assume I dense since its defined by 0 and 1 and I don't need to transform anything with label-encoder for categorical. Do I still need to use dnn_feature_columns and linear_feature_columns? I don't have both in my data.
linear_feature_columns = fixlen_feature_columns
feature_names = get_feature_names(linear_feature_columns + dnn_feature_columns)
train_model_input = {name: train[name] for name in feature_names}
test_model_input = {name: test[name] for name in feature_names}
model = DeepFM(linear_feature_columns, dnn_feature_columns, task='binary')
model.compile("adam", "binary_crossentropy",
metrics=['binary_crossentropy'], )
Thank you in advance!
I am trying to create a sequential model witch would classify random groups of vectors to a class. The model consistently classifies all groups to the same class.
creating data:
Each news has 200 random vectors with a dimension of 300.
I want the model to be able to classify each news group to a class
allnews=[]
for j in range(50):
news=[]
for i in range(200):
news.append(np.random.random(300))
allnews.append(np.array(news))
#allnews= tf.convert_to_tensor(allnews)
allnews= np.array(allnews)
print(np.shape(allnews))
allnews = allnews.reshape((allnews.shape[0], allnews.shape[1], 300))
print(np.shape(allnews))
lables=[]
for j in range(20):
lables.append(0)
for j in range(20):
lables.append(1)
for d in range(10):
lables.append(2)
lables= tf.convert_to_tensor(lables)
print(lables)
creating the model:
the model i am trying to create:
YourSequenceLenght=200
model = tf.keras.Sequential()
model.add(Input(shape=(YourSequenceLenght,300)))
model.add(Dense(300,use_bias=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),activation="linear"))
model.add(SimpleRNN(1, return_sequences=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),use_bias=False,recurrent_regularizer=tf.keras.regularizers.l1(0.01),activation="sigmoid"))
model.add(Dense(3,use_bias=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),activation="softmax"))
model.summary()
METRICS = [
keras.metrics.TruePositives(name='tp'),
keras.metrics.FalsePositives(name='fp'),
keras.metrics.TrueNegatives(name='tn'),
keras.metrics.FalseNegatives(name='fn'),
keras.metrics.BinaryAccuracy(name='accuracy'),
keras.metrics.Precision(name='precision'),
keras.metrics.Recall(name='recall'),
keras.metrics.AUC(name='auc'),
]
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=METRICS)
training and predicting:
print(lables)
lables = keras.utils.to_categorical(y=lables,num_classes= 3)
# y_train = np_utils.to_categorical(y=y_train, num_classes=10)
print(lables)
history = model.fit(allnews,lables,epochs=10)
res= model.predict(allnews)
print(np.shape(res))
import operator
for r in res:
index, value = max(enumerate(r), key=operator.itemgetter(1))
print(index)
print(value)
for r in res:
print(r)
the outputs from the for prints:
2
0.34069243
2
0.34070647
2
0.33907583
2
0.34005642
2
0.34013948
2
0.34007362
2
0.34028214
2
0.33997294
2
0.34018084
2
0.33995336
2
0.33998552
2
0.33882195
2
0.3401062
2
0.3418465
2
0.33978543
2
0.3396516
2
0.34062216
2
0.3419327
2
0.34114555
2
0.34119973
2
0.3404259
2
0.33981207
2
0.34035686
2
0.34139898
2
0.3398025
2
0.3391234
2
0.34051093
2
0.34120804
2
0.34140897
2
0.34064025
2
0.34133258
2
0.34019342
2
0.3404882
2
0.33930022
2
0.3416659
2
0.3406455
2
0.34054703
2
0.34057957
2
0.3391579
2
0.3395657
2
0.34069654
2
0.3400011
2
0.338789
2
0.34008256
2
0.34080264
2
0.34000066
2
0.340322
2
0.341806
2
0.34178147
2
0.34078327
EDIT:
clarification
I am trying to use a model witch works as follows :
sigmoid hidden layer(with resurrection ) and softmax projection
You are trying to learn something from random data. Your model is (randomly) initilialized in such a way that it always predict class 2, and the gradient updates don't steer the weights into any particular direction, because the input is random, so they stay there. Try having your input data be structured instead of random (e.g. random.random()*tf.one_hot(1,depth=200) for class 1, random.random()*tf.one_hot(2, depth=200) for class 2 and random.random()*tf.one_hot(3, depth=200). Now your values will still be random, but will adhere to a structure.
EDIT:
I took a look at your colab:
1) you can speed up the dataset construction by adding .numpy() after the tf.one_hot: tf.one_hot(1).numpy().
2) When I changed the model to:
model = tf.keras.Sequential()
model.add(Input(shape=(YourSequenceLenght,300)))
model.add(tf.keras.layers.Flatten())
model.add(Dense(300,use_bias=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),activation="linear"))
# model.add(SimpleRNN(1, return_sequences=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),use_bias=False,recurrent_regularizer=tf.keras.regularizers.l1(0.01),activation="sigmoid"))
model.add(Dense(3,use_bias=False,kernel_initializer='random_normal',kernel_regularizer=tf.keras.regularizers.l1(0.01),activation="softmax"))
model.summary()
the accuracy quickly became 100 % after 4 epochs. I think because you only have only 1 output neuron in the SimpleRNN, you can't encode enough information to what class it should be, at least not with just 1 Dense layer afterwards.
3) You are using BinaryAccuracy in your metrics, that doesn't make a lot of sense here. You can just use the normal accuracy (as a string) for the accuracy metric (metrics = ["accuracy", tf.keras.metrics.TruePositives(...), ...])
I am building an LSTM that handles several parallel sequences, and I'm struggling to find any brainscript example that handles dynamic axes.
In my specific case, an example consists of a binary label and N sequences, where each sequence i has a fixed length (but may differ for j<>i).
For example, sequence 1 is always length 1024, sequence 2 is length 4096, sequence 3 is length 1024.
I am expressing these sequences by packing them in parallel in the CNTK text format:
0 |Label 1 |S1 0 |S2 1 |S3 0
0 |S1 1 |S2 1 |S3 1
... another 1021 rows
0 |S2 0
0 |S2 1
... another 3070 rows with only S2 defined
1 |Label 0 |S1 0 |S2 1 |S3 0
1 |S1 1 |S2 1 |S3 0
... another 1021 rows
1 |S2 1
1 |S2 0
... another 3070 rows with only S2 defined
2 |Label ...
and so on. I feel as though I've constructed examples like this in the past but I've been unable to track down any sample configs, or even any BS examples that specify dynamic axes. Is this approach doable?
The G2P example (...\Examples\SequenceToSequence\CMUDict\BrainScript\G2P.cntk) uses multiple dynamic axes. This is a snippet from this file:
# inputs and axes must be defined on top-scope level in order to get a clean node name from BrainScript.
inputAxis = DynamicAxis()
rawInput = Input (inputVocabDim, dynamicAxis=inputAxis, tag='feature')
rawLabels = Input (labelVocabDim, tag='label')
However, since in your case the axes all have the same length for each input, you may also want to consider to just put them into fixed-sized tensors. E.g instead of 1024 values, you would just have a single value of dimension 1024.
The choice depends on what you want to do with the sequences. Are you planning to run a recurrence over them? If so, you want to keep them as dynamic sequences. If they are just vectors that you plan to process with, say, big matrix products, you would rather want to keep them as static axes.
I have train / test input files in this format (filename label):
...\000881.JPG 2
...\000961.JPG 1
...\001700.JPG 1
...\001291.JPG 1
The input file above will be used with the ImageDeserializer. Since I have been unable to retrieve a row ID and the label from my code after the model have been trained, I created a second test file in this format:
|index 881 |piece_type 0 0 1 0 0 0
|index 961 |piece_type 0 1 0 0 0 0
|index 1700 |piece_type 0 1 0 0 0 0
|index 1291 |piece_type 0 1 0 0 0 0
The format of the second file is the same information as represented in the first file, but formatted differently. The index is the row number and the !piece_type is the label encoded in the one hot format. I need the file in the second format in order to be able to get to the row number and the label. The second file is used with the CTFDeserializer to create a composite reader like this:
image_source = ImageDeserializer(map_file, StreamDefs(
features = StreamDef(field='image', transforms=transforms), # first column in map file is referred to as 'image'
labels = StreamDef(field='label', shape=num_classes) # and second as 'label'
))
text_source = CTFDeserializer("test_map2.txt")
text_source.map_input('index', dim=1, format="dense")
text_source.map_input('piece_type', dim=6, format="dense")
# define a composite reader
reader_config = ReaderConfig([image_source, text_source])
minibatch_source = reader_config.minibatch_source()
The reason I have added the second file is to be able to create a confusion matrix and then I need to be able to have both the true labels and the predicted labels for a given minibatch that I test with. The row numbers are nice to have in order to get a pointer pack to the input images.
Would it be possible somehow to be able to do this with just one input file? It's bit of a hassle to deal with multiple files and formats.
You could load the test images without using a reader as described in this wiki page. Admittedly this puts the burden of all the transformations (cropping/mean subtraction etc.) to the user but at least the PIL package makes these easy. This CNTK tutorial uses PIL to crop and scale the input images before feeding them to CNTK.
I have a weighted directed graph where there are no cycles, and I wish to define the constraints so that I can solve a maximization of the weights of a path with linear programming. However, I can't wrap my head around how to do that.
For this I wish to use the LPSolve tool. I thought about making an adjacency matrix, but I don't know how I could make that work with LPSolve.
How can I define the possible paths from each node using constraints and make it generic enough that it would be simple to adapt to other graphs?
Since you have a weighted directed graph, it is sufficient to define a binary variable x_e for each edge e and to add constraints specifying that the source node has flow balance 1 (there is one more outgoing edge selected than incoming edge), the destination node has flow balance -1 (there is one more incoming edge than outgoing edge selected), and every other node has flow balance 0 (there are the same number of outgoing and incoming edges selected). Since your graph has no cycles, this will result in a path from the source to the destination (assuming one exists). You can maximize the weights of the selected edges.
I'll continue the exposition in R using the lpSolve package. Consider a graph with the following edges:
(edges <- data.frame(source=c(1, 1, 2, 3), dest=c(2, 3, 4, 4), weight=c(2, 7, 3, -4)))
# source dest weight
# 1 1 2 2
# 2 1 3 7
# 3 2 4 3
# 4 3 4 -4
The shortest path from 1 to 4 is 1 -> 2 -> 4, with weight 5 (1 -> 3 -> 4 has weight 3).
We need the flow balance constraints for each of our four nodes:
source <- 1
dest <- 4
(nodes <- unique(c(edges$source, edges$dest)))
# [1] 1 2 3 4
(constr <- t(sapply(nodes, function(n) (edges$source == n) - (edges$dest == n))))
# [,1] [,2] [,3] [,4]
# [1,] 1 1 0 0
# [2,] -1 0 1 0
# [3,] 0 -1 0 1
# [4,] 0 0 -1 -1
(rhs <- ifelse(nodes == source, 1, ifelse(nodes == dest, -1, 0)))
# [1] 1 0 0 -1
Now we can put everything together into our model and solve:
library(lpSolve)
mod <- lp(direction = "max",
objective.in = edges$weight,
const.mat = constr,
const.dir = rep("=", length(nodes)),
const.rhs = rhs,
all.bin = TRUE)
edges[mod$solution > 0.999,]
# source dest weight
# 1 1 2 2
# 3 2 4 3
mod$objval
# [1] 5