Accessing epoch value across multiple threads using input_producer/limit_epochs/epochs:0 local variable - tensorflow

I tried to extract the current epoch number while reading data using multiple cpu threads. However during a trial code I observed an output which did not make any sense. Consider the code below :
with tf.Session() as sess:
train_filename_queue = tf.train.string_input_producer(trainimgs, num_epochs=4, shuffle=True)
value = train_filename_queue.dequeue()
init_op =, tf.local_variables_initializer())
coord = tf.train.Coordinator()
collections = [ for v in tf.get_collection(tf.GraphKeys.LOCAL_VARIABLES,\
threads = [threading.Thread(target=work, args=(coord, value, sess, collections)) for i in \
for t in threads:
The work function is defined as below :
def work(coord, val, sess, collections):
counter = 0
while not coord.should_stop():
epoch =[0])
filename ='UTF-8')
print(filename + ' ' + str(epoch))
except tf.errors.OutOfRangeError:
return None
The output I obtain is the following :
I tensorflow/core/common_runtime/gpu/] Found device 0 with properties:
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:84:00.0
Total memory: 11.92GiB
Free memory: 11.80GiB
I tensorflow/core/common_runtime/gpu/] DMA: 0
I tensorflow/core/common_runtime/gpu/] 0: Y
I tensorflow/core/common_runtime/gpu/] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:84:00.0)
I tensorflow/compiler/xla/service/] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/] platform Host present with 20 visible devices
I tensorflow/compiler/xla/service/] XLA service executing computations on platform Host. Devices:
I tensorflow/compiler/xla/service/] StreamExecutor device (0): <undefined>, <undefined>
I tensorflow/compiler/xla/service/] platform CUDA present with 1 visible devices
I tensorflow/compiler/xla/service/] platform Host present with 20 visible devices
I tensorflow/compiler/xla/service/] XLA service executing computations on platform CUDA. Devices:
I tensorflow/compiler/xla/service/] StreamExecutor device (0): GeForce GTX TITAN X, Compute Capability 5.2
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 2
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4209.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_11768.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_4760.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_730.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_703.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3149.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_3271.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1945.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_1015.JPEG 0 4
/local/ujjwal/ILSVRC2015/Data/CLS-LOC/train/n01768244/n01768244_40.JPEG 0 4
The last number in each line corresponds to the value of input_producer/limit_epochs/epochs:0' local variable.
For a first trial, I kept only 10 images in the queue meaning I should get a total of 40 lines of output, which I get.
However, I should get equal number of 1,2, 3 and 4 as the last character in each line, since each filename should be extracted in each of the 4 epochs.
Why am I getting the same number 4 in all the lines ?
Further Information
I tried using range(1) (for a single thread), and still the same observation.
Don't bother with the digit '0'. It is simply the label of the corresponding file. I saved the image file names in such a way.

I did a lot of experiments and finally concluded the following :
I used to believe that -
tf.train.string_input_producer() enqueues a queue epoch-wise.
Meaning that, first one complete epoch is enqueued (in multiple
stages if capacity is less than the number of filenames) and then
further epochs are enqueued.
It is not really the case.
When tf.start_queue_runners() is executed, all the epochs are
enqueued together (in multiple stages if capacity is less than number
of filenames). The local variable epochs:0 is used by tf.train.string_input_producer to maintain the epoch that is being enqueued. Once epochs:0 reaches num_epochs, it remains constant and no matter how many threads are dequeuing from the queue, it does not change.
When you capture the value of epochs:0 it gives you the instantaneous value of the counter epochs and it tells you that at that time which epoch of the dataset is being enqueued. It does not tell you that which epoch of the dataset are you dequeuing.
So it is a bad idea to get the value of the current epoch from the epochs:0 local_variable.


Is there a way to implement equations as Dymos path constraints?

For example, if I have a function h_max(mach) and I want the altitude to always respect this predefined altitude-mach relationship throughout the flight enveloppe, how could I impliment this?
I have tried calculating the limit quantity (in this case, h_max) as its own state and then calculating another state as h_max-h and then constraining that through a path constraint to being greater than 0. This type of approach has worked, but involved two explicit components, a group and alot of extra coding just to get a constraint working. I was wondering if there was a better way?
Thanks so much in advance.
The next version of Dymos, 1.7.0 will be released soon and will support this.
In the mean time, you can install the latest developmental version of Dymos directly from github to have access to this capability:
python -m pip install git+
Then, you can define boundary and path constraints with an equation. Note the equation must have an equals sign in it, and then lower, upper, or equals will apply to the result of the equation.
In reality, dymos is just inserting an OpenMDAO ExecComp for you under the hood, so the one caveat to this is that your expression must be compatible with complex-step differentiation.
Here's an example of the brachistochrone that uses constraint expressions to set the final y value to a specific value while satisfying a path constraint defined with a second equation.
import openmdao.api as om
import dymos as dm
from dymos.examples.plotting import plot_results
from dymos.examples.brachistochrone import BrachistochroneODE
import matplotlib.pyplot as plt
# Initialize the Problem and the optimization driver
p = om.Problem(model=om.Group())
p.driver = om.ScipyOptimizeDriver()
# Create a trajectory and add a phase to it
traj = p.model.add_subsystem('traj', dm.Trajectory())
phase = traj.add_phase('phase0',
# Set the variables
phase.set_time_options(fix_initial=True, duration_bounds=(.5, 10))
phase.add_state('x', fix_initial=True, fix_final=True)
phase.add_state('y', fix_initial=True, fix_final=False)
phase.add_state('v', fix_initial=True, fix_final=False)
phase.add_control('theta', continuity=True, rate_continuity=True,
units='deg', lower=0.01, upper=179.9)
phase.add_parameter('g', units='m/s**2', val=9.80665)
Y_FINAL = 5.0
Y_MIN = 5.0
phase.add_boundary_constraint(f'bcf_y = y - {Y_FINAL}', loc='final', equals=0.0)
phase.add_path_constraint(f'path_y = y - {Y_MIN}', lower=0.0)
# Minimize time at the end of the phase
phase.add_objective('time', loc='final', scaler=10)
p.model.linear_solver = om.DirectSolver()
# Setup the Problem
# Set the initial values
p['traj.phase0.t_initial'] = 0.0
p['traj.phase0.t_duration'] = 2.0
p.set_val('traj.phase0.states:x', phase.interp('x', ys=[0, 10]))
p.set_val('traj.phase0.states:y', phase.interp('y', ys=[10, 5]))
p.set_val('traj.phase0.states:v', phase.interp('v', ys=[0, 9.9]))
p.set_val('traj.phase0.controls:theta', phase.interp('theta', ys=[5, 100.5]))
# Solve for the optimal trajectory
# Check the results
print('final time')
Note the constraints from the list_problem_vars() call that come from timeseries_exec_comp - this is the OpenMDAO ExecComp that Dymos automatically inserts for you.
--- Constraint Report [traj] ---
--- phase0 ---
[final] 0.0000e+00 == bcf_y [None]
[path] 0.0000e+00 <= path_y [None]
/usr/local/lib/python3.8/dist-packages/openmdao/recorders/ UserWarning:The existing case recorder file, dymos_solution.db, is being overwritten.
Model viewer data has already been recorded for Driver.
Full total jacobian was computed 3 times, taking 0.057485 seconds.
Total jacobian shape: (71, 51)
Jacobian shape: (71, 51) (12.51% nonzero)
FWD solves: 12 REV solves: 0
Total colors vs. total size: 12 vs 51 (76.5% improvement)
Sparsity computed using tolerance: 1e-25
Time to compute sparsity: 0.057485 sec.
Time to compute coloring: 0.054118 sec.
Memory to compute coloring: 0.000000 MB.
/usr/local/lib/python3.8/dist-packages/openmdao/core/ DerivativesWarning:Constraints or objectives [('traj.phases.phase0.timeseries.timeseries_exec_comp.path_y', inds=[(0, 0)])] cannot be impacted by the design variables of the problem.
Optimization terminated successfully (Exit mode 0)
Current function value: [18.02999766]
Iterations: 14
Function evaluations: 14
Gradient evaluations: 14
Optimization Complete
final time
Design Variables
name val size indices
-------------------------- -------------- ---- ---------------------------------------------
traj.phase0.t_duration [1.80299977] 1 None
traj.phase0.states:x |12.14992234| 9 [1 2 3 4 5 6 7 8 9]
traj.phase0.states:y |22.69124774| 10 [ 1 2 3 4 5 6 7 8 9 10]
traj.phase0.states:v |24.46289861| 10 [ 1 2 3 4 5 6 7 8 9 10]
traj.phase0.controls:theta |266.48489386| 21 [ 0 1 2 3 4 5 ... 4 15 16 17 18 19 20]
name val size indices alias
----------------------------------------------------------- ------------- ---- --------------------------------------------- ----------------------------------------------------
timeseries.timeseries_exec_comp.bcf_y [0.] 1 [29] traj.phases.phase0->final_boundary_constraint->bcf_y
timeseries.timeseries_exec_comp.path_y |15.73297378| 30 [ 0 1 2 3 4 5 ... 3 24 25 26 27 28 29] traj.phases.phase0->path_constraint->path_y
traj.phase0.collocation_constraint.defects:x |6e-08| 10 None None
traj.phase0.collocation_constraint.defects:y |7e-08| 10 None None
traj.phase0.collocation_constraint.defects:v |3e-08| 10 None None
traj.phase0.continuity_comp.defect_control_rates:theta_rate |0.0| 9 None None
name val size indices
------------- ------------- ---- -------
traj.phase0.t [18.02999766] 1 -1

Media and Data Integrity Errors

I was wondering if anyone can tell me what these mean. From most people posting about them, there is no more than double digits. However, I have 1051556645921812989870080 Media and Data Integrity Errors on my SK hynix PC711 on my new HP dev one. Thanks!
Here's my entire smartctl output
`smartctl 7.3 2022-02-28 r5338 [x86_64-linux-6.0.7-arch1-1] (local build)
Copyright (C) 2002-22, Bruce Allen, Christian Franke,
Model Number: SK hynix PC711 HFS001TDE9X073N
Serial Number: KDB3N511010503A37
Firmware Version: HPS0
PCI Vendor/Subsystem ID: 0x1c5c
IEEE OUI Identifier: 0xace42e
Total NVM Capacity: 1,024,209,543,168 [1.02 TB]
Unallocated NVM Capacity: 0
Controller ID: 1
NVMe Version: 1.3
Number of Namespaces: 1
Namespace 1 Size/Capacity: 1,024,209,543,168 [1.02 TB]
Namespace 1 Formatted LBA Size: 512
Namespace 1 IEEE EUI-64: ace42e 00254f98f1
Local Time is: Wed Nov 9 13:58:37 2022 EST
Firmware Updates (0x16): 3 Slots, no Reset required
Optional Admin Commands (0x001f): Security Format Frmw_DL NS_Mngmt Self_Test
Optional NVM Commands (0x005f): Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat Timestmp
Log Page Attributes (0x1e): Cmd_Eff_Lg Ext_Get_Lg Telmtry_Lg Pers_Ev_Lg
Maximum Data Transfer Size: 64 Pages
Warning Comp. Temp. Threshold: 84 Celsius
Critical Comp. Temp. Threshold: 85 Celsius
Namespace 1 Features (0x02): NA_Fields
Supported Power States
St Op Max Active Idle RL RT WL WT Ent_Lat Ex_Lat
0 + 6.3000W - - 0 0 0 0 5 5
1 + 2.4000W - - 1 1 1 1 30 30
2 + 1.9000W - - 2 2 2 2 100 100
3 - 0.0500W - - 3 3 3 3 1000 1000
4 - 0.0040W - - 3 3 3 3 1000 9000
Supported LBA Sizes (NSID 0x1)
Id Fmt Data Metadt Rel_Perf
0 + 512 0 0
1 - 4096 0 0
SMART overall-health self-assessment test result: PASSED
SMART/Health Information (NVMe Log 0x02)
Critical Warning: 0x00
Temperature: 34 Celsius
Available Spare: 100%
Available Spare Threshold: 5%
Percentage Used: 0%
Data Units Read: 13,162,025 [6.73 TB]
Data Units Written: 3,846,954 [1.96 TB]
Host Read Commands: 156,458,059
Host Write Commands: 128,658,566
Controller Busy Time: 116
Power Cycles: 273
Power On Hours: 126
Unsafe Shutdowns: 15
Media and Data Integrity Errors: 1051556645921812989870080
Error Information Log Entries: 0
Warning Comp. Temperature Time: 0
Critical Comp. Temperature Time: 0
Temperature Sensor 1: 34 Celsius
Temperature Sensor 2: 36 Celsius
Error Information (NVMe Log 0x01, 16 of 256 entries)
No Errors Logged`
Encountered a similar SMART reading from the same model.
I'm seeing a reported Media and Data Integrity Errors rate of a value that's over 2 ^ 84.
It could just be an error with its SMART implementation or the utility reading from it.
Converting your reported value of 1051556645921812989870080 to hex, we get 0xdead0000000000000000 big endian and 0x0000000000000000adde little endian.
Similarly, when I convert my value to hex, I get 0xffff0000000000000000 big endian and 0x0000000000000000ffff little endian, where f is just denotes a value other than 0.
I'm going to assume that the Media and Data Integrity Errors value has no actual meaning with regard to real errors. I doubt that both of us would have values that are padded with 16 0's when converted to hex. Something is sending/receiving/parsing bad data.
If you poke around the other reported SMART values in your post, and on my end, some of them don't seem to make much sense, either.

How to deal with the error when using Gurobi with cvxpy :Unable to retrieve attribute 'BarIterCount'

How to deal with the error when using Gurobi with cvxpy :AttributeError: Unable to retrieve attribute 'BarIterCount'.
I have an Integer programming problem, using cvxpy and set gurobi as a solver.
When the number of variables is small, the result is ok. After the number of variables reaches a level of like 43*13*6, then the error occurred. I suppose it may be caused by the scale of the problem, in which the gurobi solver can not estimate the BarIterCount, which is the max Iterations needed.
Thus, I wonder, is there any way to manually set the BarItercount attribute of gurobi through the interface of the CVX? Or whether there exists another way to solve this problem?
Thanks for any suggestions you may provide for me.
The trace log is as follows:
If my model is small, like I set a number which indicates the scale of model as 3, then the program is ok. The trace is :
Using license file D:\software\lib\site-packages\gurobipy\gurobi.lic
Restricted license - for non-production use only - expires 2022-01-13
Parameter OutputFlag unchanged
Value: 1 Min: 0 Max: 1 Default: 1
D:\software\lib\site-packages\cvxpy\reductions\solvers\ DeprecationWarning: Deprecated, use Model.addMConstr() instead
solver_opts, problem._solver_cache)
Changed value of parameter QCPDual to 1
Prev: 0 Min: 0 Max: 1 Default: 0
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads
Optimize a model with 126 rows, 370 columns and 2689 nonzeros
Model fingerprint: 0x70d49530
Variable types: 0 continuous, 370 integer (369 binary)
Coefficient statistics:
Matrix range [1e+00, 7e+00]
Objective range [1e+00, 1e+00]
Bounds range [1e+00, 1e+00]
RHS range [1e+00, 6e+00]
Found heuristic solution: objective 7.0000000
Presolve removed 4 rows and 90 columns
Presolve time: 0.01s
Presolved: 122 rows, 280 columns, 1882 nonzeros
Variable types: 0 continuous, 280 integer (279 binary)
Root relaxation: objective 4.307692e+00, 216 iterations, 0.00 seconds
Nodes | Current Node | Objective Bounds | Work
Expl Unexpl | Obj Depth IntInf | Incumbent BestBd Gap | It/Node Time
0 0 4.30769 0 49 7.00000 4.30769 38.5% - 0s
H 0 0 6.0000000 4.30769 28.2% - 0s
0 0 5.00000 0 35 6.00000 5.00000 16.7% - 0s
0 0 5.00000 0 37 6.00000 5.00000 16.7% - 0s
0 0 5.00000 0 7 6.00000 5.00000 16.7% - 0s
Cutting planes:
Gomory: 4
Cover: 9
MIR: 4
StrongCG: 1
GUB cover: 9
Zero half: 1
RLT: 1
Explored 1 nodes (849 simplex iterations) in 0.12 seconds
Thread count was 32 (of 32 available processors)
Solution count 2: 6 7
Optimal solution found (tolerance 1.00e-04)
Best objective 6.000000000000e+00, best bound 6.000000000000e+00, gap 0.0000%
If the number is 6, then error occurs:
Using license file D:\software\lib\site-packages\gurobipy\gurobi.lic
Restricted license - for non-production use only - expires 2022-01-13
Parameter OutputFlag unchanged
Value: 1 Min: 0 Max: 1 Default: 1
D:\software\lib\site-packages\cvxpy\reductions\solvers\ DeprecationWarning: Deprecated, use Model.addMConstr() instead
solver_opts, problem._solver_cache)
Changed value of parameter QCPDual to 1
Prev: 0 Min: 0 Max: 1 Default: 0
Gurobi Optimizer version 9.1.0 build v9.1.0rc0 (win64)
Thread count: 16 physical cores, 32 logical processors, using up to 32 threads
Traceback (most recent call last):
File "", line 274, in <module>
File "D:\software\lib\site-packages\cvxpy\problems\", line 396, in solve
return solve_func(self, *args, **kwargs)
File "D:\software\lib\site-packages\cvxpy\problems\", line 754, in _solve
self.unpack_results(solution, solving_chain, inverse_data)
File "D:\software\lib\site-packages\cvxpy\problems\", line 1058, in unpack_results
solution = chain.invert(solution, inverse_data)
File "D:\software\lib\site-packages\cvxpy\reductions\", line 79, in invert
solution = r.invert(solution, inv)
File "D:\software\lib\site-packages\cvxpy\reductions\solvers\qp_solvers\", line 59, in invert
s.NUM_ITERS: model.BarIterCount,
File "src\gurobipy\model.pxi", line 343, in gurobipy.gurobipy.Model.__getattr__
File "src\gurobipy\model.pxi", line 1842, in gurobipy.gurobipy.Model.getAttr
File "src\gurobipy\attrutil.pxi", line 100, in gurobipy.gurobipy.__getattr
AttributeError: Unable to retrieve attribute 'BarIterCount'
Hopefully this can provide more hint for solution.
BarIterCount is the number of barrier iterations performed to solve an LP. This is not a limit on the number of iterations and it should only be queried when the current optimization process has been finished. You cannot set this attribute either, of course.
To actually limit the number of iterations the barrier algorithm is allowed to take, you can use the parameter BarIterLimit.
Please inspect your log file for further information about the solver's behavior.

Path-finding using only neural network

I'm trying to train a neural network for a path-finding problem on 10x10 grid map but it seems it doesn't work. Here is the details:
My input to the neural network is 10x10x2 matrix where first 10x10 represents obstacles on the map, second 10x10 represents only two points, initial and final points.
My output to the system is the shorthest path found by A* algorithm. I've written a code that produces desired number of cases, and the optimal route is found by A* just after producing the case. I want to teach finding this paths to neural network. As an example, the general structure for 4x4 case is like below.
obstacles matrix(input):
0 0 0 0
0 1 1 1
0 1 1 0
0 0 0 0
initial and final point matrix(input):
0 1 0 0
0 0 0 0
0 0 0 1
0 0 0 0
route(desired output):
1 1 0 0
1 0 0 0
1 0 0 1
1 1 1 1
Also, I'm adding the pictures of a case and the output of neural network.
start and target points
desired route
combined image
Up to now, I've described the inputs and output of the neural network. I'm trying to train network using 3 fully connected layer but it seems it does not learn the pattern. Here is my network:
x = tf.placeholder(dtype=tf.float32, shape=[None,10,10,2])
y = tf.placeholder(dtype=tf.float32, shape=[None,10,10])
rate = tf.placeholder(dtype=tf.float32)
# flatten the input
x_flatten = tf.contrib.layers.flatten(x)
y_flatten = tf.contrib.layers.flatten(y)
# fully connected layer
fc = tf.layers.dense(inputs=x_flatten, units=1000, activation=tf.nn.tanh)
fc = tf.layers.dropout(fc, rate=rate, training=True) # rate = 0.3
fc = tf.layers.dense(inputs=fc, units=500, activation=tf.nn.tanh)
logits = tf.layers.dense(inputs=fc, units=100, activation=None)
cost = tf.reduce_mean(tf.abs(logits - y_flatten))
optimizer = tf.train.AdamOptimizer().minimize(cost)
Finally, I'm adding the outcome of the NN after training with 1000 cases and 20 epochs, and the ground truth together.
training outcome
test outcome
I have also tried CNN but it also did not work. Any suggestions will be welcomed, thanks in advance.

TEZ mapper resource request

We recently migrated from MapReduce to TEZ for executing Hive queries on EMR. We are seeing cases where for the exact hive query launches very different number of mappers. See Map 3 phase below. On the first run it requested for 305 resources and on another run it requested for 4534 mappers. ( Please ignore the KILLED status because I manually killed the query.) Why does this happen ? How can we change it to be based on underlying data size instead ?
Run 1
Map 1 container KILLED 5 0 0 5 0 0
Map 3 container KILLED 305 0 0 305 0 0
Map 5 container KILLED 16 0 0 16 0 0
Map 6 container KILLED 1 0 0 1 0 0
Reducer 2 container KILLED 333 0 0 333 0 0
Reducer 4 container KILLED 796 0 0 796 0 0
VERTICES: 00/06 [>>--------------------------] 0% ELAPSED TIME: 14.16 s
Run 2
Map 1 .......... container SUCCEEDED 5 5 0 0 0 0
Map 3 container KILLED 4534 0 0 4534 0 0
Map 5 .......... container SUCCEEDED 325 325 0 0 0 0
Map 6 .......... container SUCCEEDED 1 1 0 0 0 0
Reducer 2 container KILLED 333 0 0 333 0 0
Reducer 4 container KILLED 796 0 0 796 0 0
VERTICES: 03/06 [=>>-------------------------] 5% ELAPSED TIME: 527.16 s
This article explains the process in which Tez allocates resources.
If Tez grouping is enabled for the splits, then a generic grouping
logic is run on these splits to group them into larger splits. The
idea is to strike a balance between how parallel the processing is and
how much work is being done in each parallel process.
First, Tez tries to find out the resource availability in the cluster for these tasks. For that, YARN provides a headroom value (and
in future other attributes may be used). Lets say this value is T.
Next, Tez divides T with the resource per task (say M) to find out how many tasks can run in parallel at one (ie in a single wave). W =
Next W is multiplied by a wave factor (from configuration - tez.grouping.split-waves) to determine the number of tasks to be used.
Lets say this value is N.
If there are a total of X splits (input shards) and N tasks then this would group X/N splits per task. Tez then estimates the size of
data per task based on the number of splits per task.
If this value is between tez.grouping.max-size & tez.grouping.min-size then N is accepted as the number of tasks. If
not, then N is adjusted to bring the data per task in line with the
max/min depending on which threshold was crossed.
For experimental purposes tez.grouping.split-count can be set in configuration to specify the desired number of groups. If this config
is specified then the above logic is ignored and Tez tries to group
splits into the specified number of groups. This is best effort.
After this the grouping algorithm is executed. It groups splits by node locality, then rack locality, while respecting the group size