How to make quantized tflite support tf.pow function?

How to make quantized tflite support tf.pow function? - tensorflow

Because the original tf.pow function cannot be quatilized. So I want to adjust the implementation method of the pow function so that tflite can support the pow function.
The original method of Pow function is
Po = Pi ^ gamma
changed to
ln (Po) = gamma * ln (Pi)
so,
Po = exp ^ ( gamma * ln (Pi) )
In this way, the power function can be replaced by exp function and ln function.
Because Taylor expansion only needs multiplication and addition to calculate exp function and ln function, tflite provides multiplication and addition.
I just need the range 0~1. The range of Taylor expansion of ln function (at x=1) is as follows:
I need all ranges less than 0. So I tried the exponential function of taylor expenditure in the subsection, as follows:
I also tried linear approximation of exponential function.
I still can't speed up the model with both methods.
Can anyone help me?
quantized tflite support function list:
https://github.com/tensorflow/tensorflow/tree/f9bdcd6d9c714fc5dde232ba3b8b0a5128a07516/tensorflow/lite/delegates/hexagon

Related

Problem with variable recognition in Fortran

i am a complete beginner at Fortran and am working on a code that solves a kinetic mechanism by solving differential equations at different time steps using a differential equation solver.
Here is a link to download the zip file with the whole project:
http://www.filedropper.com/fortranmicropyrolyzersetup
The input variables for the differential solver are defined as follows in the code:
! Declaration of variables
implicit none
EXTERNAL :: FEXSB_AUTO, JEX_SB
integer :: neq,Mf,lrw,liw,iwork,itol,itask,istate,iopt !solver parameters
integer :: j,jjk,m !Counters
double precision :: ATOL,RTOL, RWORK !solver parameters
double precision :: T, TOUT !starting time (s), timestep time (s)
double precision :: Y, w !molar fraction of biomass (-), mass fraction of biomass (-)
double precision :: y_gas, w_gas, Conc !molar fraction in gas phase (-), concentration in gas phase (mol/m³), mass fraction in gas phase (-)
double precision :: n(speciescount) !number of moles (used temporarily to calculate initial molar fracions)
character*5 :: simulationNumber
!Setting the solver parameters
neq = SpeciesCount !number of equations
ITOL = 1 !RTOL and ATOL are integers
RTOL = 1.0D-8 !Relative tolerance
ATOL = 1.0D-15 !Absolute tolerance
ITASK = 1
ISTATE = 1
LRW = 22 + 9*SpeciesCount + 2*SpeciesCount**2 !Array sizing (see VODE)
LIW = 30+SpeciesCount !Array sizint (see VODE)
MF = 22 !Use BDF with finite difference Jacobian
IOPT = 1 !Optional input specified
Iwork(6) = 7000 !Increase maximum iteration steps from 500 to 2000 otherwise the solver does not converge
The differential solver is then called in a do loop for each time step as follows:
! Solve reactor equations (see FEXSB) and advance time, until stop criterium is met
TimeStep = 0 ! dimensionless time step
!DO while(wm(1).lt.0.9999) -> previously used stop criterium
DO while((y_gas(1).lt.0.99999).OR.(TimeStep.lt.10)) !stop criterium: molar fraction Helium = 0.9999 AND do at least 100 timesteps
TimeStep = TimeStep + 1
write(*,*)"Doing iteration for time step",TimeStep
call DVODE(FEXSB_AUTO,NEQ,Y,T,TOUT,ITOL,RTOL,ATOL,ITASK,ISTATE,IOPT,RWORK,LRW,IWORK,LIW,JEX_SB,MF) !solve reactor equations to Y
!CALL DVODE(FEXSB_AUTO,NEQ,Y,T,TOUT,ITOL,RTOL,ATOL,ITASK,ISTATE,IOPT,RWORK,LRW,IWORK,LIW,JEX_SB,MF,RPAR,IPAR)
! calculate w, y_gas, w_gas and Conc from Y
do j = 1, SpeciesCount
if (Y(j) .lt. 1.0D-10) then ! round to zero below 10e-10 to avoid negative numbers and numerical problems with small numbers
Y(j) = 0.0D0
endif
if (MolarFlowRate(j) .lt. 1.0D-10) then
MolarFlowRate(j) = 0.0D0
endif
w(j) = Y(j)*n0_tot*amms(j) / (mass_sample)
y_gas(j) = MolarFlowRate(j) / TotalMolarFlowRate
w_gas(j) = MolarFlowRate(j)*amms(j) / (1000*MassFlowRate) ! factor 1000 to put molar mass in kg/mol instead of g/mol
Conc(j) = MolarFlowRate(j) / VolumetricFlowRate
end do
The differential equation solver does not successfully complete the do loop of the line :
DO while((y_gas(1).lt.0.99999).OR.(TimeStep.lt.10))
The variables seem to be recognized for the first time step when the ODE Solver is called in the line:
call DVODE(FEXSB_AUTO,NEQ,Y,T,TOUT,ITOL,RTOL,ATOL,ITASK,ISTATE,IOPT,RWORK,LRW,IWORK,LIW,JEX_SB,MF)
And the solver successfully completes the first iteration. After the time step is increased one more time, the call function works again but this time the variables are not recognized somehow. When I stop the code to debug what is wrong after the first time step, I realized that the variables required for DVODE do not have set values anymore, somehow they get deleted after the first successful iteration.
What might be causing this problem?
*DECK DVODE
SUBROUTINE DVODE (F, NEQ, Y, T, TOUT, ITOL, RTOL, ATOL, ITASK,
1 ISTATE, IOPT, RWORK, LRW, IWORK, LIW, JAC, MF,
2 RPAR, IPAR)
EXTERNAL F, JAC
DOUBLE PRECISION Y, T, TOUT, RTOL, ATOL, RWORK, RPAR
INTEGER NEQ, ITOL, ITASK, ISTATE, IOPT, LRW, IWORK, LIW,
1 MF, IPAR
DIMENSION Y(*), RTOL(*), ATOL(*), RWORK(LRW), IWORK(LIW),
1 RPAR(*), IPAR(*)
!-----------------------------------------------------------------------
c dvode: Variable-coefficient Ordinary Differential Equation solver,
! with fixed-leading-coefficient implementation.
! This version is in double precision.
!
! DVODE solves the initial value problem for stiff or nonstiff
! systems of first order ODEs,
! dy/dt = f(t,y) , or, in component form,
! dy(i)/dt = f(i) = f(i,t,y(1),y(2),...,y(NEQ)) (i = 1,...,NEQ).
! DVODE is a package based on the EPISODE and EPISODEB packages, and
! on the ODEPACK user interface standard, with minor modifications.
!-----------------------------------------------------------------------
! Authors:
! Peter N. Brown and Alan !. Hindmarsh
! Center for Applied Scientific Computing, L-561
! Lawrence Livermore National Laboratory
! Livermore, CA 94551
! and
! George D. Byrne
! Illinois Institute of Technology
! Chicago, IL 60616
!-----------------------------------------------------------------------
Any help would be greatly appreciated. Please note that I am new to Fortran. If I should supply any additional information to help you answer my question, please don't hesitate to let me know. Thanks in advance!

The assumption that the parameters should be untouched after the first iteration is wrong.
In fortran function parameters are always passed by reference. Thus, a fortran function never guarantees that your original parameter set is the same after the call. It might be the intened (or poor) design of the function to DO change the parameters. Only documentation can help you, and what you provided in your question is not enough, which might very likely also mean: maybe there is not enough documentation for this case/function.
In contrast to C/C++ in fortran there is no "const" modifier for variables that would guarantee you what you have to assume right now.
To me it seem the only solution is to re-initialize all parameters right before every call to DVODE.

Can Theano / Pytorch / Tensorflow compute the following gradient automatically?

I am trying to run a recurrent neural network where the state update function for each neuron is the following
z = g*y
given that
g = (x<x_max & x>x_max-e) | (x>-x_max & x<-x_max+e)
Note that all the variables here are just scalars.
The variable x is defined in a way that it will always update continually so that g will always be a pulse as shown in the this picture. That is, g won't be 1 for a single update but it will be 1 for several consecutive updates.
Can any of these packages implement an automatic gradient computation given this transfer function?

The gradient can't be computed.
g as you have shown is a binary variable. So it's gradient can't be computed. Even the wave-form you have plotted has gradient 0 everywhere except at two points (where its infinite, function is discontinuous)

Non-Convex Loss Function

I am trying to understand gradient descent algorithm by plotting the error vs value of parameters in the function. What would be an example of a simple function of the form y = f(x) with just just one input variable x and two parameters w1 and w2 such that it has a non-convex loss function ? Is y = w1.tanh(w2.x) an example ? What i am trying to achieve is this :
How does one know if the function has a non-convex loss function without plotting the graph ?

In iterative optimization algorithms such as gradient descent or Gauss-Newton, what matters is whether the function is locally convex. This is correct (on a convex set) if and only if the Hessian matrix (Jacobian of gradient) is positive semi-definite. As for a non-convex function of one variable (see my Edit below), a perfect example is the function you provide. This is because its second derivative, i.e Hessian (which is of size 1*1 here) can be computed as follows:
first_deriv=d(w1*tanh(w2*x))/dx= w1*w2 * sech^2(w2*x)
second_deriv=d(first_deriv)/dx=some_const*sech^2(w2*x)*tanh(w2*x)
The sech^2 part is always positive, so the sign of second_deriv depends on the sign of tanh, which can vary depending on the values you supply as x and w2. Therefore, we can say that it is not convex everywhere.
Edit: It wasn't clear to me what you meant by one input variable and two parameters, so I assumed that w1 and w2 were fixed beforehand, and computed the derivative w.r.t x. But I think that if you want to optimize w1 and w2 (as I suppose it makes more sense if your function is from a toy neural net), then you can compute the 2*2 Hessian in a similar way.

The same way as in high-school algebra: the second derivative tells you the direction of flex. If that's negative in all orientations, then the function is convex.

how tensorflow handles complex gradient?

Let z is a complex variable, C(z) is its conjugation.
In complex analysis theory, the derivative of C(z) w.r.t z don't exist. But in tesnsorflow, we can calculate dC(z)/dz and the result is just 1.
Here is an example:
x = tf.placeholder('complex64',(2,2))
y = tf.reduce_sum(tf.conj(x))
z = tf.gradients(y,x)
sess = tf.Session()
X = np.random.rand(2,2)+1.j*np.random.rand(2,2)
X = X.astype('complex64')
Z = sess.run(z,{x:X})[0]
The input X is
[[0.17014372+0.71475762j 0.57455420+0.00144318j]
[0.57871044+0.61303568j 0.48074263+0.7623235j ]]
and the result Z is
[[1.-0.j 1.-0.j]
[1.-0.j 1.-0.j]]
I don't understand why the gradient is set to be 1?
And I want to know how tensorflow handles the complex gradients in general.

How?
The equation used by Tensorflow for the gradient is:
Where the '*' means conjugate.
When using the definition of the partial derivatives wrt z and z* it uses Wirtinger Calculus. Wirtinger calculus enables to calculate the derivative wrt a complex variable for non-holomorphic functions. The Wirtinger definition is:
Why this definition?
When using for example Complex-Valued Neural Networks (CVNN) the gradients will be used over non-holomorphic, real-valued scalar function of one or several complex variables, tensorflow definition of a gradient can then be written as:
This definition corresponds with the literature of CVNN like for example chapter 4 section 4.3 of this book or Amin et al. (between countless examples).

Bit late, but I came across this issue recently too.
The key point is that TensorFlow defines the "gradient" of a complex-valued function f(z) of a complex variable as "the gradient of the real map F: (x,y) -> Re(f(x+iy)), expressed as a complex number" (the gradient of that real map is a vector in R^2, so we can express it as a complex number in the obvious way).
Presumably the reason for that definition is that in TF one is usually concerned with gradients for the purpose of running gradient descent on a loss function, and in particular for identifying the direction of maximum increase/decrease of that loss function. Using the above definition of gradient means that a complex-valued function of complex variables can be used as a loss function in a standard gradient descent algorithm, and the result will be that the real part of the function gets minimised (which seems to me a somewhat reasonable interpretation of "optimise this complex-valued function").
Now, to your question, an equivalent way to write that definition of gradient is
gradient(f) := dF/dx + idF/dy = conj(df/dz + dconj(f)/dz)
(you can easily verify that using the definition of d/dz). That's how TensorFlow handles complex gradients. As for the case of f(z):=conj(z), we have df/dz=0 (as you mention) and dconj(f)/dz=1, giving gradient(f)=1.
I wrote up a longer explanation here, if you're interested: https://github.com/tensorflow/tensorflow/issues/3348#issuecomment-512101921

Tensorflow Linear Regression: Getting values for Adjusted R Square, Coefficients, P-value

There are few key parameters associated with Linear Regression e.g. Adjusted R Square, Coefficients, P-value, R square, Multiple R etc. While using google Tensorflow API to implement Linear Regression how are these parameter mapped? Is there any way we can get the value of these parameters after/during model execution

From my experience, if you want to have these values while your model runs then you have to hand code them using tensorflow functions. If you want them after the model has run you can use scipy or other implementations. Below are some examples of how you might go about coding R^2, MAPE, RMSE...
total_error = tf.reduce_sum(tf.square(tf.sub(y, tf.reduce_mean(y))))
unexplained_error = tf.reduce_sum(tf.square(tf.sub(y, prediction)))
R_squared = tf.sub(tf.div(total_error, unexplained_error),1.0)
R = tf.mul(tf.sign(R_squared),tf.sqrt(tf.abs(unexplained_error)))
MAPE = tf.reduce_mean(tf.abs(tf.div(tf.sub(y, prediction), y)))
RMSE = tf.sqrt(tf.reduce_mean(tf.square(tf.sub(y, prediction))))

I believe the formula for R2 should be the following. Note that it would go negative when the network is so bad that it does a worse job than the mere average as a predictor:
total_error = tf.reduce_sum(tf.square(tf.subtract(y, tf.reduce_mean(y))))
unexplained_error = tf.reduce_sum(tf.square(tf.subtract(y, pred)))
R_squared = tf.subtract(1.0, tf.divide(unexplained_error, total_error))

Adjusted_R_squared = 1 - [ (1-R_squared)*(n-1)/(n-k-1) ]
whereas n is the number of observations and k is the number of features.

You should not use a formula for R Squared. This exists in Tensorflow Addons. You will only need to extend it to Adjusted R Squared.
I would strongly recommend against using a recipe to calculate r-squared itself! The examples I've found do not produce consistent results, especially with just one target variable. This gave me enormous headaches!
The correct thing to do is to use tensorflow_addons.metrics.RQsquare(). Tensorflow Add Ons is on PyPi here and the documentation is a part of Tensorflow here. All you have to do is set y_shape to the shape of your output, often it is (1,) for a single output variable.
Then you can use what RSquare() returns in your own metric that handled the adjustments.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to make quantized tflite support tf.pow function? - tensorflow

Related

Problem with variable recognition in Fortran

Can Theano / Pytorch / Tensorflow compute the following gradient automatically?

Non-Convex Loss Function

how tensorflow handles complex gradient?

Tensorflow Linear Regression: Getting values for Adjusted R Square, Coefficients, P-value

Categories

Resources