gmock: Testing two float vectors - googletest

I am trying to write a test for a vector.
For STL containers, I tried:
EXPECT_THAT(float_vec1, ElementsAreArray(float_vec2));
However I need to insert a margin.
Is there an ElementsAreArray equivalent of FloatNear(a_float, max_abs_error)?

Yes, I've used the Pointwise container matcher, which you can give a matcher and an expected container (any STL container and is compatible with non-dynamically allocated c-style arrays).
EXPECT_THAT(float_vec1, Pointwise(matcher, float_vec2))
For the matcher You can use FloatEq() which uses ULP-based float comparisons.
EXPECT_THAT(float_vec1, Pointwise(FloatEq(), float_vec2))
However, I've found it is easier to use FloatNear(float max_abs_error) just to define my own floating point error like you want.
float ferr = 1e-5;
EXPECT_THAT(float_vec1,
Pointwise(FloatNear(ferr), float_vec2));

Related

Matrix multiplication function?

How do you write a matrix multiplication function? Takes two matrices outputs one.
The documentation on assemblyscript.org is pretty short, Float64Array though is a valid type among these but that's 1D so...
AssemblyScript's stdlib is modeled after JavaScript's stdlib, so there are no matrix operations. However, here is a library that might work for you: https://github.com/JustinParratt/big-mult

Explaining the different types in Metal and SIMD

When working with Metal, I find there's a bewildering number of types and it's not always clear to me which type I should be using in which context.
In Apple's Metal Shading Language Specification, there's a pretty clear table of which types are supported within a Metal shader file. However, there's plenty of sample code available that seems to use additional types that are part of SIMD. On the macOS (Objective-C) side of things, the Metal types are not available but the SIMD ones are and I'm not sure which ones I'm supposed to be used.
For example:
In the Metal Spec, there's float2 that is described as a "vector" data type representing two floating components.
On the app side, the following all seem to be used or represented in some capacity:
float2, which is typedef ::simd_float2 float2 in vector_types.h
Noted: "In C or Objective-C, this type is available as simd_float2."
vector_float2, which is typedef simd_float2 vector_float2
Noted: "This type is deprecated; you should use simd_float2 or simd::float2 instead"
simd_float2, which is typedef __attribute__((__ext_vector_type__(2))) float simd_float2
::simd_float2 and simd::float2 ?
A similar situation exists for matrix types:
matrix_float4x4, simd_float4x4, ::simd_float4x4 and float4x4,
Could someone please shed some light on why there are so many typedefs with seemingly overlapping functionality? If you were writing a new application today (2018) in Objective-C / Objective-C++, which type should you use to represent two floating values (x/y) and which type for matrix transforms that can be shared between app code and Metal?
The types with vector_ and matrix_ prefixes have been deprecated in favor of those with the simd_ prefix, so the general guidance (using float4 as an example) would be:
In C code, use the simd_float4 type. (You have to include the prefix unless you provide your own typedef, since C doesn't have namespaces.)
Same for Objective-C.
In C++ code, use the simd::float4 type, which you can shorten to float4 by using namespace simd;.
Same for Objective-C++.
In Metal code, use the float4 type, since float4 is a fundamental type in the Metal Shading Language [1].
In Swift code, use the float4 type, since the simd_ types are typealiased to shorter names.
Update: In Swift 5, float4 and related types have been deprecated in favor of SIMD4<Float> and related types.
These types are all fundamentally equivalent, and all have the same size and alignment characteristics so you can use them across languages. That is, in fact, one of the design goals of the simd framework.
I'll leave a discussion of packed types to another day, since you didn't ask.
[1] Metal is an unusual case since it defines float4 in the global namespace, then imports it into the metal namespace, which is also exported as the simd namespace. It additionally aliases float4 as vector_float4. So, you can use any of the above names for this vector type (except simd_float4). Prefer float4.
which type should you use to represent two floating values (x/y)
If you can avoid it, don't use a single SIMD vector to represent a single geometry x,y vector if you're using CPU SIMD.
CPU SIMD works best when you have many of the same thing in each SIMD vector, because they're actually stores in 16-byte or 32-byte vector registers where "vertical" operations between two vectors are cheap (packed add or multiply), but "horizontal" operations can mostly only be done with a shuffle + a vertical operation.
For example a vector of 4 x values and another vector of 4 y values lets you do 4 dot-products or 4 cross-products in parallel with no shuffling, so the overall throughput is significantly more dot-products per clock cycle than if you had a vector of [x1, y1, x2, y2].
See https://stackoverflow.com/tags/sse/info, and especially these slides: SIMD at Insomniac Games (GDC 2015) for more about planning your data layout and program design for doing many similar operations in parallel instead of trying to accelerate single operations.
The one exception to this rule is if you're only adding / subtracting to translate coordinates, because that's still purely a vertical operation even with an array-of-structs. And thus fine for CPU short-vector SIMD based on 16-byte vectors. (e.g. the 2nd element in one vector only interacts with the 2nd element in another vector, so no shuffling is needed.)
GPU SIMD is different, and I think has no problem with interleaved data. I'm not a GPU expert.
(I don't use Objective C or Metal, so I can't help you with the details of their type names, just what the underlying CPU hardware is good at. That's basically the same for x86 SSE/AVX, ARM NEON / AArch64 SIMD, or PowerPC Altivec. Horizontal operations are slower.)

Errors to fit parameters of scipy.optimize

I use the scipy.optimize.minimize ( https://docs.scipy.org/doc/scipy/reference/tutorial/optimize.html ) function with method='L-BFGS-B.
An example of what it returns is here above:
fun: 32.372210618549758
hess_inv: <6x6 LbfgsInvHessProduct with dtype=float64>
jac: array([ -2.14583906e-04, 4.09272616e-04, -2.55795385e-05,
3.76587650e-05, 1.49213975e-04, -8.38440428e-05])
message: 'CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH'
nfev: 420
nit: 51
status: 0
success: True
x: array([ 0.75739412, -0.0927572 , 0.11986434, 1.19911266, 0.27866406,
-0.03825225])
The x value correctly contains the fitted parameters. How do I compute the errors associated to those parameters?
TL;DR: You can actually place an upper bound on how precisely the minimization routine has found the optimal values of your parameters. See the snippet at the end of this answer that shows how to do it directly, without resorting to calling additional minimization routines.
The documentation for this method says
The iteration stops when (f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftol.
Roughly speaking, the minimization stops when the value of the function f that you're minimizing is minimized to within ftol of the optimum. (This is a relative error if f is greater than 1, and absolute otherwise; for simplicity I'll assume it's an absolute error.) In more standard language, you'll probably think of your function f as a chi-squared value. So this roughly suggests that you would expect
Of course, just the fact that you're applying a minimization routine like this assumes that your function is well behaved, in the sense that it's reasonably smooth and the optimum being found is well approximated near the optimum by a quadratic function of the parameters xi:
where Δxi is the difference between the found value of parameter xi and its optimal value, and Hij is the Hessian matrix. A little (surprisingly nontrivial) linear algebra gets you to a pretty standard result for an estimate of the uncertainty in any quantity X that's a function of your parameters xi:
which lets us write
That's the most useful formula in general, but for the specific question here, we just have X = xi, so this simplifies to
Finally, to be totally explicit, let's say you've stored the optimization result in a variable called res. The inverse Hessian is available as res.hess_inv, which is a function that takes a vector and returns the product of the inverse Hessian with that vector. So, for example, we can display the optimized parameters along with the uncertainty estimates with a snippet like this:
ftol = 2.220446049250313e-09
tmp_i = np.zeros(len(res.x))
for i in range(len(res.x)):
tmp_i[i] = 1.0
hess_inv_i = res.hess_inv(tmp_i)[i]
uncertainty_i = np.sqrt(max(1, abs(res.fun)) * ftol * hess_inv_i)
tmp_i[i] = 0.0
print('x^{0} = {1:12.4e} ± {2:.1e}'.format(i, res.x[i], uncertainty_i))
Note that I've incorporated the max behavior from the documentation, assuming that f^k and f^{k+1} are basically just the same as the final output value, res.fun, which really ought to be a good approximation. Also, for small problems, you can just use np.diag(res.hess_inv.todense()) to get the full inverse and extract the diagonal all at once. But for large numbers of variables, I've found that to be a much slower option. Finally, I've added the default value of ftol, but if you change it in an argument to minimize, you would obviously need to change it here.
One approach to this common problem is to use scipy.optimize.leastsq after using minimize with 'L-BFGS-B' starting from the solution found with 'L-BFGS-B'. That is, leastsq will (normally) include and estimate of the 1-sigma errors as well as the solution.
Of course, that approach makes several assumption, including that leastsq can be used and may be appropriate for solving the problem. From a practical view, this requires the objective function return an array of residual values with at least as many elements as variables, not a cost function.
You may find lmfit (https://lmfit.github.io/lmfit-py/) useful here: It supports both 'L-BFGS-B' and 'leastsq' and gives a uniform wrapper around these and other minimization methods, so that you can use the same objective function for both methods (and specify how to convert the residual array into the cost function). In addition, parameter bounds can be used for both methods. This makes it very easy to first do a fit with 'L-BFGS-B' and then with 'leastsq', using the values from 'L-BFGS-B' as starting values.
Lmfit also provides methods to more explicitly explore confidence limits on parameter values in more detail, in case you suspect the simple but fast approach used by leastsq might be insufficient.
It really depends what you mean by "errors". There is no general answer to your question, because it depends on what you're fitting and what assumptions you're making.
The easiest case is one of the most common: when the function you are minimizing is a negative log-likelihood. In that case the inverse of the hessian matrix returned by the fit (hess_inv) is the covariance matrix describing the Gaussian approximation to the maximum likelihood.The parameter errors are the square root of the diagonal elements of the covariance matrix.
Beware that if you are fitting a different kind of function or are making different assumptions, then that doesn't apply.

How do I print a half-precision float using printf in the AMD OpenCL SDK?

The Programming Guide has instructions for doubles (%ld) and vector types (e.g. %v4f), but not half-precision floats.
Normally in C, varargs arguments are automatically promoted to larger datatypes, such as float to double. The OpenCL documentation seems to imply that a similar promotion applies there.
Therefore a simple %f should work also for half-length floats.

Matrices in Objective-C

Is there any class for Matrix support in Objective-C? By Matrix I mean 2D-arrays.
What I do now is using 2 NSArrays, one within the other. This works perfectly but my code looks like a big mess.
I have also tried to use C-style arrays within each-other (matrix[][]) but this approach doesn't fit my application as I cannot automatically #synthesize or specify #properties for them.
I could of course create my own class for that, but what I'm wondering is if Objective-C already has something for this kind of situations. I did some Google-research but didn't find anything.
Nope, Foundation doesn't have any 2D array class. As far as mathematical computations are concerned, Matrices are typically implemented in C or C++ for portability and performance reasons. You'll have to write your own class for that if you really want it.
It seems obj-c has not its own struct for matrix. I refered to the iOS SDK's CATransform3D, found that it use:
struct CATransform3D
{
CGFloat m11, m12, m13, m14;
CGFloat m21, m22, m23, m24;
CGFloat m31, m32, m33, m34;
CGFloat m41, m42, m43, m44;
};
typedef struct CATransform3D CATransform3D;
as the 3D transform matrix.
Late to the party but I'd like to mention this project, which implements a flexible Matrix class based on a C array, with interfaces to many BLAS and LAPACK functions. Disclaimer: I am the developer.
Basic use is as follows:
YCMatrix *I = [YCMatrix identityOfRows:3 Columns:3];
double v = [I getValueAtRow:1 Column:1];
[I setValue:0 Row:0 Column:0];
I think you will need to subclass NSArray or use some non-sweet syntax : [MyArray objectAtIndex:i*d+j]. The latter case is really cumbersome as you will get only one clumsy kind of enumerator.