CATransform3D row/column order - core-animation

I have a bit of confusion regarding the matrix row/column order of the CATransform3D struct. The struct defines a matrix like this:
[m11 m12 m13 m14]
[m21 m22 m23 m24]
[m31 m32 m33 m34]
[m41 m42 m43 m44]
At first, it would seem that the values define rows (so that [m11 m12 m13 m14] forms the first row), but when you create a translation matrix by (tx, ty, tz), the matrix will look like this:
[ 1 0 0 0]
[ 0 1 0 0]
[ 0 0 1 0]
[tx ty tz 1]
My confusion comes from the fact that this is not a valid translation matrix; multiplying it with a 4-elements column-vector will not translate the point.
My guess is that the CATransform3D struct stores the values in column-order, so that the values [m11 m12 m13 m14] form the first column (and not the first row).
Can anyone confirm?

Yes, CATransform3D is in column major order because that is how OpenGL(ES) wants it. Core Animation uses GL in the background for it's rendering. If you want proof, check out the man page for glMultMatrix:
PARAMETERS
m Points to 16 consecutive values that are used as the elements of a 4 x 4 column-major matrix.16 consecutive values that are used as the elements of a 4 x 4 column-major matrix.
This really should be more clear in the docs for CALayer.

Your initial interpretation was correct; CATransform3D does define the matrix below:
[m11 m12 m13 m14]
[m21 m22 m23 m24]
[m31 m32 m33 m34]
[m41 m42 m43 m44]
And yes, although it may be confusing (if you are used to pre-multiplying transform matrices), this yields the translation matrix:
[ 1 0 0 0]
[ 0 1 0 0]
[ 0 0 1 0]
[tx ty tz 1]
See Figure 1-8 in the "Core Animation Programming Guide".
This is a valid transformation matrix if you post-multiply your transform matrices, which is what Apple does in Core Animation (see Figure 1-7 in the same guide, although beware the equation is missing transpose operations).

Related

How to find most similar numerical arrays to one array, using Numpy/Scipy?

Let's say I have a list of 5 words:
[this, is, a, short, list]
Furthermore, I can classify some text by counting the occurrences of the words from the list above and representing these counts as a vector:
N = [1,0,2,5,10] # 1x this, 0x is, 2x a, 5x short, 10x list found in the given text
In the same way, I classify many other texts (count the 5 words per text, and represent them as counts - each row represents a different text which we will be comparing to N):
M = [[1,0,2,0,5],
[0,0,0,0,0],
[2,0,0,0,20],
[4,0,8,20,40],
...]
Now, I want to find the top 1 (2, 3 etc) rows from M that are most similar to N. Or on simple words, the most similar texts to my initial text.
The challenge is, just checking the distances between N and each row from M is not enough, since for example row M4 [4,0,8,20,40] is very different by distance from N, but still proportional (by a factor of 4) and therefore very similar. For example, the text in row M4 can be just 4x as long as the text represented by N, so naturally all counts will be 4x as high.
What is the best approach to solve this problem (of finding the most 1,2,3 etc similar texts from M to the text in N)?
Generally speaking, the most widely standard technique of bag of words (i.e. you arrays) for similarity is to check cosine similarity measure. This maps your bag of n (here 5) words to a n-dimensional space and each array is a point (which is essentially also a point vector) in that space. The most similar vectors(/points) would be ones that have the least angle to your text N in that space (this automatically takes care of proportional ones as they would be close in angle). Therefore, here is a code for it (assuming M and N are numpy arrays of the similar shape introduced in the question):
import numpy as np
cos_sim = M[np.argmax(np.dot(N, M.T)/(np.linalg.norm(M)*np.linalg.norm(N)))]
which gives output [ 4 0 8 20 40] for your inputs.
You can normalise your row counts to remove the length effect as you discussed. Row normalisation of M can be done as M / M.sum(axis=1)[:, np.newaxis]. The residual values can then be calculated as the sum of the square difference between N and M per row. The minimum difference (ignoring NaN or inf values obtained if the row sum is 0) is then the most similar.
Here is an example:
import numpy as np
N = np.array([1,0,2,5,10])
M = np.array([[1,0,2,0,5],
[0,0,0,0,0],
[2,0,0,0,20],
[4,0,8,20,40]])
# sqrt of sum of normalised square differences
similarity = np.sqrt(np.sum((M / M.sum(axis=1)[:, np.newaxis] - N / np.sum(N))**2, axis=1))
# remove any Nan values obtained by dividing by 0 by making them larger than one element
similarity[np.isnan(similarity)] = similarity[0]+1
result = M[similarity.argmin()]
result
>>> array([ 4, 0, 8, 20, 40])
You could then use np.argsort(similarity)[:n] to get the n most similar rows.

How to add magnitude or value to a vector in Python?

I am using this function to calculate distance between 2 vectors a,b, of size 300, word2vec, I get the distance between 'hot' and 'cold' to be equal 1.
How to add this value (1) to a vector, becz i thought simply new_vec=model['hot']+1, but when I do the calc dist(new_vec,model['hot'])=17?
import numpy
def dist(a,b):
return numpy.linalg.norm(a-b)
a=model['hot']
c=a+1
dist(a,c)
17
I expected dist(a,c) will give me back 1!
You should review what the norm is. In the case of numpy, the default is to use the L-2 norm (a.k.a the Euclidean norm). When you add 1 to a vector, the call is to add 1 to all of the elements in the vector.
>> vec1 = np.random.normal(0,1,size=300)
>> print(vec1[:5])
... [ 1.18469795 0.04074346 -1.77579852 0.23806222 0.81620881]
>> vec2 = vec1 + 1
>> print(vec2[:5])
... [ 2.18469795 1.04074346 -0.77579852 1.23806222 1.81620881]
Now, your call to norm is saying sqrt( (a1-b1)**2 + (a2-b2)**2 + ... + (aN-bN)**2 ) where N is the length of the vector and a is the first vector and b is the second vector (and ai being the ith element in a). Since (a1-b1)**2 == (a2-b2)**2 == ... == (aN-bN)**2 == 1 we expect this sum to produce N which in your case is 300. So sqrt(300) = 17.3 is the expected answer.
>> print(np.linalg.norm(vec1-vec2))
... 17.320508075688775
To answer the question, "How to add a value to a vector": you have done this correctly. If you'd like to add a value to a specific element then you can do vec2[ix] += value where ix indexes the element that you wish to add. If you want to add a value uniformly across all elements in the vector that will change the norm by 1, then add np.sqrt(1/300).
Also possibly relevant is a more commonly used distance metric for word2vec vectors: the cosine distance which measures the angle between two vectors.

Solving an underdetermined scipy.sparse matrix using svd

Problem
I have a set of equations with variables denoted with lowercase variables and constants with uppercase variables as such
A = a + b
B = c + d
C = a + b + c + d + e
I'm provided the information as to the structure of these equations in a pandas DataFrame with two columns: Constants and Variables
E.g.
df = pd.DataFrame([['A','a'],['A','b'],['B','c'],['B','d'],['C','a'],['C','b'],
['C','c'],['C','d'],['C','e']],columns=['Constants','Variables'])
I then convert this to a sparse CSC matrix by using NetworkX
table = nx.bipartite.biadjacency_matrix(nx.from_pandas_dataframe(df,'Constants','Variables')
,df.Constants.unique(),df.Variables.unique(),format='csc')
When converted to a dense matrix, table looks like the following
matrix([[1, 1, 0, 0, 0],[0, 0, 1, 1, 0],[1, 1, 1, 1, 1]], dtype=int64)
What I want from here is to find which variables are solvable (in this example, only e is solvable) and for each solvable variable, what constants is its value dependent on (in this case, since e = C-B-A, it is dependent on A, B, and C)
Attempts at Solution
I first tried to use rref to solve for the solvable variables. I used the symbolics library sympy and the function sympy.Matrix.rref, which gave me exactly what I wanted, since any solvable variable would have its own row with almost all zeros and 1 one, which I could check for row by row.
However, this solution was not stable. Primarily, it was exceedingly slow, and didn't make use of the fact that my datasets are likely to be very sparse. Moreover, rref doesn't do too well with floating points. So I decided to move on to another approach motivated by Removing unsolvable equations from an underdetermined system, which suggested using svd
Conveniently, there is a svd function in the scipy.sparse library, namely scipy.sparse.linalg.svds. However, given my lack of linear algebra background, I don't understand the results outputted by running this function on my table, or how to use those results to get what I want.
Further Details in the Problem
The coefficient of every variable in my problem is 1. This is how the data can be expressed in the two column pandas DataFrame shown earlier
The vast majority of variables in my actual examples will not be solvable. The goal is to find the few that are solvable
I'm more than willing to try an alternate approach if it fits the constraints of this problem.
This is my first time posting a question, so I apologize if this doesn't exactly follow guidelines. Please leave constructive criticism but be gentle!
The system you are solving has the form
[ 1 1 0 0 0 ] [a] [A]
[ 0 0 1 1 0 ] [b] = [B]
[ 1 1 1 1 1 ] [c] [C]
[d]
[e]
i.e., three equations for five variables a, b, c, d, e. As the answer linked in your question mentions, one can tackle such underdetermined system with the pseudoinverse, which Numpy directly provides in terms of the pinv function.
Since M has linearly independent rows, the psudoinverse has in this case the property that M.pinv(M) = I, where I denotes identity matrix (3x3 in this case). Thus formally, we can write the solution as:
v = pinv(M) . b
where v is the 5-component solution vector, and b denotes the right-hand side 3-component vector [A, B, C]. However, this solution is not unique, since one can add a vector from the so-called kernel or null space of the matrix M (i.e., a vector w for which M.w=0) and it will be still a solution:
M.(v + w) = M.v + M.w = b + 0 = b
Therefore, the only variables for which there is a unique solution are those for which the corresponding component of all possible vectors from the null space of M is zero. In other words, if you assemble the basis of the null space into a matrix (one basis vector per column), then the "solvable variables" will correspond to zero rows of this matrix (the corresponding component of any linear combination of the columns will be then also zero).
Let's apply this to your particular example:
import numpy as np
from numpy.linalg import pinv
M = [
[1, 1, 0, 0, 0],
[0, 0, 1, 1, 0],
[1, 1, 1, 1, 1]
]
print(pinv(M))
[[ 5.00000000e-01 -2.01966890e-16 1.54302378e-16]
[ 5.00000000e-01 1.48779676e-16 -2.10806254e-16]
[-8.76351626e-17 5.00000000e-01 8.66819360e-17]
[-2.60659800e-17 5.00000000e-01 3.43000417e-17]
[-1.00000000e+00 -1.00000000e+00 1.00000000e+00]]
From this pseudoinverse, we see that the variable e (last row) is indeed expressible as - A - B + C. However, it also "predicts" that a=A/2 and b=A/2. To eliminate these non-unique solutions (equally valid would be also a=A and b=0 for example), let's calculate the null space borrowing the function from SciPy Cookbook:
print(nullspace(M))
[[ 5.00000000e-01 -5.00000000e-01]
[-5.00000000e-01 5.00000000e-01]
[-5.00000000e-01 -5.00000000e-01]
[ 5.00000000e-01 5.00000000e-01]
[-1.77302319e-16 2.22044605e-16]]
This function returns already the basis of the null space assembled into a matrix (one vector per column) and we see that, within a reasonable precision, the only zero row is indeed only the last one corresponding to the variable e.
EDIT:
For the set of equations
A = a + b, B = b + c, C = a + c
the corresponding matrix M is
[ 1 1 0 ]
[ 0 1 1 ]
[ 1 0 1 ]
Here we see that the matrix is in fact square, and invertible (the determinant is 2). Thus the pseudoinverse coincides with "normal" inverse:
[[ 0.5 -0.5 0.5]
[ 0.5 0.5 -0.5]
[-0.5 0.5 0.5]]
which corresponds to the solution a = (A - B + C)/2, .... Since M is invertible, its kernel / null space is empty, that's why the cookbook function returns only []. To see this, let's use the definition of the kernel - it is formed by all non-zero vectors x such that M.x = 0. However, since M^{-1} exists, x is given as x = M^{-1} . 0 = 0 which is a contradiction. Formally, this means that the found solution is unique (or that all variables are "solvable").
To build on ewcz's answer, both the nullspace and pseudo-inverse can be calculated using numpy.linalg.svd. See the links below:
pseudo-inverse
nullspace

PDF Text positioning

Considering following operator sequence:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 105.12 60.3506
TJ: line 1:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 105.12 95.9906
TJ: value 1
Tm: 0 1.00057 -1 0 116.16 60.3505
TJ: line 2:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 116.16 124.551
TJ: value 2
Tm: 0 1.00057 -1 0 127.2 60.3507
TJ: line 3:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 127.2 106.671
TJ: value 3
Tm: 0 1.00057 -1 0 138.24 60.3508
TJ: line 4:
Tf: R8 9.96
Tm: 0 1.00057 -1 0 138.24 112.791
TJ: value 4
PDF displays it as:
line 1: value 1
line 2: value 2
line 3: value 3
line 4: value 4
Referencing to PDF documentation matrix consist of [a b c d e f], where e = Tx and f = Ty
From first two command blocks (which gives first line of text) I noticed that Tx and Ty actually switched places. 105.12 stays same which should state vertical position.
PDF reference also says about rotation:
Rotations are produced by [ cos θ sin θ −sin θ cos θ 0 0 ], which has
the effect of rotating the coordinate system axes by an angle θ
counterclockwise.
Seems to be because of that Tx changes vertical position and Ty changes horizontal as sin(90) = 1 cos(0) = 0. Meaning 90 counterclockwise
Questionы:
Why increasing e (Tx) which considering rotation changes vertical position in actual PDF document lines go in correct order? According to Translation e (Tx) should descend.
Why letters and words are not rotated? Only e (Tx) and f (Ty) switched and that is all.
You only consider text matrix settings. You don't tell us about the current transformation matrix at the time of those text objects, and neither do you tell us about the page rotation value.
Considering your observations I would assume the page globally is rotated 90° clockwise.
This would explain why your 90° counterclockwise rotated text appears upright (your second question).
Furthermore with that page rotation the x axis would be vertical with coordinate values rising downwards answering your first question.
Some references
Rotate - integer - (Optional; inheritable) The number of degrees by which the page
shall be rotated clockwise when displayed or printed. The value
shall be a multiple of 90. Default value: 0.
(Table 30 – Entries in a page object - ISO 32000-1)
CTM - array - The current transformation matrix, which maps positions from
user coordinates to device coordinates (see 8.3, "Coordinate
Systems"). This matrix is modified by each application of the
coordinate transformation operator, cm. Initial value: a matrix
that transforms default user coordinates to device
coordinates.
(Table 52 – Device-Independent Graphics State Parameters - ISO 32000-1)

Quake MAP brushes

Many level editors like Quake or Source games uses implicit plane equation for brush side representation (by 3 points) instead of simple (n.x n.y n.z d).
{
...
( 256 0 0 ) ( 256 0 1 ) ( 256 1 0 ) GROUND1_6 0 0 0 1.0 1.0
( 0 128 0 ) ( 0 128 1 ) ( 1 128 0 ) GROUND1_6 0 0 0 1.0 1.0
...
}
Is there some reason for this? I know it can be easily converted to any form, just wonder why they used this form. Is it some floating point precision stuff?
Quake uses the plane equation, and plane sidedness test, extensively, and so while it seems awkward or annoying if you're just looking to render map geometry, it was likely a pretty easy decision to use this representation. As alluded to above, it allows the map format to use integer coordinates for plane sides. This means the format itself is "lossless" and immune to floating point rounding issues. Vertex coordinates can be calculated at either floating point or full double precision. In fact, newer BSP compilers (q3map2, quemap, etc..) do use double precision for some face splitting calculations.