PDF image positioning - pdf

Considering following operator sequence:
q
0.12 0 0 0.12 0 0 cm
1 g
472 471.922 4014 6073 re
f
0 G
0 g
q
8.33333 0 0 8.33333 0 0 cm
BT
/R7 12 Tf
0 1.00055 -1 0 71.52 336.711 Tm
[text 1] TJ
/R8 9.96 Tf
0 1.00057 -1 0 105.12 60.3506 Tm
[text 2] TJ
ET
Q
885 502.922 6 297 re
f
q
8.33333 0 0 8.33333 0 0 cm
BT
/R8 9.96 Tf
0 1.00057 -1 0 105.12 95.9906 Tm
[text 3] TJ
0 1.00057 -1 0 116.16 60.3505 Tm
[text 4] TJ
ET
Q
977 502.922 6 535 re
f
q
8.33333 0 0 8.33333 0 0 cm
BT
/R8 9.96 Tf
0 1.00057 -1 0 116.16 124.551 Tm
[text 5] TJ
0 1.00057 -1 0 127.2 60.3507 Tm
[text 6] TJ
ET
Q
1069 502.922 6 386 re
f
q
8.33333 0 0 8.33333 0 0 cm
BT
/R8 9.96 Tf
0 1.00057 -1 0 127.2 106.671 Tm
[text 7] TJ
0 1.00057 -1 0 138.24 60.3508 Tm
[text 8] TJ
ET
Q
1161 502.922 6 437 re
f
q
8.33333 0 0 8.33333 0 0 cm
-----------------------------------------------------------------------------
BT
/R8 9.96 Tf
0 1.00057 -1 0 138.24 112.791 Tm
[line 1] TJ
ET
Q
q
1268 2621.92 m
1268 2675.92 l
1380 2675.92 l
1380 2621.92 l
h
W
n
q
8.33333 0 0 8.33333 0 0 cm
BT
/R9 11.04 Tf
0 0.999402 -1 0 162.6 314.631 Tm
Tj
ET
Q
Q
q
1268 2621.92 m
1268 4396.92 l
2049 4396.92 l
2049 2621.92 l
h
W
n
1 g
1267 2620.92 780 1775 re
f*
Q
q
8.33333 0 0 8.33333 0 0 cm
BT
/R9 11.04 Tf
0 0.999402 -1 0 204.6 515.751 Tm
Tj
ET
Q
0 0 1 RG
0 0 1 rg
q
8.33333 0 0 8.33333 0 0 cm
BT
/R9 11.04 Tf
0 0.999402 -1 0 227.16 355.071 Tm
[line 2] TJ
ET
Q
1903 2958.92 6 1101 re
f
0 G
0 g
q
8.33333 0 0 8.33333 0 0 cm
BT
/R9 11.04 Tf
0 0.999402 -1 0 227.16 487.191 Tm
Tj
ET
Q
q
0 1565 -408 0 1705 2732.92 cm
/X0 Do
Q
q
0 1738 -506 0 2659 2639.92 cm
/X1 Do
Q
q
8.33333 0 0 8.33333 0 0 cm
BT
/R7 12 Tf
0 1.00055 -1 0 342 398.991 Tm
[line 3] TJ
ET
I simplified TJ command to state only text. Please be aware that Tj is incorrectly displayed it is <01> Tj and you can see it in source when you try to edit question.
Page is rotated clockwise 90°. Page properties:
[Type] => Page
[MediaBox] => Array
(
[0] => 0
[1] => 0
[2] => 595
[3] => 842
)
[Rotate] => 90
[Resources] => Array
(
[Font] => Array
(
[R7] => Array
(
[Name] => Helvetica-Bold
[Type] => Type1
[BaseFont] => Helvetica-Bold
[Subtype] => Type1
)
[R8] => Array
(
[Name] => Helvetica
[Type] => Type1
[BaseFont] => Helvetica
[Subtype] => Type1
)
[R9] => Array
(
[Name] => DUCRGK+Calibri
[Type] => TrueType
[BaseFont] => DUCRGK+Calibri
[FirstChar] => 1
[LastChar] => 18
[Subtype] => TrueType
)
)
[XObject] => Array
(
[X0] => Array
(
[Subtype] => Image
[ColorSpace] => DeviceRGB
[Width] => 250
[Height] => 65
[BitsPerComponent] => 8
[Filter] => DCTDecode
[Length] => 3927
)
[X1] => Array
(
[Subtype] => Image
[ColorSpace] => DeviceRGB
[Width] => 278
[Height] => 81
[BitsPerComponent] => 8
[Filter] => FlateDecode
[Length] => 2617
)
)
)
[Contents] => Array
(
[Filter] => FlateDecode
[Length] => 2525
)
[Parent] => Array
(
[Type] => Pages
[Count] => 20
)
In PDF viewer it looks like:
line 1
image 1
line 2
image 2
line 3
Because of page rotation e Tx and f Ty are switched. And for example in 0 1.00057 -1 0 138.24 112.791 Tm 138.24 is stating vertical shift and 112.791 horizontal.
For convenience I'll add also matrix representation here:
[a b 0]
[c d 0]
[e f 1]
or
[a b c d e f]
It seems to be because of cm operator which states scaling image Tx and Ty are so big. Considering it we will have following results:
Ty Content Calculation
138.24 line 1
204,6 image 1 1705/8.33333
227.16 line 2
319,08 image 1 2659/8.33333
342 line 3
Which seems to be correct.
Transformations according to PDF reference v1.7 (Page 205):
•Translations are specified as [ 1 0 0 1 tx ty ], where tx and ty are
the distances to translate the origin of the coordinate system in the
horizontal and vertical dimensions, respectively.
•Scaling is obtained by [ sx 0 0 sy 0 0 ]. This scales the coordinates
so that 1 unit in the horizontal and vertical dimensions of the new
coordinate system is the same size as sx and sy units, respectively,
in the previous coordinate system.
•Rotations are produced by [ cos θ sin θ −sin θ cos θ 0 0 ], which has
the effect of rotating the coordinate system axes by an angle θ
counterclockwise.
•Skew is specified by [ 1 tan α tan β 1 0 0 ], which skews the x axis
by an angle α and the y axis by an angle β.
Questions:
Is matrix which cm/Tm operators change is same matrix? Meaning, after image processing and then modifying matrix for text with operator like Td/TD should it modify matrix after cm state?
Referencing to quote with transformation description. How to get known which of those transformation can be applied together with 1 command? For example for Scaling should there be only sx and sy stated and all other values are like in example 0, which identifies that it is scaling but 0 values should not be applied to actual matrix? It is evident that Skew + Rotation cannot be applied in 1 command as both of the uses b c. In same time Translations and Rotations is used together as I see in the example above.
In PDF reference it is stated that Tm replaces current matrix (Page 406) and cm concatenating (Page 219). Considering this my result is incorrect as along with 1705/8.33333 we should also add previous Ty position which is 138.24 and as result we will have 342,84 and it gives wrong Ty position. What is wrong here?
According to PDF reference v1.7 (Page 206) transformation is applied in following order: Translate, Rotate, Scale or skew. And I thought that scaling is applied to object itself not to the Tx and Ty positioning. So is that right what I'm doing 1705/8.33333 to identify image position?

I answer your questions referencing the PDF specification ISO 32000-1 because the specification is an ISO norm while the PDF references have been called not normative in nature by Adobe staff.
Is matrix which cm/Tm operators change is same matrix? Meaning, after image processing and then modifying matrix for text with operator like Td/TD should it modify matrix after cm state?
No, it is not the same matrix, and cm and Tm operate completely different.
cm manipulates the current transformation matrix (CTM), an element of the PDF graphics state, which defines the transformation from user space to device space. And it does so by multiplying the cm argument to the current value of the CTM, not by replacing the current value as is. (Cf. section 8.3.2.3 - User Space - and Table 57 - Graphics State Operators - in the PDF specification ISO 32000-1)
As element of the PDF graphics state the CTM is subject to save and restore graphics state operators.
Tm, on the other hand, sets the text matrix, Tm, and the text line matrix, Tlm (it does not multiply its argument to the current value). Tm in combination with the text state parameters Tfs, Th, and Trise determine the transformation from text space to user space. (Cf. section 9.4.4 - Text Space Details - and Table 108 - Text-positioning operators - in the PDF specification ISO 32000-1)
Furthermore, at the beginning of a text object, Tm and Tlm are reset to the identity matrix, and Td and TD modify these matrices. Tm is advanced by text showing operators.
Conceptually, the entire transformation from text space to device space may be represented by a text rendering matrix, Trm
And considering your question details, Tm, TD, Td, and all the other text related operators have no effect on the CTM.
Referencing to quote with transformation description. How to get known which of those transformation can be applied together with 1 command? For example for Scaling should there be only sx and sy stated and all other values are like in example 0, which identifies that it is scaling but 0 values should not be applied to actual matrix? It is evident that Skew + Rotation cannot be applied in 1 command as both of the uses b c. In same time Translations and Rotations is used together as I see in the example above.
Any of those example transformations can be combined by simple matrix multiplication. As the text before those samples in the PDF specification states, this sub-clause lists the arrays that specify the most common transformations, it does not list all possible ones.
In PDF reference it is stated that Tm replaces current matrix (Page 406) and cm concatenating (Page 219). Considering this my result is incorrect as along with 1705/8.33333 we should also add previous Ty position which is 138.24 and as result we will have 342,84 and it gives wrong Ty position. What is wrong here?
In your analysis you ignored the operators q and Q which simply cannot be ignored for a proper result.
The q operator pushes a copy of the entire graphics state onto the Graphics State stack.
The Q operator restores the entire graphics state to its former value by popping it from the stack.
As the CTM is part of the graphics state, Q replaces the CTM with the value stored at the time of the corresponding q. (Cf. section 8.4.2 - Graphics State Stack - in the PDF specification ISO 32000-1)
Thus, you have to take this into account. If we do, we get the following progression of active CTM and CTMs on the graphics state stack. As we are interested in coordinates in the default user space coordinate system (i.e. relative to the MediaBox [0 0 595 842] which due to the Rotate value 90 is oriented to have rising x coordinates going down and rising y coordinates going right), we start with the identity matrix as CTM value:
****CTM: [1 0 0 1 0 0]
****Stack: -
q
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0]
0.12 0 0 0.12 0 0 cm
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
1 g
472 471.922 4014 6073 re
f
0 G
0 g
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R7 12 Tf
0 1.00055 -1 0 71.52 336.711 Tm
[text 1] TJ
/R8 9.96 Tf
0 1.00057 -1 0 105.12 60.3506 Tm
[text 2] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
885 502.922 6 297 re
f
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R8 9.96 Tf
0 1.00057 -1 0 105.12 95.9906 Tm
[text 3] TJ
0 1.00057 -1 0 116.16 60.3505 Tm
[text 4] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
977 502.922 6 535 re
f
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R8 9.96 Tf
0 1.00057 -1 0 116.16 124.551 Tm
[text 5] TJ
0 1.00057 -1 0 127.2 60.3507 Tm
[text 6] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
1069 502.922 6 386 re
f
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R8 9.96 Tf
0 1.00057 -1 0 127.2 106.671 Tm
[text 7] TJ
0 1.00057 -1 0 138.24 60.3508 Tm
[text 8] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
1161 502.922 6 437 re
f
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R8 9.96 Tf
0 1.00057 -1 0 138.24 112.791 Tm
[line 1] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
1268 2621.92 m
1268 2675.92 l
1380 2675.92 l
1380 2621.92 l
h
W
n
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0] [0.12 0 0 0.12 0 0]
BT
/R9 11.04 Tf
0 0.999402 -1 0 162.6 314.631 Tm
<01> Tj
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
1268 2621.92 m
1268 4396.92 l
2049 4396.92 l
2049 2621.92 l
h
W
n
1 g
1267 2620.92 780 1775 re
f*
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R9 11.04 Tf
0 0.999402 -1 0 204.6 515.751 Tm
<01> Tj
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
0 0 1 RG
0 0 1 rg
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R9 11.04 Tf
0 0.999402 -1 0 227.16 355.071 Tm
[line 2] TJ
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
1903 2958.92 6 1101 re
f
0 G
0 g
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R9 11.04 Tf
0 0.999402 -1 0 227.16 487.191 Tm
<01> Tj
ET
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
0 1565 -408 0 1705 2732.92 cm
****CTM: [0 187.8 -48.96 0 204.6 327.95]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
/X0 Do
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
0 1738 -506 0 2659 2639.92 cm
****CTM: [0 208.56 -60.72 0 319.08 316.79]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
/X1 Do
Q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0]
q
****CTM: [0.12 0 0 0.12 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
8.33333 0 0 8.33333 0 0 cm
****CTM: [1 0 0 1 0 0]
****Stack: [1 0 0 1 0 0] [0.12 0 0 0.12 0 0]
BT
/R7 12 Tf
0 1.00055 -1 0 342 398.991 Tm
[line 3] TJ
ET
If you prefer coordinates in the rotated default user space coordinate system (i.e. in the rectangle [0 -595 842 0] which is the MediaBox [0 0 595 842] rotated clockwise by 90° around the origin), you have to multiply the matrices above from the right by [0 -1 1 0 0 0].
For the CTMs used for the images this in particular means:
****CTM: [187.8 0 0 48.96 327.95 -204.6]
****Stack: [0 -1 1 0 0 0] [0 -0.12 0.12 0 0 0]
/X0 Do
...
****CTM: [208.56 0 0 60.72 316.79 -319.08]
****Stack: [0 -1 1 0 0 0] [0 -0.12 0.12 0 0 0]
/X1 Do
According to PDF reference v1.7 (Page 206) transformation is applied in following order: Translate, Rotate, Scale or skew. And I thought that scaling is applied to object itself not to the Tx and Ty positioning. So is that right what I'm doing 1705/8.33333 to identify image position?
The specification more exactly says: If several transformations are combined, the order in which they are applied is significant. For example, first scaling and then translating the x axis is not the same as first translating and then scaling it. In general, to obtain the expected results, transformations should be done in the following order: Translate, Rotate, Scale or skew.
This is a recommendation, not a requirement, and it is merely meant to make the life of PDF creators easier. With some Linear Algebra knowledge one knows how to multiply matrices and what to expect regardless of the order, keeping to the recommended order merely makes things easier to understand.
Concerning your thought that scaling is applied to object itself not to the Tx and Ty positioning: The current transformation matrix (including any scaling in it) is applied to anything you do in the user space; usually, though, objects are located or anchored at the origin (0,0) of the current user coordinate system, and the origin is the fixed point of all pure scaling transformations. Thus, scaling wont change that location / anchor point.

Related

add missing values in pandas dataframe - datacleaning

I have measurements stored in a data frame that looks like the one below.
Those are measurements of PMs. Sensors are measuring the four of them pm1, pm2.5, pm5, pm10 contained in the column indicator, under conditions x1..x56, and it gives the measurement in the column area and count. The problem is that under some condition (columns x1..x56) sensors didn't catch all the PMs. And I want for every combination of column conditions (x1..x56) to have all 4 PM values in column indicator. And if the sensor didn't catch it (if there is no PM value for some combination of Xs) I should add it, and area and count column should be 0.
x1 x2 x3 x4 x5 x6 .. x56 indicator area count
0 0 0 0 0 0 .. 0 pm1 10 56
0 0 0 0 0 0 .. 0 pm10 9 1
0 0 0 0 0 0 .. 0 pm5 1 454
.............................................
1 0 0 0 0 0 .. 0 pm1 3 4
ssl ax w 45b g g .. gb pm1 3 4
1 wdf sw d78 b fd .. b pm1 3 4
In this example for the first combination of all zeros, pm2.5 is missing so I should add it and put its area and count to be 0. Similar for the second combination (the one that starts with 1). So my dummy example should look like this after I finish:
x1 x2 x3 x4 x5 x6 .. x56 indicator area count
0 0 0 0 0 0 .. 0 pm1 10 56
0 0 0 0 0 0 .. 0 pm10 9 1
0 0 0 0 0 0 .. 0 pm5 1 454
0 0 0 0 0 0 .. 0 pm2.5 0 0
.............................................
1 0 0 0 0 0 .. 0 pm1 3 4
1 0 0 0 0 0 .. 0 pm10 0 0
1 0 0 0 0 0 .. 0 pm5 0 0
1 0 0 0 0 0 .. 0 pm2.5 0 0
ssl ax w 45b g g .. gb pm1 3 4
ssl ax w 45b g g .. gb pm10 0 0
ssl ax w 45b g g .. gb pm5 0 0
ssl ax w 45b g g .. gb pm2.5 0 0
1 wdf sw d78 b fd .. b pm1 3 4
1 wdf sw d78 b fd .. b pm10 0 0
1 wdf sw d78 b fd .. b pm5 0 0
1 wdf sw d78 b fd .. b pm2.5 0 0
How I can do that? Thanks in advance!
The key here is to create a MultiIndex from all combinations of x and indicator then fill missing records.
Step 1.
Create a vector of x columns:
df['x'] = df.filter(regex='^x\d+').apply(tuple, axis=1)
print(df)
# Output:
x1 x2 x3 x4 x5 x6 x56 indicator area count x
0 0 0 0 0 0 0 0 pm1 10 56 (0, 0, 0, 0, 0, 0, 0)
1 0 0 0 0 0 0 0 pm10 9 1 (0, 0, 0, 0, 0, 0, 0)
2 0 0 0 0 0 0 0 pm5 1 454 (0, 0, 0, 0, 0, 0, 0)
3 1 0 0 0 0 0 0 pm1 3 4 (1, 0, 0, 0, 0, 0, 0)
Step 2.
Create the MultiIindex from vector x and indicator list then reindex your dataframe.
mi = pd.MultiIndex.from_product([df['x'].unique(),
['pm1', 'pm2.5', 'pm5', 'pm10']],
names=['x', 'indicator'])
out = df.set_index(['x', 'indicator']).reindex(mi, fill_value=0)
print(out)
# Output:
x1 x2 x3 x4 x5 x6 x56 area count
x indicator
(0, 0, 0, 0, 0, 0, 0) pm1 0 0 0 0 0 0 0 10 56
pm2.5 0 0 0 0 0 0 0 0 0
pm5 0 0 0 0 0 0 0 1 454
pm10 0 0 0 0 0 0 0 9 1
(1, 0, 0, 0, 0, 0, 0) pm1 1 0 0 0 0 0 0 3 4
pm2.5 *0* 0 0 0 0 0 0 0 0
pm5 *0* 0 0 0 0 0 0 0 0
pm10 *0* 0 0 0 0 0 0 0 0
# Need to be fixed ----^
Step 3.
Group by x index to update x columns by keeping the highest value for each column of the group (1 > 0).
out = out.filter(regex='^x\d+').groupby(level='x') \
.apply(lambda x: pd.Series(dict(zip(x.columns, x.name)))) \
.join(out[['area', 'count']]).reset_index()[df.columns[:-1]]
print(out)
# Output:
x1 x2 x3 x4 x5 x6 x56 indicator area count
0 0 0 0 0 0 0 0 pm1 10 56
1 0 0 0 0 0 0 0 pm2.5 0 0
2 0 0 0 0 0 0 0 pm5 1 454
3 0 0 0 0 0 0 0 pm10 9 1
4 1 0 0 0 0 0 0 pm1 3 4
5 1 0 0 0 0 0 0 pm2.5 0 0
6 1 0 0 0 0 0 0 pm5 0 0
7 1 0 0 0 0 0 0 pm10 0 0

Confusion matrix output missing some labels for multi-label classification

I have difficulty in creating a classification matrix for multi-label classification to evaluate the performance of the MLPClassifier model. The confusion matrix output should be 10x10 but instead I get 8x8 as it doesn't shows label values for 9 and 10. The class labels of true and predicted labels are from 1 to 10 (unordered). The implementation of the code looks like this:
import matplotlib.pyplot as plt
import seaborn as sns
side_bar = [1,2,3,4,5,6,7,8,9,10]
f, ax = plt.subplots(figsize=(12,12))
sns.heatmap(cm, annot=True, linewidth=.5, linecolor="r", fmt=".0f", ax = ax)
ax.set_xticklabels(side_bar)
ax.set_yticklabels(side_bar)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.show()
confusion matrix heatmap
Edit: The code & output of the constructed confusion matrix are as follows:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
print(cm)
str(cm)
[[20 0 0 1 0 5 1 0]
[ 3 0 0 0 0 0 0 0]
[ 1 1 0 1 0 1 0 0]
[ 3 0 0 0 0 3 1 1]
[ 0 0 0 0 0 1 0 0]
[ 3 0 0 1 0 2 1 1]
[ 3 0 0 0 0 0 0 2]
[ 1 0 0 0 0 0 0 1]]
'[[20 0 0 1 0 5 1 0]\n [ 3 0 0 0 0 0 0 0]\n [ 1 1 0 1 0
1 0 0]\n [ 3 0 0 0 0 3 1 1]\n [ 0 0 0 0 0 1 0 0]\n [ 3 0
0 1 0 2 1 1]\n [ 3 0 0 0 0 0 0 2]\n [ 1 0 0 0 0 0 0
1]]'
Could anyone provide me a solution on how can I fix this issue?

RuntimeError: Given groups=1, weight of size [32, 1, 3, 3], expected input[1, 3, 6, 7] to have 1 channels, but got 3 channels instead

There is 6x7 numpy array:
<class 'numpy.ndarray'>
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]]
Model is training normally, when it is passed to this network:
class Net(BaseFeaturesExtractor):
def __init__(self, observation_space: gym.spaces.Box, features_dim: int = 256):
super(Net, self).__init__(observation_space, features_dim)
# We assume CxHxW images (channels first)
# Re-ordering will be done by pre-preprocessing or wrapper
# n_input_channels = observation_space.shape[0]
n_input_channels = 1
print("Input channels:", n_input_channels)
self.cnn = nn.Sequential(
nn.Conv2d(n_input_channels, 32, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1),
nn.ReLU(),
nn.Conv2d(64, 128, kernel_size=3, stride=1, padding=0),
nn.ReLU(),
nn.Flatten(),
)
# Compute shape by doing one forward pass
with th.no_grad():
n_flatten = self.cnn(
th.as_tensor(observation_space.sample()[None]).float()
).shape[1]
self.linear = nn.Sequential(nn.Linear(n_flatten, features_dim), nn.ReLU())
def forward(self, observations: th.Tensor) -> th.Tensor:
return self.linear(self.cnn(observations))
6x7 numpy array is modified to 3x6x7 numpy array:
<class 'numpy.ndarray'>
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[1 1 1 1 1 1 1]]]
After modifying the array, it is giving this error:
RuntimeError: Given groups=1, weight of size [32, 1, 3, 3], expected
input[1, 3, 6, 7] to have 1 channels, but got 3 channels instead
In order to solve this problem, I have tried to change the number of channels:
n_input_channels = 3
However, now it is showing this error:
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected
input[1, 1, 6, 7] to have 3 channels, but got 1 channels instead
How can I make network accept 3x6x7 array?
Update:
I provide more code to make my case clear:
6x7 input array case:
...
board = np.array(self.obs['board']).reshape(1, self.rows, self.columns)
# board = board_3layers(self.obs.mark, board)
print(type(board))
print(board)
return board
Output:
<class 'numpy.ndarray'>
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]]
Number of channels is 3:
n_input_channels = 1
It is working.
I am trying to modify array to 3x6x7:
board = np.array(self.obs['board']).reshape(1, self.rows, self.columns)
board = board_3layers(self.obs.mark, board)
print(type(board))
print(board)
return board
Output:
<class 'numpy.ndarray'>
[[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]]
[[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[0 0 0 0 0 0 0]
[1 1 1 1 1 1 1]]]
Number of channels is 3:
n_input_channels = 3
I do not understand why it is showing this error:
RuntimeError: Given groups=1, weight of size [32, 3, 3, 3], expected input[1, 1, 6, 7] to have 3 channels, but got 1 channels instead
Your model can work with either 1 channel input, or 3 channels input, but not both.
If you set n_input_channels=1, you can work with 1x6x7 input arrays.
If you set n_input_channels=3, you can work with 3x6x7 input arrays.
You must pick one of the options - you cannot have them both simultanously.

what is good way to generate a "symmetric ladder" or "adjacent" matrix using tensorflow?

(Updated I forget to say the input is batched) Given a bool array, e.g. [[false, false, false, true, false, false, true, false, false], [false, true, false, false, false, false, true, false, false]], which "true" define the boundary of the separate sequence. I want to generate an adjacent matrix denoting the different group separated by the boundary. What is a good way to generate following "symmetric ladder" matrix using Tensorflow?
[[
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]
]
[
[1 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 1 1 1 1 1 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
]]
Update Jun 15 2018:
Actually, I just have some progress on this problem, if I can convert the input senqence from [false, false, false, true, false, false, true, false, false] to [1, 1, 1, 0, 2, 2, 0, 3, 3], I can get some result using following Tensorflow code. But I am not sure is there a vector operation can convert [false, false, false, true, false, false, true, false, false] to [1, 1, 1, 0, 2, 2, 0, 3, 3]?
import tensorflow as tf
sess = tf.Session()
x = tf.constant([1, 1, 1, 0, 2, 2, 0, 3, 3], shape=(9, 1), dtype=tf.int32)
y = tf.squeeze(tf.cast(tf.equal(tf.expand_dims(x, 1), x), tf.int32))
print(sess.run(y))
[[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 1 0 0 1 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 1 0 0 1 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]]
Update finally:
I inspired a lot from #Willem Van Onsem.
For batched version can be solved by modifying a little from #Willem Van Onsem solution.
import tensorflow as tf
b = tf.constant([[False, False, False, True, False, False, True, False, False], [False, True, False, False, False, False, False, False, False]], shape=(2, 9, 1), dtype=tf.int32)
x = (1 + tf.cumsum(tf.cast(b, tf.int32), axis=1)) * (1-b)
x = tf.cast(tf.equal(x, tf.transpose(x, perm=[0,2,1])),tf.int32) - tf.transpose(b, perm=[0,2,1])*b
with tf.Session() as sess:
print(sess.run(x))
But I am not sure is there a vector operation can convert [False, False, False, True, False, False, True, False, False] to [1, 1, 1, 0, 2, 2, 0, 3, 3]
There is, consider the following example:
b = tf.constant([False, False, False, True, False, False, True, False, False], shape=(9,), dtype=tf.int32)
then we can use tf.cumsum(..) to generate:
>>> print(sess.run(1+tf.cumsum(b)))
[1 1 1 2 2 2 3 3 3]
If we then multiply the values with the opposite of b, we get:
>>> print(sess.run((1+tf.cumsum(b))*(1-b)))
[1 1 1 0 2 2 0 3 3]
So we can store this expression in a variable, for example x:
x = (1+tf.cumsum(b))*(1-b)
I want to generate an adjacent matrix denoting the different group separated by the boundary. What is a good way to generate following "symmetric ladder" matrix using Tensorflow?
If we follow your approach, we only have to remove the points where both lists are 0 at the same time. We can do this with:
tf.cast(tf.equal(x, tf.transpose(x)),tf.int32) - tf.transpose(b)*b
So here we use your approach, where we basically broadcast x, and the transpose of x, and check for elementwise equality, and we subtract the element-wise multiplication of b from, it. This then yields:
>>> print(sess.run(tf.cast(tf.equal(x, tf.transpose(x)),tf.int32) - tf.transpose(b)*b))
[[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[1 1 1 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 1 1 0 0 0]
[0 0 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 1 1]
[0 0 0 0 0 0 0 1 1]]

tensorflow 0.8 one hot encoding

the data that i wanna encode looks as follows:
print (train['labels'])
[ 0 0 0 ..., 42 42 42]
there are 43 classes going from 0-42
Now i read that tensorflow in version 0.8 has a new feature for one hot encoding so i tried to use it as following:
trainhot=tf.one_hot(train['labels'], 43, on_value=1, off_value=0)
only problem is that i think the output is not what i need
print (trainhot[1])
Tensor("strided_slice:0", shape=(43,), dtype=int32)
Can someone nudge me in the right direction please :)
The output is correct and expected. trainhot[1] is the label of the second (0-based index) training sample, which is of 1D shape (43,). You can play with the code below to better understand tf.one_hot:
onehot = tf.one_hot([0, 0, 41, 42], 43, on_value=1, off_value=0)
with tf.Session() as sess:
onehot_v = sess.run(onehot)
print("v: ", onehot_v)
print("v shape: ", onehot_v.shape)
print("v[1] shape: ", onehot[1])
output:
v: [[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 1]]
v shape: (4, 43)
v[1] shape: Tensor("strided_slice:0", shape=(43,), dtype=int32)