Xamarin Line Break in label for long words in german - xaml

I´m developing an app in xamarin and I have the next scenario.
I have a label with maxLines = 3, for some long words in german or french the line break doesn't separate the words syllabels correctly, for example the correct separation of the word "Bezirksschornsteinfegermeister" is "Bezirkss–chorn–ste–in–fegermeis–ter" but in the app is showing like this
It should be Bezirksschorn(First Line) steinfegermeister(Second Line)
This is my code
<Label
Margin="24,0,24,14"
FontSize="18"
HorizontalTextAlignment="Center"
LineBreakMode="TailTruncation"
MaxLines="3"
Text="Bezirksschornsteinfegermeister"
VerticalTextAlignment="End" />
The line break for long words works fine for english words but not for german or french words. There is any library that make a correct line break separation for german and french?

If you go through the options for LineBreakMode in the official docs as below:
HeadTruncation – truncates the head of the text, showing the end.
CharacterWrap – wraps text onto a new line at a character boundary.
MiddleTruncation – displays the beginning and end of the text, with the middle replace by an ellipsis.
NoWrap – does not wrap text, displaying only as much text as can fit
on one line.
TailTruncation – shows the beginning of the text, truncating the end.
WordWrap – wraps text at the word boundary.
Obviously,we can't divide a long string into separate words.You may need add spacing or other characters to separate words.

This isn't a Xamarin issue, if you are testing on Android then German language is not supported, only English, check this issue. Not sure about the iOS.

Related

Xamarin Editor line breaks using XAML

I am trying to add line breaks to a editor placeholder in Xamarin with XAML. Unfortunately I cant use \n or < br/> for new lines.
Does anyone have a idea how to work around this behavior?
I tried this and it wont work:
<Editor Placeholder="This is one line \n this is the next one."/>
Expected result:
This is one line this is the next one.
My result:
This is one line \n this is the next one.
You should be able to use
, i.e.:
<Editor Placeholder="This is one line
this is the next one."/>
The &# notation is a XML encoding for special characters. See also this article on Wikipedia.

French character display on xaml page

I have some special characters in my French content. When i see the characters in XAML code, i can see the proper text in visual studio. But while running, the text is not getting rendered properly.
For example: <TextBlock xml:lang="fr-CA" Foreground="Black" Text="Bay Nº doit contenir uniquement caractères alphabétiques ou numériques"/>
In the given text, the underscore which we can see after N and below o is missing while running on the page.
Has anyone faced this issue/does anyone have any idea on resolving this issue?
So I think what you're running into is an issue between ordinal and numero in which case a workaround would be to just implement the Numero unicode hex directly as the character set instead of translating a single ordinal.
Numero hex : №
Shown as example which should render as desired both in designer and at runtime;
Hope this helps, cheers!

Set custom word boundaries in UILabel

I'm displaying a multiline NSAttributedString on a UILabel, I have a problem with the line breaking. When wrapping a word that ends with a plus sign ('+'), the UILabel breaks the line before the '+' sign.
I tried every lineBreakMode available but no matter what I do, if the last word of the line ends with '+', it'll break before it.
For example, using the text "My name is Fred and C++ is my language"
The UILabel will render in two lines like this:
"My name is Fred and C"
"++ is my language"
In this article on Apple's documentation (link) says:
The text system determines word boundaries in a language-specific manner according to Unicode Standard Annex #29 with additional customization for locale as described in that document. On OS X, Cocoa presents APIs related to word boundaries, such as the NSAttributedString methods doubleClickAtIndex: and nextWordFromIndex:forward:, but you cannot modify the way the word-boundary algorithms themselves work.
Any ideas?
Put a Unicode U+2060 WORD JOINER between each of the visible characters in C++. You can use \u2060 in a string literal, or you can use the Unicode Hex Input keyboard to type it as ⌥2060.

Parser not recognizing a dash

My program makes calculations on physics vectors and it allows copy/pasting from websites and then tries to parse them into the x, y, and z components automatically. I've come across one website (http://mathinsight.org/cross_product_examples) that has (3,−3,1). While that looks normal, that minus is actually not recognized by VB. Visually, it is longer than the normal minus (− and -), but return the same Unicode of 45. This picture shows the Unicode for every character (I added a minus in front of the first 3 for comparison) in the Textbox. Also, from this website, I had to use Ctrl+c because right clicking shows that this is not simple HTML.
One is valid (the first), but the second gives VB fits as shown below. Either it won't compile (shown by the blue line below) or a simple assignment (the second one) wrecks havok on my form.
I have tried using
vectorString.Replace("–", "-")
and pasting in the longer dash for the target string and a normal keystroke dash as the replacement, but nothing happens. I'm guessing that since they both have the same Unicode.
Is there some way to convert the longer, invalid dash into the one recognized by VB? I tried using dash symbol that Word likes to replace the minus sign with and it comes up as Unicode 150. So, apparently there are at least three different kinds of dashes. Any thoughts?
The character from Math Insight is U+2212, minus sign. The character you tried using in your Replace call is U+2013, en dash. That's why your replace didn't work.
Beyond the standard ASCII hyphen (-, U+0045), there are two common dashes: the en dash (–, U+2013) and the em dash (—, U+2014). There is also a figure dash (‒, U+2012), but it is not as common.

Preserve "long" spaces in PDFBox text extraction

I am using PDFBox to extract text from PDF.
The PDF has a tabular structure, which is quite simple and columns are also very widely spaced from each-other
This works really well, except that all kinds of horizontal space gets converted into a single space character, so that I cannot tell columns apart anymore (space within words in a column looks just like space between columns).
I appreciate that a general solution is very hard, but in this case the columns are really far apart so that having a simple differentiation between "long spaces" and "space between words" would be enough.
Is there a way to tell PDFBox to turn horizontal whitespace of more then x inches into something other than a single space? A proportional approach (x inch become y spaces) would also work.
The pdftotext C library/tool has a '-layout' switch that tries to preserve the layout. Basically, if I can emulate that with PDFBox, that would be perfect.
There does not seem to be a setting for this, but I was able to modify the source for the PDFTextStripper tool to output a column separator (|) when a "long" space was encountered. In the code where it was building the output line it is possible to look at the x positions of the current and previous letter, and if it is large enough, do something special. PDFTextStripper has lots of protected methods, but turned out to be not really all that extensible. I ended up having to copy the whole class to change a private method.
Looking at the code in there, I call myself lucky that with the particular PDF, this simple approach was successful. A more general solution seems very tricky.
PDF text extraction is difficult.
If the text was output as one big string separated by spaces such as :-
PDFTextOut(" Column 1 Column 2 Column 3");
and you are using a fixed width font such as Courier then you could theoretically calculate the number of spaces between items of text because each character is the same width. If the font is proportional such a Arial then the calculation is harder.
In reality most PDF's generated by individually placing each piece of text directly into its position. Therefore, there is technically no space character or any other characters between columns. The text is just placed into an absolute position on the page.
PDFMoveTo(100,100);
PDFTextOut("Column 1");
PDFMoveTo(250,100);
PDFTextOut("Column 2");
In order to perform data extraction on PDF documents you have to do a little bit more work to find and match column data by using pixel locations as you have mentioned and by making some assumptions and having a little bit of luck.