Convert unicode chars in xml to ascii - objective-c

InDesign can export an XML file, and it will also "Remap break, whitespace, and special characters" if you check the box to do that. How can I do the same thing on text?
For example, if I have: •this is a bullet —long dash
InDesign exports as: &# 8226;this is a bullet &# 8212;long dash
I don't know what kind of encoding this is. Can a standard Objective-C class do this (working on OS X) or a third party library?

They are an HTML thing. People usually refer to as HTML encoding but technically, they are Numeric character references. There is nothing built into Foundation that decodes them, but if you look around you can find some code to handle it (for example, https://github.com/mwaterfall/MWFeedParser/blob/master/Classes/NSString+HTML.m).

Related

Content in meta tag

Like the first image, the meta tag is displayed correctly in inspect elements mode but incorrectly displayed in view page source mode as in the second image. Thank you for suggesting a solution to this problem.
I understood the answer:
Because, by default, the HTML encoding engine will only safelist the basic latin alphabet (because browsers have bugs. So we're trying to protect against unknown problems). The &XXX values you see still render as correctly as you can see in your screen shots, so there's no real harm, aside from the increased page size.
If the increased page size bothers you then you can customise the encoder to safe list your own character pages (not language, Unicode doesn't think in terms on language)
To widen the characters treated as safe by the encoder you would insert the following line into the ConfigureServices() method in startup.cs;
services.AddSingleton<HtmlEncoder>(
HtmlEncoder.Create(allowedRanges: new[] { UnicodeRanges.BasicLatin,
UnicodeRanges.Arabic }));
Arabic has quite a few blocks in Unicode, so you may need to add more blocks to get the full range you need.

Yii2 widget fileinput utf-8 charset problem

I would like to upload a filename with utf-8 characters such as greek, german etc. The upload occurs successfully for both file size and type, unfortunately its filename is being replaced by strange characters. However when english characters for filename are used, there is no problem at all.
Any idea what it might be wrong with utf-8 characters regarding filename for this specific Yii2 widget plugin?
I provide you with the filename being generated for utf-8 characters
and additionally the function source code that produces filename via _slugDefault (added extra line for no special characters).
Regards
I found that it actually depends on the server OS file system language settings and not by the widget itself. So i used the following php function in my controller:
$file_name=iconv('UTF-8', 'language//TRANSLIT',$model->field);
$file->saveAs('files/'.$file_name);
Thanks a lot and i am indeed very happy to solve it on myself!

Parsing text from plist to NSAttributedString

I'm loading in text from a plist and displaying it within my app. Ideally I want to be able to specify more complex formatting such as italics, bold and superscript text, and display it in a custom label such as TTTAttributedLabel.
Are there any available libraries to parse a string in a given format (preferably a simple text format such as Markdown or Textile) into an NSAttributedString? I am aware that solutions are available for parsing RTF and HTML, but this is overkill for my needs - plus I'd like the text to be easily written by hand.
Edit: this is for iOS/UIKit
Caught your edit. For iOS/UIKit there is a project out there called NSAttributedString+HTML that attempts to simulate the functionality available on OS X. On OS X, you would just use some minor HTML to format the string and then parse it into NSAttributedString (or objects, or websites, or files, etc.).
The project I mentioned above attempts to offer the same extensions on iOS. I don't know why, after 6 major releases of iOS, it still lacks such rich tools and pushes all the weight on UIWebView (over WebKit) but that's how it is.
I've just added an NSString to NSAttributedString lightweight markup parser to MGBoxKit. It's not Markdown or Textile but it's very similar. So far it supports bold, italics, underline, monospacing, and coloured text.
The MGMushParser class could be used standalone, and is fairly easy to extend.
NSString *markup = #"**bold**, //italics//, __underlining__, and `monospacing`, and {#0000FF|coloured text}";
UIFont *baseFont = [UIFont fontWithName:#"HelveticaNeue" size:18];
UIColor *textColor = UIColor.whiteColor;
myLabel.attributedString = [MGMushParser attributedStringFromMush:markup
font:baseFont color:textColor];
OHAttributedLabel also has a similar markup parser.

Special characters in iText

I need help in using these symbols ⎕, ∨, ๐, Ʌ, and so on. But when I create a PDF with iText these symbols do not appear.
What can I do so that these symbols appear?
You have to use a font and encoding that contains those characters. Your best bet is to use IDENTITY_H for your encoding, as this grants you access to every character within a given font... but you still have to use the right font.
There are several font-manipulation examples within "iText in Action's" chapter on fonts:
http://www.itextpdf.com/book/chapter.php?id=11
The examples are down the right side. Buying the book would probably help too.
I had the same problem too and I figured out using IDENTITY_H for encoding is working fine.
For example:
java.awt.Font f =...;
Font font = FontFactory.getFont(f.getName(),BaseFont.IDENTITY_H)
I don't understand why with BaseFont.WINANSI it doesn't work. Winansi is the standard Windows Cp1252 character set, that one used by my JVM. So, if the char is correctly displayed in Java, why it is not the case for PDF?
You can escape them according to the unicode escape sequence defined in the java language specification. See http://java.sun.com/docs/books/jls/first_edition/html/3.doc.html
If you are using IntelliJ IDEA for your code you can download the StringManipulation plugin, that does the escapes for you. In the settings of IDEA you can also set the "Transparent native-to-ascii conversion" checkbox under File encodings, and this should help do the trick.
square in pdf file by iText:
BaseFont bf = BaseFont.createFont("c:/windows/fonts/arialbd.ttf", BaseFont.IDENTITY_H, BaseFont.EMBEDDED);
question.add(new Phrase("\u25A1", new Font(bf, 26)));
You can see a pdf file exemple here

Objective-c code formatter site to create html that can be embedded into a blog

I'm looking for site similar to http://www.manoli.net/csharpformat/ that allows one to put in c# code snippet and it formats the html to post into your blog with a CSS file.
I need one that actually does this for Objective-C.
You want the GeSHi (Generic Syntax Highlighter) library. It's is excellent, has dozens of languages (including Objective-C, with the ability to automatically linkify classes/protocols to the documentation), and support for many popular CMSs (Django, WordPress, Drupal, Joomla, Mambo, etc).
If you'd like to see it in action, you can check out nearly any wiki page on our local CocoaHeads website. For example: http://cocoaheads.byu.edu/wiki/different-nslog
Assuming you're on a Mac, copying code from Xcode will keep the syntax coloring. Any WYSIWYG blog editor should support that.
In case your blog software isn't WYSIWYG, you can paste into TextEdit and save as HTML. It outputs pretty crappy HTML considering it's just highlighted source code, but it's nonetheless compliant HTML.
Other than that, I don't know of an online service for that.
I use pygments (python) to generate syntax highlight for source code examples embedded in blog.
If your entry text is just the source code it will work the same for what you are after, I tested it to highlight Objective-C as well.
I actually use markdown syntax to type plain text blog post in a file and I copy plain text code examples. Then I run the file via markdown processor, which includes pygments for highlight and store it into a file.
It's as simple as:
include markdown
html = markdown.markdown(text,['codehilite'])
See simple script at the link which just takes file name of your plain text file and creates html file.
Then I can copy/paste the code.
You have to include link or copy the css as well to get the syntax highligh but it's easy.
I do this for blogger, see example how to use markdown with pygments to do syntax highlight.

Categories