WCF Unicode UrlEncoded Get not coming over nicely - wcf

I have a RESTful WCF service which accepts GET verbs with Unicode encoded urls. The Unicode characters are translated as little boxes strangely when I get the data on the server.
Is there something I have to tell the service contract to do in order to get Unicode UrlEncoded Gets to translate into nice strings?
Here's my contract:
[OperationContract]
[WebGet(BodyStyle = WebMessageBodyStyle.Wrapped,
UriTemplate = "/Document/{Fragment}", RequestFormat = WebMessageFormat.Xml)]
Message GetDocumentFromSearchResult(string Fragment);
Here's a sample of the unicode I pass in:
%FF%FE%22%00O%FF%FE%20%00King%FF%FE%20%00of%FF
I get "King" and "of" ok, but the rest are little of the string are little squares.
Gotta be an decoding issue ??

What you are passing in looks strange: it appears to contain UTF-16 for the " character with Byte Order Marks. This is almost certainly a problem, so it looks more like an issue with your encoding of the input.
Usually, UTF-8 is used for URLs, as this fits much better with the protocol (no need to escape all of the NUL bytes in pure ASCII). This is likely to be what your service is expecting, so it doesn't decode correctly (as %FF%FE is not valid UTF-8).

Examine the characters using Fragment[i] to see what the actual characters are. That will remove the variable of what the Debugger or other output method may be showing you.

Related

System.Web.HttpUtility.UrlEncode method gives wrong result with different language value

Web.HttpUtility.UrlEncode method in my project. When I am encoding name in English language then I got correct result. For example,
string temp = System.Web.HttpUtility.UrlEncode("Jewelry");
then I got exact result in temp variable. But if I wrote name in Russian language then I got different result.
string temp = System.Web.HttpUtility.UrlEncode("ювелирные изделия");
then I got value in temp variable like "%d1%8e%d0%b2%d0%b5%d0%bb%d0%b8%d1%80%d0%bd%d1%8b%d0%b5+%d0%b8%d0%b7%d0%b4%d0%b5%d0%bb%d0%b8%d1%8f"
Can anyone help me how to achieve exact name as per language?
Thank you!
Actually, the method has "done the right thing" for you!
It encodes non-ASCII characters so that it can be valid in all of the cases and transmit over the Internet. If you put your temp variable in an URL as a parameter, you will get your correct result at server side. That's what UrlEncode means for. Here your question is not a problem at all.
So please have a look at this link for further reading to understand about URL Encoding: http://www.w3schools.com/tags/ref_urlencode.asp
If you input that Russian word to the "URL Encoding Functions" part in the page I have given, it will return the same result as Web.HttpUtility.UrlEncode method does.
Can anyone help me how to achieve exact name as per language?
In short: not with that method, but it might depend on what is your exact goal.
In details:
In general URIs as defined by RFC 3986 (see Section 2: Characters) may contain any of the following characters: ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-._~:/?#[]#!$&'()*+,;=. Any other character needs to be encoded with the percent-encoding (%hh).
This is why UrlEncode produces
UrlEncode("Jewelry") -> "Jewelry"
UrlEncode("ювелирные изделия") -> "%d1%8e%d0%b2%d0%b5%d0%bb%d0%b8%d1%80%d0%bd%d1%8b%d0%b5+%d0%b8%d0%b7%d0%b4%d0%b5%d0%bb%d0%b8%d1%8f"
The string of "ювелирные изделия" contains characters that are not allowed in a URL as per RFC 3986.
Today, modern browsers could work with UTF-8 in URL it might be not necessary to use UrlEncode(). See example: http://jsfiddle.net/ybgt96ms/

WCF Service With Restricted/Special Characters in Response

I have a WCF Service which returns a string response with &, <, >. For e.g.
<response>&</response>
Actually, I'm sending the '&' char but it is encoded for some reason.
Instead, I would like to send the decoded resoponse. The response I want is - <response>&</response>
Could someone suggest how to achieve this?
Thanks.
The XML Serializer has to encode special characters in order to generate a well-formed XML document.
According to the XML specification, the ampersand character must not appear in its literal form anywhere in the document, hence the encoding.
The consuming application should be aware of this already, and will know to decode the &amp ; back to ampersand. However, if you are looking at your response on a XML-based software such as Soap UI, you'll continue to see the symbol in its encoded form.
You can use
System.Web.HttpUtility.HtmlDecode()
on your perticular data contract property/entire response string. Or you can use method in below article that is not quite developer freindly but may be used if you have lots of special chars in your response.
http://seroter.wordpress.com/2007/11/09/xml-web-services-and-special-characters/

IOS JSON escaping special characters

I'm working in IOS and trying to pass some content to a web server via an NSURLRequest. On the server I have a PHP script setup to accept the request string and convert it into an JSON object using the Zend_JSON framework. The issue I am having is whenever the character "ø" is in any part of the request parameters, then the request string is cut short by one character.
Request string before going to server.
[{"description":"Blah blah","type":"Russebuss","name":"Roscoe Simulator","appVersion":"1.0.20","osVersion":"IOS 5.1","phone":"5555555","country":"Østfold","udid":"bed164974ea0d436a43f3cdee0e005a1"}]
Request string on server before any parsing
[{"description":"Blah blah","type":"Russebuss","name":"Roscoe Simulator","appVersion":"1.0.20","osVersion":"IOS 5.1","phone":"5555555","country":"Nord-Trøndelag","udid":"bed164974ea0d436a43f3cdee0e005a1"}
Everything looks exactly the same except the final closing ] is missing. I'm thinking it's having an issue when converting the string to UTF-8, but not sure the correct way to fix this issue.
Does anyone have any ideas why this is happening?
first of all do not trust the xcode console in such cases. you never know which coding the console is actually using.
second, escape the invalid characters before you build you json string. easiest way would probably to make sure you are using the same unicode representation, like utf-8, all the time.
third, if there are still invalid characters use a json lib with a parser (does the encoding). validate the output by parsing back to e.g. NSString. or validate the output manually by using a web form like http://jsonformatter.curiousconcept.com/
the badest way is to replace the single characters in the string, build your json and convert back. one way to do this could be to replace e.g an german ä with its unicode representaion U+00E4 (http://www.utf8-chartable.de/).
Thats the way I do it. I am glad that I nerver needed to go further than step three and this is the step you should do anyway to keep your code simple.
Please try to use Zends internal json Encoding:
Zend_Json::$useBuiltinEncoderDecoder = true;
should fix your issue.

Objective C - char with umlaute to NSString

I am using libical which is a library to parse the icalendar format (RFC 2445).
The problem is, that there may be some german umlaute for example in the location field.
Now libical returns a const char * for each value like:
"K\303\203\302\274nstlerhaus in M\303\203\302\274nchen"
I tried to convert it to NSString with:
[NSString stringWithCString:icalvalue_as_ical_string_r(value) encoding:NSUTF8StringEncoding];
But what I get is:
Künstlerhaus in München
Any suggestions? I would appreciate any help!
Seems like your string got doubly-UTF-8-encoded, because "Künstlerhaus in München" actually is UTF-8, if you UTF-8-decode that again you should get the correct string.
Bear in mind though that you shouldn't be satisfied with that result. There are combinations where a doubly-UTF-8-encoded string can't be simply be decoded by doing a double-UTF-8-decode. Some encoding combinations are irreversible. So in your situation I'd suggest you find out why the string got doubly-UTF-8-encoded in the first place, probably the ical is stored in the wrong encoding on the hard disk, or libical uses the wrong character set to access it, or if you're getting the ical from a server, perhaps the charset there is wrong for text/ical, etc, etc...
The C string does not seem to be encoded in UTF-8, as there are four bytes for each of the characters. For example ü would be encoded as \xc3\xbc (or \195\188) in UTF-8. So the input is either already garbled when you receive it or it uses some other encoding.

Char.ConvertFromUtf32 not available in Silverlight

I'm converting a WinForms app to Silverlight (VB.NET). What should I use instead of Char.ConvertFromUtf32 as it's not available to use in Silverlight?
UTF-32 is currently not part of Silverlight, so you have to find a way around the limitation. I think you should stop a moment and think exactly why you need to read UTF32-encoded text.
If you are reading such text from a database or a file on the server, I would perform the conversion server-side (if possible I would convert everything to UTF-8 and get rid of the UTF-32 data in one shot).
If you are parsing a user-provided file on the client side, I would detect the UTF-32 encoding and gently tell the user that the file encoding is not supported. UTF32 is pretty rare nowadays, so I guess it should not be a very common case (but I could be wrong not knowing your exact situation).
In order to detect the file encoding you have to look at the first few bytes (byte order mark) -more information here, if they are not present the task becomes much harder and involves some kind of heuristics based on character frequency.
From: https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/types/how-to-convert-between-hexadecimal-strings-and-numeric-types
You can use a direct cast, like:
// Get the character corresponding to the integral value.
string stringValue = Char.ConvertFromUtf32(value);
char charValue = (char)value;
Small warning, it will only work up to 0xffff. It will not work for high range Unicode from 0x10000 to 0x10ffff.
Also, if you need to parse \uXXXX, try this other question: How do I convert Unicode escape sequences to Unicode characters in a .NET string?