OutgoingWebResponseContext does not display non-english characters - wcf

We have implmented a REST-style get service Using WCF in .Net 3.5. This service retrieves research documents. The string 'synopsis' indicated in the code bolow contains non-english characteres which the browser deliveres as "????????".
private void ReturnSynopsisInfo(IApiWebOperationContext context, OutgoingWebResponseContext outgoingResp, string synopsis)
{
SetResponseHeaders(outgoingResp, HttpStatusCode.OK);
outgoingResp.ContentType = "text/html; charset=UTF-8";
context.Result = new MemoryStream(Encoding.ASCII.GetBytes(synopsis));
}
Any advise is much appreciated.
Thank You.

It seems you are declaring the encoding as utf-8 in the content-type header, but actually using ASCII encoding in stream. The ASCII encoder will silently change any non-ascii character into a question mark.
You probably want to use UTF8Encoding rater than ASCIIEncoding.

Related

WCF Change message encoding from Utf-16 to Utf-8

I have a WCF connected service in a .net core application. I'm using the code that is autogenerated taken the wsdl definition.
Currently at the top of the request xml is including this line:
<?xml version="1.0" encoding="utf-16"?>
I can't find a simple way to change this encoding to UTF-8 when sending the request.
Since I could find a configuration option a the request/client objects, I've tried to change the message with following code at IClientMessageInspector.BeforeSendRequest
public object BeforeSendRequest(ref Message request, IClientChannel channel)
{
// Load a new xml document from current request
var xmlDocument = new XmlDocument();
xmlDocument.LoadXml(request.ToString());
((XmlDeclaration)xmlDocument.FirstChild).Encoding = Encoding.UTF8.HeaderName;
// Create streams to copy
var memoryStream = new MemoryStream();
var xmlWriter = XmlWriter.Create(memoryStream);
xmlDocument.Save(xmlWriter);
xmlWriter.Flush();
xmlWriter.Close();
memoryStream.Position = 0;
var xmlReader = XmlReader.Create(memoryStream);
// Create a new message
var newMessage = Message.CreateMessage(request.Version, null, xmlReader);
newMessage.Headers.CopyHeadersFrom(request);
newMessage.Properties.CopyProperties(request.Properties);
return null;
}
But the newMessage object still writes the xml declaration using utf-16. I can see it while debugging at the watch window since.
Any idea on how to accomplish this (should be) simple change will be very apreciated.
Which binding do you use to create the communication channel? The textmessageencoding element which has been contained in the CustomBinding generally contains TextEncoding property.
https://learn.microsoft.com/en-us/dotnet/framework/configure-apps/file-schema/wcf/textmessageencoding
WriteEncoding property specifies the character set encoding to be used for emitting messages on the binding. Valid values are
UnicodeFffeTextEncoding: Unicode BigEndian encoding
Utf16TextEncoding: Unicode encoding
Utf8TextEncoding: 8-bit encoding
The default is Utf8TextEncoding. This attribute is of type Encoding.
As for the specific binding, it contains the textEncoding property too.
https://learn.microsoft.com/en-us/dotnet/api/system.servicemodel.basichttpbinding.textencoding?view=netframework-4.0
Feel free to let me know if there is anything I can help with.

How to determine a file's Unicode character encoding in IOS?

In our application I have to open a text file which will be sum time UTF-8 format or UTF-16 format .
Is there any way to determine the file format of a file? Or Is it possible to check the readied 'NSString' is valid ?
You can use the following do-catch blocks as stated in the documentation if you are forced to guess the encoding of your text file, which works for Swift 4.0:
do {
let str = try String(contentsOf: url, usedEncoding: &encodingType)
print("Used for encoding: \(encodingType)")
} catch {
do {
let str = try String(contentsOf: url, encoding: .utf8)
print("Used for encoding: UTF-8")
} catch {
do {
let str = try String(contentsOf: url, encoding: .isoLatin1)
print("Used for encoding: Windows Latin 1")
} catch {
// Error handling
}
}
}
Apple's documentation has some guidance on how to proceed: String Programming Guide: Reading data with an unknown encoding:
If you are forced to guess the encoding (and note that in the absence of explicit information, it is a guess):
Try stringWithContentsOfFile:usedEncoding:error: or initWithContentsOfFile:usedEncoding:error: (or the URL-based equivalents). These methods try to determine the encoding of the resource, and if successful return by reference the encoding used.
If (1) fails, try to read the resource by specifying UTF-8 as the encoding.
If (2) fails, try an appropriate legacy encoding. "Appropriate" here depends a bit on circumstances; it might be the default C string encoding, it might be ISO or Windows Latin 1, or something else, depending on where your data is coming from.

How to send base64 encoded file to PlayFramework server?

I'd like to implement a FileUpload using the new FileReader API. From the client side, everything works well and I can send a PUT request to the server with the correct fields containing the file in Base64 encoded.
But in the server side, it's not going great, here are my results :
Logger.info(String.valueOf(request().body().asRaw())); // null
Logger.info(String.valueOf(request().body().asText())); // null
And most importantly :
Logger.info(String.valueOf(request().body().isMaxSizeExceeded())); // true !
What am I missing? How can I make it work?
I found the answer to my question !
For those who are looking for it, here's the answer :
You need to add a BodyParser as annotation for your method, and specify a higher maxLength value.
#BodyParser.Of(value = BodyParser.Json.class, maxLength = 1024 * 1024)
public static Result method() {
Logger.info(String.valueOf(request().body().asJson())); // Will not be empty!
}

Crawl Wikipedia using ASP.NET HttpWebRequest

I am new to Web Crawling, and I am using HttpWebRequest to crawl data from sites.
As of now I was successfully able to crawl and get data from my wordpress site. This data was a simple user profile data. (like name, email, AIM id etc...)
Now as an exercise I want to crawl wikipedia, where I will search using the value entered into textbox at my end and then crawl wikipedia with the search value and get the appropriate title(s) from the search.
Now I have the following doubts/difficulties.
Firstly, is this even possible ? I have heard that wiki has robot.txt setup to block this. Though I have heard this only from a friend and hence not sure.
I am using the same procedure I used earlier, but I am not getting the required results.
Thanks !
Update :
After some explanation and help from #svick, I tried the below code, but still not able to get any value (see last line of code, there I am expecting an html markup of the search result page)
string searchUrl = "http://en.wikipedia.org/w/index.php?search=Wikipedia&title=Special%3ASearch";
var postData = new StringBuilder();
postData.Append("search=" + model.Query);
postData.Append("&");
postData.Append("title" + "Special:Search");
byte[] data2 = Crawler.GetEncodedData(postData.ToString());
var webRequest = (HttpWebRequest)WebRequest.Create(searchUrl);
webRequest.Method = "POST";
webRequest.UserAgent = "Crawling HW (http://yassershaikh.com/contact-me/)";
webRequest.AllowAutoRedirect = false;
ServicePointManager.Expect100Continue = false;
Stream requestStream = webRequest.GetRequestStream();
requestStream.Write(data2, 0, data2.Length);
requestStream.Close();
var responseCsv = (HttpWebResponse)webRequest.GetResponse();
Stream response = responseCsv.GetResponseStream();
// Todo Parsing
var streamReader = new StreamReader(response);
string val = streamReader.ReadToEnd();
// val is empty !! <-- this is my problem !
and here is my GetEncodedData method defination.
public static byte[] GetEncodedData(string postData)
{
var encoding = new ASCIIEncoding();
byte[] data = encoding.GetBytes(postData);
return data;
}
Pls help me on this.
You probably don't need to use HttpWebRequest. Using WebClient (or HttpClient if you're on .Net 4.5) will be much easier for you.
robots.txt doesn't actually block anything. If something doesn't support it (and .Net doesn't support it), it can access anything.
Wikipedia does block requests that don't have their User-Agent header set. And you should use an informative User-Agent string with your contact information.
A better way to access Wikipedia is to use its API, rather than scraping. This way, you will get an answer that's specifically meant to be read by a custom applications, formatted as XML or JSON. There are also dumps containing all information from Wikipedia available for download.
EDIT: The problem with your newly posted code is that your query returns a 302 Moved Temporarily response to the searched article, if it exists. Either remove the line that forbids AllowAutoRedirect, or add &fulltext=Search to your query, which will mean you won't get redirected.

Getting around base64 encoding with WCF

I'm using WCF, REST and "pretty URI's" as shown in this blog post with the Online Template for VS 2010 .NET 4.0:
http://christopherdeweese.com/blog2/post/drop-the-soap-wcf-rest-and-pretty-uris-in-net-4
I have one problem though.
I want to return a a raw byte[] array but it automatically gets base64 encoded.
Unfortunately for my program base64 encoding is not acceptable because it will be too computationally intensive.
Is there a way for me to tell WCF NOT to base64 encode?
[WebGet(UriTemplate = "{id}")]
public byte[] Get(string id)
{
byte[] data = new byte[1024];
return data;
}
Appears to my web browser as:
AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==
Use Stream as your return type.