REST Stream's OutgoingResponse.ContentType is ignored, always shows "application/xml" on receiving browser - wcf

I have a self-hosted WCF REST/webHttpBinding-endpoint-bound service. I have a few streams of different content types that it serves. The content itself is delivered correctly, but any OutgoingResponse.ContentType setting seems to be ignored and instead delivered as "application/xml" every time.
Browsers seems to get over it for javascript and html (depending on how it's to be consumed), but not for css files which are interpreted more strictly. CSS files are how I realized the problem but it's a problem for all Streams. Chromebug and IE developer tools both show "application/xml" regardless of what I put in the serving code for a content type. I've also tried setting the content type header as a Header in OutgoingResponse but that makes no difference and it probably just a long way of doing what OutgoingResponse.ContentType does already.
[OperationBehavior]
System.IO.Stream IContentChannel.Code_js()
{
WebOperationContext.Current.OutgoingResponse.ContentType = "text/javascript;charset=utf-8";
var ms = new System.IO.MemoryStream();
using (var sw = new System.IO.StreamWriter(ms, Encoding.UTF8, 512, true))
{
sw.Write(Resources.code_js);
sw.Flush();
}
ms.Position = 0;
return ms;
}
This behavior is added:
var whb = new WebHttpBehavior
{
DefaultBodyStyle = System.ServiceModel.Web.WebMessageBodyStyle.WrappedRequest,
DefaultOutgoingRequestFormat = System.ServiceModel.Web.WebMessageFormat.Json,
DefaultOutgoingResponseFormat = System.ServiceModel.Web.WebMessageFormat.Json,
HelpEnabled = false
};
I've tried setting AutomaticFormatSelectionEnabled = true and false just in case because it came up in google searches on this issue, but that has no effect on this.
I'm finding enough articles that show Stream and ContentType working together to confuse the heck out of me as to why this isn't working. I believe that the Stream is only intended to be the body of the response, not the entire envelope.
My .svclog doesn't show anything interesting/relevant that I recognize.
============
I can confirm in Fiddler2 that the headers are being delivered as shown in the browser.
...
Content-Type: application/xml; charset=utf-8
Server: Microsoft-HTTPAPI/2.0
...

Solved!
I had something like the following in a MessageInspector:
HttpResponseMessageProperty responseProperty = new HttpResponseMessageProperty();
responseProperty.Headers.Add("Access-Control-Allow-Origin", "*");
reply.Properties["httpResponse"] = responseProperty;
and this was overwriting the already-present HttpResponseMessageProperty in reply.Properties, including any contentType settings. Instead, I tryget the HttpResponseMessageProperty first and use the existing one if found.
I lucked out seeing that one.

Related

Return binary string as plain text in browser window (and not as a downloadable file)

I want to return the following protobuf serialised binary data to the browser (Chrome) and not as a downloadable file. I don't understand the mechanism that is prompting a download. It is not the mime type as I am using text/plain elsewhere.
Controller:
[HttpGet]
public async Task<ActionResult<string>> GenerateProtoFeed()
{
var feed = _gtfsrService.GenerateFeed();
using (var stream = new MemoryStream())
{
feed.WriteTo(stream);
stream.Position = 0;
using (var reader = new StreamReader(stream))
{
return Content(reader.ReadToEnd(), "text/plain");
}
}
}
What I really want is this (example) to be returned in the browser window:
2.0?????/?
-Mcycmmp9-o4C0qeoGdz*?
????/*0
rE6s0CN800STv61PAKtfHAL6wS0jjmZkSZwq1PAKtf8A08Z?
?
?#StationAlert Elevators at Commercial-Broadway and Brentwood Stations are temporarily out of service today. ^sdken
The browser handles responses from a server differently depending on how the user has configured it, and on the mime type of the response.
It looks like your browser's default behaviour for text/plain is to prompt a save action. If you set the mime type of your response to text/html, the browser should simply display it.
Note that this is of course technically incorrect in this case.

Prevent Caching .svg Images on Application Server

I have a JWS application that caches several different resource types. However, I do not want to cache .svg images. It seems that the framework does not honor the server side cache control HTTP headers that I have set.
I was wondering if there is some other way that I could load .svg images without caching. I am open to putting a solution in my loadSVGDocument() method, but my code is currently built around Apache Batik for loading .svg files. Is there a solution to pass an InputStream with a noCache flag within the Batik library similar to what DocumentBuilderFactory provides below?
URL url = new URL(fileLocation);
URLConnection connection = url.openConnection();
// Prevent JavaWebStart from returning cached copy.
connection.setUseCaches(false);
// Now fetch the content, e.g.
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
DocumentBuilder db = dbf.newDocumentBuilder();
Document doc = db.parse(connection.getInputStream());
Here is my current loadSVGDocument() method that uses several Apache Batik fields:
public void loadSVGDocument(final String url)
{
System.out.println("THE SVG URL: " + url);
String oldURI = null;
if (svgDocument != null)
{
oldURI = svgDocument.getURL();
}
final ParsedURL newURI = new ParsedURL(oldURI, url);
String theUrl = newURI.toString();
fragmentIdentifier = newURI.getRef();
loader = new DocumentLoader(userAgent);
nextDocumentLoader = new SVGDocumentLoader(theUrl, loader);
nextDocumentLoader.setPriority(Thread.NORM_PRIORITY);
Iterator it = svgDocumentLoaderListeners.iterator();
while (it.hasNext())
{
nextDocumentLoader
.addSVGDocumentLoaderListener((SVGDocumentLoaderListener) it.next());
}
documentLoader = nextDocumentLoader;
nextDocumentLoader = null;
documentLoader.run();
}
For anyone interested, I found that I can call Batik's
DocumentLoader.loadDocument(URL url, InputStream is)
with the setUseCaches flag as false. Not only does this load the image, but it also removes it from the cache accordingly. Though not the best solution in the sense that it would be nice for JWS to honor my HTTP headers, this work-around is good enough.

Crawl Wikipedia using ASP.NET HttpWebRequest

I am new to Web Crawling, and I am using HttpWebRequest to crawl data from sites.
As of now I was successfully able to crawl and get data from my wordpress site. This data was a simple user profile data. (like name, email, AIM id etc...)
Now as an exercise I want to crawl wikipedia, where I will search using the value entered into textbox at my end and then crawl wikipedia with the search value and get the appropriate title(s) from the search.
Now I have the following doubts/difficulties.
Firstly, is this even possible ? I have heard that wiki has robot.txt setup to block this. Though I have heard this only from a friend and hence not sure.
I am using the same procedure I used earlier, but I am not getting the required results.
Thanks !
Update :
After some explanation and help from #svick, I tried the below code, but still not able to get any value (see last line of code, there I am expecting an html markup of the search result page)
string searchUrl = "http://en.wikipedia.org/w/index.php?search=Wikipedia&title=Special%3ASearch";
var postData = new StringBuilder();
postData.Append("search=" + model.Query);
postData.Append("&");
postData.Append("title" + "Special:Search");
byte[] data2 = Crawler.GetEncodedData(postData.ToString());
var webRequest = (HttpWebRequest)WebRequest.Create(searchUrl);
webRequest.Method = "POST";
webRequest.UserAgent = "Crawling HW (http://yassershaikh.com/contact-me/)";
webRequest.AllowAutoRedirect = false;
ServicePointManager.Expect100Continue = false;
Stream requestStream = webRequest.GetRequestStream();
requestStream.Write(data2, 0, data2.Length);
requestStream.Close();
var responseCsv = (HttpWebResponse)webRequest.GetResponse();
Stream response = responseCsv.GetResponseStream();
// Todo Parsing
var streamReader = new StreamReader(response);
string val = streamReader.ReadToEnd();
// val is empty !! <-- this is my problem !
and here is my GetEncodedData method defination.
public static byte[] GetEncodedData(string postData)
{
var encoding = new ASCIIEncoding();
byte[] data = encoding.GetBytes(postData);
return data;
}
Pls help me on this.
You probably don't need to use HttpWebRequest. Using WebClient (or HttpClient if you're on .Net 4.5) will be much easier for you.
robots.txt doesn't actually block anything. If something doesn't support it (and .Net doesn't support it), it can access anything.
Wikipedia does block requests that don't have their User-Agent header set. And you should use an informative User-Agent string with your contact information.
A better way to access Wikipedia is to use its API, rather than scraping. This way, you will get an answer that's specifically meant to be read by a custom applications, formatted as XML or JSON. There are also dumps containing all information from Wikipedia available for download.
EDIT: The problem with your newly posted code is that your query returns a 302 Moved Temporarily response to the searched article, if it exists. Either remove the line that forbids AllowAutoRedirect, or add &fulltext=Search to your query, which will mean you won't get redirected.

Why am I getting System.FormatException: String was not recognized as a valid Boolean on a fraction of our customers machines?

Our c#.net software connects to an online app to deal with accounts and a shop. It does this using HttpWebRequest and HttpWebResponse.
An example of this interaction, and one area where the exception in the title has come from is:
var request = HttpWebRequest.Create(onlineApp + string.Format("isvalid.ashx?username={0}&password={1}", HttpUtility.UrlEncode(username), HttpUtility.UrlEncode(password))) as HttpWebRequest;
request.Method = "GET";
using (var response = request.GetResponse() as HttpWebResponse)
using (var ms = new MemoryStream())
{
var responseStream = response.GetResponseStream();
byte[] buffer = new byte[4096];
int read;
do
{
read = responseStream.Read(buffer, 0, buffer.Length);
ms.Write(buffer, 0, read);
} while (read > 0);
ms.Position = 0;
return Convert.ToBoolean(Encoding.ASCII.GetString(ms.ToArray()));
}
The online app will respond either 'true' or 'false'. In all our testing it gets one of these values, but for a couple of customers (out of hundreds) we get this exception System.FormatException: String was not recognized as a valid Boolean Which sounds like the response is being garbled by something. If we ask them to go to the online app in their web browser, they see the correct response. The clients are usually on school networks which can be fairly restrictive and often under proxy servers, but most cope fine once they've put the proxy details in or added a firewall exception. Is there something that could be messing up the response from the server, or is something wrong with our code?
Indeed, it's possible that the return result is somehow different.
Is there any particular reason you are doing the reasonably elaborate method of reading the repsonse there? Why not:
string data;
using(HttpWebResponse response = request.GetResponse() as HttpWebResponse){
StreamReader str = new StreamReader(response.GetResponseStream());
data = str.ReadToEnd();
str.Close();
}
string cleanResult = data.Trim().ToLower();
// log this
return Convert.ToBoolean(cleanResult);
First thing to note is I would definitely use something like:
bool myBool = false;
Boolean.TryParse(Encoding.ASCII.GetString(ms.ToArray()), myBool);
return myBool;
It's not some localisation issue is it? It's expecting the Swahili version of 'true', and getting confused. Are all the sites in one country, with the same language, etc?
I'd add logging, as suggested by others, and see what results you're seeing.
I'd also lean towards changing the code as silky suggested, though with a few further changes from me (code 'smell' issues, IMO); Use using around the stream reader, as well as the response.
Also, I don't think the use of as is appropriate in this instance. If the Response can't be cast to HttpWebResponse (which, admittedly is unlikely, but still) you'll get a NullRef exception on the response.GetResponseStream() bit which is both a vague error, and you've lost the original line number. Using (HttpWebResponse)request.GetResponse() will give you a more correct error, and the correct line number of the actual error.

Why am I getting a "double response" from HttpWebResponse?

The follow code (running in ASP.Net 2.0) displays the contents of the requested URL twice. I only want it to display the contents of the requested URL once. I can't figure out what I'm doing wrong. The URL requested is returning XML and if I visit the URL directly, it works fine.
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
byte[] postDataBytes = Encoding.UTF8.GetBytes(postData);
request.Method = "POST";
request.ContentType = "application/xml";
request.ContentLength = postDataBytes.Length;
Stream requestStream = request.GetRequestStream();
requestStream.Write(postDataBytes, 0, postDataBytes.Length);
requestStream.Close();
// get response and write to console
response = (HttpWebResponse) request.GetResponse();
StreamReader responseReader = new StreamReader(response.GetResponseStream(), Encoding.UTF8);
try {
Response.Write(responseReader.ReadToEnd());
}
finally {
responseReader.Close();
}
response.Close();
Your code looks good, so I don't think the problem is there... but what I would suggest is the following:
1) Maybe the error is on the URL's other end... so try hitting Google and see if the returned content is good or not.
2) Put a breakpoint at the "responseReader.ReadToEnd()" spot, and see if what's coming out of there is good.
3) If this code above is in an ASPX page... are you making sure to call "Response.End();" after you're last line of code? (not "resposne.close()", but "Response.End()").
I found the problem. It's not with the above code at all, but with the page being called. The page I was calling was inherited from a class whose Page_OnInit method contained the following line: "MyBase.OnLoad(e)", which caused the Page_OnLoad method to be executed twice. Obviously, it should have been MyBase.OnInit(e) instead. I didn't catch it because when I tested the page directly I had to temporarily remove the inheritance from the class because of some other code that would've have prevented me from testing the page directly.
I will now put on my "Dunce" hat and retreat to the corner for a time out. Thanks anyway for the help.