Override forcedownload behavior in Sitecore - pdf

We had a problem with some of our IE clients failing to download a PDF, even after clicking on the link. We found the answer here resolved our problems: set forcedownload=true for PDF mime types in web.config.
However, that created another problem: we are now unable to render a PDF in a browser when we want to. We used to do this with an iframe. However, as you can see, the PDF just downloads, and does not render in the browser.
I learned that the forcedownload=true setting is actually a default in a subsequent version of Sitecore (v7.2). So, I'm hesitant to revert that.
So, how do I render a PDF in a browser in this situation?

You can leave forceDownload=false on the PDF mime type and instead set the following setting to false:
<setting name="Media.EnableRangeRetrievalRequest" value="false"/>
I faced the same dilema a few months back with the same initial fix. Found out the actual issue last week, I wrote a blog post about it. (In fact, I wrote the answer you linked to, I've updated it with the same information now for future visitors)
The issue is basically a combination of Adobe Reader plugin for IE9, chunked transfer encoding and streaming the file directly from the database. I found if you close your browser and try again, or force refresh with Ctrl+F5 it worked fine. Once Sitecore had cached the file to disk it would continue to work for everyone.
The above setting disables chunked transfer encoding, instead sending the file down to the browser as a single piece. This setting was introduced in Sitecore 6.5+

This is one of the flaws in the MediaRequestHandler and in my opinion; the forceDownload option is pretty useless the way it is designed by default. (Why would ever want to configure this option on media extension only?)
You’ll have to basically turn off the forcedownload option again and replace the MediaRequestHandler with your own one. I usually end up with writing my own anyway because if other issues with the default handler, such as dealing properly with CDN’s etc.
In the ProcessRequest pipeline, you can determine if the item should be “downloaded” or not by setting the Content-Disposition header. You basically need to get rid of the default handling of forceDownload and set your headers based on your own logic.
Personally I prefer to set a query string parameter, such as ?dl=1, and base the Content-Disposition header on this. You could also extend the MediaItem template to contain a default behavior on each item or sub tree (leverage from Sitecore inheritance and standard values), and potentially you could thereby also define (override) a specific filename on each item for the attachment part in the Content-Disposition header.
When rendering the link, you can leverage from the properties collection (write a suitable extension method or similar), so that you can clearly mark your code that the link is meant for download, but still leverage from the built in field render methods. Thereby you eliminate the risk of messing up the page editor etc.
/ Mikael

You have to disable range retrieval request in web.config by setting its value to false.
<setting name="Media.EnableRangeRetrievalRequest" value="false" />
MediaRequestHandler enables Sitecore to download PDF content partially in range using HTTP 206 Status code. You can also overwrite MediaRequestHandler and write your own custom implementation to handle media request.

Related

Quickly-editable config file, not visible to public visitors?

I am in the middle of working with, and getting a handle on Vuejs. I would like to code my app in a way that it has some configurable behaviors, so that I could play with parameter values, like you do when you edit your Sublime preferences, but without having to compile my app again. Ideally, I would want a situation where I could have my colleagues be able to fiddle with settings all day long, by editing a file over FTP maybe, or over some interface....
The only way I know how to do it now, is to place those settings in a separate file, but as the app runs in the client, that file would have to be fetched via another HTTP request, meaning it's a publicly readable file. Even though there isn't any sensitive information in such a configuration file, I still feel a little wonky about having it public like that, if it can be avoided in any way...
Can it be avoided?
I dont think you can avoid this. One way or another your config file will be loaded into the vuejs application, therefore being visible to the end user (with some effort).
Even putting the file outside of the public folder wouldnt help you much, because then it is unavailable for HTTP to request the file. It would only be available to your compile process in this case.
So a possible solution could be to have some sort of HTTP request that requests GET example.com/settings and returns you a JSON object. Then you could have your app make a cookie like config_key = H47DXHJK12 (or better a UUID https://en.wikipedia.org/wiki/Universally_unique_identifier) which would be the access key for a specific config object.
Your application must then request GET example.com/settings (which should send the config_key cookie), and your config object for this secret key will be returned. If the user clears his cookies, a new request will return an empty config.

Changing page content with custom scheme in WebKitGTK1

I have an app using the WebKitGTK1 API with WebKit-GTK 2.4.9 on Linux. (This is the current version in Debian Jessie and versions 2.5+ don't support the v1 API.)
I've implemented a custom URI scheme for loading entire basic page content by using a resource-request-starting handler, which parses the incoming URI via webkit_web_resource_get_uri and if it matches the custom scheme, generates some HTML content and calls webkit_network_request_set_uri to replace the original URI with a base64'd data: URI containing the content to render. (This is similar to the accepted answer of this question.)
This mostly works well, and my handler is called on each request (including repeated requests with the same original URI) and generates the correct content -- but somewhere upstream the browser appears to render only the first returned data for any given original URI, even if the data URI I generate is different.
Possibly of note is that webkit_web_resource_get_uri returns the original non-data: URI even after calling webkit_network_request_set_uri, so I assume this URI is being cached, and in turn is then being used as a key in some higher-level component to cache the data instead of using the real URI from the request.
Unfortunately this appears to be a G_PARAM_CONSTRUCT_ONLY property and there doesn't appear to be any public API to set and/or clear it so that it uses the rewritten URI of the request instead. Is there some way to force GTK to set the property after construction anyway? As far as I can tell it does have a setter method internally, and the getter would do the Right Thing™ if the internal property were reset to NULL.
Or is there some better method to force WebKit to render the new data: URI despite anything it thinks to the contrary?
For the moment I've worked around it by including the values that make it generate different data in the original custom URI (passed to webkit_web_view_load_uri or in links in the generated page). This does work but it's a bit ugly, and could be problematic if I forget to add something in the future, or if something changes generation but is not known in advance. It seems a bit silly that it goes to all the trouble of raising the event that generates the correct data, only to throw it away later, (presumably) due to an URI comparison on the wrong URI.
I suppose using a known-unique value (eg. sequential incrementing id) would also work, and resolve some of the unknown-in-advance issues, but that's no less ugly.

Asynchronous Pluggable Protocol for CID: (email), how to handle duplicate URLs

This is somewhat a duplicate of this question, but that question has no (valid) answer and is 1.5 years old so asking my own with hopes people have more info now.
If you are using multiple instances of a WebBrowser control, MSHTML, IHTMLDocument, or whatever... from inside the APP instance, mostly IInternetProtocol::Start, is there a way to know which instance is loading the resource? Or is there a way to use a different APP for each instance of the control, maybe by providing one via IDocHostUIHandler or ICustomDoc or otherwise? I'm currently using IInternetSession::RegisterNameSpace to make it process wide.
Optional reading below, don't feel you need to read it unless above isn't clear.
I'm working on a legacy (Win32 C++) email client that uses the MS ActiveX WebBrowser control (MSHTML or other names it goes by) to display HTML emails. It was saving everything to temp files, updating the cid: URLs, and then having the control load that. Now I want to do it the correct way, using APP. I've got it all working with some test code that just uses static variables/globals and loads one email.
My problem now is, the app might have several instances of the control all loading different emails (and other stuff) at the same time... not really multiple threads so much, just the asynchronous nature of the control. I can give each instance of the control a unique URL to load the email, say, cid:email-GUID, and then in my APP code I can use that URL to know which email to load. However, when it comes to loading any content inside the email, like attached images using src="cid:", those will not always be unique so I will not always know which image it is, for which email. I'd like to avoid having to modify the URLs of the HTML before displaying it (I'm doing that now for the temp file thing, but want to do it a better way).
IInternetBindInfo::GetBindString can return the referrer, BINDSTRING_XDR_ORIGIN, or the root URL, BINDSTRING_ROOTDOC_URL, but those require newer versions of IE and my legacy app must support older XP installs that might even have IE6 or IE7, so I'd rather not use these.
Tagged as TWebBrowser because that is actually what I'm using (Borland Builder 6 C++), but don't need answers specific to that platform.
As the Asynchronous Pluggable Protocol Handler us very low level, you cannot attach handlers individually to different rendering controls.
Here is a way to get the referrer:
Obtain BINDSTRING_HEADERS
Extract the referrer by parsing the line Referer: http://....
See also How can I add an extra http header using IHTTPNegotiate?
Here is another crazy way:
Create another Asynchronous Pluggable Protocol Handler by calling RegisterMimeFilter.
Monitor text/plain and text/html
Scan the incoming email source (content comes incrementally) and parse store all image links in a dictionary
In NameSpaceHandler you can use this dictionary to find the reference of any image resources.

How to set http headers in dotCMS

I'm trying to create a XML data feed with dotCMS. I can easily output the correct XML document structure in a .dot "page", but the http headers sent to the client are still saying that my page contains "text/html". How can I change them to "text/xml" or "application/xml"?
Apparently there's no way to do it using the administration console. The only way I found is to add this line of (velocity) code
$response.setHeader("Content-Type", "application/xml")
to the top of the page template.
Your solution is the easiest. However there are other options that are a bit more work, but that would prevent you from having to use velocity to do the XML generation, which is more robust most of the time.
DotCMS uses xstream to generate XML files (and vise versa). You could write a generic plugin to use this as well.
An JSONContentServlet exists in dotCMS that takes a query and generates json or xml (depending on your parameters). It is not mapped on a servlet by default, but that is easy to add.

Default script resolutions in Sling

I don't know that I have the terminology completely correct, but it seems that there are some default behaviors for script resolution in Sling (which I'm using as part of Day CQ). For example, .infinity.json returns a representation of the node and its children. Also, if I have a piece of content that I normally would access with a .html extension, I've been able to use a .xml or .json extension to get a representation of that data. However, if I use a .txt extension, I get nothing back, even though as far as I can tell, I do have scripts that should match the request (eg GET.jsp). Are these behaviors documented somewhere?
GET.jsp will match a request ending with .html, as Sling considers html as the default representation. To activate a JSP script for the .txt rendering, you need to name it txt.jsp
See http://sling.apache.org/site/servlets.html for details.