Not able to embed PDF blob in HTML in IE - pdf

I have adopted various approaches to embed PDF blob in html in IE in order to display it.
1) creating a object URL and passing it to the embed or iframe tag. This works fine in Chrome but not in IE.
</head>
<body>
<input type="file" onchange="previewFile()">
<iframe id="test_iframe" style="width:100%;height:500px;"></iframe>
<script>
function previewFile() {
var file = document.querySelector('input[type=file]').files[0];
var downloadUrl = URL.createObjectURL(file);
console.log(downloadUrl);
var element = document.getElementById('test_iframe');
element.setAttribute('src',downloadUrl);
}
</script>
</body>
2) I have also tried wrapping the URL Blob inside a encodeURIcomponent()
Any pointers on how I can approach to solve this?

IE doesn't support iframe with data url as src attribute. You could check it in caniuse. It shows that the support is limited to images and linked resources like CSS or JS in IE. Please also check this documentation:
Data URIs are supported only for the following elements and/or
attributes.
object (images only)
img
input type=image
link
CSS declarations that accept a URL, such as background, backgroundImage, and so on.
Besides, IE doesn't have PDF viewer embeded, so you can't display PDFs directly in IE 11. You can only use msSaveOrOpenBlob to handle blobs in IE, then choose to open or save the PDF file:
if(window.navigator.msSaveOrOpenBlob) {
//IE11
window.navigator.msSaveOrOpenBlob(blobData, fileName);
}
else{
//Other browsers
window.URL.createObjectURL(blobData);
...
}

Related

Unable to get `src` attribute of `<video>` with HTMLUnit

I am creating a video scraper (for the Rumble website) and I am trying to get the src attribute of the video using HTMLUnit, this is because the element is added dynamically to the page (I am a beginner to these APIs):
val webClient = WebClient()
webClient.options.isThrowExceptionOnFailingStatusCode = false
webClient.options.isThrowExceptionOnScriptError = false
webClient.options.isJavaScriptEnabled = true
val myPage: HtmlPage? = webClient.getPage("https://rumble.com/v1m9oki-our-first-automatic-afk-farms-locals-minecraft-server-smp-ep3-live-stream.html")
Thread.sleep(10000)
val document: Document = Jsoup.parse(myPage!!.asXml())
println(document)
The issue is, the output for the <video> element is the following:
<video muted playsinline="" hidefocus="hidefocus" style="width:100% !important;height:100% !important;display:block" preload="metadata"></video>
Whereas -- if you navigate to the page itself and let the JS load -- it should be:
<video muted="" playsinline="" hidefocus="hidefocus" style="width:100% !important;height:100% !important;display:block" preload="metadata" poster="https://sp.rmbl.ws/s8/1/I/6/v/1/I6v1f.OvCc-small-Our-First-Automatic-AFK-Far.jpg" src="blob:https://rumble.com/91372f42-30cf-46b3-8850-805ee634e2e8"></video>
Some attributes are missing, which are crucial for my scraper to work. I need the src value so that ExoPlayer can play the video.
I am not totally sure, but I was wondering whether it had to do with the fact that the crossOrigin attribute is anonymous in the JavaScript:
<video muted playsinline hidefocus="hidefocus" style="width:100% !important;height:100% !important;display:block" preload="'+t+'"'+(a.vars.opts.cc?' crossorigin="anonymous"':"")+'>
I tried to play around with the different HTMLUnit options, as well as look online but I still haven't been able to extract the right attributes I need so that it can work.
How would I be able to bypass this and get the appropriate element values (src) that I need for the scraper using HTMLUnit? Is this even possible to do with HTMLUnit? I was also suspecting that maybe the site owners added this cross origin anonymous statement because it can bypass scrapers, though I am not sure.
How to reproduce my issue
Navigate to this link with a GUI browser.
Press 'Inspect Element' until you find the <video> HTML tag and observe that it contains an src attribute as you would expect to the mp4 file:
<video muted="" playsinline="" hidefocus="hidefocus" style="width:100% !important;height:100% !important;display:block" preload="metadata" src="https://sp.rmbl.ws/s8/2/I/6/v/1/I6v1f.caa.rec.mp4?u=3&b=0" poster="https://sp.rmbl.ws/s8/1/I/6/v/1/I6v1f.OvCc-small-Our-First-Automatic-AFK-Far.jpg"></video>
Now, let's simulate this with a headless browser, so add the following code to IntelliJ or any IDE (add a dependency to HTMLUnit and JSoup):
To gradle (Kotlin):
implementation(group = "net.sourceforge.htmlunit", name = "htmlunit", version = "2.64.0")
implementation("org.jsoup:jsoup:1.15.3")
To gradle (Groovy):
implementation group = 'net.sourceforge.htmlunit', name = 'htmlunit', version = '2.64.0'
implementation 'org.jsoup:jsoup:1.15.3'
Then in Main function:
val webClient = WebClient()
webClient.options.isThrowExceptionOnFailingStatusCode = false
webClient.options.isThrowExceptionOnScriptError = false
webClient.options.isJavaScriptEnabled = true
val myPage: HtmlPage? = webClient.getPage("https://rumble.com/v1m9oki-our-first-automatic-afk-farms-locals-minecraft-server-smp-ep3-live-stream.html")
Thread.sleep(10000)
val document: Document = Jsoup.parse(myPage!!.asXml())
println(".....................")
println(document.getElementsByTag("video").first())
If it throws an exception add this:
LogFactory.getFactory().setAttribute("org.apache.commons.logging.Log", "org.apache.commons.logging.impl.NoOpLog");
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache.commons.httpclient").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit.javascript.StrictErrorReporter").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit.javascript.host.ActiveXObject").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit.javascript.host.html.HTMLDocument").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit.html.HtmlScript").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware.htmlunit.javascript.host.WindowProxy").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("com.gargoylesoftware").setLevel(Level.OFF);
java.util.logging.Logger.getLogger("org.apache").setLevel(Level.OFF);
We are simply fetching the page with the headless browser and then using JSoup to parse the HTML output and finding the first video element.
Observe that the output does not contain any 'src' attribute as you saw in the GUI browser:
<video muted playsinline="" hidefocus="hidefocus" style="width:100% !important;height:100% !important;display:block" preload="metadata"></video>
Screenshot of how your output should look like in the console:
This is the major issue I am having, the src attribute of the <video> element is seemingly disappeared in the headless browser, and I am unsure why although I suspect it's related to some sort of mp4 codec issue.
Correct, the js support for the video element was not sufficient for this case.
Have done a bunch of fixes/improvements and the upcoming version 2.66.0 will be able to support this.
Btw: there is no need to parse the page a second time using jsoup - HtmlUnit has all the methods to deeply look inside the dom tree of the current page.
String url = "https://rumble.com/v1m9oki-our-first-automatic-afk-farms-locals-minecraft-server-smp-ep3-live-stream.html";
try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
HtmlPage page = webClient.getPage(url);
webClient.waitForBackgroundJavaScript(10_000);
HtmlVideo video = (HtmlVideo) page.getElementsByTagName("video").get(0);
System.out.println(video.getSrc());
}
This code prints https://sp.rmbl.ws/s8/2/I/6/v/1/I6v1f.caa.rec.mp4?u=3&b=0 - the same as the source attribute in the browser.
But there are still two js errors reported when running this code. This is because some other js (i guess some tracking staff) provokes this errors. You can fix this by ignoring the js code for this two locations, this will make the code a bit faster also.
String url = "https://rumble.com/v1m9oki-our-first-automatic-afk-farms-locals-minecraft-server-smp-ep3-live-stream.html";
try (final WebClient webClient = new WebClient(BrowserVersion.FIREFOX)) {
webClient.getOptions().setThrowExceptionOnScriptError(false);
// ignore some js
new WebConnectionWrapper(webClient) {
public WebResponse getResponse(WebRequest request) throws IOException {
WebResponse response = super.getResponse(request);
if (request.getUrl().toExternalForm().contains("sovrn_standalone_beacon.js")
|| request.getUrl().toExternalForm().contains("r2.js")) {
WebResponseData data = new WebResponseData("".getBytes(response.getContentCharset()),
response.getStatusCode(), response.getStatusMessage(), response.getResponseHeaders());
response = new WebResponse(data, request, response.getLoadTime());
}
return response;
}
};
HtmlPage page = webClient.getPage(url);
webClient.waitForBackgroundJavaScript(10_000);
HtmlVideo video = (HtmlVideo) page.getElementsByTagName("video").get(0);
System.out.println(video.getSrc());
Thanks for this report - will inform on https://twitter.com/htmlunit about the new release.

ESRI JS API is stripping hrefs

ESRI's JS API seems to be stripping out the hrefs of URLs.
Here I set up a static link. Then I attempt to put it in the description. The link text and target="blank" are rendered but the link's href (test/) is blank!
{% for project in projects %}
var link = "<a target='blank' href='test'>Legal Description</a>";
console.log(link) // This prints as expected with href intact.
var attributes = {
Name: "{{project.description}}",
Description: link // strips out the href?!?!?!?!
}
It SHOULD be localhost:8000/projects/test but there is no test href.
The arcgis-js-api sanitizes html content in popups for security reasons. I'm not sure how you're defining your popups or using the attributes variable, but you'll want to create a PopupTemplate, and its its content property to do what you want. You can do it like the linked article recommends, or you can use a CustomContent instance for the popupTemplate content property.

Google script code formatted,colored and beautiful indent

I wrote a container-bound script and now want to make a report from it, by inserting the code into a Google Docs file. The problem is that with copy & paste from the Script Editor, the code is no longer colored or indented. I will need your help because I don't know how to make it well done.
I have this code :
createAndSendDocument() {
// Create a new Google Doc named 'Hello, world!'
var doc = DocumentApp.create('Hello, world!');
// Access the body of the document, then add a paragraph.
doc.getBody().appendParagraph('This document was created by Google Apps Script.');
// Get the URL of the document.
var url = doc.getUrl(); // Get the email address of the active user - that's you.
var email = Session.getActiveUser().getEmail();
}
As tehhowch said you'll need to write your own javascript code to do syntax formatting and then use the output of that.
You can use this https://www.w3schools.com/howto/tryit.asp?filename=tryhow_syntax_highlight they already have the script in place you only need to encode your html and put inside div id="myDiv" and run the javascript code.
<div id="myDiv">
Your encoded html goes here
</div>
Example
<div id="myDiv">
<!DOCTYPE html><br>
<html><br>
<body><br>
<br>
<h1>Testing an HTML Syntax Highlighter</h2><br>
<p>Hello world!</p><br>
<a href="https://www.w3schools.com">Back to School</a><br>
<br>
</body><br>
</html>
</div>
Make sure you first encode your html. [< -> &lt, > -> &gt, etc]
Then you can use the output of that . Sample : https://docs.google.com/document/d/1h8oDOZ0ReTgwxnYt2JKflHWJdlianSWWuBgbWcSdJC0/edit?usp=sharing
Reference and further reads : https://www.w3schools.com/howto/tryit.asp?filename=tryhow_syntax_highlight

How does the data flow from webserver to webpage.

I am in process of creating a new website. I have done my basic HTML/CSS/JQuery code to generate the webpage. The website is going to display images. Now my questions are around where the images are supposed to be stored and how to retrieve them. I did research but I am all over the place with the architecture.
My understanding is that HTML page will make a query to a web server (like Apache) and get the data/images back and display it? The function of the web server is to provide the data based on the query, is that right? Where is the data like jpeg images, their metadata, link between gallery and images would be stored? Is there another layer of DB somewhere? Would the architecture be HTML<-->Apache<-->DB ?
Or do I just put my images in a database and host the data their. Basically taking out Apache from the architecture? The queries are going to depend only on the current stage in the navigation tree (nothing user specific).
There is no need for DB to use images. It works more this way:
HTML <-> Apache <-> Image
because apache has the ability to deliver files.
Now, there are several differents way of working.
For example, the image can be load dynamically in a php files with images header. In this case, the scheme will be :
HTML <-> Apache <-> PHP <-> Image
To do it, you simply put your images in a folder where apache's user can access.
For example you can have the following structure in /var/www/sitename:
index.html
img / my_image.jpg
And in index.html
<img src="img/my_image.jpg" ... />
Edit to answer you question :
create a php script that will generate the json array, for example :
page.php
<?php
switch($_GET['link']){
case 'link1':
images_links = array(
'path/to/img1',
'path/to/img2',
...
);
break;
case 'link2':
images_links = array(
'path/to/img3',
'path/to/img4',
...
);
break;
}
echo json_encode(images_links);
?>
Let's guess your html is
<a>link1</a>
<a>link2</a>
<img class="imgToChange" src="..."/>
<img class="imgToChange" src="..."/>
...
Then you will add this javascript function to your html
function updateImages(clicked_link){
// get the text of the link
link_text = clicked_link.innerHTML;
// send a request to page.php to get images's urls
$.get( "path/to/page.php?link="+link_text, function( data ) {
// data will be your json array
images_links = data;
// get a table of all images elements that can be changed
var images = document.getElementsByClassName("imgToChange");
// for each image in the json array
for(var k=0; k<images_links.length; k++){
images[k].src = images_links[k];
}
});
}
And you just have to call this function when a link is clicked
<a onclick="updateImages(this)">link1</a>
<a onclick="updateImages(this)">link2</a>
<img class="imgToChange" src="..."/>
<img class="imgToChange" src="..."/>
...

Is there any way to automatically resize an iframe if the size of the content inside changes?

For example, I am trying to iframe the youtube subscription box on the homepage, and the problem is, if I make the iframe really long, then it wastes space, but if I make the size I want, then if the user clicks the "load more videos" button, then it gets cut off. So is there any way to make the iframe (or any alternatives) be a percentage of the size, or dynamically change when the page changes?
Create a file and call it iframe.html
<html>
<head>
<script type="text/javascript"></span>
function autoIframe(frameId){
try{
frame = document.getElementById(frameId);
innerDoc = (frame.contentDocument) ? frame.contentDocument : frame.contentWindow.document;
if (innerDoc == null){
// Google Chrome
frame.height = document.all[frameId].clientHeight + document.all[frameId].offsetHeight + document.all[frameId].offsetTop;
}
else{
objToResize = (frame.style) ? frame.style : frame;
objToResize.height = innerDoc.body.scrollHeight + 18;
}
}
catch(err){
alert('Err: ' + err.message);
window.status = err.message;
}
}
</script>
</head>
<body>
<iframe id="tree" name="tree" src="tree.html" onload="if (window.parent && window.parent.autoIframe) {window.parent.autoIframe('tree');}"></iframe>
</body>
</html>
Now create an html page called tree.html and put some dummy content in it.Make sure that the iframe.html and the tree.html are in the same directory. Open the .html files in browser and you will observe the o/p.
Some more useful links :
How to detect iframe resize?
How to detect iframe iframe resize