save phantom js processed page into html file with absolute url - phantomjs

I want to save my special web pages after document loaded into special file name via all url and links convert to absolute url such as wget -k.
//phantomjs
var page = require('webpage').create();
var url = 'http://google.com/';
page.open(url, function (status) {
var js = page.evaluate(function () {
return document;
});
console.log(js.all[0].outerHTML);
phantom.exit();
});
for example my html content somthing like this:
page
must be
page
It's my sample script but how can i convert all url and links such as wget -k using phantomjs?

You can modify your final HTML so that it has a <base> tag - this will make all relative URLs working. In your case, try putting <base href="http://google.com/"> right after the <head> on the page.

It is not really supported by PhantomJS is more than just an HTTP client. Imagine if there is a JavaScript code which pulls a random content with image on the main landing page.
The workaround which might or might not for you is to replace all the referred resource in the DOM. This is possible using some CSS3 selector (href for a, src for img, etc) and manual path resolve relative to the base URL. If you really need to track and enlist every single resource URL, use the network traffic monitoring feature.
Last but not least, to get the generated content you can use page.content instead of that complicated dance with evaluate and outerHTML.

Related

Node.js Response From Image Upload Without Refreshing Client Page

Problem Set: Client .posts image from form action='/pages/contact/image/something' to node.js and I .putItem to AWS S3. On the success response I would like to send the image url back to the client without refreshing the screen and update the location they wanted to add the image with the new src url.
If the page refreshes I lose the location where they wanted to upload the image. Open to any suggestions, I have looked at res.send, res.end, res.json, res.jsonp, res.send(callback): all of which overwrite(refresh) the client webpage with the array, text or context in general I am passing back to the client . Code below:
myrouter.route('/Pages/:Page/Image/:Purpose')
.post(function (req, res) {
controller.addImageToS3(req, res)
.then(function(imgurl){
//res.json({imgurl : imgurl});
//res.send(imgurl);
//res.end(imgurl);
//res.send(req.query.callback(imgUploadResponse(imgurl)))
<response mechanism here>
console.log('Image Upload Complete');
}, function (err){
res.render('Admin/EditPages', {
apiData : apiData,
PageId : PageId
});
});
});
Ideally there could be a passed parameter to a javascript function that I could then use: Example:
function imgUploadResponse(imgurl){
// Do something with the url
}
You, as a developer, have full control over the s3 url format. It follows a straightforward convention:
s3-region.amazonaws.com/your-bucket-name/your-object-name.
For example:
https://s3-us-west-2.amazonaws.com/some-random-bucket-name/image.jpg
While I would recommend keeping those details in the back-end, if you really want to avoid using res.send, you can basically make the front-end aware of the url formatting convention and present the url to the user, even before the image was actually uploaded (just need to append the name of the image to the s3-region.amazonaws.com/your-bucket-name)
Also, I'm not sure why your page would refresh. There are ways to refresh content of your page without refreshing the whole page, the most basic being AJAX. Frameworks like angular provide you with promises that allow you to do back-end calls from the front-end.
Hope this helps!

how to dynamically rewrite the url?

original url
http://wwww.mydomain.com/image.php?id=13&cat=4&type=3$date=2011-03-14
i want to modify this dynamic url to something like this
http://www.mydomain.com/imageid/imagetitle (no php/html extension at the end )
plz if someone could help me in this , i have tried several online generators but my modification is little different.
I don't know how to write mods .
well, to get it you could need a page that redirects you to another location.
in the head section of your html page you have to specify a meta tag like it:
<meta http-equiv="refresh" content="0; http://www.mydomain.com/imageid/imagetitle" />
when you land on that page, it counts to seconds you specified in the first argument of content (in this case 0 seconds) and opens the new page passed as the second parameter.
a easy thing you could do to avoid extensions to be visible on the address bar of your browser is to create a folder in your web server (in your case "imagetitle") and put inside it an index page for example "index.html" which is the page loaded by default when a user aims to that link

Redirection to original URL having hash tag (#) broken in MVC4

I am developing an SPA application using AngularJS working with REST Web API, on top of a very small layer of ASP.NET MVC4. For reasons not important here, I am not using the default Account Controller of MVC4.
basically, I want to share "tasks" between users. My goal is to be able send the URL of a specific "task" entity to any user, via email. Clicking on that URL should launch the authentication. Following a successful authentication, I want to display the real task page info.
AngularJS causes my URLs to have # sign, or a URL of a page displaying the task "XYZ123" is:
http://hostname.com/#/tasks/XYZ123
ASP.NET redirects the unauthorized access to that URL to:
http://hostname.com/Home/Login?ReturnUrl=%2f#/tasks/XYZ123
This is OK, but the relevant controller method "cuts out" the path from #, so in:
public ActionResult Login(string returnUrl)
the value of 'returnUrl' will be just "/"
So, I am losing the path: I would like to build a "Connect with Facebook" link having the original URL, like:
http://hostname.com/Login/ExternalLogin?ReturnUrl=%2F#/tasks/XYZ123
but I cannot.
What is the right way to solve this issue?
I can think of creating my own redirection service URL without # tag, but this solution implies additional work, and covers only a case when the system is sending a message with task URL - humans will still try to copy/paste the location URL from the browser.
Thanks for any hint.
Max
Yes. A browser cuts '#/tasks/XYZ123' and requests page without that hash.
Although the hash itself apears on the logon page - it's the browser's work again.
Hash is not traveling to the server.
So when a browser loads the logon page with ?ReturnUrl=%2f#/tasks/XYZ123 we can rewrite Form action and encode the hash.
If the form looks like:
<form action="/Home/Login" method="post" >
...
</form>
The javascript code should look like:
<script src="~/js/jquery.js"></script>
<script type="text/javascript">
$(function() {
var search = $(location).attr('search') || '';
var hash = $(location).attr('hash') || '';
if (hash.length === 0) {
if (window.history.pushState) {
window.history.pushState('login', 'Login', '/Home/Login');
}
} else if (search === '?ReturnUrl=%2f') {
$('form').attr('action', '/Home/Login' + search + encodeURIComponent(hash) );
}
});
</script>
The part with window.history.pushState is required for the following:
If there is no hash, then for a SPA its URL (more likely) will be:
http://hostname.com/Home/Login?ReturnUrl=%2f
so here we try to replace URL (without page reload) with more accurate
http://hostname.com/Home/Login
You can use the properties of Request (like .Urlor .QueryString) to get the original url (and url parameters), instead of relying on the automatic binding of returnUrl parameter.
Replace # in the returnUrl with %23

how to open specific page on Google's docs viewer

I'm using google's docs viewer to show a pdf document in a html page and I would like to open the document starting on page 20 instead of 1 for example.
There's hardly any documentation about Google's docs viewer service. They say in its webpage https://docs.google.com/viewer that the service only accepts two parameters (url and embedded) but I've seen other parameters searching the web, like "a", "pagenumber", "v" and "attid", none of them did anything to me. I've tried to add #:0.page.19 at the end of my url (that's the id of the div containing page number 20 inside the body google creates) but it just ignores it or works in a random way.
Do you guys know how to tell google docs viewer to show the document starting on a specific page?
I found a solution I'll post here just in case somebody is in the same situation.
Every page inside google's docs viewer iframe has an id like :0.page.X, being X the number of the page. Calling the service like this
<iframe id="iframe1" src="http://docs.google.com/gview?url=http://yourpdf&embedded=true#:0.page.20">
won't work (maybe because the pages ids are not yet created when the page is rendered?)
So you just have to add an onload attribute to the iframe:
<iframe id="iframe1" src="http://docs.google.com/gview?url=http://yourpdf&embedded=true" onload="javascript:this.contentWindow.location.hash=':0.page.20';">
and voilĂ , the iframe will automatically scroll down after loading.
Note that page indices are zero-based. If you want to view the 20th page of a document in the viewer, you would need use the parameter :0.page.19
I found these two ones :
1) just an Screenshot(Image) of specific page (without navigation):
https://docs.google.com/viewer?url=http://infolab.stanford.edu/pub/papers/google.pdf&embedded=true&a=bi&pagenumber=12
2) a link to specific page of PDF in IFRAME (with navigation):
<script>
var docURL='https://docs.google.com/viewer?url=http://infolab.stanford.edu/pub/papers/google.pdf&embedded=true';
var startPAGE=7;
document.write('<iframe id="iframe1" onload="javascript:go_to_page('+ startPAGE +')" src="https://docs.google.com/viewer?url=http://infolab.stanford.edu/pub/papers/google.pdf&embedded=true"width="600" height="400" ></iframe>');
function go_to_page(varr) { document.getElementById("iframe1").setAttribute("src", docURL + '#:0.page.'+ (varr-1) );}
</script>
p.s. then you can have on your website go to page 3
For me this solution didn't work with the current version of google viewer. Link to specific page on Google Document Viewer on iPad helped me out. Use &a=bi&pagenumber=xx as additonal URL parameter.
Got it working in the imbed viewer
By changing the url with a timout function, this becous the pdf is not directly shown
$(window).load(function() {
setTimeout(function() { $('.gde-frame').attr('src', 'https://docs.google.com/viewer?url=http://yourdomain.com/yourpdf.pdf&hl=nl&embedded=true#:0.page.15'); }, 1000);
});
Just copy the imbed url and add your hash for the page (#:0.page.15 > will go to page 15)
You might want to change the language to your own &hl=nl
and for the people how have to support ie8
if you use a boilerplate like this:
<!--[if IE 8 ]> <html class="no-js ie8" lang="en"> <![endif]-->
you can change the output of the link directly to the pdf like this,
if( $("html").hasClass("ie8") ) {
$('#linkID').attr('href', 'http://yourdomai.com/yourpdf.pdf');
};
the pdf will be shown in the pdf reader from IE
My PDF is 13 pages, but when I used hash: 0.page.13, it only jumped to page 10. I had to setTimeout and call the same function again:
Only needed to do that once per page load, then it seems to be synched properly. I am sure there is a more elegant solution:
var is_caught_up = false;
function pdf_go(page){
$('pdf_frame').contentWindow.location.hash=':0.page.'+page;
if (!is_caught_up){
setTimeout('pdf_catchup('+page+')', 500);
}
}
function pdf_catchup(page){
$('pdf_frame').contentWindow.location.hash=':0.page.'+page;
is_caught_up = true;
}

Chrome extensions: How can I pass form variables to a url in a newly created tab?

I'm working on my first chrome extension, and using an image-based context menu item to capture the URL of a given image, and want to then display that image at a specific URL in a new tab. So, I need to pass the URL of the image clicked on (using srcUrl) to a specific script that can then render it on that page. Is it possible to perform an HMLHttpRequest from within a chrome.tabs.create() call, or must this be done some other way?
Thanks for any help.
You would need to create an HTML page containing that script and put it into your extension folder. Then you can just pass image url to it as GET parameter:
chrome.tabs.create({url: "local.html?img_url=...");
If url parameter is not enough, you would be also able to communicate with that page using chrome.tabs.sendRequest():
chrome.tabs.create({url: "local.html", function(tab){
chrome.tabs.sendRequest(tab.id, {img_url: "local.html?img_url=...");
));
With the request listener in that page:
chrome.extension.onRequest.addListener(function(request) {
console.log(request.img_url);
});