Dumping browser document content using Zombie.js - zombie.js

Using browser.visit, I am fetching the page of a browser as shown in the documentation. According to the browser API, browser.document returns the main window's document. However, I am not sure how to dump (display) the contents of the document. Is there a method like browser.document.toString() or browser.document.text() to be able to print the contents of the document in the console.
Thanks,
Sony

What you want is probably:
browser.document.innerHTML

There is a browser.text(selector, context?).
Selector is a CSS selector evaluated against the document body.
Context is a optional second argument, the CSS selector is evaluated against the element given as the context.
You can say something like browser.text('body') to get the text in the body.

I got here while looking for answer for the same question.
I may be late for the party, but try using
Browser.visit(url, function(error, browser){
fs.appendFileSync('index.html', browser.html());
})
Remember to put error checking here and do better handling, but should give you basic HTML document.

If it's not necessarily HTML (like you find yourself pulling XML or JSON through Zombie due to complicated, valid reasons...), you can access it like this:
browser.document._childNodes[0]._nodeValue

Related

How to make a redirect in velocity template?

How to redirect to http://google.com in the code .vm file? (I mean within
#if <redirect to Google here> #else ... #end statement)
Doing
setRequestURI('http://google.com')
or similar doesn't work and I'm not sure if it is possible at all.
Thank you.
Can anybody explain please?
IMHO a template is the wrong place to implement logic like this. You should determine upfront (in whatever you do - you don't mention your environment other than velocity), before you determine that the velocity template in question should render the output.
Once you're in the template, you can't assume that you're within a web browser (it might render an email message) or that you have access to the response headers (the template might be rendered as part of a page when the response has already been "committed", e.g. the headers are already sent to the browser.
Do yourself a favor and move the logic further up the chain, out of your velocity template. If you need a quick fix: render Javascript that triggers the redirect. But pay back this technical debt sooner rather than later.

How to pass data between pages through worklight client API

I want to invoke a procedure in one page and use it in another page, and the response is only used by the next page, so I think JsonStore is not suit for that. Should I define a global var?
Is there any code sample to do such things? Thanks for your help.
I presume by pages you mean different HTML files. If so, that is not recommended, Worklight is intended for single page applications. There are no code samples that show how to do that.
I would recommended having a single HTML page and using something like jQuery.load to inject new HTML / DOM elements. By dynamically injecting new HTML your single/main HTML file shouldn't be too big and you can destroy (i.e. remove from memory / the DOM) unused DOM elements. Searching on Google for page fragments and html templates could help you find examples. The idea is that you don't lose the JavaScript context.
Maybe you can get away with doing a new init to re-initialize JSONStore (it won't delete any the data, just give you access) on every new HTML page and use get to get access to the JSONStore collections to perform operations such as find.

Find a link inside iframe in webbrowser control in VB.net

I want to find a url webbrowser control inside iframe.
1) my webbrowsercontrol opena url
2)that url has one iframe inside it
3) That Iframe has a link which I want to grab programmatically using vb.net
At any point of time use webBrowser1.Url.ToString() to get the URL of the current open link.
You can get the html code of the open url by using webBrowser1.DocumentText. Once you have the html code use string manipulation to find the "iframe src" value.
This can be abit complicated as you migt not know how may iframes you need to handle.
As well there are some limitations for the FRAME elements according to HtmlWindow.WindowFrameElement Property
You cannot access a FRAME elements or the FRAME's document if the
FRAME is in a different zone than the FRAMESET that contains it. For a
full explanation, see About Cross-Frame Scripting and Security.
Actually, all you need to do is this...
Msgbox Webbrowser1.document.frames(0),getelementbyid("linkTagId").href
This will show you the href of the link, don't bother wasting time with string manipulation.
Of course, you can loop through the frames and links as well using the .length properties in a for loop.
Also, there are ways to bypass the cross-frame security issues since you are running the code in an exe, there are examples online, just search for "bypass cross-frame security webbrowser control" in google without the quotes.
If you need more help with these let me know as I can tell you how. Remember the cross frame stuff only need bypassing if the parent domain name and iframe domain name are different (not subdomains though, they can be different no problems).
Let me know mate :)

Checking the contains of an embed tag using Selenium

We generate a pdf doc via a call to a web service that returns the path to the generated doc.
We use an embed html tag to display the pdf inline, i.e.
<div id="ctl00_ContentPlaceHolder2_ctl01_embedArea">
<embed wmode="transparent" src="http://www.company.com/vdir/folder/Pdfs/file.pdf" width="710" height="400"/>
I'd like to use selenium to check that the pdf is actually being displayed and if possible save the path, i.e. the src link into a variable.
Anyone know how to do this? Ideally we'd like to be able to then compare this pdf to a reference one but that's a question for another day.
As far as inspecting the pdf from selenium, you're more or less out of luck. The embed tag just drops a plugin into the page, and because a plugin isn't well represented in the DOM, Selenium can't get a very good handle on it.
However, if you're using Selenium-RC you may want to consider getting the src of the embed element, then requesting that URL directly and evaluating the resulting PDF in code. Assuming your embed element looks like this <embed id="embedded" src="http://example.com/static/pdf123.pdf" /> you can try something like this
String pdfSrc = selenium.getAttribute("embedded#src");
Then make a web request to the pdfSrc url and do (somehow) validate it's the one you want. It may be enough to just check that it's not a 404.

how to read/parse dynamically generated web content?

I need to find a way to write a program (in any language) that will connect to a website and read dynamically generated data from the website.
Note that it's dynamically generated--it's not enough to get the source html, because the data I'm interested in is generated via javascript that references back-end code. So when i view the webpage source, I can't see the data. (For example, go to google, and do a search. Check the source code on the search results page. Very little of the data your browser is displaying is reflected in the source--most of it is dynamically generated. I need some way to access this data.)
Pick a language and environment that includes an HTML renderer (e.g. .NET and the WebBrowser control). Use the HTML renderer to get the URL and produce an HTML DOM in memory (making sure that scripting is enabled). Read the contents of the HTML DOM after the renderer has done its work.
Example (you'll need to do this inside a System.Windows.Form derived class):
WebBrowser browser = new WebBrowser();
browser.Navigate("http://www.google.com");
HtmlDocument document = browser.Document;
// extract what you want from the document
I used to have a Perl program to access Mapguide.com to get the drive direction from one location to another location. I parsed the returned page and save to database. If the source never change their format, it is OK. the problem is the source format often change, your parser also need change.
A simple thought: if we're talking about AJAX, you can rather look up the urls for the dynamic data. Then you can use the javascript on the page you're talking about to reformat this.
If you have Firefox/greasemonkey making a DOM dumper should be a simple matter.