Selenium 2: How to save a HTML page including all referenced resources (css, js, images...)?

Selenium 2: How to save a HTML page including all referenced resources (css, js, images...)? - selenium

In Selenium 2, the WebDriver object only offers a method getPageSource() which saves the raw HTML page without any CSS, JS, images etc.
Is there a way to also save all referenced resources in the HTML page (similar to HtmlUnit's HtmlPage.save())?

I know I'm royally late with my answer, but I didn't really find an answer for this question when I was searching myself. So I did something myself, hope I can help some people still.
For c# here's how I did it:
using system.net;
string DataDirectory = "C:\\Temp\\AutoTest\\Data\\";
string PageSourceHTML = Driver.PageSource;
string[] StringSeparators = new string[] { "<" };
string[] Result = PageSourceHTML.Split(StringSeparators, StringSplitOptions.None);
string CSSFile;
string FileName = "filename.html";
System.IO.File.WriteAllText(DataDirectory + FileName, PageSourceHTML);
foreach(string S in Result)
{
if(S.Contains("stylesheet"))
{
CSSFile = S.Substring(28); // strip off "link rel="stylesheet" href="
CSSFile = CSSFile.Substring(0,CSSFile.Length-10); // strip off characters behind, like " />" and newline, spaces until next "<" was found. Can and probably will be different in your case.
System.IO.Directory.CreateDirectory(DataDirectory + "\\" + CSSFile.Substring(0, CSSFile.LastIndexOf("/"))); //create the CSS direcotry structure
var Client = new WebClient();
Client.DownloadFile(Browser.Browser.WebUrl + "/" + CSSFile, DataDirectory + "\\" + CSSFile); // download the file and save it with the same filename under the same relative path.
}
}
I'm sure it could be improved to include any unforeseen situations, but for my website in test it will always work like this.

Nope. If you can, go for HtmlUnit for this particular task.
The best you could do, I think, is Robot. Press Ctrl + S simultaneously, the confirm with Enter. It's blind, it's imperfect, but it's the closest thing to your need.

You can use the selenium interactions to handle it.
using OpenQA.Selenium.Interactions;
There are a few ways to do it as well. One of the ways that I handle something like this, is to find an item central to the page, or whichever area that you wish to save, and do an actions builder.
var htmlElement = driver.FindElement(By.XPath("//your path"));
Actions action = new Actions(driver);
try
{
action.MoveToElement(htmlElement).ContextClick(htmlElement).SendKeys("p").Build().Perform();
}
catch(WebDriverException){}
This will simply right click on the area, and then send the key "p" which is the 'Save Page As' hotkey in firefox when right clicking. Another way is to have the builder send the keys.
var htmlElement = driver.FindElement(By.Xpath("//your path"));
action.MoveToElement(htmlElement);
try
{
action.KeyDown(Keys.Control).SendKeys("S").KeyUp(Keys.Control).Build().Perform();
}
catch(WebDriverException){}
Note that in both cases, if you leave the scope of the driver, say a windows form, then you will have to switch your case / code to handle the windows form when it pops up. Selenium will also have issues with nothing being returned after the keys are sent, so the Try Catches are there for that. If anyone has a way to work around that, it would be awesome.

Related

Splitting pages within a PDF with ArcoJS / Acrobat JS with a given array of names

So, I am super new to using the JS interface within Acrobat and I am trying to write something for splitting PDF pages easily to an array of file names. I cannot find a lot of snippets around that seems to show me how to work with Acrobat JS. Can you provide some guidance on how a script similar to such will look like and how I can execute it within Acrobat? Thanks!

First, you will need Acrobat Professional or Standard for JS tasks generally. And you will execute code like you do within your normal terminal/immediates window within what they call the debugger tool for Javascript. You will need to first activate JS within Acrobat by going to Preferences and activate the Debugger. After you set the preferences, restart Acrobat, and find the tools for the Javascript debugger (different places based on your version, google it if you can't find it).
Once you get the debugger running, run the code below after modifying for the file names you wish to use and the appropriate file paths. Then highlight the entire code block and hit Ctrl+Enter and it will automatically split the pages for you. Enjoy.
Split();
function Split() {
var totalPages = this.numPages;
var i;
var arrNames = [ "SOME ARRAY" ];
var targetPath = "/C/Users/...SOMEPATH/";
try {
for (i = 0; i < totalPages; i++) {
this.extractPages({
nStart: i,
cPath: targetPath +arrNames[i] + ".pdf"
});
console.println("Completed: " + targetPath + arrNames[i] + ".pdf");
}
} catch (e) {
console.println("Aborted: " + e);
}
}

Handle download dialog box in SlimerJS

I have written a script that clicks on a link which can download a mp3 file. The problem I am facing is when the script simulates the click on that link, a download dialog box pops up like this:
Download Dialog Box
Now, I want to save this file to some path of my choice and automate this whole process. I am clueless on how to handle this dialog box.

Here's a script adapted from this blog post to download a file.
In SlimerJS it is possible to use response.body inside the onResourceReceived handler. However to prevent using too much memory it does not get anything by default. You have to first set page.captureContent to say what you want. You assign an array of regexes to page.captureContent to say which files to receive. The regex is applied to the mime-type. In the example code below I use /.*/ to mean "get everything". Using [/^image/.+$/] should just get images, etc.
var fs=require('fs');
var page = require('webpage').create();
fs.makeTree('contents');
page.captureContent = [ /.*/ ];
page.onResourceReceived = function(response) {
if(response.stage!="end" || !response.bodySize)
{
return;
}
var matches = response.url.match(/[/]([^/]+)$/);
var fname = "contents/"+matches[1];
console.log("Saving "+response.bodySize+" bytes to "+fname);
fs.write(fname,response.body);
phantom.exit();
};
page.onResourceRequested = function(requestData, networkRequest) {
//console.log('Request (#' + requestData.id + '): ' + JSON.stringify(requestData));
};
page.open("http://....mp3", function(){
});

You can't control a dialog box. SlimerJS doesn't have API for this action.

Firefox generates a temp "downloadfile.extension.part" file which contains the content. Just simply rename the file ex. myfile.csv.part > myfile.csv
locally if working on a mac you should find the .part file in the downloads directory, on linux /temp/ folder
Not the most elegant solution but should do the trick

How to check multiple PDF files for annotations/comments?

Problem: I routinely receive PDF reports and annotate (highlight etc.) some of them. I had the bad habit of saving the annotated PDFs together with the non-annotated PDFs. I now have hundreds of PDF files in the same folder, some annotated and some not. Is there a way to check every PDF file for annotations and copy only the annotated ones to a new folder?
Thanks a lot!
I'm on Win 7 64bit, I have Adobe Acrobat XI installed and I'm able to do some beginner coding in Python and Javascript
Please ignore the following suggestion, since the answers already solved the problem.
EDIT: Following Mr. Wyss' suggestion, I created the following code for Acrobat's Javascript console to be run only once at the beginning:
counter = 1;
// Open a new report
var rep = new Report();
rep.size = 1.2;
rep.color = color.blue;
rep.writeText("Files WITH Annotations");
Then this code should be applied to all PDFs:
this.syncAnnotScan();
annots = this.getAnnots();
path = this.path;
if (annots) {
rep.color = color.black;
rep.writeText(" ");
rep.writeText(counter.toString()+"- "+path);
rep.writeText(" ");
if (counter% 20 == 0) {
rep.breakPage();
}
counter++;
}
And, at last, one code to be run only once at the end:
//Now open the report
var docRep = rep.open("files_with_annots.pdf");
There are two problems with this solution:
1. The "Action Wizard" seems to always apply the same code afresh to each PDF (that means that the "counter" variable, for instance, is meaningless; it will always be = 1. But more importantly, var "rep" will be unassigned when the middle code is run on different PDFs).
2. How can I make the codes that should be run only once run only at the beginning or at the end, instead of running everytime for every single PDF (like it does by default)?
Thank you very much again for your help!

This would be possible using the Action Wizard to put together an action.
The function to determine whether there are annotations in the document would be done in Acrobat JavaScript. Roughly, the core function would look like this:
this.syncAnnotScan() ; // updates all annots
var myAnnots = this.getAnnots() ;
if (myAnnots != null) {
// do something if there are annots
} else {
// do something if there are no annots
}
And that should get you there.
I am not completely positive, but I think there is also a Preflight check which tells you whether there are annotations in the document. If so, you would create a Preflight droplet, which would sort out the annotated and not annotated documents.

Mr. Wyss is right, here's a step-by-step guide:
In Acrobat XI Pro, go to the 'Tools' panel on the right side
Click on the 'Action Wizard' tab (you must first make it visible, though)
Click on 'Create New Action...', choose 'More tools' > 'Execute Javascript' and add it to right-hand pane > click on 'Execute Javascript' > 'Specify Settings' (uncheck 'prompt user' if you want) > paste this code:
.
this.syncAnnotScan();
var annots = this.getAnnots();
var fname = this.documentFileName;
fname = fname.replace(",", ";");
var errormsg = "";
if (annots) {
try {
this.saveAs({
cPath: "/c/folder/"+fname,
bPromptToOverwrite: false //make this 'true' if you want to be prompted on overwrites
});
} catch(e) {
for (var i in e)
{errormsg+= (i + ": " + e[i]+ " / ");}
app.alert({
cMsg: "Error! Unable to save the file under this name ('"+fname+"'- possibly an unicode string?) See this: "+errormsg,
cTitle: "Damn you Acrobat"
});
}
;}
annots = 0;
Save and run it! All your annotated PDFs will be saved to 'c:\folder' (but only if this folder already exists!)
Be sure to enable first Javascript in 'Edit' > 'Preferences...' > 'Javascript' > 'Enable Acrobat Javascript'.
VERY IMPORTANT: Acrobat's JS has a bug that doesn't allow Docs to be saved with commas (",") in their names (e.g., "Meeting with suppliers, May 11th.pdf" - this will get an error). Therefore, I substitute in the code above all "," for ";".

'sendKeys' are not working in Selenium WebDriver

I am not able to put any value in my application using WebDriver. My application is using frames.
I am able to clear the value of my textbox with driver.findElement(By.name("name")).clear();, but I'm unable to put any value using driver.findElement(By.name("name")).sendKeys("manish");. The click command works for another button on the same page.

I also had that problem, but then I made it work by:
myInputElm.click();
myInputElm.clear();
myInputElm.sendKeys('myString');

Before sendkeys(), use the click() method (i.e., in your case: clear(), click(), and sendKeys()):
driver.findElement(By.name("name")).clear();
driver.findElement(By.name("name")).click(); // Keep this click statement even if you are using click before clear.
driver.findElement(By.name("name")).sendKeys("manish");

Try clicking on the textbox before you send keys.
It may be that you need to trigger an event on the field before input and hopefully the click will do it.

I experienced the same issue and was able to collect the following solution for this:
Make sure element is in focus → try to click it first and enter a string.
If there is some animation for this input box, apply some wait, not static. you may wait for an element which comes after the animation. (My case)
You can try it out using Actions class.

Clicking the element works for me too, however, another solution I found was to enter the value using JavaScript, which doesn't require the element to have focus:
var _element= driver.FindElement(By.Id("e123"));
IJavaScriptExecutor js = (IJavaScriptExecutor)driver;
js.ExecuteScript("arguments[0].setAttribute('value', 'textBoxValue')", _element);

Use JavaScript to click in the field and then use sendkeys() to enter values.
I had a similar problem in the past with frames. JavaScript is the best way.

First pass the driver control to the frame using:
driver.switchTo().frame("pass id/name/index/webelement");
After that, perform the operation which you want to do on the webelement present inside the frame:
driver.findElement(By.name("name")).sendKeys("manish");

I have gone with the same problem where copy-paste is also not working for that text box.
The below code is working fine for me:
WebDriver driver = new FirefoxDriver();
String mobNo = "99xxxxxxxx";
WebElement mobileElementIrs =
driver.findElement(By.id("mobileNoPrimary"));
mobileElementIrs.click();
mobileElementIrs.clear();
mobileElementIrs.sendKeys(mobNo);

I had a similar problem too, when I used
getDriver().findElement(By.id(idValue)).clear();
getDriver().findElement(By.id(idValue)).sendKeys(text);
The value in "text" was not completely written into the input. Imagine that "Patrick" sometimes write "P" another "Pat",...so the test failed
The fix is a workaround and uses JavaScript:
((JavascriptExecutor)getDriver()).executeScript("$('#" + idValue + "').val('" + value + "');");
Now it is fine.
Instead of
driver.findElement(By.id("idValue")).sendKeys("text");
use,
((JavascriptExecutor)getDriver()).executeScript("$('#" + "idValue" + "').val('" + "text" + "');");
This worked for me.

I had a similar problem recently and tried some of the suggestions above, but nothing worked. In the end it fell back on a brute-force retry which retries if the input box wasn't set to what was expected.
I wanted to avoid thread.sleep for obvious reasons and saw different examples of it failing that looked like some kind of race or timing condition.
public void TypeText(string id, string text)
{
const int numberOfRetries = 5;
for (var i = 1; i < numberOfRetries; i++)
{
try
{
if (TryTypeText())
return;
}
catch (Exception)
{
if (i == numberOfRetries)
throw;
}
}
bool TryTypeText()
{
var element = _webDriver.FindElement(By.Id(id));
element.Click();
element.Clear();
element.SendKeys(text);
if (element.TagName.ToLower() == "input"
&& !DoesElementContainValue(element, text, TimeSpan.FromMilliseconds(1000)))
{
throw new ApplicationException($"Unable to set the type the text '{text}' into element with id {id}. Value is now '{element.GetAttribute("value")}'");
}
return true;
}
}
private bool DoesElementContainValue(IWebElement webElement, string expected, TimeSpan timeout)
{
var wait = new WebDriverWait(_webDriver, timeout);
return wait.Until(driver =>
{
try
{
var attribute = webElement.GetAttribute("value");
return attribute != null && attribute.Contains(expected);
}
catch (StaleElementReferenceException)
{
return false;
}
});
}

In my case, I had some actions.keyDowns(Keys.CONTOL).XXXX;
But I forgot to add the keyUp for that button and that prevented from sending keys and resulted in weird behaviors
Adding X.keyUp() after the x.keyDown() fixed the issue

Try using JavaScript to sendkeys().
WebElement element = driver.findElement(By.name("name"));
JavascriptExecutor executor = (JavascriptExecutor)driver;
executor.executeScript("arguments[0].click();", element);
More information on JavaScript Executor can be found at
JavascriptExecutor - Selenium.

Generally I keep a temporary variable. This should work.
var name = element(by.id('name'));
name.clear();
name.sendKeys('anything');

Switching to a window with no name

Using the Codeception testing framework and Selenium 2 module to test a website, I end up following a hyperlink that opens a new window with no name. As a result the switchToWindow() function will not work because it is trying to switch to the parent window (which I'm currently on). Without being able to switch to the new window I cannot perform any testing on it.
<a class="external" target="_blank" href="http://mylocalurl/the/page/im/opening">
View Live
</a>
Using both Chrome and Firefox debugging tools I can confirm the new window doesn't have a name, and I cannot give it one because I cannot edit the HTML page I am working on. Ideally I would have changed the HTML to use javascript onclick="window.open('http://mylocalurl/the/page/im/opening', 'myPopupWindow') however this is not possible in my case.
I've looked around on the Selenium forums without any clear method to tackle this problem, and Codeception doesn't appear to have much functionality around this.

After searching around on the Selenium forum and some helpful prods from #Mark Rowlands, I got it to work using raw Selenium.
// before codeception v2.1.1, just typehint on \Webdriver
$I->executeInSelenium(function (\Facebook\WebDriver\Remote\RemoteWebDriver $webdriver) {
$handles=$webdriver->window_handles();
$last_window = end($handles);
$webdriver->focusWindow($last_window);
});
Returning back to the parent window was easy because I could just use Codeception's switchToWindow method:
$I->switchToWindow();

Building on the accepted answer, in Codeception 2.2.9 I was able to add this code to the Acceptance Helper and it seems to work.
/**
* #throws \Codeception\Exception\ModuleException
*/
public function switchToNewWindow()
{
$webdriver = $this->getModule('WebDriver')->webDriver;
$handles = $webdriver->getWindowHandles();
$lastWindow = end($handles);
$webdriver->switchTo()->window($lastWindow);
}
Then in the test class I can do this:
$I->click('#somelink');
$I->switchToNewWindow();
// Some assertions...
$I->switchToWindow(); // this switches back to the previous window
I had a heck of a time trying to figure out how to do this by just searching google, so I hope it helps someone else.

Try this,
String parentWindowHandle = browser.getWindowHandle(); // save the current window handle.
WebDriver popup = null;
Iterator<String> windowIterator = browser.getWindowHandles();
while(windowIterator.hasNext()) {
String windowHandle = windowIterator.next();
popup = browser.switchTo().window(windowHandle);
}
make sure to return on parent window using,
browser.close(); // close the popup.
browser.switchTo().window(parentWindowHandle); // Switch back to parent window.
I hope will help you.

Using Codeception 2.2+ it looks like this:
$I->executeInSelenium(function (\Facebook\WebDriver\Remote\RemoteWebDriver $webdriver) {
$handles = $webdriver->getWindowHandles();
$lastWindow = end($handles);
$webdriver->switchTo()->window($lastWindow);
});

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Selenium 2: How to save a HTML page including all referenced resources (css, js, images...)? - selenium

In Selenium 2, the WebDriver object only offers a method getPageSource() which saves the raw HTML page without any CSS, JS, images etc. Is there a way to also save all referenced resources in the HTML page (similar to HtmlUnit's HtmlPage.save())?

Nope. If you can, go for HtmlUnit for this particular task. The best you could do, I think, is Robot. Press Ctrl + S simultaneously, the confirm with Enter. It's blind, it's imperfect, but it's the closest thing to your need.

Related

Splitting pages within a PDF with ArcoJS / Acrobat JS with a given array of names

Handle download dialog box in SlimerJS

How to check multiple PDF files for annotations/comments?

'sendKeys' are not working in Selenium WebDriver

Switching to a window with no name

Categories

Resources