Is that possible to save pictures with a name from page? - selenium

I want to save images whole products inside a site with their own names which is written on same page, is it possible to do that on a site with below logic?
Main product page has link for whole product at same page so I think I can manage to get every product here, in product page there is sub menus such as "General - Gallery etc." I want to get product name from General section then go to Gallery section and save images with this name like ProductName1.jpg, ProductName2.jpg ...
Is it possible or impossible to do with selenium?
Product Page: http://www.laboory.com/products
Here a sample link for product:
http://www.laboory.com/product/laboory-water-soluble-m/3937

Yes, we can do this. As you mentioned selenium tag only I assume it's using Java.
Go to the product page
Get image source URL and product name.
Using BufferedImage and ImageIO classes save the image into desired location.
Code:
driver = new ChromeDriver(options);
driver.manage().deleteAllCookies();
driver.get("http://laboory.com/product/laboory-water-soluble-m/3937");
WebElement logo = driver.findElement(By.xpath("(//span//img[#class='imgin' and #src])[1]"));
String logoSRC = logo.getAttribute("src");
String productName = driver.findElement(By.xpath("//div/h1")).getText();
URL imageURL = new URL(logoSRC);
BufferedImage saveImage = ImageIO.read(imageURL);
ImageIO.write(saveImage, "png", new File(productName+".png"));
Output: The product CAPSULE GC 510.png saved in project directory.
Note: You can change the location as well.

You can capture the screen shot of the image by using the dimension of the image element, and save it with desired name, below is the reference
How to capture the screenshot of a specific element rather than entire page using Selenium Webdriver?

Related

Can PDF notes/annotations include links to other pages in the PDF?

I want to add an annotation to a PDF page (i.e. something that would show as a pop-up note or appear in the list of notes for the current page).
And in that note, I want to say "See page 93", where clicking on that takes the user to page 93.
Is that possible? It seems like a useful feature, but I haven't been able to find any examples.
And if so, can it be done with Apache PDF Box?
Yes (it is possible) and yes (it can be done with PdfBox). That question has been asked before and answered several times. Read the follwing answer here and the see the full code here.
try ( InputStream resource = getClass().getResourceAsStream("some.pdf")) {
PDDocument document = Loader.loadPDF(resource);
PDPage page = document.getPage(1);
PDAnnotationLink link = new PDAnnotationLink();
PDPageDestination destination = new PDPageFitWidthDestination();
PDActionGoTo action = new PDActionGoTo();
//destination.setPageNumber(2);
destination.setPage(document.getPage(2));
action.setDestination(destination);
link.setAction(action);
link.setPage(page);
link.setRectangle(page.getMediaBox());
page.getAnnotations().add(link);
document.save(new File("RESULT_FOLDER", "output-with-link.pdf"));
}
Other answers are here and here.

Selenium driver is not reflecting updates after click on link

There are some posts about this topic but I cannot find any solution for my case, this is the situation:
I click on a link (a next page):
ActionChains(driver).move_to_element(next_el).click().perform()
Then I get the content of the new page(I'm interested on some script sections inside the body)
html = driver.find_element_by_xpath("//*").get_attribute("outerHTML")
But that content is always the same, no matter how long I wait for.
The only way to get the driver with new DOM information is to do a refresh(),
but for this case that is not a valid option.
Thanks and regards.
I am not sure what exactly you are looking for here, but if I am right you want to capture the content of script tag from the page.
If that is the case capture the page source in a string variable
sorce_code = driver.page_source , after you get the sting you can extract the value by any of the available string methods. I hope it helps.

Switching to a new tab/window which is having xml-style-view instead of web-view in selenium

I have a scenario wherein when I click a button on a page, it gets redirected to a new page in a separate tab. Now the new page is not a regular page, And when I use normal switchTo().window() operations, it does not work saying "Web view not found, target window closed.
How should I handle this scenario in selenium
A screenshot of the result xml-viewer-style page
What is the resultant pages complete path? Does it end with XML? And why do you want that page? I believe that page is an XML file opening in a new tab. If you have stuff to retrieve from that page, you need to first download it as an XML file. Then use a parser to retrieve the data from it.
You can use DOM to parse an XML file like so:
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
InputSource is = new InputSource(new StringReader(response));
Document doc = builder.parse(is);
NodeList nList = doc.getElementsByTagName("item");
Node namedItem = nList.item(0).getAttributes().getNamedItem("uid");
System.out.println(namedItem.getNodeValue());
Now before you even can do that, you have to get that file to your local system.
You can do it using a dummy argument in the href for your file like so
<a href="http://link/to/file.xml?dummy=dummy" download>Download Now</a>

HtmlUnit - lazy loading of images

I am using HtmlUnit to download URL and the webpage is using lazy loading (I think) to load some of the images. Which settings should I use in HtmlUnit so that I can get those images.
For example, this is one of the URLs I am trying to download-
http://www.ebay.com.au/sch/i.html?_from=R40&_trksid=p2050601.m570.l1313.TR10.TRC0.A0.H0.Xiphone6s.TRS0&_nkw=iphone6s&_sacat=0
The product images (after first few) have dummy src value-
As you can see the src tag has dummy value and actual image url is stored in imgurl attribute. I think the webpage uses some javascript to change the src attribute by correct value once we scroll down.
This is my sample code-
webClient = new WebClient(BrowserVersion.FIREFOX_38);
webClient.getOptions().setActiveXNative(false);
webClient.getOptions().setAppletEnabled(false);
webClient.getOptions().setDoNotTrackEnabled(true);
webClient.getOptions().setPopupBlockerEnabled(true);
webClient.getOptions().setPrintContentOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnFailingStatusCode(false);
webClient.getOptions().setThrowExceptionOnScriptError(false);
webClient.setCssErrorHandler(new SilentCssErrorHandler());
Page page = webClient.getPage(url);
I have tried the following-
1) Increase window height-
webClient.getCurrentWindow().setInnerHeight(60000);
webClient.getCurrentWindow().setInnerWidth(60000);
2) Try to scroll down after page is downloaded
webClient.getOptions().setJavaScriptEnabled(true);
webClient.getOptions().setCssEnabled(true);
webClient.waitForBackgroundJavaScript(10 * 1000);
HtmlPage page = (HtmlPage) webClient.getPage(url);
page.getBody().type(KeyboardEvent.DOM_VK_PAGE_DOWN);
Thread.sleep(3000);
String html = page.asXml();
But so far, I have not been able to get the correct src URL.
If anyone has successfully fixed this lazy loading issue, please suggest some workarounds.
thank you!

Need a Hyperlink control to do several things at once

On my site I have a DataList full of image thumbnails. The thumbnails are HyperLink controls that, when clicked, offer an enlarged view of the source image (stored in my database).
My client wants a facebook Like button on each image and I was hoping to put that in the lightbox window that appears when you click on a thumbnail.
My challenge here is that to generate the info for the Like, I need to create meta tags and each image should, preferably, create it's own meta tags on the fly.
What I can't figure out is how to make the HyperLink click open the lightbox AND create the meta tags at the same time.
Any help will be greatly appreciated.
For a live view of the site, go to http://www.dossier.co.za
The way that we approach similar problems is to hook the onclick event of the href in javascript.
Depending on exactly what you need to do, you can even prevent the standard browser behavior for the hyperlink from executing by returning false from the javascript method.
And in some cases, we just use the hyperlink for "show" by setting the href to "#".
Here is an example that combines all of these concepts:
File Name
In this case, the specified javascript is executed, there is no real hyperlink, and the browser doesn't try to navigate to the specified URL because we return false in the javascript.
Add a Classname to the opening table tag like class="tbl_images" so we can use JQuery to access it. Capture the click on the td and pickup the id of the item. Pass that id to your code as required to generate your meta tags. In the following when the user clicks on an anchor in a td, a function will run.
I use this all the time to access attributes in the td so i can run a function. You could use something like this to pickup values from your image/anchor and create something...
$("#tbl_images > tbody > tr ").each(function () {
//get the id of the tr (if required)
var id = $(this).attr("id");
var ImageTitle = $(this).find("img.Image_Class_Name").attr("title");
//on click of anchor with classname of lighthouse run function,
//passing in our id or other data in the row to our function
$(this).find("td: > a.lighthouse").click(function () {
//update script for this record
MyFunction(id,ImageTitle);
});
});