GEB: how I can read text without any interpretation? - geb

Operator .text() is interpreting/eliminating the following inclusions:
Sequenced space characters
Character entities like &nbsp
In-line references like a href="" target="_blank"
How I can read text block between
<p> ... </p>
without any interpretation?

This is unfortunatelly not possible. Geb is using WebDriver's WebElement.getText() method under the hood and WebDriver's philosophy when it comes to text is to only return text that would be visible to a human and exactly as it is displayed.


Selenium XPath find element where second text child element contains certain text (use contains on array item)

The page contains a multi-select dropdown (similar to the one below)
The html code looks like the below:
<div class="button-and-dropdown-div>
<button class="Multi-Select-Button">multi-select button</button>
<div class="dropdown-containing-options>
<label class="dropdown-item">
<input class="checkbox">
<label class="dropdown-item">
<input class="checkbox">
After testing in firefox developer tools, I was finally able to figure out the xPath needed in order to get the text for a certain label ...
The below XPath statement will return the the text "Phone"
The label contains multiple text items (although it looks like there is just one text object when looking at the UI) in the label element. There are actually two text elements within each label element. The first is always empty, the second contains the actual text (as shown in the below image when observing the element through the Firefox developer tool's console window):
How do I modify the XPath shown above in order to use in Selenium's FindElement?
I know how to use the contains tool, but apparently not with more complex XPath statements. I was pretty sure one of the below would work but they did not (develop tool complain of a syntax error):
$x("(//label[#class='dropdown-item' and text()[2][contains(., 'Name')]]")
$x("(//label[#class='dropdown-item' and contains(text()[2], 'Name')]")
I am using the 'contains' in order to avoid white-space conflicts.
Additional for learning purposes (good for XPath debugging):
just in case anyone comes across this who is new to XPath, I wanted to show what the data structure of these label objects looked like. You can explore the data structure of objects within your webpage by using the Firefox Console window within the developer tools (F12). As you can see, the label element contains three sub-items; text which is empty, then the inpput checkbox, then some more text which has the actual text in it (not ideal). In the picture below, you can see the part of the webpage that corresponds to the label data structure.
If you are looking to find the element that contains "Name" given the HTML above, you can use
So finally got it to work. The Firefox developer environment was correct when it stated there was a syntax problem with the XPath strings.
The following XPath string finally returned the desired result:
$x("//label[#class='dropdown-item' and contains(text()[2], 'Name')]")

Need to keep <br> in text block tags while using

Looking to do something relatively straightforward, I'm scraping text which so far I have had no problem grabbing, but I need to keep the <br> tags because white space analysis is an important part of the dataset.
Is there a way to keep the <br> tags so I can turn them into \n\rlater on.
<span>Some text.</br></span>
<a>Some more text.<br></a>
<span>Some more more text.<br></span>
I need : Some text.<br>Some more text.<br>Some more more text.<br>
Right now I get: Some text. Some more text. Some more more text.
The only way is to get the html format of your selection , all you have to do is change the column type from Text to HTML , also there is no way to get only the text + the <br>.

how to get text from text node without getting content of siblings

I have following code
<p>some paragraph</p>
some nasty text that I need
<span>something else</span>
Now I need to get some nasty text that I need only. How to do it using only XPath 1.0? Is it possible?
How to do it using only XPath 1.0? Is it possible?
Yes - and it's rather trivial:
I wonder why you did not try that? All other text nodes are either in a p or span element and should not cause you any trouble.

How to verify text across HTML elements in Selenium

Given the following code, how would I verify the text within using Selenium?
<div class='my-text-block>
<p>My first paragraph of text</p>
<p>My second paragraph of text</p>
I am wanting to, in one verifyText statement to capture all the text:
My first paragraph of text
My second paragraph of text
Is it possible?
Since you've tagged this with selenium-webdriver, I'm assuming you want a code example but because you've not stated what language you're using, I'll give you a python example. It should be easy to translate that to a different language if needed.
ok(driver.find_element("class", "my-text-block").text == "What I expect it to be")
The text attribute on a WebElement object simply contains all visible text within that element and all children elements.
And some lovely docs, of course.

How to verify Local language in UI using Selenium

I am new to testing and have a question about how can we verify the local language using Selenium. Suppose that I have some link which lets me choose the language, so if I choose a language how can I verify that?
By anything on the page! Let's assume our page is only made of one element, either
It's time to kick ass and chew bubble gum!
Il est temps de botter le cul et mâcher de la gomme à bulles!
Then, with Selenium, you can get the element, get the inner text and verify! Example in Java:
// assuming driver is a good WebDriver instance
WebElement elem = driver.findElement(By.xpath("//div"));
if (elem.getText().contains("kick ass")) {
System.out.println("It's English!");
} else if (elem.getText().contains("botter le cul")) {
System.out.println("It's French!");
As of Java 7, you can even use a switch-case on a String, so you can digestedly test against many, many languages.
Well, you can
search for flag of that language on the page
check number formatting, or time formatting (does change in some locales)
Verify some well known text (like greeting) is changed. Say, from Good evening to Dobrý večer (that was in Czech ;) )
We use dictionary files for l10n. Dictionary is an xml file with strings stored like <string key="Cancel">Cancel</string>.
So for testing purposes we replace all </string> with something like ###</string> and replace all spaces between <string key="..."> and </string> with '_' (can be done in notepad++ with ctrl+h, spaces can be replaced with the help of this answer for example). Then we select language that corresponds to modified dictionary at our website. Next step is done by Selenium IDE script: browsing through the whole web site and storing pages' text with the use of Selenium IDE command storeBodyText | text. Then parsing and echo all words that is not ended with ### and analyze if it ok or the word that should be translated is hard-coded. Not pure automation but better than nothing :) I think this approach can be applied not only for xml-dictionary files but for any storage you use for your strings.
PS. If you don't want to perform localization testing but just want to be sure that by clicking on link some language is applied (I'm afraid this is exactly what you are looking for), after you click at your link you can get text from any element and compare it with expected text (storeText(locator, variableName) for IDE).
<!-- Using Selenium IDE -->
<td>The label is ${language}</td>