Why does scrapy return a different date from a time html tag as the development tools? - scrapy

I queried the html node, where the date of an article is stored. I noticed a different date in the datetime attribute compared to the text inside the node when scraping the site. In the development tools of Google Chrome the datetime attribute is the same as the displayed text.
My question is, why does scrapy get a different datetime attribute as the development tools? And can I somehow get the correct date from the datetime attribute?
This is the code and the return value:
response.xpath("//*[#class='a20-news-date']/time").getall()
['<time datetime="2021-11-15T08:17:20+01:00">Sonntag, 08.03.2020 // 17:20 Uhr</time>']
The development tools of Google display the node as:
<div class="a20-news-date">
<time datetime="2020-03-08T17:20:16+01:00">8. März 2020</time>
</div>

Because if you check HTML source code (Ctrl+U) you'll find that there are several <time> elements in the page. What you see in Dev Tools is a result DOM after Javascript execution. Your target element is located inside <article> tag in source HTML:
response.xpath("//article//time/text()").get()

Related

htmx - format date for browser locale

I've tried the following to format a date in the locale of the browser:
<script>document.write((new Date(2021, 4, 14)).toLocaleString().split(",")[0])</script>
However, based on this question Document.write clears page it seems like it is writing after the document stream is closed, thereby opening a new stream and replacing the content on my page.
Using htmx is there a recommended way of formatting dates to the browser locale?
Is there an htmx tag that allows me to execute this javascript safely?
This is the html I'm using to invoke htmx:
<div hx-get="/open_orders"
hx-trigger="load"
hx-target="this"
hx-swap="outerHTML">
<img class="htmx-indicator"
src="[[=URL('static', 'images/spinner.gif')]]"
height="20"/>
</div>
-Jim
As you mentioned, document.write() does not play well with htmx. This is true for most front-end libraries/toolkits/frameworks that want to control what is displayed in the browser window.
Instead, there are a number of ways you could do this instead:
Try rendering the time on your server and simply displaying the value via htmx. This library works best when you put the server in charge whenever you can. I would recommend starting with this, if you can, instead of rendering a date via Javascript.
If you really need to update this information on the browser (for instance, to update the display as the data changes, write to a specific DOM element instead:
<span id="time"> </span>
<script>
document.getElementById('time').innerHTML = currentTime();
</script>
You can also hook in to a wide range of events that htmx triggers. This works well if you want to update information on the browser whenever htmx does something -- for instance, you can update the date/time displayed whenever htmx loads a new html fragment into the DOM.

UI Automation - Elements on my UI have ember ids , which change frequently with addition of new UI elements. How to use the id for automation?

Example of the HTML of a dropdown element:
<div aria-owns="ember-basic-dropdown-content-ember1234" tabindex="0" data-ebd-id="ember1234-trigger" role="button" id="ember1235" class="ember-power-select-trigger ember-basic-dropdown-trigger ember-view"> <!---->
<span class="ember-power-select-status-icon"></span>
</div>
The xpath and CSS selector also contain the same ember id.
xpath : //*[#id="ember1235"]
css selector : #ember1235
The ember id would change from id="ember1235" to say, id="ember1265" when there is a change in the UI.
I am using id to locate the element. But every time it changes I need to modify the code. Is there any other attribute I could use for Ember JS UI elements?
There is quite a lot to discuss in your question but hopefully we will have a good answer for you #PriyaK
The first thing to mention is that Ember IDs may not be the best method to select an element in the DOM. As you have already mentioned, they can change from time to time and also it doesn't really give you a great semantic thing to select in your selenium test so it might seem a bit out of context when looking back.
One thing that you could try is to either pass a class to the ember-power-select component (the one that provides the HTML that you used in your example) and use that to select the element, something like:
<PowerSelect
#class="my-fancy-class"
as |name|
>
{{name}}
</PowerSelect>
Then you should be able to select the selected value by using the CSS selector .my-fancy-class span (because the component outputs the selected value in a span)
We just tried this in an example app but it didn't actually work 🤔 Never fear, you can also do something like this and it should work with the same selector as before:
<div class="my-fancy-class">
<PowerSelect as |name|>
{{name}}
</PowerSelect>
</div>
This is fine, but there are also a few issues using classes for selectors in tests. One example of a problem that might crop up is that your tests might all suddenly stop working if you did a style refactor and changed or removed some of the classes on your elements. One technique that has become popular in the Ember community is to use data-test- attributes on your DOM nodes like this:
<div data-test-my-fancy-select>
<PowerSelect
#class="my-fancy-class"
as |name|
>
{{name}}
</PowerSelect>
</div>
which can then be accessed by the following selector: [data-test-my-fancy-select] span. This is great for a few reasons! Firstly it separates the implementation of your application and tests from your styling and avoids the issue I described above. The second benefit of this method is that using what #Gokul suggested in the comments, the ember-test-selectors package, you can make use of these data-test- selectors in your development and test environments but they will be automatically removed from your production build. This is great to keep your DOM clean in production but also, depending on the size of your application, could save you a reasonable amount of size in your templates on aggregate.
I know you say that you are using selenium for your testing but it's also worth mentioning that if you're using the built-in Ember testing system you will be able to make use of some testing helpers that addons may provide you. ember-power-select is one of those addons that provides specific testing helpers and you can read more about it in their documentation: https://ember-power-select.com/docs/test-helpers
I hope this answers any questions you had!
This question was answered as part of "May I Ask a Question" Season 3 Episode 1. If you would like to see us discuss this answer in full you can check out the video here: https://www.youtube.com/watch?v=1DAJXUucnQU

How to get fully qualified url with selenium on a link without any href attribute?

I would like to retrieve url from a link on an html page.
unfortunately, html code does not contain any href attribute (I suppose it is managed by some javascript code)
Here is html code :
<p class="ng-scope">
<a class="documentLink ng-binding" data-document-id="21928499">Electronic document</a>
</p>
I tried to do it with getattribute() function :
By linkPodPopover = new ByXpath("//div[#class='popover-content']//a[contains(.,'Electronic document')]");
find(linkPodPopover).getAttribute("href");
but it returns an empty String...
I also tried with this code but also without success :
driver.getCurrentUrl()
click(linkPodPopover)
Do you see another way ?
I did not find any answer on the internet.
And I tried to explore every javascript attribute of the DOM element of my link without finding URL.
Finally, I came across this problem by using browserstack functionnalities : http://browserstack.com/automate/java#enhancements-uploads-downloads
It allows to click on the download link, then the browser download it. then using Javascript, I can check if file is well downloaded, and if size and md5 are correct. –

How to get Inspect Element code using Selenium WebDriver

I'm working in selenium with Firefox browser.
The Html code shown in View Source (CTRL+U) is different from the html code i see when inspecting the elements in Firefox.
When i run the driver.getPageSource() i only get the View source (CTRL + U) codes.
Is there is any way to access the Inspect element code instead of View source code?
I think your question is answered here.
The View Source html is what is sent by the server. I think of it as compile time html, or the initial state of the DOM.
The Inspect Element html could have been updated by ajax responses or javascript so will not necessarily be the same. I think of it as runtime html, or the current state of the DOM.
The GetAttribute() method queries the current DOM element state. You can return a particular html attribute value directly
webElement.GetAttribute("class")
or get the whole html string.
webElement.GetAttribute("innerHTML")
There are some fundamental difference between the markup shown through View Source i.e. using ctrl + U and the markup shown through Inspector i.e. using ctrl + shift + I.
Both the methods are two different browser features which allows users to look at the HTML of the webpage. However, the main difference is the View Source shows the HTML that was delivered from the web server (application server) to the browser. Where as, Inspect element is a Developer Tool e.g. Chrome DevTools to look at the state of the DOM Tree after the browser has applied its error correction and after any Javascript have manipulated the DOM. Some of those activities may include:
HTML error correction by the browser
HTML normalization by the browser
DOM manipulation by Javascript
In short, using View Source you will observe the Javascript but not the HTML. The HTML errors may get corrected in the Inspect Elements tool. As an example:
With in View Source you may observe:
<h1>The title</h2>
Whereas through Inspect Element that would have corrected as:
<h1>The title</h1>
getPageSource() always returns the markup obtained through View Source.

DHTML - Change text on page using input field

I need to create a code to change an example text to a user-defined value when the user types in an input field (Similar to the preview field when writing a question on Stack Overflow).
This needs to be achieved without the use of HTML5 or Flash as the users will be running IE8, not all will have Flash plug-ins installed.
As such I have started by looking at DHTML to achieve the desired effect. Currently I can change the example text when a user types in the input field but only to a pre-defined value ("Example" in the code below), how should I edit this code to display the user-defined value?
JS
function changetext(id)
{
id.innerHTML="Example";
}
HTML
<form>
Content:<input type="text" id="input" onkeyup="changetext(preview)" />
</form>
<p id="preview">No content found</p>
You need to have something like this in the function:
function changetext(id){
var info = document.getElementById('input').value();
id.innerHTML = info;
}
This js is not fully correct. I would highly recommend you start using a javascript library like jQuery. It makes this a menial task.
Edited:
jQuery will work in IE8 just fine. in jQuery you will not need to attach js to your input. The code would look like this.
$('#input').click(function(){
$('#preview).html(this.val());
});
It is a lot cleaner and doesnt have js in the html.