Process HTML page without triggering internet connections - vba

page.html is saved website that has many java scripts (even when i disable JS in internet explorer options, page still manages to load trackers upon rendering).
Every time calling function findElement() my local proxy server that shows three request to tracking sites, total nine unnecessary requests for code below.
HTML := FileRead("page.html")
findElement(HTML, findStr) {
HTMLFile := ComObjCreate("HTMLfile")
HTMLFile.write(HTML)
DOMElements := HTMLFile .getElementsByTagName(findStr)
}
findElement(HTML, "div")
findElement(HTML, "a")
findElement(HTML, "span")
My goal is to find elements on this page without triggering any internet connections, is there solution for this or i am forced to use regex to process HTML page?

Related

How to open a page with Phantomjs without running js or making subsequent requests?

Is there a way to just load the server generated HTML (without any js or images)?
The docs seem a little sparse
The strength of PhantomJS is exactly in its ability to emulate a real browser, which opens a page and makes all the subsequent request. If you want just html maybe better use curl or wget?
But nevertheless there is a way not to run js or load images: set corresponding page settings: http://phantomjs.org/api/webpage/property/settings.html
page.settings.javascriptEnabled = false;
page.settings.loadImages = false;

capturing a browser refreshed event using Selenium Web Driver

I am writing a program to automate link validations in a site. Our site is having more than 400 links per page and we need to open each link and see it is returning a valid page i.e 200, there are other requirements as well to check if the page is a 404 redirection page etc. It means to validate 400 inks it will take about 30 minutes or so.
My design is to integrate this with the Front-End (Selenium) automation in a way that each time the browser loads a new page or browser refreshes it will trigger a new thread by passing the page source for validating all the href available.
We are not following a page object model otherwise I could trigger this in my each page.
Question here is that is there any way we can listen to a browser refresh or page load event using Selenium Web Driver?
Correct me if I don't understand your question, but page_refresh and page_load_event can be two very different goals for you, if you are dealing with AJAX. You can try this article about the AJAX part
and this one for selenium custom events synchronization.
This solution here is the most actual I could find.
Actually Selenium is JS driver so this answers can be helpful if you want to try it too:
check-if-page-reloaded-or-refresh-in-js
is-page-reloaded-or-refreshed-using-jquery-or-javascript
post_detect_refresh_with_javascript

msapplication-starturl ignored in modern Windows UI

Internet Explorer 10 and 11 on the desktop (“classic”) respect the msapplication-starturl meta tag. Allowing me to specify what URL to use when a user pins my site to their task bar in Windows.
In modern Windows UI (“Metro”), however, the meta tag is ignored. Whatever is the current page URL is used instead of the starturl.
I’ve used the msapplication-startpage URL to track how many users access my site using pinning. (By appending a campaign token to the URL.) Does anyone have a clever work around for tracking incoming users from the modern Windows UI?
Use JS in one of these two ways to track users pinning your site to the Start Screen.
SiteMode
http://msdn.microsoft.com/en-us/library/ie/gg491733(v=vs.85).aspx
This function will return true if the user has navigated to your site from the Start Screen. You can increment your counter every time it returns true
if (window.external.msIsSiteMode()) {
//Add 1 to your counter
}
mssitepinned
This will work with pinning on Immersive IE11 (but not on Immersive IE10).
You can use this event to track how many users are performing the pinning action to get yourself an absolute count of how many times your site has been pinned.
document.addEventListener('mssitepinned', IncrementCounter, false);
function IncrementCounter()
{
//Add 1 to your counter
}

How to call chrome function from browser content page in xulrunner

I am converting code that originally ran as remote signed jar files in Firefox to use XULRunner instead. There are several reports that are implemented as web pages with an output option. Options include an HTML page or a report viewer that is written in XUL and Javascript.
When the user submits the form, and the report viewer is selected, then I need to open a chrome window. Obviously this cannot be done directly for security reasons. I want to provide a function or use some sort of message passing method to signal to the containing chrome what needs to happen.
Can this be done and if so how? Things I am considering:
1) Adding a function to the content window's window or document object
2) Some sort of message passing function
3) Some sort of customer event send/receive
4) A special URL form with a handler such as repviewer://repname/parameters
There is a quite elaborate article on this topic on MDN. The best way to achieve this without jeopardizing security is to send a generic event from your web page. The top XUL document should call addEventListener() with the fourth parameter set to true which will allow it to receive such untrusted events. Data can be passed through an attribute of the event target, the XUL document can then inspect that attribute.

How can I block based on URL (from address bar) in a safari extension

I'm trying to write an extension that will block access to (configurable) list of URLs if they are accessed more than N times per hour. From what I understand, I need to have a start script pass a "should I load this" message to a global HTML page (who can access the settings object to get the list of URLs), who will give a thumbs up/thumbs down message back to the start script to deny/allow loading.
That works out fine for me, but when I use the usual beforeLoad/canLoad handlers, I get messages for all the sub-items that need to be loaded (images/etc..), which screws up the #accesses/hour limit I'm trying to make.
Is there a way to synchronously pass messages back and forth between the two sandboxes so I can tell the global HTML page, "this is the URL in the window bar and the timestamp for when this request came in", so I can limit duplicate requests?
Thanks!
You could use a different message for the function that checks whether to allow the page to load, rather than using the same message as for your beforeLoad handler. For example, in the injected script (which must be a "start" script), put:
safari.self.tab.dispatchMessage('pageIsLoading');
And in the global script:
function handleMessage(event) {
if (event.name == 'pageIsLoading') {
if (event.target.url.indexOf('forbidden.site.com') > -1) {
console.log(event.timeStamp);
event.target.url = 'about:blank';
}
}
}