How to create test suite for any broken images using machine learning based algoritham/ AI based Automation - selenium

I am working on a sizeable large eCommerce web-portal where, there are many images on thousands of CMS generated dynamic pages.
The location of the image on page is fixed.
How to create a Javascript / Machine Learning based test automation bot, which will skim through all of these pages & result in the pages where images do not load.
Time for this test run is not a constraint, as we won't be putting this run as CI/CD pipeline, rater a standard overnight run.

1 Start by fetching a page (load it in a browser, or make a HTTP request for it, depending if you need to wait for JS or not)
2 Find all tags, and store each one in a Set of toVisitPages
3 Find all tags on current page
4 For each tag on current page, make a HTTP request to it, does it 404? If so - add it to your broken list
5 Start at step 2) with the next page from toVisitPages

Related

Too much / infinite loop issue - Shopify

In Shopify:-
When my for loop code run then this issue occurred:-
"There was a problem loading this website
Try refreshing the page.
If the site still doesn't load, please try again in a few minutes."
console error is
"Failed to load resource: the server responded with a status of 502 ()"
How to fix it.
Seems like your Shopify site has too many collections and products within them so it just fails to load all of them due exceeding memory limits.
I'm assuming that you're trying to replicate the page from the reference URL you provided in your comment. Consider one of the options below to implement the required functionality:
Create different automated collections for each price range using Product price is less than condition. This approach is good as it uses Shopify's engine to generate collections, but it still might be quiet tricky to implement grouping as on the ref site you provided in comments.
Load collections and its products using AJAX requests i.e. request the data only when a customer is scrolling the page down. It will increase page load speed and will slightly decrease the Shopify site load but still is not ideal as data still will be requested on every page load and scrolling down events. You can slightly improve the situation by caching results on the clients' side, but again, is not ideal still.
Create a custom Shopify application that syncs products with your database. Then you can create an URL on your server that will be used as a data provider for your page. It can be requested via AJAX and return JSON with all the products, grouped by collections and matching the request parameters e.g. price less than X.
You can go further and add a proxy extension to your app. Shopify proxy would allow you loading a custom page directly from your server with the data from your database and render it within Shopify site like a part of itself.
In general, this approach gives you more flexibility on data to output, which can also be cached on your side to increase page speed load drastically.
Personally, I would prefer the last option.

script to check entire website to figure out if there are any pages which are taking more time to load

Can we have a script which will crawl through the entire website to figure out if there are any pages which are taking more time to load (some pages under a particular category were taking more time to load) in selenium Webdriver or jmeter
For JMeter you can use HTML Link Parser configuration element for this purposes. From the documentation:
Spidering Example
Consider a simple example: let's say you wanted JMeter to "spider" through your site, hitting link after link parsed from the HTML returned from your server (this is not actually the most useful thing to do, but it serves as a good example). You would create a Simple Controller, and add the "HTML Link Parser" to it. Then, create an HTTP Request, and set the domain to ".*", and the path likewise. This will cause your test sample to match with any link found on the returned pages. If you wanted to restrict the spidering to a particular domain, then change the domain value to the one you want. Then, only links to that domain will be followed.
More information on above approach and a couple more options: How to Spider a Site with JMeter - A Tutorial
Remember that JMeter is not a browser hence it doesn't execute JavaScript so your results may not be precise enough as JMeter doesn't measure the time required to actually render the page.

JMeter two(2) page login load test

I am very interested in using and mastering the JMeter tool. I am still at the beginning and I am stuck with this situation:
I wanted to load test an application but there is a login page with two pages. Basically you have to fill in forms on 2 pages. How do you do that in JMeter? I tested so far just 1 page process.
Thanks.
Actually the number of "pages" shouldn't matter as:
in case if it is in fact one page 2nd part of which is not visible, i.e. via CSS display:none property
if these are really 2 pages, you need to maintains the session between them (an after the login as well). To do so:
make sure you have HTTP Cookie Manager
if there are any mandatory dynamic parameters you need to extract them from 1st page, save into JMeter Variables and use as parameters for the 2nd page. The most commonly used test element for this is Regular Expression Extractor

Script to download Google web history

How does one write a script to download one's Google web history?
I know about
https://www.google.com/history/
https://www.google.com/history/lookup?hl=en&authuser=0&max=1326122791634447
feed:https://www.google.com/history/lookup?month=1&day=9&yr=2011&output=rss
but they fail when called programmatically rather than through a browser.
I wrote up a blog post on how to download your entire Google Web History using a script I put together.
It all works directly within your web browser on the client side (i.e. no data is transmitted to a third-party), and you can download it to a CSV file. You can view the source code here:
http://geeklad.com/tools/google-history/google-history.js
My blog post has a bookmarklet you can use to easily launch the script. It works by accessing the same feed, but performs the iteration of reading the entire history 1000 records at a time, converting it into a CSV string, and making the data downloadable at the touch of a button.
I ran it against my own history, and successfully downloaded over 130K records, which came out to around 30MB when exported to CSV.
EDIT: It seems that number of foks that have used my script have run into problems, likely due to some oddities in their history data. Unfortunately, since the script does everything within the browser, I cannot debug it when it encounters histories that break it. If you're a JavaScript developer, use my script, and it appears your history has caused it to break; please feel free to help me fix it and send me any updates to the code.
I tried GeekLad's system, unfortunately two breaking changes have occurred #1 URL has changed ( I modified and hosted my own copy which led to #2 type=rss arguments no longer works.
I only needed the timestamps... so began the best/worst hack I've written in a while.
Step 1 - https://stackoverflow.com/a/3177718/9908 - Using chrome disable ALL security protocols.
Step 2 - https://gist.github.com/devdave/22b578d562a0dc1a8303
Using contentscript.js and manifest.json, make a chrome extension, host ransack.js locally to whatever service you want ( PHP, Ruby, Python, etc ). Goto https://history.google.com/history/ after installing your contentscript extension in developer mode ( unpacked ). It will automatically inject ransack.js + jQuery into the dom, harvest the data, and then move on to the next "Later" link.
Every 60 seconds, Google will force you to re-login randomly so this is not a start and walk away process BUT it does work and if they up the obfustication ante, you can always resort to chaining Ajax calls and send the page back to the backend for post processing. At full tilt, my abomination script collected 1 page a second of data.
On moral grounds I will not help anyone modify this script to get search terms and results as this process is not sanctioned by Google ( though not blocked apparently ) and recommend it only to sufficiently motivated individuals to make it work for them. By my estimates it took me 3-4 hours to get all 9 years of data ( 90K records ) # 1 page every 900ms or faster.
While this thing is going, DO NOT browse the rest of the web because Chrome is running with no safeguards in place, most of them exist for a reason.
One can download her search logs directly from Google (In case downloading it using a script is not the primary purpose),
Steps:
1) Login and Go to https://history.google.com/history/
2) Just below your profile picture logo, towards the right side, you can find an icon for settings. See the second option called "Download". Click on that.
3) Then click on "Create Archive", then Google will mail you the log within minutes.
maybe before issuing a request to get the feed the script shuld add a User-Agent HTTP header of well known browser, for Google to decide that the request came from that browser.

Do not record particular page hit

I am trying to hit particular web page and record post its load for example:
http://serv1.project.com/page7
but on hitting above page only http://serv1.project.com gets recorded. When i play same script then http://serv1.project.com is opened without subsequent page hit.
Note: I am trying to run my scripts using RC with java as base.
Why does it matter what gets recorded? Can't you just do:
selenium.open("page7");
in your Java code?
Of course I assume you have created the selenium session with the http://serv1.project.com/ as a base URL.