I'm trying to scrape a website. The first request with Selenium always works, but as soon as the second request is run, the page loads infinitely. All I can do is stopping the page manually, so the rest of requests continue working correctly. What can I do about it? Thanks.
Related
I am trying to scrape from a fashion website that uses Javascript, using Scrapy
This is the page: https://www.thekooples.com/us_en/women/ready-to-wear/dresses.html
I have docker, and followed the instructions on splash docs to set up splash on localhost:8050.
I am able to render https://quotes.toscrape.com/js/ properly.
As I understand, that is a js page, and I disabled js, and it does look different when I do so.
However, I am unsuccessful in rendering the fashion webpage. This is what I get:
That is actually how the page looks like without js, so I know it is unsuccessful. What could be happening?
You can do try the following two things:
Increase the delay. You can also find an example script in splash server's homepage where you can wait for a particular element to appear.
You can download and print the HAR and see if any of the requests failed. If it failed then you may need to add some user-agent in your splash request.
We are using and single page application (SPA, which renders client-side) using Vue.js and we have UI automation tests using Selenium. I don't think the fact we are using Vue.js and Selenium is the issue, I think this is a general issue when UI testing with a SPA application.
The problem
The tests are running before the DOM has loaded...
In a SPA, when navigating, the browser does not reload, so the issue lies when the tests think they can run straight away.
In a server-side rendered application (like MVC), the browser reloads when navigating to a different page, only when the browser decides the page has loaded will the tests run.
This is not the case in SPA, the browser in a SPA doesn't (as far as I'm aware) have a way of telling the tests when the page is ready to test.
Can anyone help? I am looking for a solution or an event for the browser to let the tests know when they are ready to run.
Tried solutions
We have hard-coded a timeout for the tests to wait (which works sometimes), but it doesn't guarantee every pipeline build will pass since it's not guaranteed the same load speed each time and doesn't guarentee the page will load before the timeout has expired.
Im trying to scrape a site that requires logging in. so i login.
The pages directly after the login seem to have alot of javascript.
How can i test out the responses of splash on scrapy shell?
Once i login, how can i run splash on the next url via splash command line and have it process the javascript and give me a response i can parse?
I dont understand what i actually need to do to run the splash service and how it functions with scrapy... Please point me in the right direction.
Situation:
I am writing test automation for a website. There comes a point where there is a link button on my website. Clicking this I am redirected to an external website. There I have to log in and as soon as I do that I am redirected to my original web-page which contains some 'connections' that I need.
Problem:
As soon as cypress clicks on the redirection button it does into a blank page.
Ideal solution:
I would want to automate the entire scenario. If not then at-least a work around.
As suggested in the Cypress Docs, you should really be using cy.request() to log in. You don't control a 3rd party site, and that makes your test very flakey.
For example, a lot of login pages are constantly changing and are A/B tested for the purpose of preventing a bot from logging in, including testing bots. The data:, url is probably the result of a http redirect.
Thankfully, using cy.request() you can 'fake' logging in by making a request to the server through code (which doesn't change as much) and you will never have to leave your app to log in
Here's a recipe for Single Sign-On for example.
Hope that makes sense!
We're trying to validate if a URL fires upon loading a web page. Is there a way to do this programmatically using Selenium RC? The event does not appear within the page HTML or DOM.
Thanks.
Can you specify the URL as part of your test?
If so, then what about running a httpd server and checking if it gets a hit when your page loads?
The selenium.start("captureNetworkTraffic=true"); enables the 'capture' of all the HTTP requests/responses associated with the loading of the web page. Once formatted, these results can be reported easily.