How to use Scrapy-Splash - scrapy-splash

I try to use Scrapy to crawl a web whose URL will not change with the page go on.I used Splash to simulate CLICK .However ,it only take me to second page.I wonder how can I keep get next page and how to crawl web like that.

Related

Scrapy Splash - Not Rendering the full content

I'm trying to scrape this site https://tucson.craigslist.org/search/acc?postedToday=1#search=1~list~0~0, When i try it in splash web console and try to give a wait time of 30 seconds, sometimes it renders the full page and sometimes it is not rendering, When i run in scrapy it did not even render the javascript path, it just returns the rendered content (content received on first call to page)
Can someone help me on this?
Note
This site uses localstorage to store the result and render from there
I deployed the splash instances using the aquarium (Three instances, 1 Slot per instance, 3600 Timeout, Disabled Private Mode)

On Nuxt-Link click refresh page if same url

For example, I have a blog website, with a discover page that doesn't take any params. When you load page fetch hook calls API that randomly returns some article.
So my problem is when I'm already on the Discover page and I click discover in the Left bar I want the page to refresh.
:key="$route.fullPath" is not working for me because nothing is changing in the path.

Is it possible to scrape an Angular Website using Selenium-python?

I have been trying to scrape an Angular Website using Selenium. To my surprise it doesn't let you scrape the html rendered contents as it renders it dynamically using Javascript. I want to locate those tags for the purpose of scraping but I am unable to do so. What is the right way to scrape them? Here is some more context:
They say you can't do it using python.
Some also tried downloading all the html content and then read them. But again this isn't my use case.
But my use case is a lot different:
I want to login to my google account then it redirects me to an angular page where I click a button called reporting and from there I am redirected to a page from where I have to finally click download button to download the report.

How to follow lazy loading with scrapy?

I am trying to crawl a page that is using lazy loading to get the next set of items. My crawler follows normal links, but this one seems to be different:
The page:
https://www.omegawatches.com/de/vintage-watches
is followed by https://www.omegawatches.com/de/vintage-watches?p=2
But only if you load it within the browser. Scrapy will not follow the link.
Is there a way to make scray follow the pages 1,2,3,4 automatically?
The page follows Virtual scrolling and the api through which it gets data is
https://www.omegawatches.com/de/vintage-watches?p=1&ajax=1
it returns a json data which contains different details including products in html format, and if the next page exist or not in a a tag with class link next
increase the page number till there is no a tag with link next class.

Selenium catch HTML every time change has been made in browser

Is it possible to use Selenium so that my code and browser will be integrated - I want to get updated HTML page every time I made any change on the web page in the browser?
In other words I would like to run my app which would automatically start a browser and every time I do any change on the web page selenium automatically get changed HTML in java/python code. For example select a dropdown item might be a good example.
Thanks!