Get HAR Data from Remote AJAX Requests with Scrapy-Splash - scrapy-splash

I'm scraping a web page that makes a number of ajax calls to get data. render.har returns the data that's on the same domain but does not include data from remote domains. Is there a way to get HAR data from requests made to remote domains?
I've tried the following arguments with no success:
endpoint='render.har',
url=url,
har=1,
iframes=1,
allowed_content_types='*',
html=1,
expand=1,
response_body=1,
request_body=1,
console=1,
timeout=20,
wait=6

Related

RESTDataSource - How to know if response comes from get request or cache

I need to get some data from a REST API in my GraphQL API. For that I'm extending RESTDataSource from apollo-datasource-rest.
From what I understood, RESTDataSource caches automatically requests but I'd like to verify if it is indeed cached. Is there a way to know if my request is getting its data from the cache or if it's hitting the REST API?
I noticed that the first request takes some time, but the following ones are way faster and also, the didReceiveResponse method is not called everytime I make a query. Is it because the data is loaded from the cache?
I'm using apollo-server-express.
Thanks for your help!
You can time the requests like following:
console.time('restdatasource get req')
this.get(url)
console.timeEnd('restdatasource get req')
Now, if the time is under 100-150 milliseconds, that should be a request coming from the cache.
You can monitor the console, under the network tab. You will be able to see what endpoints the application is calling. If it uses cached data, there will be no new request to your endpoint logged
If you are trying to verify this locally, one good option is to setup a local proxy so that you can see all the network calls being made. (no network call meaning the call was read from cache) Then you can simply configure your app using this apollo documentation to forward all outgoing calls through a proxy like mitmproxy.

Same response data for all iterations even though cookie is cleared

My Test structure in Jmeter
Thread group (2 users)
Http request
Listener
For each iteration same form_key values are getting in response which should not be.
How to get unique form_key in response for each iteration
Jmeter Test result screenshot
I cannot reproduce your issue using one of the online Magento demo instances, in particular this one: http://demo-acm-2.bird.eu/customer/account/login/
As you can see, each time form_key is different for each user for each iteration.
If you're using HTTP Cookie Manager - make sure to tick "Clear cookies each iteration" box
Also make sure to properly setup the HTTP Request sampler, to wit put http or https into "Protocol", server name or IP to the relevant field, path, etc.

Use Scrapy to combine data from multiple AJAX requests into a single item

What is the best way to crawl pages with content coming from multiple AJAX requests? It looks like I have the following options (given that AJAX URLs are already known):
Crawl AJAX URLs sequentially passing the same item between requests
Crawl AJAX URLs concurrently and output each part as a separate item
with a shared key (e.g. source URL)
What is the most common practice? Is there a way to get a single item at the end, but allow some AJAX requests to fail w/o compromising the rest of the data?
scrapy is built for concurrency and statelessness, so if point 2 is possible, it is always preferred, from both speed and memory consumption aspects.
in case requests must be serialized, consider accumulate items in request meta field
Check scrapy-inline-requests. It allows to smoothly process multiple nested requests in a response handler.

Handling dynamic http requests instead of hardcoded http requests in Jmeter

I'm creating a 50 users load test on a JSF web application.
I record a scenario using JMeter proxy for one user who logs in, does some db operations and logs out. After recording the scenario, the recorded test contains http requests and data that particularly belongs to the user used while scenario recording.
At the time of running the test for 50 unique virtual users, the recorded test sends http requests and data which was in the recorded scenario. But in our application, the http requests and data vary depending upon the user. So how do I handle such situations in JMeter when it comes to methods being called depending upon the existence or non-existence of data for a user after logging in?
To be precise how would I make changes in my Test plan to manage dynamic urls and dynamic data for each virtual user?
Latest versions of JMeter allow you to write the whole parameters (raw data) from scratch, so you could use variables in this field.
To achieve dynamic URLs use a Regular Expression Extractor (Post-Processor) on a prior request that define what request will be sent and use the variable in HTTP Request's path field.
If you know what request each type of users will send you could use If Controllers and test a thread variable, created by a previous Regular Expression Extractor, and inside each controller add the specific request.
If the subsequent request for each user is defined by the server, using redirection, just check "Follow Redirection" field.
See JMeter Wiki for more examples on how to do this.

Jmeter : How to test a website to render a page regardless of the content

I have a requirement where the site only needs to respond to the user within certain seconds, regardless of the contents.
Now there is a option in Jmeter in HTTP Proxy Server -> URL Patterns to exclude and then to start recording.
Here I can specify gif, css or other content to ignore. However before starting the recording I have to be aware of what are the various contents that are going to be there.
Is there any specific parameter to pass to Jmeter or any other tool which takes care about loading the page only and I can assert the response code of that page and no the other contents of the page are recorded.
Thanks.
Use the standard HTTP Request sampler with DISABLED (not checked) option Retrieve All Embedded Resources from HTML Files (set via sampler's control panel):
"It also lets you control whether or not JMeter parses HTML files for
images and other embedded resources and sends HTTP requests to
retrieve them."
NOTE: You may also define the same setting via HTTP Request Defaults.
NOTE: See also "Response size calculation" in the same HTTP Request article.
Add assertions to your http samplers:
Duration Assertion: to tests if response was received within a defined amount of time;
Response Assertion: to ensure that request was successfull,
e.g.
Response Field to Test = Response Code
Pattern Matching Rules = Equals
Patterns to Test = 200
You want to run test that would ignore resources after certain number of seconds?
I don't understand, what are you trying to accomplish by doing that?
Users will still receive those resources when they request your url, so your tests wont be accurate.
I don't mean any disrespect, but is it possible that you misunderstood the requirements?
I assume that the requirement is to load all the resources in certain number of seconds, not to cut off the ones that fail to fit in that time?