difference between response.selector.xpath and response.xpath

difference between response.selector.xpath and response.xpath - scrapy

From a performance standpoint, I would like to know the difference between
response.selector.xpath
and
response.xpath
Is there a case where a new http request is made and not the other one?
Thanks

They are the same.
If you look into Scrapy code, response.xpath() actually uses selector.xpath().
def xpath(self, query, **kwargs):
return self.selector.xpath(query, **kwargs)
Is there a case where a new http request is made and not the other
one?
Neither one generate a new http request.

Related

Insert Record to BigQuery or some RDB during API Call

I am writing a REST API GET endpoint that needs to both return a response and store records to either GCP Cloud SQL (MySQL), but I want the return to not be dependent on completion of the writing of the records. Basically, my code will look like:
def predict():
req = request.json.get("instances")
resp = make_response(req)
write_to_bq(req)
write_to_bq(resp)
return resp
Is there any easy way to do this with Cloud SQL Client Library or something?

Turns our flask has a functionality that does what I require:
#app.route("predict", method=["GET"]):
def predict():
# do some stuff with the request.json object
return jsonify(response)
#app.after_request
def after_request_func(response):
# do anything you want that relies on context of predict()
#response.call_on_close
def persist():
# this will happen after response is sent,
# so even if this function fails, the predict()
# will still get it's response out
write_to_db()
return response
One important thing is that a method tagged with after_request must take an argument and return something of type flask.Response. Also I think if method has call_on_close tag, you cannot access from context of main method, so you need to define anything you want to use from the main method inside the after_request tagged method but outside (above) the call_on_close method.

use case of process_spider_input in spidermiddleware

Does anyone know the difference between process_spider_input(response, spider) in spidermiddleware and process_response(request, response, spider) in Downloadermiddleware.
And how to choose one over another, because I see they do quite the same work, they handle response.

According to the source, they do have difference
return value
spider_mw.process_spider_input() returns None, you can check or modify the Response. Basically it supposes the response has been accepted and you can't refuse it.
downloader_mw.process_response() returns Response or Request. You can refuse the response from download handler and generate a new request. (e.g. the RetryMiddleware）

In karate mocking (karate-netty), how can we override request header value?

Objective:
We want few API calls should go to mock-server(https://192.x.x.x:8001) and others should go to an actual downstream application server(https://dev.api.acme.com).
Setup :
On local, mock server is up with standalone jar on port 8001. e.g https://192.x.x.x:8001
In application config file (config.property)downstream system(which need to mock) defined with mockserver IP i.e https://192.x.x.x:8001
Testing scenario and problem:
1.
Scenario: pathMatches('/profile/v1/users/{id}/user')
* karate.proceed('https://dev.api.acme.com')
* def response = read ('findScope.json')
* def responseStatus = 200ˀˀ
* print 'created response is: ' + response
Now, when we hit API request via postman or feature file then it does karate.proceed properly to https://dev.api.acme.com/profile/v1/users/123/user instead of 192.x.x.x. However, in this request, host is referring to https://192.x.x.x:8001 instead of https://dev.api.acme.com which create a problem for us.
How can we override request header in this case? I did try with karate.set and also with header host=https://192.x.x.x:8001 but no luck.
Thanks!

Please see if the 1.0 version works: https://github.com/intuit/karate/wiki/1.0-upgrade-guide
Unfortunately https proxying may not work as mentioned. If you are depending on this, we may need your help (code contribution) to get this working
If the Host header is still not mutable, that also can be considered a feature request, and here also I'd request you to consider contributing code

Which request to use to fetch data from database based on some data sent?

I am using django-rest-framework's genericAPIViews
I want to send some data from the front end to the backend and depending upon the data sent Django should query a model and return some data to the frontend. The data sent is protected data and thus can't be attached in the URL so, GET request can't be used. I am not manipulating the database, just querying it and returning a response (a typical GET use case).
Now in DRF's genericAPIViews, I can't find a view which does this:
As can be seen from Tom Christie's GitHub page only 2 views have a post handler:
CreateAPIView: return self.create()
ListCreateAPIView: return self.create()
As can be seen both these views have post methods which create entries in the database which I don't want. Is there a built-in class which does my job or should I use generics.GenericAPIView and write my own post handler?
Currently I am using generic.View which has post(self, request, *args, **kwargs)

I think you have a few options to choose from. One way is to use a ModelViewSet which could be quite useful because of how it nicely handles the communication between views, serializers and models. Here is a link to django-rest-framework ModelViewSet docs.
These are the actions that it provides by default (since it inherits from GenericAPIView):
.list(), .retrieve(), .create(), .update(), .partial_update(), .destroy().
If you don't want all of them you could specify which methods you want by doing the following:
class ModelViewSet(views.ModelViewSet):
queryset = App.objects.all()
serializer_class = AppSerializer
http_method_names = ['get', 'post', 'head']
Note: http_method_names seems to be working from Django >= 1.8
Source: Disable a method in a ViewSet, django-rest-framework

Where is a Response transformed into one of its subclasses?

I'm trying to write a downloader middleware that ignores responses that don't have a pre-defined element. However, I can't use the css method of the HtmlResponse class inside the middleware because, at that point, the response's type is just Response. When it reaches the spider it's an HtmlResponse, but then it's too late because I can't perform certain actions to the middleware state.
Where is the response's final type set?

Without seeing your code of the middleware it is hard to tell what the matter is.
However my middleware below gets an HtmlResponse object:
class FilterMiddleware(object):
def process_response(self, request, response, spider):
print response.__class__
print type(response)
return response**strong text**
Both print statements verify this:
<class 'scrapy.http.response.html.HtmlResponse'>
<class 'scrapy.http.response.html.HtmlResponse'>
And I can use the css method on the response without any exception. The order of the middleware in the settings.py does not matter either: with 10, 100 or 500 I get the same result as above.
However if I configure the middleware to 590 or above I get plain old Response object. And this is because the conversion happens in the HttpCompressionMiddleware class on line 35 in the current version.
To solve your issue order your middleware somewhere later on the pipeline (with a lower order number) or convert the response yourself (I would not do this however).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

difference between response.selector.xpath and response.xpath - scrapy

From a performance standpoint, I would like to know the difference between response.selector.xpath and response.xpath Is there a case where a new http request is made and not the other one? Thanks

They are the same. If you look into Scrapy code, response.xpath() actually uses selector.xpath(). def xpath(self, query, kwargs): return self.selector.xpath(query, kwargs) Is there a case where a new http request is made and not the other one? Neither one generate a new http request.

Related

Insert Record to BigQuery or some RDB during API Call

use case of process_spider_input in spidermiddleware

In karate mocking (karate-netty), how can we override request header value?

Which request to use to fetch data from database based on some data sent?

Where is a Response transformed into one of its subclasses?

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

difference between response.selector.xpath and response.xpath - scrapy

From a performance standpoint, I would like to know the difference between response.selector.xpath and response.xpath Is there a case where a new http request is made and not the other one? Thanks

They are the same. If you look into Scrapy code, response.xpath() actually uses selector.xpath(). def xpath(self, query, **kwargs): return self.selector.xpath(query, **kwargs) Is there a case where a new http request is made and not the other one? Neither one generate a new http request.

Related

Insert Record to BigQuery or some RDB during API Call

use case of process_spider_input in spidermiddleware

In karate mocking (karate-netty), how can we override request header value?

Which request to use to fetch data from database based on some data sent?

Where is a Response transformed into one of its subclasses?

Categories

Resources

They are the same. If you look into Scrapy code, response.xpath() actually uses selector.xpath(). def xpath(self, query, kwargs): return self.selector.xpath(query, kwargs) Is there a case where a new http request is made and not the other one? Neither one generate a new http request.