I want to developt many middlewares to be sure websites'll be parse.
This is the workflow I thinks :
First try with TOR + Polipo
If 2 HTTP errors, try without TOR (so website know my IP)
If 2 HTTP errors, try with proxy (use one of my other server to make HTTP REQ)
If 2 HTTP errors, try with random proxy (on list of 100). This is repeat 5 times
If none works, I save informations on ElasticSearch database, to see on my control panel
I'll create a custom middleware, with process_request function wich contains all of this 5 methods. But I don't find how save type of connection (for exemple if TOR not works, but direct connection yes, I want to use this settings for all of my other scrap, for the same website). How can I save this settings ?
Other thinks, I've a pipeline wich download images of items. Is there a solution to use this middleware (idealy with saving settings) to use on it ?
Thanks in advance for you're help.
I think you could use the retry middleware as a starting point:
You could use request.meta["proxy_method"] to keep track of which one you are using
You could reuse request.meta["retry_times"] in order to track how many times you have retried a given method, and then set the value to zero when you change the proxy method.
You could use request.meta["proxy"] to use the proxy server you want via the existing HTTP proxy middleware. You may want to tweak the middlewares ordering so that the retry middleware runs before the proxy middleware.
Related
When Testcafe runs against our local site, every request it makes during the test steps are prepended with something like http://192.168.1.182:59304/http://localhost:3000 (port number varies per run).
For the most part this works, but our web application makes calls to certain APIs during a user journey, and within TestCafe they might look like: http://192.168.1.182:59304/http://www.example.com/api/v2/customers/1 which come back with a 401 and response body of 'unauthorized'. Some API calls are fine, however.
I guess my question is:
Are there any way to get around this from my side, such as rewrite certain requests, or do I need to contact the API provider - and if so, what would they be potentially looking to do to allow these requests to go ahead?
You have faced this issue: https://github.com/DevExpress/testcafe-hammerhead/issues/2344. It was fixed. Try to run your tests with the latest TestCafe version (1.8.8-alpha.3).
I'm extremely new to Lua as well as nginx.we're trying to set up authentication.
I'm trying to write a script that could be injected in my NGINX which would actually listen to a an endpoint.
My api would give give me a token. I would receive this token and check if it exists in my YAML file or probably JSON file .
based on the privilege mentioned in the file, I would like to redirect it the respective url with necessary permissions.
Any help would be highly appreciated.
First of all, nginx on its own has no Lua integration whatsoever; so if you just have an nginx server, you can't script it in Lua at all.
What you probably mean is openresty, aka. the lua-nginx-module, which lets you run Lua code in nginx to handle requests programatically.
Assuming that you have a working nginx + lua-nginx-module installed and running, what you're looking for is the rewrite_by_lua directive, which lets you redirect the client to a different address based on their request.
(Realistically, you'd likely want to use rewrite_by_lua_block or rewrite_by_lua_file instead)
Within the Lua block, you can make API calls, execute some logic, etc. and then redirect to some URI internally with ngx.exec or send an actual redirect to the client with ngx.redirect.
If you want to read in a JSON or YAML file, you should do so in the init_by_lua so the file gets loaded only once and then stays in memory. The lua-cjson module comes with nginx, so you can just use that to parse your json data into a Lua table.
I have a pure CherryPy server which has been running for a few years already. I decided recently to add SSL support. In this case it was enough to provide the certificate and key files and to assign correct values to the variables cherrypy.server.ssl_certificate and cherrypy.server.ssl_private_key.
I would like to give a warning about this change whenever somebody tries to access a page using "http://..." instead of "https://...". Is there a simple way of achieving this without many changes in my system? Another option would be to redirect the HTTP access to HTTPS—can that be done easily?
I would create a custom handler to achieve what you're after. This automatically redirects to HTTPS.
class Functions():
def check_ssl(self=None):
# check if url is in https and redirect if http
if cherrypy.request.scheme == "http":
cherrypy.HTTPRedirect(Referer.replace("http:", "https:"))
cherrypy.tools.Functions = cherrypy.Tool('before_handler', check_ssl)
When using Geb, is it possible to set custom request headers and user agent when using the Browser API (and not the Direct Download API)?
While this is possible with the FirefoxDriver (see here), I am looking for a way of doing this with the WebKitDriver.
A possible solution is via a proxy.
BrowserMob has a standalone mode with REST api, or embedded in your test programmatically: https://github.com/webmetrics/browsermob-proxy . Useful when there are a lot of custom headers you want to test.
If you already have Apache, you can create another VirtualHost on a different port having that particular request header, and point your browser to that port before the test. Given that your header doesn't change between tests.
This might not be the direct solution to your question: modify request headers directly in Browser API, but it achieves the end result.
I need to redirect all requests on port 80 of an application server to a web server. I'm trying to avoid the need to install IIS and instead use WCF to do the job.
It looks like an operation such as the one below is suitable but one problem I've got is if a URL of the form http://mydomain.com/ is used then WCF will present a page about metadata.
[OperationContract, WebGet(UriTemplate = "*")]
RedirectToWebServer();
Does anybody know of a way to get WCF behaving the same as IIS in redirect mode?
This just seems like the wrong tool for the job. If you really don't want to use one of the many web servers that could do this with a couple minutes of setup time (IIS, Apache, Lighttpd), you could just make a simple HTTP socket server.
Listen on port 80. As soon as you get two newlines in a row, send back the response:
HTTP/1.1 301 Moved Permanently
Location: http://myothersite.com/whatever
(I'm almost certain that's the minimum you need). If you want to be really fancy and follow HTTP specs, match HTTP/1.1 or HTTP/1.0 based on what the request has.. but for a quick and dirty redirect, that's all you need.
That said, again, I'd say go grab another web server and set up a redirect using it. There are many lightweight HTTP servers that will work.