How does Python Eventlet concurrently request? - eventlet

The client uses eventlet with the following code.
import eventlet
import urllib.request
urls = [
"http://localhost:5000/",
"http://localhost:5000/",
"http://localhost:5000/",
]
def fetch(url: str) -> str:
return urllib.request.urlopen(url).read()
pool = eventlet.GreenPool(1000)
for body in pool.imap(fetch, urls):
print("got body", len(body), body)
The server side is built using fastapi, so why can we distinguish if the client is really concurrently requested, we add a delay
from fastapi import FastAPI
import uvicorn
import time
app = FastAPI()
#app.get('/')
def root():
time.sleep(3)
return {"message": "Hello World"}
if __name__ == '__main__':
uvicorn.run("api:app", host="0.0.0.0", port=5000)
Start the server first, then run the client code, using Linux's time command to see how long it takes.
got body 25 b'{"message": "Hello World"}'
got body 25 b'{"message": "Hello World"}'
got body 25 b'{"message": "Hello World"}'
python -u "013.py" 0.25s user 0.01s system 2% cpu 9.276 total
As you can see, it took 9 seconds instead of 3 seconds, so this ``eventlet` is not parallel, am I using it wrong? What is the correct way to open it?
Translated with www.DeepL.com/Translator (free version)

Related

Is flask_restful compatible with virtual environments?

I am making an API with Flask-RESTFUL, but when I make the POST
http://127.0.0.1:5000/bot?id_articulo=1&url_articulo=www.wiki.org
I get the message
"message": "The browser (or proxy) sent a request that this server could not understand."
My python code is
from flask import Flask
from flask_restful import Resource, Api, reqparse
import pandas as pd
app = Flask(__name__)
api = Api(app)
class Bot(Resource):
def post(self):
parser = reqparse.RequestParser()
parser.add_argument('id_articulo' , required=True, type=int)
parser.add_argument('url_articulo', required=True, type=str)
args = parser.parse_args()
print(args)
data_articulo = pd.read_csv('articulos.csv')
print(data_articulo)
if args['url_articulo'] in list(data_articulo['url']):
return {
'mensage': f"El artículo '{args['url_articulo']}' ya existe."
}, 409
else:
nueva_columna = pd.DataFrame({
'id_articulo': [args['id_articulo']],
'url': [args['url_articulo']],
})
data_articulo = data_articulo.append(nueva_columna, ignore_index=True)
data_articulo.to_csv('articulos.csv', index=False)
return {'data': data_articulo.to_dict()}, 200
api.add_resource(Bot, '/bot', methods=['POST'])
if __name__ == '__main__':
app.run()
Now, I noticed that the error message is thrown only when I am in a virtual environment whose requirements.txt is
aniso8601==9.0.1
click==8.1.3
colorama==0.4.5
Flask==2.1.2
Flask-RESTful==0.3.9
importlib-metadata==4.12.0
itsdangerous==2.1.2
Jinja2==3.1.2
joblib==1.1.0
MarkupSafe==2.1.1
numpy==1.23.1
pandas==1.4.3
python-dateutil==2.8.2
pytz==2022.1
six==1.16.0
Werkzeug==2.1.2
zipp==3.8.0
By this far, I don't have a clue about what is going on and it makes me think that the flask_restful library have issues with virtual environments and I would like to know how to make this work properly in one.

Why botium-box HTTP(S)/JSON connector request is empty?

This is my demo code:
from flask import Flask , jsonify, request, render_template
import json
app = Flask(__name__)
#app.route('/t',methods=['GET','POST'])
def t():
print(request.data)
print (request.headers)
return jsonify({"test":"test"})
if __name__ =='__main__':
app.run(debug=True,host="0.0.0.0",port=5000)
When I use the Generic HTTP(S)/JSON Connector and to Click CHECK CONNECTIVITY Button,
it response a "test" for text.
but I see the logs , the header is empty. The logs is below:
192.168.58.119 - - [17/Aug/2020 15:24:35] "[37mGET /t HTTP/1.1[0m" 200 -
b''
Host: 192.168.58.145:5000
Connection: close
The connectivity is successful.
How to get the botium box query text in the http headers?
thanks!

How do you get a response from a Flask app with curl?

I'm building a Flask test predictor using AllenNLP.
I'm passing 'passage' and 'question' from a .json file to the predictor.
However, when I pass the json file using curl, it doesn't return a response. Is there a special return in Flask to get it?
Code looks like:
from allennlp.predictors.predictor import Predictor as AllenNLPPredictor
from flask import Flask
from flask import request
app = Flask(__name__)
#app.route("/", methods=['GET','POST'])
def hello():
return "<h1>Test app!</h1>"
class PythonPredictor:
def __init__(self, config):
self.predictor = AllenNLPPredictor.from_path(
"https://storage.googleapis.com/allennlp-public-models/bidaf-elmo-model-2018.11.30-charpad.tar.gz"
)
def predict(self, payload):
if request.method == "POST":
prediction = self.predictor.predict(
passage=payload["passage"], question=payload["question"]
)
return prediction["best_span_str"]
Curl command looks like:
curl http://127.0.0.1:5000 -X POST -H "Content-Type: application/json" -d #sample.json
Unless I've misunderstood (I'm guessing you're asking how to obtain the JSON submission in your route, and return the result) it sounds like you need to do something like:
p = PythonPredictor()
#app.route("/", methods=['POST'])
def hello():
data = request.get_json()
result = p.predict(data)
return result
This effectively runs the data in your sample.json through your PythonPredictor.predict method, and returns that prediction to the client.
Notice this code creates the instance p outside the route function, so that the NLP model is only loaded when your flask app starts (not on every request). However it looks like this may re-download that file, unless AllenNLPPredictor.from_path does some caching, so it would probably be advisable to manually download that file to your own storage first, and load from there in the PythonPredictor.__init__ function.
Let me know if any of this needs clarification, or I've missunderstood.

Setting referer in Selenium

Im working with the selenium remote driver to automate actions on a site, i can open the page i need directly by engineering the url as the sites url schema is very constant. This speeds up the script as it dose not have to work through several pages before it gets to the one it needs.
To make the automation seem organic is there a way to set a referral page in Selenium ?
If you're checking the referrer on the server, then using a proxy (as mentioned in other answers) will be the way to go.
However, if you need access to the referrer in Javascript using a proxy will not work. To set the Javascript referrer I did the following:
Go to the referral website
Inject this javascript onto the page via Selenium API: document.write('<script>window.location.href = "<my website>";</script>')"
I'm using a Python wrapper around selenium, so I cannot provide the function you need to inject the code in your language, but it should be easy to find.
What you are looking for is referer spoofing.
Selenium does not have an inbuilt method to do this, however it can be accomplished by using a proxy such as fiddler.
Fiddler also provides an API-only version of the FiddlerCore component, and programmatic access to all of the proxy's settings and data, thus allowing you to modify the headers of the http response.
Here is a solution in Python to do exactly that:
https://github.com/j-bennet/selenium-referer
I described the use case and the solution in the README. I think github repo won't go anywhere, but I'll quote the relevant pieces here just in case.
The solution uses libmproxy to implement a proxy server that only does one thing: adds a Referer header. Header is specified as command line parameter when running the proxy. Code:
# -*- coding: utf-8 -*-
"""
Proxy server to add a specified Referer: header to the request.
"""
from optparse import OptionParser
from libmproxy import controller, proxy
from libmproxy.proxy.server import ProxyServer
class RefererMaster(controller.Master):
"""
Adds a specified referer header to the request.
"""
def __init__(self, server, referer):
"""
Init the proxy master.
:param server: ProxyServer
:param referer: string
"""
controller.Master.__init__(self, server)
self.referer = referer
def run(self):
"""
Basic run method.
"""
try:
print('Running...')
return controller.Master.run(self)
except KeyboardInterrupt:
self.shutdown()
def handle_request(self, flow):
"""
Adds a Referer header.
"""
flow.request.headers['referer'] = [self.referer]
flow.reply()
def handle_response(self, flow):
"""
Does not do anything extra.
"""
flow.reply()
def start_proxy_server(port, referer):
"""
Start proxy server and return an instance.
:param port: int
:param referer: string
:return: RefererMaster
"""
config = proxy.ProxyConfig(port=port)
server = ProxyServer(config)
m = RefererMaster(server, referer)
m.run()
if __name__ == '__main__':
parser = OptionParser()
parser.add_option("-r", "--referer", dest="referer",
help="Referer URL.")
parser.add_option("-p", "--port", dest="port", type="int",
help="Port number (int) to run the server on.")
popts, pargs = parser.parse_args()
start_proxy_server(popts.port, popts.referer)
Then, in the setUp() method of the test, proxy server is started as an external process, using pexpect, and stopped in tearDown(). Method called proxy() returns proxy settings to configure Firefox driver with:
# -*- coding: utf-8 -*-
import os
import sys
import pexpect
import unittest
from selenium.webdriver.common.proxy import Proxy, ProxyType
import utils
class ProxyBase(unittest.TestCase):
"""
We have to use our own proxy server to set a Referer header, because Selenium does not
allow to interfere with request headers.
This is the base class. Change `proxy_referer` to set different referers.
"""
base_url = 'http://www.facebook.com'
proxy_server = None
proxy_address = '127.0.0.1'
proxy_port = 8888
proxy_referer = None
proxy_command = '{0} {1} --referer {2} --port {3}'
def setUp(self):
"""
Create the environment.
"""
print('\nSetting up.')
self.start_proxy()
self.driver = utils.create_driver(proxy=self.proxy())
def tearDown(self):
"""
Cleanup the environment.
"""
print('\nTearing down.')
utils.close_driver(self.driver)
self.stop_proxy()
def proxy(self):
"""
Create proxy settings for our Firefox profile.
:return: Proxy
"""
proxy_url = '{0}:{1}'.format(self.proxy_address, self.proxy_port)
p = Proxy({
'proxyType': ProxyType.MANUAL,
'httpProxy': proxy_url,
'ftpProxy': proxy_url,
'sslProxy': proxy_url,
'noProxy': 'localhost, 127.0.0.1'
})
return p
def start_proxy(self):
"""
Start the proxy process.
"""
if not self.proxy_referer:
raise Exception('Set the proxy_referer in child class!')
python_path = sys.executable
current_dir = os.path.dirname(__file__)
proxy_file = os.path.normpath(os.path.join(current_dir, 'referer_proxy.py'))
command = self.proxy_command.format(
python_path, proxy_file, self.proxy_referer, self.proxy_port)
print('Running the proxy command:')
print(command)
self.proxy_server = pexpect.spawnu(command)
self.proxy_server.expect_exact(u'Running...', 2)
def stop_proxy(self):
"""
Override in child class to use a proxy.
"""
print('Stopping proxy server...')
self.proxy_server.close(True)
print('Proxy server stopped.')
I wanted my unit tests to start and stop the proxy server without any user interaction, and could not find any Python samples doing that. Which is why I created the github repo (link above).
Hope this helps someone.
Not sure if i understand your question correctly, but if you want to override your HTTP requests there is no way to do it directly with webdriver. You must run your request thru a proxy. I prefer using browsermob, you can get it thru maven or similar.
ProxyServer server = new ProxyServer(proxy_port); //net.lightbody.bmp.proxy.ProxyServer;
server.start();
server.setCaptureHeaders(true);
Proxy proxy = server.seleniumProxy(); //org.openqa.selenium.Proxy
proxy.setHttpProxy("localhost").setSslProxy("localhost");
server.addRequestInterceptor(new RequestInterceptor() {
#Override
public void process(BrowserMobHttpRequest browserMobHttpRequest, Har har) {
browserMobHttpRequest.addRequestHeader("Referer", "blabla");
}
});
// configure it as a desired capability
DesiredCapabilities capabilities = new DesiredCapabilities();
capabilities.setCapability(CapabilityType.PROXY, proxy);
// start the driver
driver = new FirefoxDriver(capabilities);
Or black/whitelist anything:
server.blacklistRequests("https?://.*\\.google-analytics\\.com/.*", 410);
server.whitelistRequests("https?://*.*.yoursite.com/.*. https://*.*.someOtherYourSite.*".split(","), 200);

ScrapyDeprecationWaring: Command's default `crawler` is deprecated and will be removed. Use `create_crawler` method to instantiate crawlers

Scrapy version 0.19
I am using the code at this page ( Run multiple scrapy spiders at once using scrapyd ). When I run scrapy allcrawl, I got
ScrapyDeprecationWaring: Command's default `crawler` is deprecated and will be removed. Use `create_crawler` method to instantiate crawlers
Here is the code:
from scrapy.command import ScrapyCommand
import urllib
import urllib2
from scrapy import log
class AllCrawlCommand(ScrapyCommand):
requires_project = True
default_settings = {'LOG_ENABLED': False}
def short_desc(self):
return "Schedule a run for all available spiders"
def run(self, args, opts):
url = 'http://localhost:6800/schedule.json'
for s in self.crawler.spiders.list(): #this line raise the warning
values = {'project' : 'YOUR_PROJECT_NAME', 'spider' : s}
data = urllib.urlencode(values)
req = urllib2.Request(url, data)
response = urllib2.urlopen(req)
log.msg(response)
How do I fix the DeprecationWarning ?
Thanks
Use:
crawler = self.crawler_process.create_crawler()