Future/Promise like stuff for Trio in Python? - python-trio

Say I have a class Messenger which is responsible for sending and receiving messages. Now I have a service that sends out requests and waits for responses via it, matching each pair with an id field in the message. In asyncio I would do:
class Service:
...
async def request(self, req):
new_id = self._gen_id()
req.id = new_id
fu = asyncio.Future()
self._requests[new_id] = fu
await self._messenger.send(req)
return await fu
def handle_response(self, res):
try:
fu = self._requests.pop(res.req_id)
except KeyError:
return
fu.set_result(res)
So I could send out multiple requests from different tasks, and in each task wait for the corresponding response. (And some messages may not have a corresponding response that are handled in another way.)
But how do I do this in Trio? Should I create an Event / Condition / Queue for each request and put the response in a predefined place?
If yes, which is the best for this scenario? Or there is another way to do this?

You could create a simple class that contains an Event and your result.
However, strictly speaking events are overkill because multiple tasks can wait on an event, which you don't need, so you should use trio.hazmat.wait_task_rescheduled. That also gives you a hook you can use to do something when the requesting task gets cancelled before receiving its reply.
http://trio.readthedocs.io/en/latest/reference-hazmat.html#low-level-blocking

Related

Insert Record to BigQuery or some RDB during API Call

I am writing a REST API GET endpoint that needs to both return a response and store records to either GCP Cloud SQL (MySQL), but I want the return to not be dependent on completion of the writing of the records. Basically, my code will look like:
def predict():
req = request.json.get("instances")
resp = make_response(req)
write_to_bq(req)
write_to_bq(resp)
return resp
Is there any easy way to do this with Cloud SQL Client Library or something?
Turns our flask has a functionality that does what I require:
#app.route("predict", method=["GET"]):
def predict():
# do some stuff with the request.json object
return jsonify(response)
#app.after_request
def after_request_func(response):
# do anything you want that relies on context of predict()
#response.call_on_close
def persist():
# this will happen after response is sent,
# so even if this function fails, the predict()
# will still get it's response out
write_to_db()
return response
One important thing is that a method tagged with after_request must take an argument and return something of type flask.Response. Also I think if method has call_on_close tag, you cannot access from context of main method, so you need to define anything you want to use from the main method inside the after_request tagged method but outside (above) the call_on_close method.

Easier way to use pika asynchronous (twisted)?

This is my first project using rabbitmq and I am complete lost because I am not sure what would be the best way to solve a problem.
The program is fairly simple, it just listen for alarms events, and then put the events in a rabbitmq queue, but I am struggling with the architecture of the program.
If I open, publish and then close the connection for every single event, I will add a lot of latency, and unnecessary packages will be transmitted (even more than the usual because I am using TLS)...
If I keep a connection open, and create a function that publish the messages (I only work with a single queue, pretty basic), I will eventually have problems because multiple events can occur at the same time, and my program will not know what to do if the connection to the rabbitmq broker end.
Reading their documentations, the solution seems use one of their "Connection Adapters", which would fit me like a glove because I just rewrite all my connection stuff from basic sockets to use Twisted (I really liked their high level approach). But there is a problem. Their "basic example" is fairly complex for someone who barely considers himself "intermediate".
In a perfect world, I would be able to run the service in the same reactor as the "alarm servers" and call a method to publish a message. But I am struggling to understand the code. Has anyone who worked with pika could point me a better direction, or even tell me if there is a easier way?
Well, I will post what worked for me. Probably is not the best alternative but maybe it helps someone who gets here with the same problem.
First I decided to drop Twisted and use Asyncio (nothing personal, I just wanted to use it because it's already in python), and even tho pika had a good example using Asynchronous, I tried and found it easier to just use aio_pika.
I end up with 2 main functions. One for a publisher and another for a subscriber.
Bellow is my code that works for me...
# -*- coding: utf-8 -*-
import asyncio
import aio_pika
from myapp import conf
QUEUE_SEND = []
def add_queue_send(msg):
"""Add MSG to QUEUE
Args:
msg (string): JSON
"""
QUEUE_SEND.append(msg)
def build_url(amqp_user, amqp_pass, virtual_host):
"""Build Auth URL
Args:
amqp_user (str): User name
amqp_pass (str): Password
virtual_host (str): Virtual Host
Returns:
str: AMQP URL
"""
return ''.join(['amqps://',
amqp_user, ':', amqp_pass,
'#', conf.get('amqp_host'), '/', virtual_host,
'?cafile=', conf.get('ca_cert'),
'&keyfile=', conf.get('client_key'),
'&certfile=', conf.get('client_cert'),
'&no_verify_ssl=0'])
async def process_message(message: aio_pika.IncomingMessage):
"""Read a new message
Args:
message (aio_pika.IncomingMessage): Mensagem
"""
async with message.process():
# TODO: Do something with the new message
await asyncio.sleep(1)
async def consumer(url):
"""Keep listening to a MQTT queue
Args:
url (str): URL
Returns:
aio_pika.Connection: Conn?
"""
connection = await aio_pika.connect_robust(url=url)
# Channel
channel = await connection.channel()
# Max concurrent messages?
await channel.set_qos(prefetch_count=100)
# Queue
queue = await channel.declare_queue(conf.get('amqp_queue_client'))
# What call when a new message is received
await queue.consume(process_message)
# Returns the connection?
return connection
async def publisher(url):
"""Send messages from the queue.
Args:
url (str): URL de autenticaĆ§Ć£o
"""
connection = await aio_pika.connect_robust(url=url)
# Channel
channel = await connection.channel()
while True:
if QUEUE_SEND:
# If the list (my queue) is not empty
msg = aio_pika.Message(body=QUEUE_SEND.pop().encode())
await channel.default_exchange.publish(msg, routing_key='queue')
else:
# Just wait
await asyncio.sleep(1)
await connection.close()
I started both using the ``loop.create_task```.
As I said. It kinda worked for me (even tho I am still having an issue with another part of my code) but I did not want to left this question open since most people can have the same issue.
If you know a better approach or a more elegant approach, please, share.

Designing an API for the client to a 3rd-party service

I am fairly new to Scala and I'm working on an application (library) which is a client to a 3rd-party service (I'm not able to modify the server side and it uses custom binary protocol). I use Netty for networking.
I want to design an API which should allow users to:
Send requests to the server
Send requests to the server and get the response asynchronously
Subscribe to events triggered by the server (having multiple asynchronous event handlers which should be able to send requests as well)
I am not sure how should I design it. Exploring Scala, I stumble upon a bunch of information about Actor model, but I am not sure if it can be applied there and if it can, how.
I'd like to get some recommendations on the way I should take.
In general, the Scala-ish way to expose asynchronous functionality to user code is to return a scala.concurrent.Future[T].
If you're going the actor route, you might consider encapsulating the binary communication within the context of a single actor class. You can scale the instances of this proxy actor using Akka's router support, and you could produce response futures easily using the ask pattern. There are a few nice libraries (Spray, Play Framework) that make wrapping e.g. a RESTful or even WebSocket layer over Akka almost trivial.
A nice model for the pub-sub functionality might be to define a Publisher trait that you can mix in to some actor subclasses. This could define some state to keep track of subscribers, handle Subscribe and Unsubscribe messages, and provide some sort of convenient method for broadcasting messages:
/**
* Sends a copy of the supplied event object to every subscriber of
* the event object class and superclasses.
*/
protected[this] def publish[T](event: T) {
for (subscriber <- subscribersFor(event)) subscriber ! event
}
These are just some ideas based on doing something similar in some recent projects. Feel free to elaborate on your use case if you need more specific direction. Also, the Akka user list is a great resource for general questions like this, if indeed you're interested in exploring actors in Scala.
Observables
This looks like a good example for the Obesrvable pattern. This pattern comes from the Reactive Extensions of .NET, but is also available for Java and Scala. The library is provided by Netflix and has a really good quality.
This pattern has a good theoretical foundation --- it is the dual to the iterator in the category theoretical sense. But more important, it has a lot of practical ideas in it. Especially it handles time very good, e.g. you can limit the event rate you want to get.
With an observable you can process events on avery high level. In .NET it looks a lot like an SQL query. You can register for certain events ("FROM"), filter them ("WHERE") and finally process them ("SELECT"). In Scala you can use standard monadic API (map, filter, flatMap) and of course "for expressions".
An example can look like
stackoverflowQuestions.filter(_.tag == "Scala").map(_.subject).throttleLast(1 second).subscribe(println _)
Obeservables take away a lot of problems you will have with event based systems
Handling subsrcriptions
Handling errors
Filtering and pre-processing events
Buffering events
Structuring the API
Your API should provide an obesrvable for each event source you have. For procedure calls you provide a function that will map the function call to an obesrvable. This function will call the remote procedure and provide the result through the obeservable.
Implementation details
Add the following dependency to your build.sbt:
libraryDependencies += "com.netflix.rxjava" % "rxjava-scala" % "0.15.0"
You can then use the following pattern to convert a callback to an obeservable (given your remote API has some way to register and unregister a callback):
private val callbackFunc : (rx.lang.scala.Observer[String]) => rx.lang.scala.Subscription = { o =>
val listener = {
case Value(s) => o.onNext(s)
case Error(e) => o.onError(o)
}
remote.subscribe(listener)
// Return an interface to cancel the subscription
new Subscription {
val unsubscribed = new AtomicBoolean(false)
def isUnsubscribed: Boolean = unsubscribed.get()
val asJavaSubscription: rx.Subscription = new rx.Subscription {
def unsubscribe() {
remote.unsubscribe(listener)
unsubscribed.set(true)
}
}
}
If you have some specific questions, just ask and I can refine the answer
Additional ressources
There is a very nice course from Martin Odersky et al. at coursera, covering Observables and other reactive techniques.
Take a look at the spray-client library. This provides HTTP request functionality (I'm assuming the server you want to talk to is a web service?). It gives you a pretty nice DSL for building requests and is all about being asynchronous. It does use the akka Actor model behind the scenes, but you do not have to build your own Actors to use it. Instead the you can just use scala's Future model for handling things asynchronously. A good introduction to the Future model is here.
The basic building block of spray-client is a "pipeline" which maps an HttpRequest to a Future containing an HttpResponse:
// this is from the spray-client docs
val pipeline: HttpRequest => Future[HttpResponse] = sendReceive
val response: Future[HttpResponse] = pipeline(Get("http://spray.io/"))
You can take this basic building block and build it up into a client API in a couple of steps. First, make a class that sets up a pipeline and defines some intermediate helpers demonstrating ResponseTransformation techniques:
import scala.concurrent._
import spray.can.client.HttpClient
import spray.client.HttpConduit
import spray.client.HttpConduit._
import spray.http.{HttpRequest, HttpResponse, FormData}
import spray.httpx.unmarshalling.Unmarshaller
import spray.io.IOExtension
type Pipeline = (HttpRequest) => Future[HttpResponse]
// this is basically spray-client boilerplate
def createPipeline(system: ActorSystem, host: String, port: Int): Pipeline = {
val httpClient = system.actorOf(Props(new HttpClient(IOExtension(system).ioBridge())))
val conduit = system.actorOf(props = Props(new HttpConduit(httpClient, host, port)))
sendReceive(conduit)
}
private var pipeline: Pipeline = _
// unmarshalls to a specific type, e.g. a case class representing a datamodel
private def unmarshallingPipeline[T](implicit ec:ExecutionContext, um:Unmarshaller[T]) = (pipeline ~> unmarshal[T])
// for requests that don't return any content. If you get a successful Future it worked; if there's an error you'll get a failed future from the errorFilter below.
private def unitPipeline(implicit ec:ExecutionContext) = (pipeline ~> { _:HttpResponse => () })
// similar to unitPipeline, but where you care about the specific response code.
private def statusPipeline(implicit ec:ExecutionContext) = (pipeline -> {r:HttpResponse => r.status})
// if you want standard error handling create a filter like this
// RemoteServerError and RemoteClientError are custom exception classes
// not shown here.
val errorFilter = { response:HttpResponse =>
if(response.status.isSuccess) response
else if(response.status.value >= 500) throw RemoteServerError(response)
else throw RemoteClientError(response)
}
pipeline = (createPipeline(system, "yourHost", 8080) ~> errorFilter)
Then you can use wrap these up in methods tied to specific requests/responses that becomes the public API. For example, suppose the service has a "ping" GET endpoint that returns a string ("pong") and a "form" POST endpoint where you post form data and receive a DataModel in return:
def ping()(implicit ec:ExecutionContext, um:Unmarshaller[String]): Future[String] =
unmarshallingPipeline(Get("/ping"))
def form(formData: Map[String, String])(implicit ec:ExecutionContext, um:Unmarshaller[DataModel]): Future[DataModel] =
unmarshallingPipeline(Post("/form"), FormData(formData))
And then someone could use the API like this:
import scala.util.{Failure, Success}
API.ping() foreach(println) // will print out "pong" when response comes back
API.form(Map("a" -> "b") onComplete {
case Success(dataModel) => println("Form accepted. Server returned DataModel: " + dataModel)
case Failure(e) => println("Oh noes, the form didn't go through! " + e)
}
I'm not sure if you will find direct support in spray-client for your third bullet point about subscribing to events. Are these events being generated by the server and somehow sent to your client outside the scope of a specific HTTP request? If so, then spray-client will probably not be able to help directly (though your event handlers could still use it to send requests). Are the events occurring on the client side, e.g. the completion of deferred processing initially triggered by a response from the server? If so, you could actually probably get pretty far just by using the functionality in Future, but depending on your use cases, using Actors might make sense.

While handling an NServiceBus message, is it possible to peek at the input queue?

I have a Windows service using NServiceBus to handle incoming messages.
While processing a message, I would like to check to see if there are any other remaining messages on the queue to process.
What is the best way to approach this?
For this specific scenario I'd say that a saga could be appropriate where it is created by the first message received, opens a timeout (for let's say one minute), collects all messages during that period of time, then Bus.SendLocal's a message containing all rows, for which another handler creates the spreadsheet and uploads.
Since, NServiceBus is using MSMQ, you can use the methods from System.Messaging.
Included is a modified method, I'm currently working on, to do a kind of batch processing.
using System.Messaging;
public int PeekAtQueue()
{
const string QUEUE_NAME = "private$\\you_precious_queuname";
if (!MessageQueue.Exists(".\\" + QUEUE_NAME))
return 0;
var messageQueues = MessageQueue.GetPrivateQueuesByMachine(Environment.MachineName);
var queue = messageQueues.Single(x => x.QueueName == QUEUE_NAME);
return queue.GetAllMessages().Count();
}
Modified here itself in the editor. Hope it still compiles :)
Found a similar discussion here, by the way:
http://jopinblog.wordpress.com/2008/03/12/counting-messages-in-an-msmq-messagequeue-from-c/

Persist items using a POST request within a Pipeline

I want to persist items within a Pipeline posting them to a url.
I am using this code within the Pipeline
class XPipeline(object):
def process_item(self, item, spider):
log.msg('in SpotifylistPipeline', level=log.DEBUG)
yield FormRequest(url="http://www.example.com/additem, formdata={'title': item['title'], 'link': item['link'], 'description': item['description']})
but it seems it's not making the http request.
Is it possible to make http request from pipelines? If not, do I have to do it in the Spider?
Do I need to specify a callback function? If so, which one?
If I can make the http call, can I check the response (JSON) and return the item if everything went ok, or discard the item if it didn't get saved?
As I final thing, is there a diagram that explains the flow that Scrapy follows from beginning to end? I am getting slightly lost which what gets called when. For instance, if Pipelines returned items to Spiders, what do Spiders do with those items? What's after a Pipeline call?
Many thanks in advance
Migsy
You can inherit your pipeline from scrapy.contrib.pipeline.media.MediaPipeline and yield Requests in 'get_media_requests'. Responses are passed into 'media_downloaded' callback.
Quote:
This method is called for every item pipeline component and must
either return a Item (or any descendant class) object or raise a
DropItem exception. Dropped items are no longer processed by further
pipeline components.
So, only spider can yield a request with a callback.
Pipelines are used for processing items.
You better describe what do you want to achieve.
is there a diagram that explains the flow that Scrapy follows from beginning to end
Architecture overview
For instance, if Pipelines returned items to Spiders
Pipelines do not return items to spiders. The items returned are passed to the next pipeline.
This could be done easily by using the requests library. If you don't want to use another library then look into urllib2.
import requests
class XPipeline(object):
def process_item(self, item, spider):
r = requests.post("http://www.example.com/additem", data={'title': item['title'], 'link': item['link'], 'description': item['description']})
if r.status_code == 200:
return item
else:
raise DropItem("Failed to post item with title %s." % item['title'])