gRPC + Thread local issue - python-multithreading

Im building a grpc server with python and trying to have some thread local storage handled with werkzeug Local and LocalProxy, similar to what flask does.
The problem I'm facing is that, when I store some data in the local from a server interceptor, and then try to retrieve it from the servicer, the local is empty. The real problem is that for some reason, the interceptor runs in a different greenlet than the servicer, so it's impossible to share data across a request since the werkzeug.local.storage ends up with different keys for the data that is supposed to belong to the same request.
The same happens using python threading library, it looks like the interceptors are run from the main thread or a different thread from the servicers. Is there a workaround for this? I would have expected interceptors to run in the same thread, thus allowing for this sort of things.
# Define a global somewhere
from werkzeug.local import Local
local = Local()
# from an interceptor save something
local.message = "test msg"
# from the service access it
local.service_var = "test"
print local.message # this throw a AttributeError
# print the content of local
print local.__storage__ # we have 2 entries in the storage, 2 different greenlets, but we are in the same request.

the interceptor is indeed run on the serving thread which is different from the handling thread. The serving thread is in charge of serving servicers and intercept servicer handlers. After the servicer method handler is returned by the interceptors, the serving thread will submit it to the thread_pool at _server.py#L525:
# Take unary unary call as an example.
# The method_handler is the returned object from interceptor.
def _handle_unary_unary(rpc_event, state, method_handler, thread_pool):
unary_request = _unary_request(rpc_event, state,
method_handler.request_deserializer)
return thread_pool.submit(_unary_response_in_pool, rpc_event, state,
method_handler.unary_unary, unary_request,
method_handler.request_deserializer,
method_handler.response_serializer)
As for workaround, I can only imagine passing a storage instance both to the interceptor and to servicer during initialization. After that, the storage can be used as a member variable.
class StorageServerInterceptor(grpc.ServerInterceptor):
def __init__(self, storage):
self._storage = storage
def intercept_service(self, continuation, handler_call_details):
key = ...
value = ...
self._storage.set(key, value)
...
return continuation(handler_call_details)
class Storage(...StorageServicer):
def __init__(self, storage):
self._storage = storage
...Servicer Handlers...

You can also wrap all the functions that will be called and set the threading local there, and return a new handler with the wrapped functions.
class MyInterceptor(grpc.ServerInterceptor):
def wrap_handler(self, original_handler: grpc.RpcMethodHandler):
if original_handler.unary_unary is not None:
unary_unary = original_handler.unary_unary
def wrapped_unary_unary(*args, **kwargs):
threading.local().my_var = "hello"
return unary_unary(*args, **kwargs)
new_unary_unary = wrapped_unary_unary
else:
new_unary_unary = None
...
# do this for all the combinations to make new_unary_stream, new_stream_unary, new_stream_stream
new_handler = grpc.RpcMethodHandler()
new_handler.request_streaming=original_handler.request_streaming
new_handler.response_streaming=original_handler.response_streaming
new_handler.request_deserializer=original_handler.request_deserializer
new_handler.response_serializer=original_handler.response_serializer
new_handler.unary_unary=new_unary_unary
new_handler.unary_stream=new_unary_stream
new_handler.stream_unary=new_stream_unary
new_handler.stream_stream=new_stream_stream
return new_handler
def intercept_service(self, continuation, handler_call_details):
return self.wrap_handler(continuation(handler_call_details))

Related

APScheduler and Flask sharing the same object (singleton) issue

Is it possible to share some application data between scheduled jobs? To be more specific, I have one singleton on the application level that is updated when someone POST some data to the flask endpoint, my plan is to create a job that will check (in intervals) that application singleton and perform some operations on it. But the problem is that a singleton object is always an empty object (within the flask context it is always a valid object), somehow it doesn't have the reference to an object defined in the FLASK application (running with a gunicorn multiple workers).
"""SINGLETON OBJECT DEFINED ON APPLICATION LEVEL"""
class SingletonObject(metaclass=Singleton):
singleton_var: Dict[str, Dict[str, any]] = None
def __init__(self):
if not SingletonObject.singleton_var:
SingletonObject.singleton_var = dict()
"""FUNCTION CALLED WITHIN FLASK APP"""
def update_value_on_singleton(key: any, value: any):
SingletonObject.singleton_var[key] = value
"""FUNCTION THAT CHECKS SINGLETON AND IS CALLED WITHIN APSCHEDULER job"""
def check_singleton_object():
"""SingletonObject --always new object, not one from flask context"""
with app.app_context:
for key, value in SingletonObject.singleton_var.items():
print(key, value)
"""STANDARD BACKGROUND SCHEDULER"""
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_job(func=check_singleton_object, trigger='interval', seconds=5)
scheduler.start()
I tried using flask g (globals), also, before scheduling the job using flask context manager but without success. I would like to avoid using the database, some caching mechanism etc.
Not clear how apscheduler works but I expected that the singleton object is the same in the flask context and apscheduler context

JobQueue.run_repeating to run a function without command handler in Telegram

I need to start sending notifications to a TG group, before that I want to run a function continuosly which would query an API and store data in DB. While this function is running I would want to be able to send notifications if they are available in the DB:
That's my code:
import telegram
from telegram.ext import Updater,CommandHandler, JobQueue
token = "tokenno:token"
bot = telegram.Bot(token=token)
def start(update, context):
context.bot.send_message(chat_id=update.message.chat_id,
text="You will now receive msgs!")
def callback_minute(context):
chat_id = context.job.context
# Check in DB and send if new msgs exist
send_msgs_tg(context, chat_id)
def callback_msgs():
fetch_msgs()
def main():
JobQueue.run_repeating(callback_msgs, interval=5, first=1, context=None)
updater = Updater(token,use_context=True)
dp = updater.dispatcher
dp.add_handler(CommandHandler("start",start, pass_job_queue=True))
updater.start_polling()
updater.idle()
if __name__ == '__main__':
main()
This code gives me error:
TypeError: run_repeating() missing 1 required positional argument: 'callback'
Any help would greatly appreciated
There are a few issues with your code, let me try to point them out:
1.
def callback_msgs(): fetch_msgs()
You use callback_msgs as callback for your job. But job callbacks take exactly one argument of type telegram.ext.CallbackContext.
JobQueue.run_repeating(callback_msgs, interval=5, first=1, context=None)
JobQueue is a class. To use run_repeating, which is an instance method, you'll need an instance of that class. In fact the Updater already builds an instance for you, it's available as updater.job_queue in your case. So the call should look like this:
updater.job_queue.run_repating(callback_msgs, interval=5, first=1, context=None)
CommandHandler("start",start, pass_job_queue=True)
This is not strictly speaking an issue, bot pass_job_queue=True has no effect at all, because you use use_context=True
Please note that there is a nice tutorial on JobQueue over at the ptb-wiki. There is also an example on how to use it.
Disclaimer: I'm currently the maintainer of python-telegram-bot

Multiple mongoDB related to same django rest framework project

We are having one django rest framework (DRF) project which should have multiple databases (mongoDB).Each databases should be independed. We are able to connect to one database, but when we are going to another DB for writing connection is happening but data is storing in DB which is first connected.
We changed default DB and everything but no changes.
(Note : Solution should be apt for the usage of serializer. Because we need to use DynamicDocumentSerializer in DRF-mongoengine.
Thanks in advance.
While running connect() just assign an alias for each of your databases and then for each Document specify a db_alias parameter in meta that points to a specific database alias:
settings.py:
from mongoengine import connect
connect(
alias='user-db',
db='test',
username='user',
password='12345',
host='mongodb://admin:qwerty#localhost/production'
)
connect(
alias='book-db'
db='test',
username='user',
password='12345',
host='mongodb://admin:qwerty#localhost/production'
)
models.py:
from mongoengine import Document
class User(Document):
name = StringField()
meta = {'db_alias': 'user-db'}
class Book(Document):
name = StringField()
meta = {'db_alias': 'book-db'}
I guess, I finally get what you need.
What you could do is write a really simple middleware that maps your url schema to the database:
from mongoengine import *
class DBSwitchMiddleware:
"""
This middleware is supposed to switch the database depending on request URL.
"""
def __init__(self, get_response):
# list all the mongoengine Documents in your project
import models
self.documents = [item for in dir(models) if isinstance(item, Document)]
def __call__(self, request):
# depending on the URL, switch documents to appropriate database
if request.path.startswith('/main/project1'):
for document in self.documents:
document.cls._meta['db_alias'] = 'db1'
elif request.path.startswith('/main/project2'):
for document in self.documents:
document.cls._meta['db_alias'] = 'db2'
# delegate handling the rest of response to your views
response = get_response(request)
return response
Note that this solution might be prone to race conditions. We're modifying a Documents globally here, so if one request was started and then in the middle of its execution a second request is handled by the same python interpreter, it will overwrite document.cls._meta['db_alias'] setting and first request will start writing to the same database, which will break your database horribly.
Same python interpreter is used by 2 request handlers, if you're using multithreading. So with this solution you can't start your server with multiple threads, only with multiple processes.
To address the threading issues, you can use threading.local(). If you prefer context manager approach, there's also a contextvars module.

Persistent connection in twisted

I'm new in Twisted and have one question. How can I organize a persistent connection in Twisted? I have a queue and every second checks it. If have some - send on client. I can't find something better than call dataReceived every second.
Here is the code of Protocol implementation:
class SyncProtocol(protocol.Protocol):
# ... some code here
def dataReceived(self, data):
if(self.orders_queue.has_new_orders()):
for order in self.orders_queue:
self.transport.write(str(order))
reactor.callLater(1, self.dataReceived, data) # 1 second delay
It works how I need, but I'm sure that it is very bad solution. How can I do that in different way (flexible and correct)? Thanks.
P.S. - the main idea and alghorithm:
1. Client connect to server and wait
2. Server checks for update and pushes data to client if anything changes
3. Client do some operations and then wait for other data
Without knowing how the snippet you provided links into your internet.XXXServer or reactor.listenXXX (or XXXXEndpoint calls), its hard to make head-or-tails of it, but...
First off, in normal use, a twisted protocol.Protocol's dataReceived would only be called by the framework itself. It would be linked to a client or server connection directly or via a factory and it would be automatically called as data comes into the given connection. (The vast majority of twisted protocols and interfaces (if not all) are interrupt based, not polling/callLater, thats part of what makes Twisted so CPU efficient)
So if your shown code is actually linked into Twisted via a Server or listen or Endpoint to your clients then I think you will find very bad things will happen if your clients ever send data (... because twisted will call dataReceived for that, which (among other problems) would add extra reactor.callLater callbacks and all sorts of chaos would ensue...)
If instead, the code isn't linked into twisted connection framework, then your attempting to reuse twisted classes in a space they aren't designed for (... I guess this seems unlikely because I don't know how non-connection code would learn of a transport, unless your manually setting it...)
The way I've been build building models like this is to make a completely separate class for the polling based I/O, but after I instantiate it, I push my client-list (server)factory into the polling instance (something like mypollingthing.servfact = myserverfactory) there-by making a way for my polling logic to be able to call into the clients .write (or more likely a def I built to abstract to the correct level for my polling logic)
I tend to take the examples in Krondo's Twisted Introduction as one of the canonical examples of how to do twisted (other then twisted matrix), and the example in part 6, under "Client 3.0" PoetryClientFactory has a __init__ that sets a callback in the factory.
If I try blend that with the twistedmatrix chat example and a few other things, I get:
(You'll want to change sendToAll to whatever your self.orders_queue.has_new_orders() is about)
#!/usr/bin/python
from twisted.internet import task
from twisted.internet import reactor
from twisted.internet.protocol import Protocol, ServerFactory
class PollingIOThingy(object):
def __init__(self):
self.sendingcallback = None # Note I'm pushing sendToAll into here in main
self.iotries = 0
def pollingtry(self):
self.iotries += 1
print "Polling runs: " + str(self.iotries)
if self.sendingcallback:
self.sendingcallback("Polling runs: " + str(self.iotries) + "\n")
class MyClientConnections(Protocol):
def connectionMade(self):
print "Got new client!"
self.factory.clients.append(self)
def connectionLost(self, reason):
print "Lost a client!"
self.factory.clients.remove(self)
class MyServerFactory(ServerFactory):
protocol = MyClientConnections
def __init__(self):
self.clients = []
def sendToAll(self, message):
for c in self.clients:
c.transport.write(message)
def main():
client_connection_factory = MyServerFactory()
polling_stuff = PollingIOThingy()
# the following line is what this example is all about:
polling_stuff.sendingcallback = client_connection_factory.sendToAll
# push the client connections send def into my polling class
# if you want to run something ever second (instead of 1 second after
# the end of your last code run, which could vary) do:
l = task.LoopingCall(polling_stuff.pollingtry)
l.start(1.0)
# from: https://twistedmatrix.com/documents/12.3.0/core/howto/time.html
reactor.listenTCP(5000, client_connection_factory)
reactor.run()
if __name__ == '__main__':
main()
To be fair, it might be better to inform PollingIOThingy of the callback by passing it as an arg to it's __init__ (that is what is shown in Krondo's docs), For some reason, I tend to miss connections like this when I read code and find class-cheating easier to see, but that may just by my personal brain-damage.

Twisted server-client data sharing

I slightly modified a server-client Twisted program on this site, which provided a program that could act as a server and a client (How to write a twisted server that is also a client?). I am able to connect the server-client to an external client on one side and an external server on the other. I want to transfer data from the external client to the external server via the server-client program. The problem I am having is getting the line received in the ServerProtocol class (in the server-client program) into the ClientProtocol class (in the server-client program). I have tried a number of ways of doing this, including trying to use the factory reference, as you can see from the def init but I cannot get it to work. (at the moment I am just sending literals back and forth to the external server and client) Here is the server-client code:
from twisted.internet import protocol, reactor
from twisted.protocols import basic
class ServerProtocol(basic.LineReceiver):
def lineReceived(self, line):
print "line recveived on server-client",line
self.sendLine("Back at you from server-client")
factory = protocol.ClientFactory()
factory.protocol = ClientProtocol
reactor.connectTCP('localhost', 1234, factory)
class ClientProtocol(basic.LineReceiver):
def __init__(self, factory):
self.factory = factory
def connectionMade(self):
self.sendLine("Hello from server-client!")
#self.transport.loseConnection()
def lineReceived(self, line):
print "line recveived on server-client1.py",line
#self.transport.loseConnection()
def main():
import sys
from twisted.python import log
log.startLogging(sys.stdout)
factory = protocol.ServerFactory()
factory.protocol = ServerProtocol
reactor.listenTCP(4321, factory)
reactor.run()
if __name__ == '__main__':
main()
I should mention that I am able to connect to the server-client program with the external server and external client on ports 4321 and 1234 respectively and they just echo back. Also, I have not shown my many attempts to use the self.factory reference. Any advice or suggestions will be much appreciated.
This question is very similar to a popular one from the Twisted FAQ:
How do I make input on one connection result in output on another?
It doesn't make any significant difference that the FAQ item is talking about many client connections to one server, as opposed to your question about one incoming client connection and one outgoing client connection. The way you share data between different connections is the same.
The essential take-away from that FAQ item is that basically anything you want to do involves a method call of some sort, and method calls in Twisted are the same as method calls in any other Python program. All you need is to have a reference to the right object to call the method on. So, for example, adapting your code:
from twisted.internet import protocol, reactor
from twisted.protocols import basic
class ServerProtocol(basic.LineReceiver):
def lineReceived(self, line):
self._received = line
factory = protocol.ClientFactory()
factory.protocol = ClientProtocol
factory.originator = self
reactor.connectTCP('localhost', 1234, factory)
def forwardLine(self, recipient):
recipient.sendLine(self._received)
class ClientProtocol(basic.LineReceiver):
def connectionMade(self):
self.factory.originator.forwardLine(self)
self.transport.loseConnection()
def main():
import sys
from twisted.python import log
log.startLogging(sys.stdout)
factory = protocol.ServerFactory()
factory.protocol = ServerProtocol
reactor.listenTCP(4321, factory)
reactor.run()
if __name__ == '__main__':
main()
Notice how:
I got rid of the __init__ method on ClientProtocol. ClientFactory calls its protocol with no arguments. An __init__ that requires an argument will result in a TypeError being raised. Additionally, ClientFactory already sets itself as the factory attribute of protocols it creates.
I gave ClientProtocol a reference to the ServerProtocol instance by setting the ServerProtocol instance as the originator attribute on the client factory. Since the ClientProtocol instance has a reference to the ClientFactory instance, that means it has a reference to the ServerProtocol instance.
I added the forwardLine method which ClientProtocol can use to direct ServerProtocol to do whatever your application logic is, once the ClientProtocol connection is established. Notice that because of the previous point, ClientProtocol has no problem calling this method.