Twisted IRC Bot connection lost repeatedly to localhost - twisted

I am trying to implement an IRC Bot on a local server. The bot that I am using is identical to the one found at Eric Florenzano's Blog. This is the simplified code (which should run)
import sys
import re
from twisted.internet import reactor
from twisted.words.protocols import irc
from twisted.internet import protocol
class MomBot(irc.IRCClient):
def _get_nickname(self):
return self.factory.nickname
nickname = property(_get_nickname)
def signedOn(self):
print "attempting to sign on"
self.join(self.factory.channel)
print "Signed on as %s." % (self.nickname,)
def joined(self, channel):
print "attempting to join"
print "Joined %s." % (channel,)
def privmsg(self, user, channel, msg):
if not user:
return
if self.nickname in msg:
msg = re.compile(self.nickname + "[:,]* ?", re.I).sub('', msg)
prefix = "%s: " % (user.split('!', 1)[0], )
else:
prefix = ''
self.msg(self.factory.channel, prefix + "hello there")
class MomBotFactory(protocol.ClientFactory):
protocol = MomBot
def __init__(self, channel, nickname='YourMomDotCom', chain_length=3,
chattiness=1.0, max_words=10000):
self.channel = channel
self.nickname = nickname
self.chain_length = chain_length
self.chattiness = chattiness
self.max_words = max_words
def startedConnecting(self, connector):
print "started connecting on {0}:{1}"
.format(str(connector.host),str(connector.port))
def clientConnectionLost(self, connector, reason):
print "Lost connection (%s), reconnecting." % (reason,)
connector.connect()
def clientConnectionFailed(self, connector, reason):
print "Could not connect: %s" % (reason,)
if __name__ == "__main__":
chan = sys.argv[1]
reactor.connectTCP("localhost", 6667, MomBotFactory('#' + chan,
'YourMomDotCom', 2, chattiness=0.05))
reactor.run()
I added the startedConnection method in the client factory, which it is reaching and printing out the proper address:host. It then disconnects and enters the clientConnectionLost and prints the error:
Lost connection ([Failure instance: Traceback (failure with no frames):
<class 'twisted.internet.error.ConnectionDone'>: Connection was closed cleanly.
]), reconnecting.
If working properly it should log into the appropriate channel, specified as the first arg in the command (e.g. python module2.py botwar. would be channel #botwar.). It should respond with "hello there" if any one in the channel sends anything.
I have NGIRC running on the server, and it works if I connect from mIRC or any other IRC client.
I am unable to find a resolution as to why it is continually disconnecting. Any help on why would be greatly appreciated. Thank you!

One thing you may want to do is make sure you will see any error output produced by the server when your bot connects to it. My hunch is that the problem has something to do with authentication, or perhaps an unexpected difference in how ngirc handles one of the login/authentication commands used by IRCClient.
One approach that almost always applies is to capture a traffic log. Use a tool like tcpdump or wireshark.
Another approach you can try is to enable logging inside the Twisted application itself. Use twisted.protocols.policies.TrafficLoggingFactory for this:
from twisted.protocols.policies import TrafficLoggingFactory
appFactory = MomBotFactory(...)
logFactory = TrafficLoggingFactory(appFactory, "irc-")
reactor.connectTCP(..., logFactory)
This will log output to files starting with "irc-" (a different file for each connection).
You can also hook directly into your protocol implementation, at any one of several levels. For example, to display any bytes received at all:
class MomBot(irc.IRCClient):
def dataReceived(self, bytes):
print "Got", repr(bytes)
# Make sure to up-call - otherwise all of the IRC logic is disabled!
return irc.IRCClient.dataReceived(self, bytes)
With one of those approaches in place, hopefully you'll see something like:
:irc.example.net 451 * :Connection not registered
which I think means... you need to authenticate? Even if you see something else, hopefully this will help you narrow in more closely on the precise cause of the connection being closed.
Also, you can use tcpdump or wireshark to capture the traffic log between ngirc and one of the working IRC clients (eg mIRC) and then compare the two logs. Whatever different commands mIRC is sending should make it clear what changes you need to make to your bot.

Related

is there a way to setup timeout in grpc server side?

Unable to timeout a grpc connection from server side. It is possible that client establishes a connection but kept on hold/sleep which is resulting in grpc server connection to hang. Is there a way at server side to disconnect the connection after a certain time or set the timeout?
We tried disconnecting the connection from client side but unable to do so from server side. In this link Problem with gRPC setup. Getting an intermittent RPC unavailable error, Angad says that it is possible but unable to define those parameters in python.
My code snippet:
def serve():
server = grpc.server(thread_pool=futures.ThreadPoolExecutor(max_workers=2), maximum_concurrent_rpcs=None, options=(('grpc.so_reuseport', 1),('grpc.GRPC_ARG_KEEPALIVE_TIME_MS', 1000)))
stt_pb2_grpc.add_ListenerServicer_to_server(Listener(), server)
server.add_insecure_port("localhost:50051")
print("Server starting in port "+str(50051))
server.start()
try:
while True:
time.sleep(60 * 60 * 24)
except KeyboardInterrupt:
server.stop(0)
if __name__ == '__main__':
serve()
I expect the connection should be timed out from grpc server side too in python.
In short, you may find context.abort(...) useful, see API reference. Timeout a server handler is not supported by the underlying C-Core API of gRPC Python. So, you have to implement your own timeout mechanism in Python.
You can try out some solution from other StackOverflow questions.
Or use a simple-but-big-overhead extra threads to abort the connection after certain length of time. It might look like this:
_DEFAULT_TIME_LIMIT_S = 5
class FooServer(FooServicer):
def RPCWithTimeLimit(self, request, context):
rpc_ended = threading.Condition()
work_finished = threading.Event()
def wrapper(...):
YOUR_ACTUAL_WORK(...)
work_finished.set()
rpc_ended.notify_all()
def timer():
time.sleep(_DEFAULT_TIME_LIMIT_S)
rpc_ended.notify_all()
work_thread = threading.Thread(target=wrapper, ...)
work_thread.daemon = True
work_thread.start()
timer_thread = threading.Thread(target=timer)
timer_thread.daemon = True
timer_thread.start()
rpc_ended.wait()
if work_finished.is_set():
return NORMAL_RESPONSE
else:
context.abort(grpc.StatusCode.DEADLINE_EXCEEDED, 'RPC Time Out!')

How to make pooling HTTP connection with twisted?

I wirite a very simple spider program to fetch webpages from single site.
Here is a minimized version.
from twisted.internet import epollreactor
epollreactor.install()
from twisted.internet import reactor
from twisted.web.client import Agent, HTTPConnectionPool, readBody
baseUrl = 'http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemCode='
start = 1001
end = 3500
pool = HTTPConnectionPool(reactor)
pool.maxPersistentPerHost = 10
agent = Agent(reactor, pool=pool)
def onHeader(response, i):
deferred = readBody(response)
deferred.addCallback(onBody, i)
deferred.addErrback(errorHandler)
return response
def onBody(body, i):
print('Received %s, Length %s' % (i, len(body)))
def errorHandler(err):
print('%s : %s' % (reactor.seconds() - startTimeStamp, err))
def requestFactory():
for i in range (start, end):
deferred = agent.request('GET', baseUrl + str(i))
deferred.addCallback(onHeader, i)
deferred.addErrback(errorHandler)
print('Generated %s' % i)
reactor.iterate(1)
print('All requests has generated, elpased %s' % (reactor.seconds() - startTimeStamp))
startTimeStamp = reactor.seconds()
reactor.callWhenRunning(requestFactory)
reactor.run()
For a few requests, like 100, it works fine. But for massive requests, it will failed.
I expect all of the requests(around 3000) should be automatically pooled, scheduled and pipelined, since I use HTTPConnectionPool, set maxPersistentPerHost, create an Agent instance with it and incrementally create the connections.
But it doesn't, the connections are not keep-alive nor pooled.
In this programm, it did establish the connections incrementally, but the connections didn't pooled, each connecction will close after body received, and later requests never wait in the pool for an available connecction.
So it will take thousands of sockets, and finally failed due to timeout, because the remote server has a connection timeout set to 30s. Thousands of requests can't be done within 30s.
Could you please give me some help on this?
I have tried my best on this, here is my finds.
Error occured exactly 30s after reactor start runing, won't be influenced by other things.
Let the spider fetch my server, I find something interesting.
The HTTP protocol version is 1.1 (I check the twisted document, the default HTTPClient is 1.0 rather than 1.1)
If I didn't add any explicit header(just like the minimized version), the request header didn't contain Connection: Keep-Alive, either do response header.
If I add explicit header to ensure it is a keep-alive connection, the request header did contain Connection: Keep-Alive, but the response header still not. (I am sure my server behave correctly, other stuff like Chrome, wget did receive Connection: Keep-Alive header.)
I check /proc/net/sockstat during running, it increase rapidly at first, and decrease rapidly later. (I have increase the ulimit to support plenty of sockets)
I write a similar program with treq, a twisted based request library). The code is almost the same, so not paste here.
Link: https://gist.github.com/Preffer/dad9b1228fcd75cebd75
It's behavior is almost the same. Not pooling. It is expected to be pooling as described in treq's feature list.
If I have add explicit header on it, Connection: Keep-Alive never appear in response header.
Based on all of the above, I am quite suspicious about the quirk Connection: Keep-Alive header ruin the program. But this header is part of HTTP 1.1 standard, and it did report as HTTP 1.1. I am completely puzzled on this.
I solved the problem myself, with help from IRC and another question in stackoverflow, Queue remote calls to a Python Twisted perspective broker?
In summary, the agent's behavior is very different from that in Nodejs(I have some experience in Nodejs). As it described on Nodejs doc
agent.requests
An object which contains queues of requests that have not yet been assigned to sockets.
agent.maxSockets
By default set to 5. Determines how many concurrent sockets the agent can have open per origin. Origin is either a 'host:port' or 'host:port:localAddress' combination.
So, here is the difference.
Twisted:
There is no doubt that Agent could queue requests if construct with a HTTPConnectionPool instance.
But if a new request is issued after connections in pool has run out, the agent will still create a new connection and perform the request, rather than put it in a queue.
Actually, it will lead to drop a connection in the pool, and push the newly generated connection into the pool, keep the connections count still equal to maxPersistentPerHost
Nodejs:
By default, agent will queue the requests with a implicit connection pool, which have a size of 5 connections.
If a new request is issued after connections in pool has run out, the agent will queue the requests into agent.requests variable, waiting for available connection.
The agent's behavior in twisted lead to a result that the agent is able to queue the requests, but actually it doesn't.
Follow our intuition, once assign a connection pool to an agent, it is in line with the intuition that agent will only use the connections in the pool, and wait for available connection if the pool has run out. That is exactly match with the agent in Nodejs.
Personally, I think it is a buggy behavior in twisted, or at least, could make an improvement to provide an option to set agent's behavior.
According to this, I have to use DeferredSemaphore to manually schedule the requests.
I raise a issue to treq project on github, and get similar solution. https://github.com/dreid/treq/issues/71
Here is my solution.
#!/usr/bin/env python
from twisted.internet import epollreactor
epollreactor.install()
from twisted.internet import reactor
from twisted.web.client import Agent, HTTPConnectionPool, readBody
from twisted.internet.defer import DeferredSemaphore
baseUrl = 'http://acm.zju.edu.cn/onlinejudge/showProblem.do?problemCode='
start = 1001
end = 3500
count = end - start
concurrency = 10
pool = HTTPConnectionPool(reactor)
pool.maxPersistentPerHost = concurrency
agent = Agent(reactor, pool=pool)
sem = DeferredSemaphore(concurrency)
done = 0
def onHeader(response, i):
deferred = readBody(response)
deferred.addCallback(onBody, i)
deferred.addErrback(errorHandler, i)
return deferred
def onBody(body, i):
sem.release()
global done, count
done += 1
print('Received %s, Length %s, Done %s' % (i, len(body), done))
if(done == count):
print('All items fetched')
reactor.stop()
def errorHandler(err, i):
print('[%s] id %s: %s' % (reactor.seconds() - startTimeStamp, i, err))
def requestFactory(token, i):
deferred = agent.request('GET', baseUrl + str(i))
deferred.addCallback(onHeader, i)
deferred.addErrback(errorHandler, i)
print('Request send %s' % i)
#this function it self is a callback emit by reactor, so needn't iterate manually
#reactor.iterate(1)
return deferred
def assign():
for i in range (start, end):
sem.acquire().addCallback(requestFactory, i)
startTimeStamp = reactor.seconds()
reactor.callWhenRunning(assign)
reactor.run()
Is it right? Beg for pointing out my error and improvements.
For a few requests, like 100, it works fine. But for massive requests,
it will failed.
This is either a protection against web crawlers or a server protection against DoS/DDoS, because you are sending too much requests from the same IP in a short time, so the Firewall or the WSA will block your future request. Just modify your script to make request in batch spaced by some time. you can use callLater() with some time after each X request.

celery - Programmatically list queues

How can I programmatically, using Python code, list current queues created on a RabbitMQ broker and the number of workers connected to them? It would be the equivalent to:
rabbitmqctl list_queues name consumers
I do it this way and display all the queues and their details (messages ready, unacknowledged etc.) on a web page -
import kombu
conn = kombu.Connection(broker_url)# example 'amqp://guest:guest#localhost:5672/'
conn.connect()
client = conn.get_manager()
queues = client.get_queues('/')#assuming vhost as '/'
You will need kombu to be installed and queues will be a dictionary with keys representing the queue names.
I think I got this when digging through the code of celery flower (The tool used for monitoring celery).
Update: As pointed out by #zaq178miami, you will also need the management plugin that has the http API. I had forgotten that I had enabled than in rabbitmq.
This way did it for me:
def get_queue_info(queue_name):
with celery.broker_connection() as conn:
with conn.channel() as channel:
return channel.queue_declare(queue_name, passive=True)
This will return a namedtuple with the name, number of messages waiting and consumers of that queue.
ksrini answer is correct too and can be used when you require more information about a queue.
Thanks to Ask Solem who gave me the hint.
As a rabbitmq client you can use pika. However it doesn't have option for list_queues. The easiest solution would be calling rabbitmqctl command from python using subprocess:
import subprocess
command = "/usr/local/sbin/rabbitmqctl list_queues name consumers"
process = subprocess.Popen(command.split(), stdout=subprocess.PIPE)
print process.communicate()
I would use simply this:
Just replace the user(default= guest), passwd(default= guest) and port with your values.
import requests
import json
def call_rabbitmq_api(host, port, user, passwd):
url = 'https://%s:%s/api/queues' % (host, port)
r = requests.get(url, auth=(user,passwd),verify=False)
return r
def get_queue_name(json_list):
res = []
for json in json_list:
res.append(json["name"])
return res
if __name__ == '__main__':
host = 'rabbitmq_host'
port = 55672
user = 'guest'
passwd = 'guest'
res = call_rabbitmq_api(host, port, user, passwd)
print ("--- dump json ---")
print (json.dumps(res.json(), indent=4))
print ("--- get queue name ---")
q_name = get_queue_name(res.json())
print (q_name)
Referred from here: https://gist.github.com/hiroakis/5088513#file-example_rabbitmq_api-py-L2

Sending large message to RabbitMQ with Pika 0.9.5: message is silently dropped by Rabbit

I've got a bunch of celery tasks that take their results and post them to a RabbitMQ message queue. The results that get posted can become quite large (up to a few meg). Opinion is mixed as to whether putting large amounts of data in a RabbitMQ message is a good idea, but I've seen this work in other situations and as long as memory is kept under control, it seems to work.
However, for my current set of tasks, rabbit appears to be just dropping messages that seem to be too big. I've reduced it down to a fairly simple test case:
#!/usr/bin/env python
import string
import random
import pika
import os
qname='examplequeue'
connection = pika.BlockingConnection(pika.ConnectionParameters(
host='mq.example.com'))
channel = connection.channel()
channel.queue_declare(queue=qname,durable=True)
N=100000
body = ''.join(random.choice(string.ascii_uppercase) for x in range(N))
promise = channel.basic_publish(exchange='', routing_key=qname, body=body, mandatory=0, immediate=0, properties=pika.BasicProperties(content_type="text/plain",delivery_mode=2))
print " [x] Sent 'Hello World!'"
connection.close()
I have a 3-node RabbitMQ cluster, and mq.example.com round-robins to each node. Client is using Pika 0.9.5 on Ubuntu 12.04 and the RabbitMQ cluster is running RabbitMQ 2.8.7 on Erlang R14B04.
Executing this script prints the print statement and exits without any exceptions being raised. The message never appears in RabbitMQ.
Changing N to 10000 makes it work as expected.
Why?
I suppose you have problem with tcp-backpressure mechanizm in RabbitMq. You can read about http://www.rabbitmq.com/memory.html.
I see two ways to solve this problem:
Add tcp-callback and make reconnect every tcp-call from rabbit
Use compressing messages before sending it to rabbit, It will make easier push to rabbit.
def compress(s):
return binascii.hexlify(zlib.compress(s))
def decompress(s):
return zlib.decompress(binascii.unhexlify(s))
This is what I do to send and receive packets. It is somewhat more efficient than hexlify, because base64 may use one byte where two bytes are needed by hexlify to represent one character.
import zlib
import base64
def hexpress(send: str):
print(f"send: {send}")
bsend = send.encode()
print(f"byte-encoded send: {bsend}")
zbsend = zlib.compress(bsend)
print(f"zipped-byte-encoded-send: {zbsend}")
hzbsend = base64.b64encode(zbsend)
print(f"hex-zip-byte-encoded-send: {hzbsend}")
shzbsend = hzbsend.decode()
print(f"string-hex-zip-byte-encoded-send: {shzbsend}")
return shzbsend
def hextract(recv: str):
print(f"string-hex-zip-byte-encoded-recv: {recv}")
zbrecv = base64.b64decode(recv)
print(f"zipped-byte-encoded-recv: {zbrecv}")
brecv = zlib.decompress(zbrecv)
print(f"byte-encoded-recv: {brecv}")
recv = brecv.decode()
print(f"recv: {recv}")
return recv
print("sending ...\n")
send = "hello this is dog"
packet = hexpress(send)
print("\nover the wire -------->>>>>\n")
print("receiving...\n")
recv = hextract(packet)

Extend existing Twisted Service with another Socket/TCP/RPC Service to get Service informations

I'm implementing a Twisted-based Heartbeat Client/Server combo, based on this example. It is my first Twisted project.
Basically it consists of a UDP Listener (Receiver), who calls a listener method (DetectorService.update) on receiving packages. The DetectorService always holds a list of currently active/inactive clients (I extended the example a lot, but the core is still the same), making it possible to react on clients which seem disconnected for a specified timeout.
This is the source taken from the site:
UDP_PORT = 43278; CHECK_PERIOD = 20; CHECK_TIMEOUT = 15
import time
from twisted.application import internet, service
from twisted.internet import protocol
from twisted.python import log
class Receiver(protocol.DatagramProtocol):
"""Receive UDP packets and log them in the clients dictionary"""
def datagramReceived(self, data, (ip, port)):
if data == 'PyHB':
self.callback(ip)
class DetectorService(internet.TimerService):
"""Detect clients not sending heartbeats for too long"""
def __init__(self):
internet.TimerService.__init__(self, CHECK_PERIOD, self.detect)
self.beats = {}
def update(self, ip):
self.beats[ip] = time.time()
def detect(self):
"""Log a list of clients with heartbeat older than CHECK_TIMEOUT"""
limit = time.time() - CHECK_TIMEOUT
silent = [ip for (ip, ipTime) in self.beats.items() if ipTime < limit]
log.msg('Silent clients: %s' % silent)
application = service.Application('Heartbeat')
# define and link the silent clients' detector service
detectorSvc = DetectorService()
detectorSvc.setServiceParent(application)
# create an instance of the Receiver protocol, and give it the callback
receiver = Receiver()
receiver.callback = detectorSvc.update
# define and link the UDP server service, passing the receiver in
udpServer = internet.UDPServer(UDP_PORT, receiver)
udpServer.setServiceParent(application)
# each service is started automatically by Twisted at launch time
log.msg('Asynchronous heartbeat server listening on port %d\n'
'press Ctrl-C to stop\n' % UDP_PORT)
This heartbeat server runs as a daemon in background.
Now my Problem:
I need to be able to run a script "externally" to print the number of offline/online clients on the console, which the Receiver gathers during his lifetime (self.beats). Like this:
$ pyhb showactiveclients
3 clients online
$ pyhb showofflineclients
1 client offline
So I need to add some kind of additional server (Socket, Tcp, RPC - it doesn't matter. the main point is that i'm able to build a client-script with the above behavior) to my DetectorService, which allows to connect to it from outside. It should just give a response to a request.
This server needs to have access to the internal variables of the running detectorservice instance, so my guess is that I have to extend the DetectorService with some kind of additionalservice.
After some hours of trying to combine the detectorservice with several other services, I still don't have an idea what's the best way to realize that behavior. So I hope that somebody can give me at least the essential hint how to start to solve this problem.
Thanks in advance!!!
I think you already have the general idea of the solution here, since you already applied it to an interaction between Receiver and DetectorService. The idea is for your objects to have references to other objects which let them do what they need to do.
So, consider a web service that responds to requests with a result based on the beats data:
from twisted.web.resource import Resource
class BeatsResource(Resource):
# It has no children, let it respond to the / URL for brevity.
isLeaf = True
def __init__(self, detector):
Resource.__init__(self)
# This is the idea - BeatsResource has a reference to the detector,
# which has the data needed to compute responses.
self._detector = detector
def render_GET(self, request):
limit = time.time() - CHECK_TIMEOUT
# Here, use that data.
beats = self._detector.beats
silent = [ip for (ip, ipTime) in beats.items() if ipTime < limit]
request.setHeader('content-type', 'text/plain')
return "%d silent clients" % (len(silent),)
# Integrate this into the existing application
application = service.Application('Heartbeat')
detectorSvc = DetectorService()
detectorSvc.setServiceParent(application)
.
.
.
from twisted.web.server import Site
from twisted.application.internet import TCPServer
# The other half of the idea - make sure to give the resource that reference
# it needs.
root = BeatsResource(detectorSvc)
TCPServer(8080, Site(root)).setServiceParent(application)