Logging SMTP connections with Twisted - python

Python newbie here. I'm writing an SMTP server using Twisted and twisted.mail.smtp. I'd like to log incoming connections and possibly dump them when there are too many concurrent connections. Basically, I want ConsoleMessageDelivery.connectionMade() method to be called in the following, when a new connection is made:
class ConsoleMessageDelivery:
implements(smtp.IMessageDelivery)
def connectionMade(self):
# This never gets called
def receivedHeader(self, helo, origin, recipients):
myHostname, clientIP = helo
headerValue = "by %s from %s with ESMTP ; %s" % (myHostname, clientIP, smtp.rfc822date())
# email.Header.Header used for automatic wrapping of long lines
return "Received: %s" % Header(headerValue)
def validateFrom(self, helo, origin):
# All addresses are accepted
return origin
def validateTo(self, user):
if user.dest.local == "console":
return lambda: ConsoleMessage()
raise smtp.SMTPBadRcpt(user)
class ConsoleMessage:
implements(smtp.IMessage)
def __init__(self):
self.lines = []
def lineReceived(self, line):
self.lines.append(line)
def eomReceived(self):
return defer.succeed(None)
def connectionLost(self):
# There was an error, throw away the stored lines
self.lines = None
class ConsoleSMTPFactory(smtp.SMTPFactory):
protocol = smtp.ESMTP
def __init__(self, *a, **kw):
smtp.SMTPFactory.__init__(self, *a, **kw)
self.delivery = ConsoleMessageDelivery()
def buildProtocol(self, addr):
p = smtp.SMTPFactory.buildProtocol(self, addr)
p.delivery = self.delivery
return p

connectionMade is part of twisted.internet.interfaces.IProtocol, not part of twisted.mail.smtp.IMessageDelivery. There's no code anywhere in the mail server implementation that cares about a connectionMade method on a message delivery implementation.
A better place to put per connection logic is in the factory. And specifically, a good way to approach this is with a factory wrapper, to isolate the logic about connection limits and logging from the logic about servicing SMTP connections.
Twisted comes with a few factory wrappers. A couple in particular that might be interesting to you are twisted.protocols.policies.LimitConnectionsByPeer and twisted.protocols.policies.LimitTotalConnectionsFactory.
Unfortunately, I don't know of any documentation explaining twisted.protocols.policies. Fortunately, it's not too complicated. Most of the factories in the module wrap another arbitrary factory to add some piece of behavior. So, for example, to use LimitConnectionsByPeer, you do something like this:
from twisted.protocols.policies import LimitConnectionsByPeer
...
factory = ConsoleSMTPFactory()
wrapper = LimitConnectionsByPeer(ConsoleSMTPFactory(...))
reactor.listenTCP(465, wrapper)
This is all that's needed to get LimitConnectionsByPeer to do its job.
There's only a little bit more complexity involved in writing your own wrapper. First, subclass WrappingFactory. Then implement whichever methods you're interested in customizing. In your case, if you want to reject connections from a certain IP, that would mean overriding buildProtocol. Then, unless you also want to customize the protocol that is constructed (which you don't in this case), call the base implementation and return its result. For example:
from twisted.protocols.policies import WrappingFactory
class DenyFactory(WrappingFactory):
def buildProtocol(self, clientAddress):
if clientAddress.host == '1.3.3.7':
# Reject it
return None
# Accept everything else
return WrappingFactory.buildProtocol(self, clientAddress)
These wrappers stack, so you can combine them as well:
from twisted.protocols.policies import LimitConnectionsByPeer
...
factory = ConsoleSMTPFactory()
wrapper = LimitConnectionsByPeer(DenyFactory(ConsoleSMTPFactory(...)))
reactor.listenTCP(465, wrapper)

Related

Can I tell python multiprocessing.Process not to serialize something?

I essentially have a class like this:
class ConnectionClass:
def __init__(self, addr:str) -> None:
self.addr = addr
def connection(self):
if self._connection is None:
self._connection = magic_create_socket(self.addr)
return self._connection
def do_something(self):
self._connection.send_message("something")
I will be passing it via something like:
def do_processing(connection):
# this will run many times:
connection.do_something()
# There will be more processes or maybe a process pool. I want to avoid repeating myself
my_connection = ConnectionClass("some address")
my_proc = multiprocessing.Process(do_processing, my_connection)
Now clearly, each process should have its own connection sockets, file descriptors and so on. So while I want to pass any props that describe the connection, like addr in this simplified example, but I want the ConnectionClass._connection be None when it is copied to the other process, so it gets lazy initialized again.
I COULD make the connection description and the actual wrapper for the socket/fd separate, but it means extra classes, extra code to pass the description from one to another and so on.
Is it possible to use some annotation to tell Pythons multiprocessing library to ignore certain values when serializing the data for the other process?

asyncio create_connection protocol factory

The create_connection function from Python 3's asyncio module takes as it's first parameter a protocol factory. The documentation has the following note:
Note protocol_factory can be any kind of callable, not necessarily a class. For example, if you want to use a pre-created protocol instance, you can pass lambda: my_protocol.
So you can pass in an instance using a lambda like so:
create_connection(lambda: Protocol(a, b, c))
An alternative would be to define __call__ to return self such that you could just pass the instance without defining a lambda.
protocol = Protocol(a, b, c)
create_connection(protocol)
Is there any reason to use a lambda as the documentation suggests over defining __call__ on the class?
Notice the difference between these two lines:
loop.create_connection(MyProtocol, '127.0.0.1', 8888) # Option #1
loop.create_connection(MyProtocol(), '127.0.0.1', 8888) # Option #2
Here is the echo client example from asyncio docs, modified to work with the Option #1:
class MyEchoClientProtocol(asyncio.Protocol):
def connection_made(self, transport):
message = "hello"
transport.write(message.encode())
print('Data sent: {!r}'.format(message))
def data_received(self, data):
print('Data received: {!r}'.format(data.decode()))
def connection_lost(self, exc):
print('The server closed the connection')
print('Stop the event loop')
loop.stop()
loop = asyncio.get_event_loop()
coro = loop.create_connection(MyEchoClientProtocol, '127.0.0.1', 8765)
loop.run_until_complete(coro)
loop.run_forever()
loop.close()
If you choose to use Option #2, you will need to implement MyProtocol.__call__(self) which works on instances of MyProtocol.
Although this might work OK for create_connection, since your __call__ will be called only once, this does not work well for the protocol_factory parameter of create_server:
...
# Each client connection will create a new protocol instance
coro = loop.create_server(EchoServerClientProtocol, '127.0.0.1', 8888)
...
Here protocol_factory is called multiple times to create new Protocol instances. Using EchoServerClientProtocol() and defining def __call__(self): return self will reuse only one instance of Protocol!
Short answer:
The lambda should be used in preference because it is more readable - it can be understood easily without having to scrutinise the Protocol class code.
Explanation:
BaseEventLoop.create_connection yields from BaseEventLoop._create_connection_transport ref, which instantiates a protocol object from the Protocol class as follows:
protocol = protocol_factory()
We can present the problem in a simplified manner without the event loop code to demonstrate how the Protocol is being instantiated:
class Prococol:
pass
def create_connection(Protocol):
protocol = Protocol()
create_connection(Protocol)
So, "protocol = Protocol()" needs to work with the parameters. This can be by using a lambda:
class Protocol:
def __init__(self, a):
self.a = a
def create_connection(Protocol):
protocol = Protocol()
create_connection(lambda: Protocol(1))
Or the alternate suggestion that the OP suggested would be making the object a callable:
class Protocol:
def __init__(self, a):
self.a = a
def __call__(self):
return self
def create_connection(Protocol):
protocol = Protocol()
create_connection(Protocol(1))
Functionally both will work, and thus it is a question of what is better practice. I would argue that the lambda approach is better, because looking the final line create_connection(lambda: Protocol(1)) makes it clear that we are passing to the create_connection function that returns an object when called, whereas passing an a callable object makes the code less readable - because one needs to scrutinise the Protocol class for ascertain that the instantiated object is also a callable entity.
Udi answer to this question says that using def __call__(self): return self, will not work with create_server (which as an aside is not what the question asked) as it will reuse one instance of an instantiated object. This observation is correct, but what is omitted from that answer is that the callable can easily be adjusted to work with the create_server. For example:
class Protocol:
def __init__(self, a):
self.a = a
def __call__(self):
return Protocol(self.a)
The bottom line is using __call__ should work as will the lambda approach. The reason why lambda should be used in preference is for readability reasons.

How to make spydlay module to work like httplib/http.client?

I have to test server based on Jetty. This server can work with its own protocol, HTTP, HTTPS and lastly it started to support SPDY. I have some stress tests which are based on httplib /http.client -- each thread start with similar URL (some data in query string are variable), adds execution time to global variable and every few seconds shows some statistics. Code looks like:
t_start = time.time()
connection.request("GET", path)
resp = connection.getresponse()
t_stop = time.time()
check_response(resp)
QRY_TIMES.append(t_stop - t_start)
Client working with native protocol shares httplib API, so connection may be native, HTTPConnection or HTTPSConnection.
Now I want to add SPDY test using spdylay module. But its interface is opaque and I don't know how to change its opaqueness into something similar to httplib interface. I have made test client based on example but while 2nd argument to spdylay.urlfetch() is class name and not object I do not know how to use it with my tests. I have already add tests to on_close() method of my class which extends spdylay.BaseSPDYStreamHandler, but it is not compatibile with other tests. If it was instance I would use it outside of spdylay.urlfetch() call.
How can I use spydlay in a code that works based on httplib interfaces?
My only idea is to use global dictionary where url is a key and handler object is a value. It is not ideal because:
new queries with the same url will overwrite previous response
it is easy to forget to free handler from global dictionary
But it works!
import sys
import spdylay
CLIENT_RESULTS = {}
class MyStreamHandler(spdylay.BaseSPDYStreamHandler):
def __init__(self, url, fetcher):
super().__init__(url, fetcher)
self.headers = []
self.whole_data = []
def on_header(self, nv):
self.headers.append(nv)
def on_data(self, data):
self.whole_data.append(data)
def get_response(self, charset='UTF8'):
return (b''.join(self.whole_data)).decode(charset)
def on_close(self, status_code):
CLIENT_RESULTS[self.url] = self
def spdy_simply_get(url):
spdylay.urlfetch(url, MyStreamHandler)
data_handler = CLIENT_RESULTS[url]
result = data_handler.get_response()
del CLIENT_RESULTS[url]
return result
if __name__ == '__main__':
if '--test' in sys.argv:
spdy_response = spdy_simply_get('https://localhost:8443/test_spdy/get_ver_xml.hdb')
I hope somebody can do spdy_simply_get(url) better.

How can I write tests for code using twisted.web.client.Agent and its subclasses?

I read the official tutorial on test-driven development, but it hasn't been very helpful in my case. I've written a small library that makes extensive use of twisted.web.client.Agent and its subclasses (BrowserLikeRedirectAgent, for instance), but I've been struggling in adapting the tutorial's code to my own test cases.
I had a look at twisted.web.test.test_web, but I don't understand how to make all the pieces fit together. For instance, I still have no idea how to get a Protocol object from an Agent, as per the official tutorial
Can anybody show me how to write a simple test for some code that relies on Agent to GET and POST data? Any additional details or advice is most welcome...
Many thanks!
How about making life simpler (i.e. code more readable) by using #inlineCallbacks.
In fact, I'd even go as far as to suggest staying away from using Deferreds directly, unless absolutely necessary for performance or in a specific use case, and instead always sticking to #inlineCallbacks—this way you'll keep your code looking like normal code, while benefitting from non-blocking behavior:
from twisted.internet import reactor
from twisted.web.client import Agent
from twisted.internet.defer import inlineCallbacks
from twisted.trial import unittest
from twisted.web.http_headers import Headers
from twisted.internet.error import DNSLookupError
class SomeTestCase(unittest.TestCase):
#inlineCallbacks
def test_smth(self):
ag = Agent(reactor)
response = yield ag.request('GET', 'http://example.com/', Headers({'User-Agent': ['Twisted Web Client Example']}), None)
self.assertEquals(response.code, 200)
#inlineCallbacks
def test_exception(self):
ag = Agent(reactor)
try:
yield ag.request('GET', 'http://exampleeee.com/', Headers({'User-Agent': ['Twisted Web Client Example']}), None)
except DNSLookupError:
pass
else:
self.fail()
Trial should take care of the rest (i.e. waiting on the Deferreds returned from the test functions (#inlineCallbacks-wrapped callables also "magically" return a Deferred—I strongly suggest reading more on #inlineCallbacks if you're not familiar with it yet).
P.S. there's also a Twisted "plugin" for nosetests that enables you to return Deferreds from your test functions and have nose wait until they are fired before exiting: http://nose.readthedocs.org/en/latest/api/twistedtools.html
This is similar to what mike said, but attempts to test response handling. There are other ways of doing this, but I like this way. Also I agree that testing things that wrap Agent isn't too helpful and testing your protocol/keeping logic in your protocol is probably better anyway but sometimes you just want to add some green ticks.
class MockResponse(object):
def __init__(self, response_string):
self.response_string = response_string
def deliverBody(self, protocol):
protocol.dataReceived(self.response_string)
protocol.connectionLost(None)
class MockAgentDeliverStuff(Agent):
def request(self, method, uri, headers=None, bodyProducer=None):
d = Deferred()
reactor.callLater(0, d.callback, MockResponse(response_body))
return d
class MyWrapperTestCase(unittest.TestCase):
def setUp:(self):
agent = MockAgentDeliverStuff(reactor)
self.wrapper_object = MyWrapper(agent)
#inlineCallbacks
def test_something(self):
response_object = yield self.wrapper_object("example.com")
self.assertEqual(response_object, expected_object)
How about this? Run trial on the following. Basically you're just mocking away Agent and pretending it does as advertised, and using FakeAgent to (in this case) fail all requests. If you actually want to inject data into the transport, that would take "more doing" I guess. But are you really testing your code, then? Or Agent's?
from twisted.web import client
from twisted.internet import reactor, defer
class BidnessLogik(object):
def __init__(self, agent):
self.agent = agent
self.money = None
def make_moneee_quik(self):
d = self.agent.request('GET', 'http://no.traffic.plz')
d.addCallback(self.made_the_money).addErrback(self.no_dice)
return d
def made_the_money(self, *args):
##print "Moneeyyyy!"
self.money = True
return 'money'
def no_dice(self, fail):
##print "Better luck next time!!"
self.money = False
return 'no dice'
class FailingAgent(client.Agent):
expected_uri = 'http://no.traffic.plz'
expected_method = 'GET'
reasons = ['No Reason']
test = None
def request(self, method, uri, **kw):
if self.test:
self.test.assertEqual(self.expected_uri, uri)
self.test.assertEqual(self.expected_method, method)
self.test.assertEqual([], kw.keys())
return defer.fail(client.ResponseFailed(reasons=self.reasons,
response=None))
class TestRequest(unittest.TestCase):
def setUp(self):
self.agent = FailingAgent(reactor)
self.agent.test = self
#defer.inlineCallbacks
def test_foo(self):
bid = BidnessLogik(self.agent)
resp = yield bid.make_moneee_quik()
self.assertEqual(resp, 'no dice')
self.assertEqual(False, bid.money)

How do I write tests for Cyclone in the style of Tornado?

I have been googling and asking on IRC to no avail. Cyclone is supposed to be a Tornado-like protocol for Twisted. But, there are no tests in the Cyclone repository and no-one has written up how to convert tornado.testing.AsyncHTTPTestCase tests to exercise code written against Cyclone.
How do I start a server to test the web interface?
Where is the self.fetch()?
Where is the documentation in Cyclone to describe how to convert an existing Tornado app?
Unfortunately, there's nothing like tornado.testing.AsyncHTTPTestCase in cyclone at the moment. Your best bet would be to use Twisted Trial to write unit tests. One (slightly kludgy) approach would be explicitly call self.listener = reactor.listenTCP(<someport>, YourCycloneApplication()) in the setUp method inside your test case and call self.listener.stopListening() in the tearDown method.
Then, inside your test methods, you could use cyclone.httpclient.fetch to fetch the pages.
This is far from ideal. But as of now, this is the only way to go.
Here is what we are currently using to test our cylcone handler like we did with tornado:
from twisted.trial.unittest import TestCase
from twisted.internet import defer, reactor
from cyclone import httpclient
# copied from tornado
_next_port = 10000
def get_unused_port():
"""Returns a (hopefully) unused port number."""
global _next_port
port = _next_port
_next_port = _next_port + 1
return port
class TxTestCase(TestCase):
def get_http_port(self):
"""Returns the port used by the HTTPServer.
A new port is chosen for each test.
"""
if self.__port is None:
self.__port = get_unused_port()
return self.__port
def setUp(self, *args, **kwargs):
self.__port = None
self._app = self.get_app()
self._listener = None
if self._app:
self._listener = reactor.listenTCP(self.get_http_port(), self._app)
return TestCase.setUp(self, *args, **kwargs)
def get_app(self):
return None
def tearDown(self):
if self._listener:
self._listener.stopListening()
#defer.inlineCallbacks
def fetch(self, url, *args, **kwargs):
response = yield httpclient.fetch('http://localhost:%s%s'%(self.get_http_port(), url), *args, **kwargs)
defer.returnValue(response)
This way, you get the fetch method back ;)
And there are no more needs to use trial.
Here is an usage example:
from twisted.internet import defer
class Test(TxTestCase):
def get_app(self):
return MyApplication()
#defer.inlineCallbacks
def some_test_method(self):
res = yield self.fetch('/path/to/resource')
self.assertEquals(200, res.code)
Hope that will help you.

Categories