Mocking method in class called through Multiprocessing.Process - python

I am writing a python tcp proxy: whenever a connection gets established from the client, the proxy establishes the connection to the server, and transparently forwards both streams. Additionally, when the packet being forwarded has some conditions, I want to have it parsed and have that sent to another server.
This is the contents of my unittest:
class TestParsing(TestCase):
def setUp(self) -> None:
self.patcher = patch('EnergyDataClient.EnergyDataClient', autospec=True)
self.EC_mock = self.patcher.start()
EnergyAgent.EC = self.EC_mock()
EnergyAgent.GP = MyParser.MyParser()
self.server = multiprocessing.Process(target=tcp_server, args=(1235,))
self.gp = multiprocessing.Process(target=EnergyAgentRunner, args=(1234, 1235))
self.server.start()
self.gp.start()
def tearDown(self) -> None:
self.patcher.stop()
self.server.terminate()
self.gp.terminate()
while self.server.is_alive() or self.gp.is_alive():
sleep(0.1)
def test_parsemessage(self):
# start the client process, and wait until done
result = tcp_client(1234, correct_packets['DATA04']['request'])
self.assertEqual(correct_packets['DATA04']['request'], result)
EnergyAgent.EC.post.assert_called_once()
I want to validate that the 'post' method on the object EC is called with the contents I expect to have intercepted... but, as that object is on another process, mocking seems not to be helping. What am I doing wrong?

I figured out what is happening here. When calling multiprocessing.Process, python is spawning a new process using fork(), that produces a copy of the memory pages on the child process. That is the reason because I can patch the EnergyAgent.GP and works (as that object is only read from that point on, and we do not require it back at the main process) and EnergyAgent.EC does not work: The mocked object gets successfully updated on the child process, but the parent never realizes that.

Related

Can self be overwritten by a queue object within class?

I have a python pexpect script that logs in to servers frequently, and each time it tries to log in, I get a DUO push, which is very unreliable (not sure if it's the app on Android or the DUO system itself, but that is not relevant to my question). I'm trying to avoid DUO pushes by re-using sessions in the queue.
I have a class called Session to open/close sessions. I also have a global Queue defined. Whenever I'm done using a session, instead of closing the pexpect handle, I do q.put(self). self contains the active pexpect session. Next time I need to login, I first check to see if there is an item in the Queue. If there is, I would like to do self = q.get(), hence overwriting my "self" with the object in the Queue. Here is example code of what I'm trying to accomplish:
from globals import q
class Session:
def __init__(self, ip):
self.user = flask_login.current_user.saneid
self.pass = flask_login.current_user.sanepw
self.ip = ip
self.handle = None
def __enter__(self):
if not q.empty():
self = q.get()
else:
# login to node
self.handle = pexpect.spawn('ssh user#node')
...
return(self.handle)
def __exit__(self, *args):
q.put(self)
Is this good practice? Is there a better way?

Call to async endpoint gets blocked by another thread

I have a tornado webservice which is going to serve something around 500 requests per minute. All these requests are going to hit 1 specific endpoint. There is a C++ program that I have compiled using Cython and use it inside the tornado service as my processor engine. Each request that goes to /check/ will trigger a function call in the C++ program (I will call it handler) and the return value will get sent to user as response.
This is how I wrap the handler class. One important point is that I do not instantiate the handler in __init__. There is another route in my tornado code that I want to start loading the DataStructure after an authroized request hits that route. (e.g. /reload/)
executors = ThreadPoolExecutor(max_workers=4)
class CheckerInstance(object):
def __init__(self, *args, **kwargs):
self.handler = None
self.is_loading = False
self.is_live = False
def init(self):
if not self.handler:
self.handler = pDataStructureHandler()
self.handler.add_words_from_file(self.data_file_name)
self.end_loading()
self.go_live()
def renew(self):
self.handler = None
self.init()
class CheckHandler(tornado.web.RequestHandler):
async def get(self):
query = self.get_argument("q", None).encode('utf-8')
answer = query
if not checker_instance.is_live:
self.write(dict(answer=self.get_argument("q", None), confidence=100))
return
checker_response = await checker_instance.get_response(query)
answer = checker_response[0]
confidence = checker_response[1]
if self.request.connection.stream.closed():
return
self.write(dict(correct=answer, confidence=confidence, is_cache=is_cache))
def on_connection_close(self):
self.wait_future.cancel()
class InstanceReloadHandler(BasicAuthMixin, tornado.web.RequestHandler):
def prepare(self):
self.get_authenticated_user(check_credentials_func=credentials.get, realm='Protected')
def new_file_exists(self):
return True
def can_reload(self):
return not checker_instance.is_loading
def get(self):
error = False
message = None
if not self.can_reload():
error = True
message = 'another job is being processed!'
else:
if not self.new_file_exists():
error = True
message = 'no new file found!'
else:
checker_instance.go_fake()
checker_instance.start_loading()
tornado.ioloop.IOLoop.current().run_in_executor(executors, checker_instance.renew)
message = 'job started!'
if self.request.connection.stream.closed():
return
self.write(dict(
success=not error, message=message
))
def on_connection_close(self):
self.wait_future.cancel()
def main():
app = tornado.web.Application(
[
(r"/", MainHandler),
(r"/check", CheckHandler),
(r"/reload", InstanceReloadHandler),
(r"/health", HealthHandler),
(r"/log-event", SubmitLogHandler),
],
debug=options.debug,
)
checker_instance = CheckerInstance()
I want this service to keep responding after checker_instance.renew starts running in another thread. But this is not what happens. When I hit the /reload/ endpoint and renew function starts working, any request to /check/ halts and waits for the reloading process to finish and then it starts working again. When the DataStructure is being loaded, the service should be in fake mode and respond to people with the same query that they send as input.
I have tested this code in my development environment with an i5 CPU (4 CPU cores) and it works just fine! But in the production environment (3 double-thread CPU cores) the /check/ endpoint halts requests.
It is difficult to fully trace the events being handled because you have clipped out some of the code for brevity. For instance, I don't see a get_response implementation here so I don't know if it is awaiting something itself that could be dependent on the state of checker_instance.
One area I would explore is in the thread-safety (or seeming absence of) in passing the checker_instance.renew to run_in_executor. This feels questionable to me because you are mutating the state of a single instance of CheckerInstance from a separate thread. While it might not break things explicitly, it does seem like this could be introducing odd race conditions or unanticipated copies of memory that might explain the unexpected behavior you are experiencing
If possible, I would make whatever load behavior you have that you want to offload to a thread be completely self-contained and when the data is loaded, return it as the function result which can then be fed back into you checker_instance. If you were to do this with the code as-is, you would want to await the run_in_executor call for its result and then update the checker_instance. This would mean the reload GET request would wait until the data was loaded. Alternatively, in your reload GET request, you could ioloop.spawn_callback to a function that triggers the run_in_executor in this manner, allowing the reload request to complete instead of waiting.

How to call a Twisted reactor from a file different from his?

I have a question that could well belong to Twisted or could be directly related to Python.
My problem, as the other is related to the disconnection process in Twisted.
As I read on this site, if I want to I have to perform the following steps:
The server must stop listening.
The client connection must disconnect.
The server connection must disconnect.
According to what I read on the previous page to make the first step would have to run the stopListening method.
In the example mentioned in the web all actions are performed in the same script. Making it easy to access the different variables and methods.
For me I have a server and a client are in different files and different locations.
I have a function that creates a server, and assigns a protocol and want, from the client protocol in another file, make an AMP call to a method for stop the connector.
The call AMP calls the SendMsg command.
class TESTServer(protocol.Protocol):
factory = None
sUsername = ""
credProto = None
bGSuser = None
slot = None
"""
Here was uninteresting code.
"""
# upwards=self.bGSuser, forwarded=True, tx_timestamp=iTimestamp,\
# message=sMsg)
log.msg("self.connector")
log.msg(self.connector)
return {'bResult': True}
SendMsg.responder(vSendMsg)
def _testfunction(self):
logger = logging.getLogger('server')
log.startLogging(sys.stdout)
pf = CredAMPServerFactory()
sslContext = ssl.DefaultOpenSSLContextFactory('key/server.pem',\
'key/public.pem',)
self.connector = reactor.listenSSL(1234, pf, contextFactory = sslContext,)
log.msg('Server running...')
reactor.run()
if __name__ == '__main__':
TESTServer()._testfunction()
The class CredAMPServerFactory assign the corresponding protocol.
class CredAMPServerFactory(ServerFactory):
"""
Server factory useful for creating L{CredReceiver} and L{SATNETServer} instances.
This factory takes care of associating a L{Portal} with the L{CredReceiver}
instances it creates. If the login is succesfully achieved, a L{SATNETServer}
instance is also created.
"""
protocol = CredReceiver
In the "CredReceiver" class I have a call that assigns the protocol to the TestServer class. I do this to make calls using the AMP method "Responder".
self.protocol = SATNETServer
My problem is that when I make the call the program responds with an error indicating that the connector doesn't belong to CredReceiver attribute object.
File "/home/sgongar/Dev/protocol/server_amp.py", line 248, in vSendMsg
log.msg(self.connector)
exceptions.AttributeError: 'CredReceiver' object has no attribute 'connector'
How could I do this? Does anyone know of a similar example of that may take note?
Thank you.
EDIT.
Server side:
server_amp.py
Starts a reactor: reactor.listenSSL(1234, pf, contextFactory =
sslContext,) from within the SATNETServer class.
Assigns protocol, pf, to CredAMPServerFactory class who belongs to module server.py also from within the SATNETServer class.
server.py
Within the class CredAMPServerFactory assigns CredReceiver class to protocol.
Once the connection is established the class SATNETServer is assigned to the protocol.
Client side:
client_amp
Makes a call to the SendMsg method belonging to theSATNETServer class.

sqlalchemy mysql connections not closing on flask api

I have an API I have written in flask. It uses sqlalchemy to deal with a MySQL database. I don't use flask-sqlalchemy, because I don't like how the module forces you into a certain pattern for declaring the model.
I'm having a problem in which my database connections are not closing. The object representing the connection is going out of scope, so I assume it is being garbage collected. I also explicitly call close() on the session. Despite this, the connections stay open long after the API call has returned its response.
sqlsession.py: Here is the wrapper I am using for the session.
class SqlSession:
def __init__(self, conn=Constants.Sql):
self.db = SqlSession.createEngine(conn)
Session = sessionmaker(bind=self.db)
self.session = Session()
#staticmethod
def createEngine(conn):
return create_engine(conn.URI.format(user=conn.USER, password=conn.PASS, host=conn.HOST, port=conn.PORT, database=conn.DATABASE, poolclass=NullPool))
def close(self):
self.session.close()
flaskroutes.py: Here is an example of the flask app instantiating and using the wrapper object. Note that it instantiates it in the beginning within the scope of the api call, then closes the session at the end, and presumably is garbage collected after the response is returned.
def commands(self, deviceId):
sqlSession = SqlSession(self.sessionType) <---
commandsQueued = getCommands()
jsonCommands = []
for command in commandsQueued:
jsonCommand = command.returnJsonObject()
jsonCommands.append(jsonCommand)
sqlSession.session.delete(command)
sqlSession.session.commit()
resp = jsonify({'commands': jsonCommands})
sqlSession.close() <---
resp.status_code = 200
return resp
I would expect the connections to be cleared as soon as the HTTP response is made, but instead, the connections end up with the "SLEEP" state (when viewed in the MySQL command line interface 'show processlist').
I ended up using the advice from this SO post:
How to close sqlalchemy connection in MySQL
I strongly recommend reading that post to anyone having this problem. Basically, I added a dispose() call to the close method. Doing so causes the entire connection to be destroyed, while closing simply returns connections to an available pool (but leave them open).
def close(self):
self.session.close()
self.db.dispose()
This whole this was a bit confusing to me, but at least now I understand more about the connection pool.

Accessing python class variable defined inside the main module of a script

I have a django project that uses celery for async task processing. I am using python 2.7.
I have a class in a module client.py in my django project:
# client.py
class Client:
def __init__(self):
# code for opening a persistent connection and saving the connection client in a class variable
...
self.client = <connection client>
def get_connection_client(self):
return self.client
def send_message(self, message):
# --- Not the exact code but this is the function I need to access to for which I need access to the client variable---
self.client.send(message)
# Other functions that use the above method to send messages
...
This class needs to be instantiated only once to create one persistent connection to a remote server.
I run a script connection.py that runs indefinitely:
# connection.py
from client import Client
if __name__ == '__main__':
clientobj = Client()
client = clientobj.get_connection_client()
# Blocking process
while True:
# waits for a message from the remote server
...
I need to access the variable client from another module tasks.py (needed for celery).
# tasks.py
...
from client import Client
#app.task
def function():
# Need access to the client variable
# <??? How do I get an access to the client variable for the
# already established connection???>
message = "Message to send to the server using the established connection"
client.send_message(message)
All the three python modules are on the same machine. The connection.py is executed as a standalone script and is executed first. The method function() in tasks.py is called multiple times across other modules of the project whenever required, thus, I can't instantiate the Client class inside this method. Global variables don't work.
In java, we can create global static variable and access it throughout the project. How do we do this in python?
Approaches I can think of but not sure if they can be done in python:
Save this variable in a common file such that it is accessible in other modules in my project?
Save this client as a setting in either django or celery and access this setting in the required module?
Based on suggestions by sebastian, another way is to share variables between running processes. I essentially want to do that. How do I do this in python?
For those interested to know why this is required, please see this question. It explains the complete system design and the various components involved.
I am open to suggestions that needs a change in the code structure as well.
multiprocessing provides all the tools you need to do this.
connection.py
from multiprocessing.managers import BaseManager
from client import Client()
client = Client()
class ClientManager(BaseManager): pass
ClientManager.register('get_client', callable=lambda: client)
manager = ClientManager(address=('', 50000), authkey='abracadabra')
server = manager.get_server()
server.serve_forever()
tasks.py
from multiprocessing.managers import BaseManager
class ClientManager(BaseManager): pass
ClientManager.register('get_client')
manager = ClientManager(address=('localhost', 50000), authkey='abracadabra')
manager.connect()
client = manager.get_client()
#app.task
def function():
message = "Message to send to the server using the established connection"
client.send_message(message)
I dont have experience working with django, but if they are executed from the same script you could make the Client a singleton, or maybe declaring the Client in the init.py and then import it wherever you need it.
If you go for the singleton, you can make a decorator for that:
def singleton(cls):
instances = {}
def get_instance(*args, **kwargs):
if cls not in instances:
instances[cls] = cls(*args, **kwargs)
return instances[cls]
return get_instance
Then you would define:
# client.py
#singleton
class Client:
def __init__(self):
# code for opening a persistent connection and saving the connection client in a class variable
...
self.client = <connection client>
def get_connection_client(self):
return self.client
Thats all I can suggest with the little description you have given. Maybe try to explain a little better how everything is run or the parts that are involved.
Python has class attributes (attributes that are shared amongst instances) and class methods (methods that act on the class itself). Both are readable on either the class and an instance.
# client.py
class Client(object):
_client = None
#classmethod
def connect(cls):
# dont do anything if already connected
if cls._client is None:
return
# code for opening a persistent connection and saving the connection client in a class variable
...
cls._client = <connection client>
#classmethod
def get_connection_client(cls):
return cls._client
def __init__(self):
# make sure we try to have a connection on initialisation
self.connect()
Now I'm not sure this is the best solution to your problem.
If connection.py is importing tasks.py, you can do it in your tasks.py:
import __main__ # connection.py
main_globals = __main__.__dict__ # this "is" what you getting in connection.py when you write globals()
client = main_globals["client"] # this client has the same id with client in connection.py
BaseManager is also an answer but it uses socket networking on localhost and it is not a good way of accessing a variable if you dont already using multiprocessing. I mean if you need to use multiprocessing, you should use BaseManager. But if you dont need multiprocessing, it is not a good option to use multiprocessing. My code is just taking pointer of "client" variable in connection.py from
interpreter.
Also if you want to use multiprocessing, my code won't work because the interpreters in different processes are different.
Use pickle when reading it from file.

Categories