Can self be overwritten by a queue object within class? - python

I have a python pexpect script that logs in to servers frequently, and each time it tries to log in, I get a DUO push, which is very unreliable (not sure if it's the app on Android or the DUO system itself, but that is not relevant to my question). I'm trying to avoid DUO pushes by re-using sessions in the queue.
I have a class called Session to open/close sessions. I also have a global Queue defined. Whenever I'm done using a session, instead of closing the pexpect handle, I do q.put(self). self contains the active pexpect session. Next time I need to login, I first check to see if there is an item in the Queue. If there is, I would like to do self = q.get(), hence overwriting my "self" with the object in the Queue. Here is example code of what I'm trying to accomplish:
from globals import q
class Session:
def __init__(self, ip):
self.user = flask_login.current_user.saneid
self.pass = flask_login.current_user.sanepw
self.ip = ip
self.handle = None
def __enter__(self):
if not q.empty():
self = q.get()
else:
# login to node
self.handle = pexpect.spawn('ssh user#node')
...
return(self.handle)
def __exit__(self, *args):
q.put(self)
Is this good practice? Is there a better way?

Related

Mocking method in class called through Multiprocessing.Process

I am writing a python tcp proxy: whenever a connection gets established from the client, the proxy establishes the connection to the server, and transparently forwards both streams. Additionally, when the packet being forwarded has some conditions, I want to have it parsed and have that sent to another server.
This is the contents of my unittest:
class TestParsing(TestCase):
def setUp(self) -> None:
self.patcher = patch('EnergyDataClient.EnergyDataClient', autospec=True)
self.EC_mock = self.patcher.start()
EnergyAgent.EC = self.EC_mock()
EnergyAgent.GP = MyParser.MyParser()
self.server = multiprocessing.Process(target=tcp_server, args=(1235,))
self.gp = multiprocessing.Process(target=EnergyAgentRunner, args=(1234, 1235))
self.server.start()
self.gp.start()
def tearDown(self) -> None:
self.patcher.stop()
self.server.terminate()
self.gp.terminate()
while self.server.is_alive() or self.gp.is_alive():
sleep(0.1)
def test_parsemessage(self):
# start the client process, and wait until done
result = tcp_client(1234, correct_packets['DATA04']['request'])
self.assertEqual(correct_packets['DATA04']['request'], result)
EnergyAgent.EC.post.assert_called_once()
I want to validate that the 'post' method on the object EC is called with the contents I expect to have intercepted... but, as that object is on another process, mocking seems not to be helping. What am I doing wrong?
I figured out what is happening here. When calling multiprocessing.Process, python is spawning a new process using fork(), that produces a copy of the memory pages on the child process. That is the reason because I can patch the EnergyAgent.GP and works (as that object is only read from that point on, and we do not require it back at the main process) and EnergyAgent.EC does not work: The mocked object gets successfully updated on the child process, but the parent never realizes that.

Call to async endpoint gets blocked by another thread

I have a tornado webservice which is going to serve something around 500 requests per minute. All these requests are going to hit 1 specific endpoint. There is a C++ program that I have compiled using Cython and use it inside the tornado service as my processor engine. Each request that goes to /check/ will trigger a function call in the C++ program (I will call it handler) and the return value will get sent to user as response.
This is how I wrap the handler class. One important point is that I do not instantiate the handler in __init__. There is another route in my tornado code that I want to start loading the DataStructure after an authroized request hits that route. (e.g. /reload/)
executors = ThreadPoolExecutor(max_workers=4)
class CheckerInstance(object):
def __init__(self, *args, **kwargs):
self.handler = None
self.is_loading = False
self.is_live = False
def init(self):
if not self.handler:
self.handler = pDataStructureHandler()
self.handler.add_words_from_file(self.data_file_name)
self.end_loading()
self.go_live()
def renew(self):
self.handler = None
self.init()
class CheckHandler(tornado.web.RequestHandler):
async def get(self):
query = self.get_argument("q", None).encode('utf-8')
answer = query
if not checker_instance.is_live:
self.write(dict(answer=self.get_argument("q", None), confidence=100))
return
checker_response = await checker_instance.get_response(query)
answer = checker_response[0]
confidence = checker_response[1]
if self.request.connection.stream.closed():
return
self.write(dict(correct=answer, confidence=confidence, is_cache=is_cache))
def on_connection_close(self):
self.wait_future.cancel()
class InstanceReloadHandler(BasicAuthMixin, tornado.web.RequestHandler):
def prepare(self):
self.get_authenticated_user(check_credentials_func=credentials.get, realm='Protected')
def new_file_exists(self):
return True
def can_reload(self):
return not checker_instance.is_loading
def get(self):
error = False
message = None
if not self.can_reload():
error = True
message = 'another job is being processed!'
else:
if not self.new_file_exists():
error = True
message = 'no new file found!'
else:
checker_instance.go_fake()
checker_instance.start_loading()
tornado.ioloop.IOLoop.current().run_in_executor(executors, checker_instance.renew)
message = 'job started!'
if self.request.connection.stream.closed():
return
self.write(dict(
success=not error, message=message
))
def on_connection_close(self):
self.wait_future.cancel()
def main():
app = tornado.web.Application(
[
(r"/", MainHandler),
(r"/check", CheckHandler),
(r"/reload", InstanceReloadHandler),
(r"/health", HealthHandler),
(r"/log-event", SubmitLogHandler),
],
debug=options.debug,
)
checker_instance = CheckerInstance()
I want this service to keep responding after checker_instance.renew starts running in another thread. But this is not what happens. When I hit the /reload/ endpoint and renew function starts working, any request to /check/ halts and waits for the reloading process to finish and then it starts working again. When the DataStructure is being loaded, the service should be in fake mode and respond to people with the same query that they send as input.
I have tested this code in my development environment with an i5 CPU (4 CPU cores) and it works just fine! But in the production environment (3 double-thread CPU cores) the /check/ endpoint halts requests.
It is difficult to fully trace the events being handled because you have clipped out some of the code for brevity. For instance, I don't see a get_response implementation here so I don't know if it is awaiting something itself that could be dependent on the state of checker_instance.
One area I would explore is in the thread-safety (or seeming absence of) in passing the checker_instance.renew to run_in_executor. This feels questionable to me because you are mutating the state of a single instance of CheckerInstance from a separate thread. While it might not break things explicitly, it does seem like this could be introducing odd race conditions or unanticipated copies of memory that might explain the unexpected behavior you are experiencing
If possible, I would make whatever load behavior you have that you want to offload to a thread be completely self-contained and when the data is loaded, return it as the function result which can then be fed back into you checker_instance. If you were to do this with the code as-is, you would want to await the run_in_executor call for its result and then update the checker_instance. This would mean the reload GET request would wait until the data was loaded. Alternatively, in your reload GET request, you could ioloop.spawn_callback to a function that triggers the run_in_executor in this manner, allowing the reload request to complete instead of waiting.

Cassandra creates multiple connections and not release

I am using Python Cassandra-Driver 3.15.1
I have a script that runs some multiproc.
The problem is that for some reason, the connection is not properly released after calling close_connection (get_connection -> run CQL -> close_connection -> then to the end call close_cluster. This results to over hundreds of connections/sessions stay open
Any hints on where to look for the issue is very much appreciated.
def get_connection(self, timeout = 600):
self.session = Cluster([self.host]).connect()
self.session.default_timeout = timeout
return self.session
def close_connection(self, conn):
return conn.shutdown()
def close_cluster_connection(self):
return self.cluster.shutdown()
Each Cluster object should be explicitly shutdown when finished, but that's not possible here because you aren't holding on to the instance created by Cluster([self.host]) in get_connection.
close_cluster_connection references a self.cluster. If that's already instantiated and the cluster instance you want to use, get_connection should look like this.
def get_connection(self, timeout = 600):
self.session = self.cluster.connect()
self.session.default_timeout = timeout
return self.session
If you can't use self.cluster there, you have to find a way to keep track of your Cluster instances and shut them down when you're done.

How to mock a pika connection for a different module?

I have a class that imports the following module:
import pika
import pickle
from apscheduler.schedulers.background import BackgroundScheduler
import time
import logging
class RabbitMQ():
def __init__(self):
self.connection = pika.BlockingConnection(pika.ConnectionParameters(host="localhost"))
self.channel = self.connection.channel()
self.sched = BackgroundScheduler()
self.sched.add_job(self.keep_connection_alive, id='clean_old_data', trigger='cron', hour = '*', minute='*', second='*/50')
self.sched.start()
def publish_message(self, message , path="path"):
message["path"] = path
logging.info(message)
message = pickle.dumps(message)
self.channel.basic_publish(exchange="", routing_key="server", body=message)
def keep_connection_alive(self):
self.connection.process_data_events()
rabbitMQ = RabbitMQ()
def publish_message(message , path="path"):
rabbitMQ.publish_message(message, path=path)
My class.py:
import RabbitMQ as rq
class MyClass():
...
When generating unit tests for MyClass I can't mock the connection for this part of the code. And keeping throwing exceptions. And it will not work at all
pika.exceptions.ConnectionClosed: Connection to 127.0.0.1:5672 failed: [Errno 111] Connection refused
I tried a couple of approaches to mock this connection but none of those seem to work. I was wondering what can I do to support this sort of test? Mock the entire RabbitMQ module? Or maybe mock only the connection
Like the commenter above mentions, the issue is your global creation of your RabbitMQ.
My knee-jerk reaction is to say "just get rid of that, and your module-level publish_message". If you can do that, go for that solution. You have a publish_message on your RabbitMQ class that accepts the same args; any caller would then be expected to create an instance of your RabbitMQ class.
If you don't want to or can't do that for whatever reason, you should just move the instantiation of move that object instantiation in your module-level publish_message like this:
def publish_message(message , path="path"):
rabbitMQ = RabbitMQ()
rabbitMQ.publish_message(message, path=path)
This will create a new connection every time you call it though. Maybe that's ok...but maybe it's not. So to avoid creating duplicate connections, you'd want to introduce something like a singleton pattern:
class RabbitMQ():
__instance = None
...
#classmethod
def get_instance(cls):
if cls.__instance is None:
cls.__instance = RabbitMQ()
return cls.__instance
def publish_message(message , path="path"):
RabbitMQ.get_instance().publish_message(message, path=path)
Ideally though, you'd want to avoid the singleton pattern entirely. Whatever caller should store a single instance of your RabbitMQ object and call publish_message on it directly.
So the TLDR/ideal solution IMO: Just get rid of those last 3 lines. The caller should create a RabbitMQ object.
EDIT: Oh, and the why it's happening -- When you import that module, this is being evaluated: rabbitMQ = RabbitMQ(). Your attempt to mock it is happening after that is evaluated, and fails to connect.

How to implement a two way jsonrpc + twisted server/client

Hello I am working on develop a rpc server based on twisted to serve several microcontrollers which make rpc call to twisted jsonrpc server. But the application also required that server send information to each micro at any time, so the question is how could be a good practice to prevent that the response from a remote jsonrpc call from a micro be confused with a server jsonrpc request which is made for a user.
The consequence that I am having now is that micros are receiving bad information, because they dont know if netstring/json string that is comming from socket is their response from a previous requirement or is a new request from server.
Here is my code:
from twisted.internet import reactor
from txjsonrpc.netstring import jsonrpc
import weakref
creds = {'user1':'pass1','user2':'pass2','user3':'pass3'}
class arduinoRPC(jsonrpc.JSONRPC):
def connectionMade(self):
pass
def jsonrpc_identify(self,username,password,mac):
""" Each client must be authenticated just after to be connected calling this rpc """
if creds.has_key(username):
if creds[username] == password:
authenticated = True
else:
authenticated = False
else:
authenticated = False
if authenticated:
self.factory.clients.append(self)
self.factory.references[mac] = weakref.ref(self)
return {'results':'Authenticated as %s'%username,'error':None}
else:
self.transport.loseConnection()
def jsonrpc_sync_acq(self,data,f):
"""Save into django table data acquired from sensors and send ack to gateway"""
if not (self in self.factory.clients):
self.transport.loseConnection()
print f
return {'results':'synced %s records'%len(data),'error':'null'}
def connectionLost(self, reason):
""" mac address is searched and all reference to self.factory.clientes are erased """
for mac in self.factory.references.keys():
if self.factory.references[mac]() == self:
print 'Connection closed - Mac address: %s'%mac
del self.factory.references[mac]
self.factory.clients.remove(self)
class rpcfactory(jsonrpc.RPCFactory):
protocol = arduinoRPC
def __init__(self, maxLength=1024):
self.maxLength = maxLength
self.subHandlers = {}
self.clients = []
self.references = {}
""" Asynchronous remote calling to micros, simulating random calling from server """
import threading,time,random,netstring,json
class asyncGatewayCalls(threading.Thread):
def __init__(self,rpcfactory):
threading.Thread.__init__(self)
self.rpcfactory = rpcfactory
"""identifiers of each micro/client connected"""
self.remoteMacList = ['12:23:23:23:23:23:23','167:67:67:67:67:67:67','90:90:90:90:90:90:90']
def run(self):
while True:
time.sleep(10)
while True:
""" call to any of three potential micros connected """
mac = self.remoteMacList[random.randrange(0,len(self.remoteMacList))]
if self.rpcfactory.references.has_key(mac):
print 'Calling %s'%mac
proto = self.rpcfactory.references[mac]()
""" requesting echo from selected micro"""
dataToSend = netstring.encode(json.dumps({'method':'echo_from_micro','params':['plop']}))
proto.transport.write(dataToSend)
break
factory = rpcfactory(arduinoRPC)
"""start thread caller"""
r=asyncGatewayCalls(factory)
r.start()
reactor.listenTCP(7080, factory)
print "Micros remote RPC server started"
reactor.run()
You need to add a enough information to each message so that the recipient can determine how to interpret it. Your requirements sounds very similar to those of AMP, so you could either use AMP instead or use the same structure as AMP to identify your messages. Specifically:
In requests, put a particular key - for example, AMP uses "_ask" to identify requests. It also gives these a unique value, which further identifies that request for the lifetime of the connection.
In responses, put a different key - for example, AMP uses "_answer" for this. The value matches up with the value from the "_ask" key in the request the response is for.
Using an approach like this, you just have to look to see whether there is an "_ask" key or an "_answer" key to determine if you've received a new request or a response to a previous request.
On a separate topic, your asyncGatewayCalls class shouldn't be thread-based. There's no apparent reason for it to use threads, and by doing so it is also misusing Twisted APIs in a way which will lead to undefined behavior. Most Twisted APIs can only be used in the thread in which you called reactor.run. The only exception is reactor.callFromThread, which you can use to send a message to the reactor thread from any other thread. asyncGatewayCalls tries to write to a transport, though, which will lead to buffer corruption or arbitrary delays in the data being sent, or perhaps worse things. Instead, you can write asyncGatewayCalls like this:
from twisted.internet.task import LoopingCall
class asyncGatewayCalls(object):
def __init__(self, rpcfactory):
self.rpcfactory = rpcfactory
self.remoteMacList = [...]
def run():
self._call = LoopingCall(self._pokeMicro)
return self._call.start(10)
def _pokeMicro(self):
while True:
mac = self.remoteMacList[...]
if mac in self.rpcfactory.references:
proto = ...
dataToSend = ...
proto.transport.write(dataToSend)
break
factory = ...
r = asyncGatewayCalls(factory)
r.run()
reactor.listenTCP(7080, factory)
reactor.run()
This gives you a single-threaded solution which should have the same behavior as you intended for the original asyncGatewayCalls class. Instead of sleeping in a loop in a thread in order to schedule the calls, though, it uses the reactor's scheduling APIs (via the higher-level LoopingCall class, which schedules things to be called repeatedly) to make sure _pokeMicro gets called every ten seconds.

Categories