socketIO_client (python) makes cpu spin and crashes

socketIO_client (python) makes cpu spin and crashes - python

I am trying to use socketIO_client in python and I am pretty successful with it, however when I let the program below run for a while (like an hour), it crashes and if I look at the system information with the 'top' command I can see the CPU is spinning at something like 80 or 90%.
PS: this happens only on my raspberry, so it might be due to an implementation of the python socketio module on ARM?
Am I doing anything wrong? Is there any socket I should close? I am not very familiar with sockets...
Here below my code:
from socketIO_client import SocketIO, BaseNamespace
class MainNamespace(BaseNamespace):
def on_message(self, message):
try:
typestr = message["depth"]["type_str"]
price_int = int(message["depth"]["price_int"])
total_volume_int = long(message["depth"]["total_volume_int"])
print "price_int:%s total_volume_int:%s" % (price_int,total_volume_int)
except:
pass
if __name__ == "__main__":
try:
mainSocket = SocketIO('socketio.mtgox.com', 80)
chatSocket = mainSocket.connect('/mtgox',MainNamespace)
mainSocket.wait()
except Exception, e:
print e

I rewrote socketIO-client in v0.5 so that it uses coroutines instead of threads to save memory. The external API remains the same.
pip install -U socketIO-client
Does v0.5 fix your issue?

Related

Long running script from flask endpoint

I've been pulling my hair out trying to figure this one out, hoping someone else has already encountered this and knows how to solve it :)
I'm trying to build a very simple Flask endpoint that just needs to call a long running, blocking php script (think while true {...}). I've tried a few different methods to async launch the script, but the problem is my browser never actually receives the response back, even though the code for generating the response after running the script is executed.
I've tried using both multiprocessing and threading, neither seem to work:
# multiprocessing attempt
#app.route('/endpoint')
def endpoint():
def worker():
subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp)
p = multiprocessing.Process(target=worker)
print '111111'
p.start()
print '222222'
return json.dumps({
'success': True
})
# threading attempt
#app.route('/endpoint')
def endpoint():
def thread_func():
subprocess.Popen('nohup php script.php &', shell=True, preexec_fn=os.setpgrp)
t = threading.Thread(target=thread_func)
print '111111'
t.start()
print '222222'
return json.dumps({
'success': True
})
In both scenarios I see the 111111 and 222222, yet my browser still hangs on the response from the endpoint. I've tried p.daemon = True for the process, as well as p.terminate() but no luck. I had hoped launching a script with nohup in a different shell and separate processs/thread would just work, but somehow Flask or uWSGI is impacted by it.
Update
Since this does work locally on my Mac when I start my Flask app directly with python app.py and hit it directly without going through my Nginx proxy and uWSGI, I'm starting to believe it may not be the code itself that is having issues. And because my Nginx just forwards the request to uWSGI, I believe it may possibly be something there that's causing it.
Here is my ini configuration for the domain for uWSGI, which I'm running in emperor mode:
[uwsgi]
protocol = uwsgi
max-requests = 5000
chmod-socket = 660
master = True
vacuum = True
enable-threads = True
auto-procname = True
procname-prefix = michael-
chdir = /srv/www/mysite.com
module = app
callable = app
socket = /tmp/mysite.com.sock

This kind of stuff is the actual and probably main use case for Python Celery (https://docs.celeryproject.org/). As a general rule, do not run long-running jobs that are CPU-bound in the wsgi process. It's tricky, it's inefficient, and most important thing, it's more complicated than setting up an async task in a celery worker. If you want to just prototype you can set the broker to memory and not using an external server, or run a single-threaded redis on the very same machine.
This way you can launch the task, call task.result() which is blocking, but it blocks in an IO-bound fashion, or even better you can just return immediately by retrieving the task_id and build a second endpoint /result?task_id=<task_id> that checks if result is available:
result = AsyncResult(task_id, app=app)
if result.state == "SUCCESS":
return result.get()
else:
return result.state # or do something else depending on the state
This way you have a non-blocking wsgi app that does what is best suited for: short time CPU-unbound calls that have IO calls at most with OS-level scheduling, then you can rely directly to the wsgi server workers|processes|threads or whatever you need to scale the API in whatever wsgi-server like uwsgi, gunicorn, etc. for the 99% of workloads as celery scales horizontally by increasing the number of worker processes.

This approach works for me, it calls the timeout command (sleep 10s) in the command line and lets it work in the background. It returns the response immediately.
#app.route('/endpoint1')
def endpoint1():
subprocess.Popen('timeout 10', shell=True)
return 'success1'
However, not testing on WSGI server, but just locally.

Would it be enough to use a background task? Then you only need to import threading e.g.
import threading
import ....
def endpoint():
"""My endpoint."""
try:
t = BackgroundTasks()
t.start()
except RuntimeError as exception:
return f"An error occurred during endpoint: {exception}", 400
return "successful.", 200
return "successfully started.", 200
class BackgroundTasks(threading.Thread):
def run(self,*args,**kwargs):
...do long running stuff

Requests.get() gets stuck on connect

I'm trying to make a simple get request using requests
import requests
def main():
content = requests.get("https://google.com")
print(content.status_code)
if __name__ == "__main__":
main()
I'm running this on Linux, version 17.10.
Python version: either 2.7 or 3.6 (tried both).
The code gets stuck in running, it doesn't timeout or anything.
After I stop it, based on the callstack, it gets stuck at:
File "/usr/lib/python2.7/socket.py", line 228, in meth return getattr(self._sock,name)(*args)

I just ran your code on Python console and it returned 200. I am Running Python 3.6.7, Ubuntu 18.04.
It may be the case that your computer cannot connect to google.com for a very long time, you should pass timeout parameter along with try except
Use the following code:
import requests
def main():
success = False
while success==False:
try:
content = requests.get("https://google.com", timeout=5)
success=True
except:
pass
print(content.status_code)
if __name__ == "__main__":
main()
If there is a temporary problem with your network connection, this code snippet gurantees a proper response.

Checking Servers with Motor(Mongodb & Tornado)

I need to create a function that checks to make sure Mongo servers are running using the ping function. I set up the clients right there (the config file has dictionary with ports numbers)
clientList = []
for value in configuration["mongodbServer"]:
client = motor.motor_tornado.MotorClient('mongodb://localhost:{}'.format(value))
clientList.append(client)
and then i run this function:
class MongoChecker(Checker):
formatter = 'stashboard.formatters.MongoFormatter'
def check(self):
for x in clientList:
if x.ping:
return x.ping
and the error i get:
yielded unknown object MotorDatabase(Database(MongoClient([]), 'ping'))\n",
I think my issue is that i'm using the ping function wrong. I can't find any other documentation on that or any other kind of feature that would check to see if the servers are still running. If anyone knows of a better way to monitor the status using Motor, i'm open. Thanks!

First, there's no "ping" function. Hence MotorClient thinks you're trying to access the database named "ping". The database named "ping" is shown in the "unknown object" exception. For all MongoDB commands like "ping", just use MotorDatabase's command method.
Second, Motor is asynchronous. You must use Motor methods in a Tornado coroutine with the "yield" statement. For example:
#gen.coroutine
def check():
try:
result = yield client.admin.command({'ping': 1})
print(result)
except ConnectionFailure as exc:
print(exc)
If you want to test this out synchronously, you can run the IOLoop just long enough for the coroutine to complete:
from pymongo.errors import ConnectionFailure
from tornado import gen
from tornado.ioloop import IOLoop
import motor.motor_tornado
client = motor.motor_tornado.MotorClient()
IOLoop.current().run_sync(check)
For an introduction to Tornado coroutines, see Refactoring Tornado Coroutines and the Tornado documentation.

pylibmc: 'Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key"'

I have a python web app that uses the pylibmc module to connect to a memcached server. If I test my app with requests once per second or slower, everything works fine. If I send more than one request per second, however, my app crashes and I see the following in my logs:
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:107
Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key" likely for "Programmer error, the query_id was not incremented.", at libmemcached/get.cc:89
Any idea what's going wrong or how to fix it?
My code looks like this:
self.mc = pylibmc.Client(
servers=[os.environ.get(MEMCACHE_SERVER_VAR)],
username=os.environ.get(MEMCACHE_USER_VAR),
password=os.environ.get(MEMCACHE_PASS_VAR),
binary=True
)
#...
if (self.mc != None):
self.mc.set(key, stored_data)
#...
page = self.mc.get(key)

This is a threading issue. pylibmc clients are not thread-safe. You should convert your code to use a ThreadMappedPool object to ensure you keep a separate connection for each thread. Something like this:
mc = pylibmc.Client(
servers=[os.environ.get(MEMCACHE_SERVER_VAR)],
username=os.environ.get(MEMCACHE_USER_VAR),
password=os.environ.get(MEMCACHE_PASS_VAR),
binary=True
)
self.pool = pylibmc.ThreadMappedPool(mc)
#...
if (self.pool != None):
with self.pool.reserve() as mc:
mc.set(key, stored_data)
#...
if (self.pool != None):
with self.pool.reserve() as mc:
page = mc.get(key)
Make sure to call self.pool.relinquish() when the thread is finished, possibly in the destructor!
(In my case this happened because I was using cherrypy as my web server, and cherrypy spawns 10 separate threads to serve requests by default.)

I ran into the same issue running Django on Apache. Switching from pylibmc to python-memcached eliminated the problem for me.

How do you you run a Twisted application via Python (instead of via Twisted)?

I am working my way through learning Twisted, and have stumbled across something I'm not sure I'm terribly fond of - the "Twisted Command Prompt". I am fiddling around with Twisted on my Windows machine, and tried running the "Chat" example:
from twisted.protocols import basic
class MyChat(basic.LineReceiver):
def connectionMade(self):
print "Got new client!"
self.factory.clients.append(self)
def connectionLost(self, reason):
print "Lost a client!"
self.factory.clients.remove(self)
def lineReceived(self, line):
print "received", repr(line)
for c in self.factory.clients:
c.message(line)
def message(self, message):
self.transport.write(message + '\n')
from twisted.internet import protocol
from twisted.application import service, internet
factory = protocol.ServerFactory()
factory.protocol = MyChat
factory.clients = []
application = service.Application("chatserver")
internet.TCPServer(1025, factory).setServiceParent(application)
However, to run this application as a Twisted server, I have to run it via the "Twisted Command Prompt", with the command:
twistd -y chatserver.py
Is there any way to change the code (set Twisted configuration settings, etc) so that I can simply run it via:
python chatserver.py
I've Googled, but the search terms seem to be too vague to return any meaningful responses.
Thanks.

I don't know if it's the best way to do this but what I do is instead of:
application = service.Application("chatserver")
internet.TCPServer(1025, factory).setServiceParent(application)
you can do:
from twisted.internet import reactor
reactor.listenTCP(1025, factory)
reactor.run()
Sumarized if you want to have the two options (twistd and python):
if __name__ == '__main__':
from twisted.internet import reactor
reactor.listenTCP(1025, factory)
reactor.run()
else:
application = service.Application("chatserver")
internet.TCPServer(1025, factory).setServiceParent(application)
Hope it helps!

Don't confuse "Twisted" with "twistd". When you use "twistd", you are running the program with Python. "twistd" is a Python program that, among other things, can load an application from a .tac file (as you're doing here).
The "Twisted Command Prompt" is a Twisted installer-provided convenience to help out people on Windows. All it is doing is setting %PATH% to include the directory containing the "twistd" program. You could run twistd from a normal command prompt if you set your %PATH% properly or invoke it with the full path.
If you're not satisfied with this, perhaps you can expand your question to include a description of the problems you're having when using "twistd".

On windows you can create .bat file with your command in it, use full paths, then just click on it to start up.
For example I use:
runfileserver.bat:
C:\program_files\python26\Scripts\twistd.py -y C:\source\python\twisted\fileserver.tac

Maybe one of run or runApp in twisted.scripts.twistd modules will work for you. Please let me know if it does, it will be nice to know!

I haven't used twisted myself. However, you may try seeing if the twistd is a python file itself. I would take a guess that it is simply managing loading the appropriate twisted libraries from the correct path.

I am successfully using the simple Twisted Web server on Windows for Flask web sites.
Are others also successfully using Twisted on Windows, to validate that configuration?
new_app.py
if __name__ == "__main__":
reactor_args = {}
def run_twisted_wsgi():
from twisted.internet import reactor
from twisted.web.server import Site
from twisted.web.wsgi import WSGIResource
resource = WSGIResource(reactor, reactor.getThreadPool(), app)
site = Site(resource)
reactor.listenTCP(5000, site)
reactor.run(**reactor_args)
if app.debug:
# Disable twisted signal handlers in development only.
reactor_args['installSignalHandlers'] = 0
# Turn on auto reload.
import werkzeug.serving
run_twisted_wsgi = werkzeug.serving.run_with_reloader(run_twisted_wsgi)
run_twisted_wsgi()
old_app.py
if __name__ == "__main__":
app.run()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

socketIO_client (python) makes cpu spin and crashes - python

I rewrote socketIO-client in v0.5 so that it uses coroutines instead of threads to save memory. The external API remains the same. pip install -U socketIO-client Does v0.5 fix your issue?

Related

Long running script from flask endpoint

Requests.get() gets stuck on connect

Checking Servers with Motor(Mongodb & Tornado)

pylibmc: 'Assertion "ptr->query_id == query_id +1" failed for function "memcached_get_by_key"'

How do you you run a Twisted application via Python (instead of via Twisted)?

Categories

Resources