I am trying to access video data (e.g., frames, video length) from within python.
Spawning something like mplayer is not an option because of a weird bug which apparently exists between mod_wsgi and python.
pyffmpeg and ffvideo no longer compile, and are not in sync with the newest ffmpeg versions.
I want a simple library, if anyone knows of it.
The bug being referred to would have to be the bug in Python 2.7.2. In short they broke the ability to do a fork from within a sub interpreter. See:
http://bugs.python.org/issue13156
The workaround in mod_wsgi is to force your WSGI application to run in the main Python interpreter. This is done using:
WSGIApplicationGroup %{GLOBAL}
If you are hosting multiple WSGI applications with embedded mode and needed to do this to more than one, you would need to start using daemon mode instead and delegate each WSGI application to separate daemon process group, with all being forced to run in the main interpreter of their respective daemon process groups.
So, any reason you aren't simply using this work around for the bug in Python 2.7.2?
Related
Problem: .so(shared object) as library in python works well when python calls it and fails in uWSGI-running python(Django) application.
More info: I've build Go module with go build -buildmode=c-shared -o output.so input.go to call it in Python with
from ctypes import cdll
lib = cdll.LoadLibrary('path_to_library/output.so')
When django project is served via uWSGI the request handler that calling Go library freezes, causing future 504 in Nginx. After getting in "so called freeze", uWSGI is locked there and only restarting helps to enliven app. No logs AT ALL! It just freezes.
Everything works correctly when i run in python interpreter on the same machine.
My thoughts: i've tried to debug this and put a lot of log messages in library, but it won't give much info because everything is fine with library(because it works in interpreter). Library loads correctly, because some log messages that i've putted in library. I think it some sort of uWSGI limitation. I don't think putting uwsgi.ini file is somehow helpful.
Additional info:
Go dependencies:
fmt
github.com/360EntSecGroup-Skylar/excelize
log
encoding/json
OS: CentOS 6.9
Python: Python 3.6.2
uWSGI: 2.0.15
What limitations can be in uWSGI in that type of shared object work and if there a way to overcome them?
Firstly, are you absolutely positive you need to call Go as a library from uWSGI process?
uWSGI are usually for interpreted languages such as PHP, Python, Ruby and others. It bootstraps the interpreter and manages the master/worker processes to handle requests. It seems strange to be using it on Go library.
You mentioned having nginx as your webserver, why not just use your Go program as the http server (which it does great) and call it from nginx directly using it's URL:
location /name/ {
proxy_pass http://127.0.0.1/go_url/;
}
See nginx docs.
If you really want to use Go as a python imported library via a .so module, you have to be aware Go has its own runtime, thread management and may not work well with uWSGI which handles threads/processes in a different way. In this case I'm unable to help you, since I never actually tried this.
If you could clarify your question with what are you actually tring to do, we might me able to answer more helpfully.
Attempts and thoughts
I tried to avoid separation of shared library from my python code, since it requires support of at least one more process, and i would have to rewrite some of the library to create new api.
As #Lu.nemec kindly noted that:
Go has its own runtime, thread management and may not work well with uWSGI which handles threads/processes in a different way
Since uWSGI is the problem i started seaching for a solution there. One of the hopes was installing GCCGO uWSGI plugin somehow solve that problem. But even it's hard to install on old OSes, because it lacks of pre-builded plugins and manual build haven't gone very well, it haven't helped, nothing changes, it still freezes.
And then i thought that i wan't to disable coroutines and that type of stuff that differs from uWSGI and one of the changes that i am able to do is to set GOMAXPROCS
GOMAXPROCS sets the maximum number of CPUs that can be executing simultaneously and returns the previous setting. If n < 1, it does not change the current setting. The number of logical CPUs on the local machine can be queried with NumCPU. This call will go away when the scheduler improves.
And it worked like a charm!!!
The solution
import (
...
"runtime"
)
...
//export yourFunc
func yourFunc(yourArgs argType) {
runtime.GOMAXPROCS(1)
...
}
My previous answer works in some cases. HOWEVER, when i tried to run the same project on another server with same OS, same Python, same uWSGI (version, plugins, config files), same requirements, same .so file, it freezed the save way as i described in answer.
I personally didn't want to run this as separate process and bind it to socket/port and create API for communicating with shared library.
The solution:
Required only separate process. Run with celery.
Some caveats:
1.You cannot run task with task.apply() since it would be run in main application, not in celery:
result = task.apply_async()
while result.ready():
time.wait(5)
2.You neeed to run celery with solo execution pool
celery -A app worker -P solo
I have a python program that I would like to constantly be running updates and gathering new data. Essentially, I am gathering data from a bunch of domains. My processors take about a day and a half to run. Once they finish, I'd like them to automatically start over again.
I don't want to use a while loop to just restart the processes without killing everything related first because some of the packages that I am using to support these processors (mainly pyV8) have a problem of memory slowly accumulating and I'm not a good enough programmer to dive into debugging a memory leak in a big package like that. So, I need all of the related processes to successfully die and then come back to life.
I have heard that supervisord can do this type of work, but don't like messing around with .conf files and would prefer to keep everything inside of python.
Summary: Is there a package that will kill all related processes with a script/package that I could use to put into a while loop or create this kind of behavior inside of a python script?
I don't see why you couldn't use supervisord. The configuration is really simple and very flexible and it's not limited to python programs.
For example, you can create file /etc/supervisor/conf.d/myprog.conf:
[program:myprog]
command=/opt/myprog/bin/myprog --opt1 --opt2
directory=/opt/myprog
user=myuser
Then reload supervisor's config:
$ sudo supervisorctl reload
and it's on. Isn't it simple enough?
More about supervisord configuration: http://supervisord.org/subprocess.html
I'm starting a web project in Python and I'm looking for a process manager that offers reloading in the same manner as PHP-FPM.
I've built stuff with Python before and Paste seems similar to what I want, but not quite.
The need for the ability to reload the process rather than restart is to allow long-running tasks to complete uninterrupted where necessary.
How about supervisor with uwsgi?
So I've been working on my first Django / Python project and I got my production server up and running. I was wondering if it's possible to make Python/FastCGI (not really sure which is responsible for the task) to recompile my code. As of right now, when I upload updated code, I need to restart the server for the changes to take place. I read that you can add some kind of mysite.fcgi file to lighttpd so it see's that you've updated the code, can you do the same for Nginx / FastCGI?
for anyone else that was interested in my question.. this is only a partial solution, but I ended up finding my answer here: How to gracefully restart django running fcgi behind nginx?
You can just run the script (I'm going to modify it a bit), everytime you edit your code and it will gracefully restart everything without dropping connections.
This is a general guide from the mod_wsgi project that outlines how you can monitor code changes from your app_wsgi.py and restart the current process if any of the modules have changed. You need to restart the Python process, because threads contending over modules could mean that a freshly reloaded module has outdated references from other modules that are still waiting to get discovered for reload.
If you want something that works nicely with nginx, Django and wsgi apps in general, take a peek at Spawning as your wsgi server. It's approach to code reloading is about as graceful as it gets.
It has great documentation, well documented request handling model and it just works, which makes it such a no-brainer to configure. You'd need less than five minutes from now to having your Django instance running on Spawning. Here's another topical blog to get your juices running.
Does anyone know of a working and well documented implementation of a daemon using python? Please post a link here if you know of a project that fits these two requirements.
Three options I can think of-
Make a cron job that calls your script. Cron is a common name for a GNU/Linux daemon that periodically launches scripts according to a schedule you set. You add your script into a crontab or place a symlink to it into a special directory and the daemon handles the job of launching it in the background. You can read more at wikipedia. There is a variety of different cron daemons, but your GNU/Linux system should have it already installed.
Pythonic approach (a library, for example) for your script to be able to daemonize itself. Yes, it will require a simple event loop (where your events are timer triggering, possibly, provided by sleep function). Here is the one I recommend & use - A simple unix/linux daemon in Python
Use python multiprocessing module. The nitty-gritty of trying to fork a process etc. are hidden in this implementation. It's pretty neat.
I wouldn't recommend 2 or 3 'coz you're in fact repeating cron functionality. The Linux system paradigm is to let multiple simple tools interact and solve your problems. Unless there are additional reasons why you should make a daemon (in addition to trigger periodically), choose the other approach.
Also, if you use daemonize with a loop and a crash happens, make sure that you have logs which will help you debug. Also devise a way so that the script starts again. While if the script is added as a cron job, it will trigger again in the time gap you kept.
If you just want to run a daemon, consider Supervisor, a daemon that itself controls and manages daemons.
If you want to look at the nitty-gritty, you can check out Supervisor's launch script or some of the responses to this lazyweb request.
Check this link for a double-fork daemon: http://code.activestate.com/recipes/278731-creating-a-daemon-the-python-way/
The code is readable and well-documented. You want to take a look at chapter 13 of W. Richard's book 'Advanced Programming in the UNix Environment' for detailed information on Unix daemons.