Random python module loading failures with threading - python

I'm trying to debug an error where python import statements randomly fail, at other times they run cleanly.
This is an example of the exceptions I see. Sometimes I'll see this one, sometimes I'll see another one in a different module, though it seems to always hit in one of 4 modules.
ERROR:root:/home/user/projecteat/django/contrib/auth/management/__init__.py:25: RuntimeWarning: Parent module 'django.contrib.auth.management' not found while handling absolute import
from django.contrib.contenttypes.models import ContentType
Because of the random nature, I'm almost certain it's a threading issue, but I don't understand why I would get import errors, so I'm not sure what to look for in debugging. Can this be caused by filesystem contention if different threads are trying to load the same modules?
I'm trying to get Django 1.4's LiveServerTestCase working on Google App Engine's development server. The main thread runs django's test framework. When it loads up a LiveServerTestCase based test class, it spawns a child thread which launches the App Engine dev_appserver, which is a local webserver. The main thread continues to run the test, using the Selenium driver to make HTTP requests, which are handled by dev_appserver on the child thread.
The test framework may run a few tests in the LiveServerTestCase based class before tearing down the testcase class. At teardown, the child thread is ended.
It looks like the exceptions are happening in the child (HTTP server) thread, mostly between tests within a single testcase class.
The code for the App Engine LiveServerTestCase class is here: https://github.com/dragonx/djangoappengine/blob/django-1.4/test.py
It's pretty hard to provide all the debugging info required for this question. I'm mostly looking for suggestions as to why python import statements would give RuntimeWarning errors.

I have a partial answer to my own question. What's going on is that I have two threads running.
Thread 1 is running the main internal function inside dev_appserver (dev_appserver_main) which is handling HTTP requests.
Thread 2 is running the Selenium based testcases. This thread will send commands to the browser to do something (which then indirectly generates an HTTP request and re-enters in thread 1). It then either issues more requests to Selenium to check status, or makes a datastore query to check for a result.
I think the problem is that upon handling every HTTP request, Thread 1 (dev_appserver) changes the environment so that certain folders are not accessible (folder excluded in app.yaml, as well as the environment that is not part of appengine). If Thread 2 happens to run some code in this time, certain imports may fail to load if they are located in these folders.

Related

Python Multiprocessing returning results with Logging and running frozen on Windows

I need some help with implementing logging while multiprocessing and running the application frozen under Windows. There are dozens of topics on this subject and I have spent a lot of time reviewing and testing those. I have also extensively reviewed the documentation, but I cannot figure out how to implement this in my code.
I have created a minimum example which runs fine on Linux, but crashes on Windows (even when not frozen). The example I created is just one of many iterations I have put my code through.
You can find the minimum example on github. Any assistance to get this example working would be greatly appreciated.
Thank you.
Marc.
The basic
On Linux, a child process is created by fork method by default. That means, the child process inherits almost everything from the parent process.
On Windows, the child process is created by spawn method.
That means, a child process is started almost from crash, re-imports and re-executes any code that is outside of the guard cloud if __name__ == '__main__'.
Why it worked or failed
On Linux, because the logger object is inherited, your program will start logging.
But it is far from perfect since you log directly to the file.
Sooner or later, log lines will be overlapped or IO error on file happens due to race condition between processes.
On Windows, since you didn't pass the logger object to the child process, and it re-imports your pymp_global module, logger is a None object. So when you try logging with a None object, it crashes for sure.
The solution
Logging with multiprocessing is not an easy task.
For it to work on Windows, you must either pass a logger object to child processes and/or log with QueueHandler. Another similar solution for inter-process communication is to use SocketHandler.
The idea is that only one thread or process does the logging. Other processes just send the log records. This prevents the race condition and ensures the log is written out after the critical process got time to do its job.
So how to implement it?
I have encountered this logging problem before and already written the code.
You can just use it with logger-tt package.
#pymp.py
from logging import getLogger
from logger_tt import setup_logging
setup_logging(use_multiprocessing=True)
logger = getLogger(__name__)
# other code below
For other modules
#pymp_common.py
from logging import getLogger
logger = getLogger(__name__)
# other code below
This saves you from writing all the logging config code everywhere manually.
You may consider changing the log_config file to suit your need.

GAE update app, how to avoid violent stop of long process?

I have a GAE apps that spawn some long process via an other module (managed by basic_scaling).
This long process handles correctly the DeadlineExceededError but spawning a defered method that will save the current state of the long process to be resumed later.
Today I discovered that when I do a appcfg.py -A <YOUR_PROJECT_ID> update myapp/, it abruptaly stops the long process. Just stop, no DeadlineExceededError (here goes my hope), nothing.
Is there some events triggered by GAE before stopping the app that would let me save the current state of my long process, write data to files (via s3, so a bit long), and re-queue the process to be re-run later ? (or something like this) ?
Thank you for your help.
From Scaling types and instance classes, both manual and basic scaling appear to behave identically from the instance shutdown prospective:
As with manual scaling, an instance that is stopped with appcfg stop
or from the Cloud Platform Console) has 30 seconds to finish handling
requests before it is forcibly terminated.
I assume the same shutdown method is used when the app is updated.
And from Shutdown:
There are two ways for an app to determine if a manual scaling
instance is about to be shut down. First, the is_shutting_down()
method from google.appengine.api.runtime starts returning true. Second
(and preferred), you can register a shutdown hook, as described below.
When App Engine begins to shut down an instance, existing requests are
given 30 seconds to complete, and new requests immediately return 404.
If an instance is handling a request, App Engine pauses the request
and runs the shutdown hook. If there is no active request, App Engine
sends an /_ah/stop request, which runs the shutdown hook. The
/_ah/stop request bypasses normal handling logic and cannot be handled
by user code; its sole purpose is to invoke the shutdown hook. If you
raise an exception in your shutdown hook while handling another
request, it will bubble up into the request, where you can catch it.
If you have enabled concurrent requests by specifying threadsafe: true
in app.yaml (which is the default), raising an exception from a
shutdown hook copies that exception to all threads. The following code
sample demonstrates a basic shutdown hook:
from google.appengine.api import apiproxy_stub_map
from google.appengine.api import runtime
def my_shutdown_hook():
apiproxy_stub_map.apiproxy.CancelApiCalls()
save_state()
# May want to raise an exception
runtime.set_shutdown_hook(my_shutdown_hook)
Alternatively, the following sample demonstrates how to use the
is_shutting_down() method:
while more_work_to_do and not runtime.is_shutting_down():
do_some_work()
save_state()
Note: It's important to recognize that the shutdown hook is not always
able to run before an instance terminates. In rare cases, an outage
can occur that prevents App Engine from providing 30 seconds of
shutdown time. Thus, we recommend periodically checkpointing the state
of your instance and using it primarily as an in-memory cache rather
than a reliable data store.
Based on my assumption above I expect these methods should work for your case as well, give them a try.
It looks like you replacing an existing version of your app (the default version). When you do this, it doesn't gracefully handle existing processing.
Whenever I update the production version of my app, I do it in a new version. I use the current date for my version name (e.g., 2016-05-13). I then go to the Google cloud console and make that new version the default. This way, the old version continues to run in parallel.
I asked a similar question a couple years ago that you can see here.

Mock module in code run by selenium's webdriver

My code is running some instances of threading.Thread for some long asynchronous tasks.
This does not allow me running my django unittests using sqlite backend, because sqlite can not handle multiple connections in threads. Thus, I am successfully mocking Thread with a FakeThread class that i wrote (it simply runs the target synchronously).
However, the mock does not seem to work for selenium tests. I do:
from tests.stubs import FakeThread
# ...
class FunctionalTest(LiveServerTestCase):
#mock.patch('accounts.models.user_profile.Thread', new=FakeThread)
def test_register_agency(self):
self.browser.get("%s%s" % (self.live_server_url, "/register"))
# .. fill in form, submit, eventually calls something in user_profile
# using an instance of Thread. Thread seems to still be threading.Thread
Any idea how to mock Thread in the code that selenium runs when serving my browser calls? Thank you!

Flask/Werkzeug debugger, process model, and initialization code

I'm writing a Python web application using Flask. My application establishes a connection to another server at startup, and communicates with that server periodically in the background.
If I don't use Flask's builtin debugger (invoking app.run with debug=False), no problem.
If I do use the builtin debugger (invoking app.run with debug=True), Flask starts a second Python process with the same code. It's the child process that ends up listening for HTTP connections and generally behaving as my application is supposed to, and I presume the parent is just there to watch over it when the debugger kicks in.
However, this wreaks havoc with my startup code, which runs in both processes; I end up with 2 connections to the external server, 2 processes logging to the same logfile, and in general, they trip over each other.
I presume that I should not be doing real work before the call to app.run(), but where should I put this initialization code (which I only want to run once per Flask process group, regardless of the debugger mode, but which needs to run at startup and independent of client requests)?
I found this question about "Flask auto-reload and long-running thread" which is somewhat related, but somewhat different, and the answer didn't help me. (I too have a separate long-running thread marked as a daemon thread, but it is killed when the reloader kicks in, but the problem I'm trying to solve is before any reload needs to happen. I'm not concerned with the reload; I'm concerned with the extra process, and the right way to avoid executing unnecessary code in the parent process.)
I confirmed this behavior is due to Werkzeug, not Flask proper, and it is related to the reloader. You can see this in Werkzeug's serving.py -- in run_simple(), if use_reloader is true, it invokes make_server via a helper function run_with_reloader() / restart_with_reloader() which does a subprocess.call(sys.executable), after setting an environment variable WERKZEUG_RUN_MAIN in the environment which will be inherited by the subprocess.
I worked around it with a fairly ugly hack: in my main function, before creating the wsgi application object and calling app.run(), I look for WERKZEUG_RUN_MAIN:
if use_reloader and not os.environ.get('WERKZEUG_RUN_MAIN'):
logger.warning('startup: pid %d is the werkzeug reloader' % os.getpid())
else:
logger.warning('startup: pid %d is the active werkzeug' % os.getpid()
# my real init code is invoked from here
I have a feeling this would be better done from inside the application object, if there's a method that's called before Werkzeug starts serving it. I don't know of such a method, though.
This all boils down to: in Werkzeug's run_simple.py, there's only going to be one eventual call to make_server().serve_forever(), but there may be two calls to run_simple() (and the entire call stack up to that point) before we make it to make_server().

Keep code from running during syncdb

I have some code that throws causes syncdb to throw an error (because it tries to access the model before the tables are created).
Is there a way to keep the code from running on syncdb? something like:
if not syncdb:
run_some_code()
Thanks :)
edit: PS - I thought about using the post_init signal... for the code that accesses the db, is that a good idea?
More info
Here is some more info as requested :)
I've run into this a couple times, for instance... I was hacking on django-cron and determined it necessary to make sure there are not existing jobs when you load django (because it searches all the installed apps for jobs and adds them on load anyway).
So I added the following code to the top of the __init__.py file:
import sqlite3
try:
# Delete all the old jobs from the database so they don't interfere with this instance of django
oldJobs = models.Job.objects.all()
for oldJob in oldJobs:
oldJob.delete()
except sqlite3.OperationalError:
# When you do syncdb for the first time, the table isn't
# there yet and throws a nasty error... until now
pass
For obvious reasons this is crap. it's tied to sqlite and I'm there are better places to put this code (this is just how I happened upon the issue) but it works.
As you can see the error you get is Operational Error (in sqlite) and the stack trace says something along the lines of "table django_cron_job not found"
Solution
In the end, the goal was to run some code before any pages were loaded.
This can be accomplished by executing it in the urls.py file, since it has to be imported before a page can be served (obviously).
And I was able to remove that ugly try/except block :) Thank god (and S. Lott)
"edit: PS - I thought about using the post_init signal... for the code that accesses the db, is that a good idea?"
Never.
If you have code that's accessing the model before the tables are created, you have big, big problems. You're probably doing something seriously wrong.
Normally, you run syncdb approximately once. The database is created. And your web application uses the database.
Sometimes, you made a design change, drop and recreate the database. And then your web application uses that database for a long time.
You (generally) don't need code in an __init__.py module. You should (almost) never have executable code that does real work in an __init__.py module. It's very, very rare, and inappropriate for Django.
I'm not sure why you're messing with __init__.py when Django Cron says that you make your scheduling arrangements in urls.py.
Edit
Clearing records is one thing.
Messing around with __init__.py and Django-cron's base.py are clearly completely wrong ways to do this. If it's that complicated, you're doing it wrong.
It's impossible to tell what you're trying to do, but it should be trivial.
Your urls.py can only run after syncdb and after all of the ORM material has been configured and bound correctly.
Your urls.py could, for example, delete some rows and then add some rows to a table. At this point, all syncdb issues are out of the way.
Why don't you have your logic in urls.py?
Code that tries to access the models before they're created can pretty much exist only at the module level; it would have to be executable code run when the module is imported, as your example indicates. This is, as you've guessed, the reason by syncdb fails. It tries to import the module, but the act of importing the module causes application-level code to execute; a "side-effect" if you will.
The desire to avoid module imports that cause side-effects is so strong in Python that the if __name__ == '__main__': convention for executable python scripts has become commonplace. When just loading a code library causes an application to start executing, headaches ensue :-)
For Django apps, this becomes more than a headache. Consider the effect of having oldJob.delete() executed every time the module is imported. It may seem like it's executing only once when you run with the Django development server, but in a production environment it will get executed quite often. If you use Apache, for example, Apache will frequently fire up several child processes waiting around to handle requests. As a long-running server progresses, your Django app will get bootstrapped every time a handler is forked for your web server, meaning that the module will be imported and delete() will be called several times, often unpredictably. A signal won't help, unfortunately, as the signal could be fired every time an Apache process is initialized as well.
It isn't, btw, just a webserver that could cause your code to execute inadvertently. If you use tools like epydoc, for example they will import your code to generate API documentation. This in turn would cause your application logic to start executing, which is obviously an undesired side-effect of just running a documentation parser.
For this reason, cleanup code like this is either best handled by a cron job, which looks for stale jobs on a periodic basis and cleans up the DB. This custom script can also be run manually, or by any process (for example during a deployment, or as part of your unit test setUp() function to ensure a clean test run). No matter how you do it, the important point is that code like this should always be executed explicitly, rather than implicitly as a result of opening the source file.
I hope that helps. I know it doesn't provide a way to determine if syncdb is running, but the syncdb issue will magically vanish if you design your Django app with production deployment in mind.

Categories