Write custom translation allocation in Pyramid - python

Pyramid uses gettext *.po files for translations, a very good and stable way to internationalize an application. It's one disadvantage is it cannot be changed from the app itself. I need some way to give a normal user the ability to change the translations on his own. Django allows changes in the file directly and after the change it restarts the whole app. I do not have that freedom, because the changes will be quite frequent.
Since I could not find any package that will help me with the task, I decided to override the Localizer. My idea is based on using a Translation Domain like Zope projects use and make Localizer search for registered domain, and if not found, back off to default translation strategy.
The problem is that I could not find a good way to place a custom translation solution into the Localizer itself. All I could think of is to reimplement the get_localizer method and rewrite the whole Localizer. But there are several things, that need to be copypasted here, such as interpolation of mappings and other tweeks related to translation strings.

I don't know how much things you have in there but I did something alike a while ago.. Will have to do it once again. The implementation is pretty simple...
If you can be sure that all calls will be handled trough _() or something alike. You can provide your own function. It will look something like it.
def _(val):
val = db.translations.find({key: id, locale: request.locale_name})
if val:
return val['value']
else:
return real_gettext(val)
this is pretty simple... then you need to have something that will dump the database into the file...
But I guess overriding the localizer makes more sense.. I did it a long time ago and overiding the function was easier than searching in the code.
The plus side of Localiser is that it will work everywhere. Monkey patch is pretty cool but it's also pretty ugly. If I had to do it again, I'd provide my own localizer that will load from a database first and then display it's own value. The reason behind database is that if someone closes the server and the file hasn't been modified you won't see the results.
If the DB is more than needed, then Localize is good enough and you can update the file on every update. If the server will get restarted it will load the new files... You'll have to compile the catalog first.

Related

Pythonic Django object reuse

I've been racking my brain on this for the last few weeks and I just can't seem to understand it. I'm hoping you folks here can give me some clarity.
A LITTLE BACKGROUND
I've built an API to help serve a large website and like all of us, are trying to keep the API as efficient as possible. Part of this efficiency is to NOT create an object that contains custom business logic over and over again (Example: a service class) as requests are made. To give some personal background I come from the Java world so I'm use to using a IoC or DI to help handle object creation and injection into my classes to ensure classes are NOT created over and over on a per request basis.
WHAT I'VE READ
While looking at many Python IoC and DI posts I've become rather confused on how to best approach creating a given class and not having to worry about the server getting overloaded with too many objects based on the amount of requests it may be handling.
Some people say an IoC or DI really isn't needed. But as I run my Django app I find that unless I construct the object I want globally (top of file) for views.py to use later rather than within each view class or def within views.py I run the change of creating multiple classes of the same type, which from what I understand would cause memory bloat on the server.
So what's the right way to be pythonic to keep objects from being built over and over? Should I invest in using an IoC / DI or not? Can I safely rely on setting up my service.py files to just contain def's instead of classes that contain def's? Is the garbage collector just THAT efficient so I don't even have to worry about it.
I've purposely not placed any code in this post since this seems like a general questions, but I can provide a few code examples if that helps.
Thanks
From a confused engineer that wants to be as pythonic as possible
You come from a background where everything needs to be a class, I've programmed web apps in Java too, and sometimes it's harder to unlearn old things than to learn new things, I understand.
In Python / Django you wouldn't make anything a class unless you need many instances and need to keep state.
For a service that's hardly the case, and sometimes you'll notice in Java-like web apps some services are made singletons, which is just a workaround and a rather big anti-pattern in Python
Pythonic
Python is flexible enough so that a "services class" isn't required, you'd just have a Python module (e.g. services.py) with a number of functions, emphasis on being a function that takes in something, returns something, in a completely stateless fashion.
# services.py
# this is a module, doesn't keep any state within,
# it may read and write to the DB, do some processing etc but doesn't remember things
def get_scores(student_id):
return Score.objects.filter(student=student_id)
# views.py
# receives HTTP requests
def view_scores(request, student_id):
scores = services.get_scores(student_id)
# e.g. use the scores queryset in a template return HTML page
Notice how if you need to swap out the service, you'll just be swapping out a single Python module (just a file really), so Pythonistas hardly bother with explicit interfaces and other abstractions.
Memory
Now per each "django worker process", you'd have that one services module, that is used over and over for all requests that come in, and when the Score queryset is used and no longer pointed at in memory, it'll be cleaned up.
I saw your other post, and well, instantiating a ScoreService object for each request, or keeping an instance of it in the global scope is just unnecessary, the above example does the job with one module in memory, and doesn't need us to be smart about it.
And if you did need to keep state in-between several requests, keeping them in online instances of ScoreService would be a bad idea anyway because now every user might need one instance, that's not viable (too many online objects keeping context). Not to mention that instance is only accessible from the same process unless you have some sharing mechanisms in place.
Keep state in a datastore
In case you want to keep state in-between requests, you'd keep the state in a datastore, and when the request comes in, you hit the services module again to get the context back from the datastore, pick up where you left it and do your business, return your HTTP response, then unused things will get garbage collected.
The emphasis being on keeping things stateless, where any given HTTP request can be processed on any given django process, and all state objects are garbage collected after the response is returned and objects go out of scope.
This may not be the fastest request/response cycle we can pull, but it's scalable as hell
Look at some major web apps written in Django
I suggest you look at some open source Django projects and look at how they're organized, you'll see a lot of the things you're busting your brains with, Djangonauts just don't bother with.

Programmatically modifying someones AppDelegate - categories, subclass?

I am working on a framework installer script. The script needs to modify the users AppDelegate file and inject a few lines of code at the beginning or end of the applicationDidFinishLaunching and applicationWillTerminatate methods.
Some options I've thought about:
Parse the source code, and insert lines at correct positions. (Can be difficult to get right and work for everyone's code, just about equivalent to writing a compiler...)
Subclass the AppDelegate file (is this possible?)
Categories??
Which of these is the best option? Any other suggestions?
If you really need to make this something that modifies the AppDelegate with no intervention at all from the developer, and you can modify the xcodeproj and the nib but not the source, there is a way to do it.
First, make sure your classes get compiled in, and an instance of your class gets created in the nib.
Now, here's what you do:
Define a -[AHHackClass applicationDidFinishLaunching] method that does your extra stuff, then calls the [self originalApplicationDidFinishLaunching].
In -[AHHackClass awakeFromNib:], use objc runtime calls to copy the -[AHHackClass applicationDidFinishLaunching] method to the application delegate as -[originalApplicationDidFinishLaunching], then use method swizzling to swap the two methods' implementations.
Do the same to swizzle applicationWillTerminate.
See JRSwizzle for some code that makes the method swizzling much easier, and MethodSwizzling at CocoaDev for some background.
However, there may be a much easier way to do this: Does your extra stuff really need to be called from the app delegate's applicationDidFinishLaunching and applicationWillTerminate methods? Can't you just set up to listen for notifications in your awakeFromNib and handle things there?
And if, for some reason, you can't do that, can you just put a line in the instructions to the developer to call your method from their applicationDidFinishLaunching method?
One solution I am currently considering:
Add NewAppDelegate.m/h file that subclasses AppDelegate.
This subclass, does what I want, and then calls the super methods.
Find/replace AppDelegate with NewAppDelegate.m.h in main.m
This seems pretty simple and robust. Thoughts on this? Will this work for all/most projects?

Recommended approach for loading CouchDB design documents in Python?

I'm very new to couch, but I'm trying to use it on a new Python project, and I'd like to use python to write the design documents (views), also. I've already configured Couch to use the couchpy view server, and I can confirm this works by entering some simple map/reduce functions into Futon.
Are there any official recommendations on how to load/synchronize design documents when using Python's couchdb module?
I understand that I can post design documents to "install" them into Couch, but my question is really around best practices. I need some kind of strategy for deploying, both in development environments and in production environments. My intuition is to create a directory and store all of my design documents there, then write some kind of sync script that will upload each one into couch (probably just blindly overwriting what's already there). Is this a good idea?
The documentation for "Writing views in Python" is 5 sentences, and really just explains how to install couchpy. On the project's google code site, there is mention of a couchdb.design module that sounds like it might help, but there's no documentation (that I can find). The source code for that module indicates that it does most of what I'm interested in, but it stops short of actually loading files. I think I should do some kind of module discovery, but I've heard that's non-Pythonic. Advice?
Edit:
In particular, the idea of storing my map/reduce functions inside string literals seems completely hacky. I'd like to write real python code, in a real module, in a real package, with real unit tests. Periodically, I'd like to synchronize my "couch views" package with a couchdb instance.
Here's an approach that seems reasonable. First, I subclass couchdb.design.ViewDefinition. (Comments and pydocs removed for brevity.)
import couchdb.design
import inflection
DESIGN_NAME="version"
class CurrentVersion(couchdb.design.ViewDefinition):
def __init__(self):
map_fun = self.__class__.map
if hasattr(self.__class__, "reduce"):
reduce_fun = self.__class__.reduce
else:
reduce_fun = None
super_args = (DESIGN_NAME,
inflection.underscore(self.__class__.__name__),
map_fun,
reduce_fun,
'python')
super(CurrentVersion, self).__init__(*super_args)
#staticmethod
def map(doc):
if 'version_key' in doc and 'created_ts' in doc:
yield (doc['version_key'], [doc['_id'], doc['created_ts']])
#staticmethod
def reduce(keys, values, rereduce):
max_index = 0
for index, value in enumerate(values):
if value[1] > values[max_index][1]:
max_index = index
return values[max_index]
Now, if I want to synchronize:
import couchdb.design
from couchview.version import CurrentVersion
db = get_couch_db() # omitted for brevity
couchdb.design.ViewDefinition.sync_many(db, [CurrentVersion()], remove_missing=True)
The benefits of this approach are:
Organization. All designs/views exist as modules/classes (respectively) located in a single package.
Real code. My text editor will highlight syntax. I can write unit tests against my map/reduce functions.
The ViewDefinition subclass can also be used for querying.
current_version_view = couchview.version.CurrentVersion()
result = current_version_view(self.db, key=version_key)
It's still not ready for production, but I think this is a big step closer compared to storing map/reduce functions inside string literals.
Edit: I eventually wrote a couple blog posts on this topic, since I couldn't find any other sources of advice:
http://markhaase.com/2012/06/23/couchdb-views-in-python/
http://markhaase.com/2012/07/01/unit-tests-for-python-couchdb-views/

How to make django test framework read from live database?

I realize there's a similar question here, but this one has a different approach: I have a django app that does queries over data indexed with djapian ; I'd like to write unit tests for this app's search component, and, obviously, I'd need the django settings module and all connections with the database active, so the test runner that django provides seems ideal. however, the django testing framework creates a dummy database and I'd hate to dump all my data to a fixture and then index it (the tests would take forever!);
My data isn't at risk because the tests would only read from the database, so, how could this be achieved? -I'm new at this whole unit testing thing, so the solution of writing a new test runner I read in that similar question doesn't enlighten me a bit, at least not without some details
Reading the test cases for djapian I found something really interesting: what those guys do is use the setUp method for the TestCase class: they create an object and then use the update method for the indexer, so they effectively have a document to search for and a way to write controlled query tests!
For the curious, the method looks something like this:
def setUp(self):
p = Person.objects.create(name="Alex")
for i in range(self.num_entries):
Entry.objects.create(author=p, title="Entry with number %s" % i, text="foobar " * i)
Entry.indexer.update()
I think this would do, but we have to remember I'm testing a little search engine here, so this solution might be the easy way out; I can't come up with an objection, though, so if you guys have an answer that'll help define a strategy for testing this kind of webApps in python in general, it's more than welcome!
-I think I'll settle for something like this for now (I wanted to test the latency of the queries with a fully populated database also, but I think I could do that later with bench tests in Funkload)
EDIT: Ok, to be faithful to a solution for anyone interested, I ran into another issue: the xapian index (as stated in the comment). To solve it, I created a default test runner that changed the production xapian index for a test index (a smaller one, created with a management script). This runner is fairly simple:
def custom_run_tests(test_labels, verbosity=1, interactive=True, extra_tests=[]):
"""Set the test indices"""
settings.CATEGORY_CLASSIFIER_DATA = settings.TEST_CLASSIFIER_DATA
return run_tests(test_labels, verbosity, interactive, extra_tests)
And, to use it, I simply added a setting:
TEST_RUNNER = 'search.tests.custom_run_tests'
I dropped the aforementioned approach (creating the documents in the setUp) for performance and readability reasons: to test the database I needed a decent amount of documents with some text (a paragraph or two), so I ended up creating a fixture for that (I used a management command that created the documents in the real database, serialized them -writing them to a file- and then deleted 'em).
So, in the end, I didn't read from the live db at all and instead used test fixtures created with a somewhat hacky script and a custom runner, and it wasn't that hard :)

Is it ever polite to put code in a python configuration file?

One of my favorite features about python is that you can write configuration files in python that are very simple to read and understand. If you put a few boundaries on yourself, you can be pretty confident that non-pythonistas will know exactly what you mean and will be perfectly capable of reconfiguring your program.
My question is, what exactly are those boundaries? My own personal heuristic was
Avoid flow control. No functions, loops, or conditionals. Those wouldn't be in a text config file and people aren't expecting to have understand them. In general, it probably shouldn't matter the order in which your statements execute.
Stick to literal assignments. Methods and functions called on objects are harder to think through. Anything implicit is going to be a mess. If there's something complicated that has to happen with your parameters, change how they're interpreted.
Language keywords and error handling are right out.
I guess I ask this because I came across a situation with my Django config file where it seems to be useful to break these rules. I happen to like it, but I feel a little guilty. Basically, my project is deployed through svn checkouts to a couple different servers that won't all be configured the same (some will share a database, some won't, for example). So, I throw a hook at the end:
try:
from settings_overrides import *
LOCALIZED = True
except ImportError:
LOCALIZED = False
where settings_overrides is on the python path but outside the working copy. What do you think, either about this example, or about python config boundaries in general?
There is a Django wiki page, which addresses exactly the thing you're asking.
http://code.djangoproject.com/wiki/SplitSettings
Do not reinvent the wheel. Use configparser and INI files. Python files are to easy to break by someone, who doesn't know Python.
Your heuristics are good. Rules are made so that boundaries are set and only broken when it's obviously a vastly better solution than the alternate.
Still, I can't help but wonder that the site checking code should be in the parser, and an additional configuration item added that selects which option should be taken.
I don't think that in this case the alternative is so bad that breaking the rules makes sense...
-Adam
I think it's a pain vs pleasure argument.
It's not wrong to put code in a Python config file because it's all valid Python, but it does mean you could confuse a user who comes in to reconfigure an app. If you're that worried about it, rope it off with comments explaining roughly what it does and that the user shouldn't edit it, rather edit the settings_overrides.py file.
As for your example, that's nigh on essential for developers to test then deploy their apps. Definitely more pleasure than pain. But you should really do this instead:
LOCALIZED = False
try:
from settings_overrides import *
except ImportError:
pass
And in your settings_overrides.py file:
LOCALIZED = True
... If nothing but to make it clear what that file does.. What you're doing there splits overrides into two places.
As a general practice, see the other answers on the page; it all depends. Specifically for Django, however, I see nothing fundamentally wrong with writing code in the settings.py file... after all, the settings file IS code :-)
The Django docs on settings themselves say:
A settings file is just a Python module with module-level variables.
And give the example:
assign settings dynamically using normal Python syntax. For example:
MY_SETTING = [str(i) for i in range(30)]
Settings as code is also a security risk. You import your "config", but in reality you are executing whatever code is in that file. Put config in files that you parse first and you can reject nonsensical or malicious values, even if it is more work for you. I blogged about this in December 2008.

Categories