Should I create a factory class or a getobj method? - python

I've been reading stackoverflow for years, and this is my first post. I've tried searching, but I can't really find anything that I both understand, and matches my scenario. Yes, I'm total OOP newbie, so please explain things as plainly as possible.
OK, I'm trying to write a Python script that calls Rsync to make a backup.
I'm essentially calling the script from root's crontab. Because this opens up security issues, I'm going to be reading in the directories that need to be backed up (and a few options for the rsync command for each directory) from a configuration file in a form that the ConfigParser module will understand.
So, I'm at a point where I want to create objects to represent backup directory. My question is thus:
Do I make a separate object factory class, and send it all of the relevant information that was gleaned while parsing the config file? Alternatively, do I put all the object creation stuff in a method in my existing configuration parsing class?
I hope at least some of that makes sense.
Please note this is both for production use and a learning project for me to learn Object Oriented programming and design, so yes, it is probably overkill to go OOP on this, but I want to learn this stuff!
Thanks for your advice and help!
Here's some of what I'm got so far (pseudo code):
import ConfigParser
import Logging
import os
class ParseConf(object):
[set up logging facilities]
def __init__(self,conffile = "/etc/pybackup/pyback.conf"):
self.conffile = conffile
self.confvals = {}
def getdirs():
create configparser instance
read config file
add values to dictionary self.confvals
def getdirobj():
either create a list of objects here or send self.confvals
to a factory object that returns a list of objects.

Related

Common logging module in Python

I am new to python and just trying to learn and find better ways to write code. I want to create a custom class for logging and use package logging inside it. I want the function in this class to be reusable and do my logging from other scripts rather than writing custom code in each and every scripts. Is there a good link you guys can share? Is this the right way to handle logging? I just want to avoid writing the same code in every script if I can reuse it from one module.
I would highly appreciate any reply.
You can build a custom class that utilizes the built in python logging library. There isn't really any right way to handle logging as the library allows you to use 5 standard levels indicating the severity of events (DEBUG, INFO, WARNING, ERROR, and CRITICAL). The way you use these levels are application specific. Here's another good explanation of the package.
It's indeed a good idea to keep all your logging configuration (formatters, level, handlers) in one place.
create a class wrapping a custom logger with your configuration
expose methods for logging with different levels
import this class wherever you want
create an instance of this class to log where you want
To make sure all you custom logging objects have the same config, you should make logging class own the configuration.
I don't think there's any links I can share for the whole thing but you can find links for the individual details I mentioned easily enough.

How to show documentation of a module and sub class to a user in python

I am trying to code up a module which has two classes. First class is called as TextProcessing
class TextProcessing(object):
""" To carry out text processing
"""
def __init__(self,):
pass
It has various methods in there for pre-processing text.
Similary other class is for other data wrangling on pre-processed data.
I am saving these two classes in a python file to make it a module.
Now lets say a user downloads this python module and would now want to run the various methods of each class.
I wanted to provide some sort of documentation about the module, methods of each class to a user when she imports the module so that she is aware of which function to call and what parameters to pass.
Think of how a scikit learn documentation is on their documentation page.
http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.TfidfTransformer.html
Even the documentation we get to see when we do
help(some_python_module)
is fine too.
Issue is I don't have a documentation page like sklearn to show documentation. And I wanted a user to know documentation of various methods she can use once she imports the module in python console.
Is there a way I can print that documentation info to the console when a user imports the module?
It can show the doc string of each Class and Method.
This is a very weird thing to do, but it's definitely possible.
The easiest thing to do is just to call help. While it's intended to be called from the interactive prompt, there's nothing stopping you from calling it from your own code.
Of course you could instead extract the docstrings (they're stored as __doc__ on every module, class, and function), textwrap them yourself, and print them out, but if you're trying to reproduce the same thing help does, that's a lot of work for no real benefit.
The only tricky bit is that the thing you want to invoke the help system on is "this current module". How do you refer to that? It's a bit clunky, but you have this current module's name as __name__, so you can look it up in sys.modules.
So:
"""Helpful module"""
import sys
class Spam:
"""Classy class"""
def eggs(self):
"Functional function"
return 2
help(sys.modules[__name__])
Now, when you import helpful for the first time in a session, it will print out the help.
Of course that will be pretty odd if someone's trying to run a script that does an import helpful, rather than doing it from an interactive session. So you may want to only do this in interactive sessions, by checking sys.flags:
if sys.flags.interactive:
help(sys.modules[__name__])
What if someone does an import otherthing, and that otherthing does an import helpful? You'll get the same help, which may be confusing.
If that's a problem, the only real option I can think of is to check whether the calling frame comes from the top-level script (and that the flags are interactive). That's pretty hacky, and something you shouldn't even consider unless you really need to, so I'll just direct you to the inspect module and hope you don't need it.

Make global variables available in multiple modules

I am creating an application consisting of several modules. There is one main.py file which will be the file to run the application. The main.py file will load the configuration file(s) and put them in the 'config'-variable. It will also import the application-module-file (the file which holds the source-code of the application itself, a.k.a. application-class) and start the instance.
I am not very experienced in coding Python, and my biggest question is if I am doing it the right way, by using a main-file to handle all needed stuff (loading configuration-files for example). The problem I am having right now is that I cannot access the 'config'-variable that was defined in the main.py-file from any other module and/or Python-file.
Is it possible to make a global variable for configuration-values exc.? I know in PHP I used to create a singleton object which holds all the specific global arguments. I could also create a global 'ROOT'-variable to hold the full path to the root of the application, which is needed to load/import new files, this is also not possible in Python as far as I know.
I hope someone can help me out of this or send me in the right direction so I can continue working on this project.
The answer seems to be by Matthias:
Use from AppName.modules import settings and then access the data in the module with settings.value. According to PEP-8, the style guide for Python code, wildcard imports should be avoided and would in fact lead to undesirable behaviour in this case.
Thanks you all for the help!

How to store application settings across modules [duplicate]

This question already has answers here:
How to save application settings in a config file?
(3 answers)
Closed 9 years ago.
I received a project from developer who left our company. Not too complex, but it doesn't look very nice.
So here is the question:
Application has some modules and one is "settings" which stores some app. options (not all possible options, lets say just two: foo and bar).
When application is started it reads options from command line (using argparse):
parser.add_argument('--foo', action='store_true')
parser.add_argument('--bar', action='store_true')
parser.add_argument('--baz', action='store_true')
And then it performs this nasty thing:
for name, val in parser.parse_args(sys.argv[1:])._get_kwargs():
setattr(sys.modules['settings'], name, val)
First: I think this is dirty, non-pythonic hack. And second, it is simply inconvenient to use such code, because when I need to use settings.baz, IDE complaints that it doesn't exist.
The intention of this hack is to make options parsed from command line available in all modules that are used in application further.
I'm thinking about something like singleton pattern, but I only used it once in PHP, and don't know if this correct solution in python. And even if it is, can someone show example?
I'm noob in python and on SO, please be kind to me :)
Thanks.
p.s. I'm sorry for possible mistakes in my English
Modules in Python are singleton objects, and using one to store the settings used by the other modules would be a very Pythonic
The second line of the "nasty thing" is just setting the attributes of a module named settings and so isn't that bad. What's worse is the _get_kwargs() part of the first line which is accessing a private attribute of the argparse.Namespace object returned by parser.parse_args() to get the names and values of the settings parsed from the command-line. A slightly better way to do it might be something like this:
import settings # possibly empty .py file
for name, val in vars(parser.parse_args(sys.argv[1:])).iteritems():
setattr(settings, name, val)
However this won't fix your IDE problems because the IDE doesn't know the name of settings added dynamically. A simple way to fix that would be to define all the possible attributes with some kind of default values in a settings.py module instead of having an empty one.
The first time a module is imported an entry for it is added to the sys.modules dictionary with its name as the key and an instance of types.ModuleType as a value. Subsequent imports will first check to see if an entry for it already exists and will skip reloading the file if it does -- which is why I claim they're basically singleton objects. Modifications made to its attributes will immediately be visible to other modules that have imported it or do so afterwards, so it's generally a good data sharing mechanism within an application.
Look this Config (A hierarchical, easy-to-use, powerful configuration module for Python )
Detailed doc & examples

Recommended approach for loading CouchDB design documents in Python?

I'm very new to couch, but I'm trying to use it on a new Python project, and I'd like to use python to write the design documents (views), also. I've already configured Couch to use the couchpy view server, and I can confirm this works by entering some simple map/reduce functions into Futon.
Are there any official recommendations on how to load/synchronize design documents when using Python's couchdb module?
I understand that I can post design documents to "install" them into Couch, but my question is really around best practices. I need some kind of strategy for deploying, both in development environments and in production environments. My intuition is to create a directory and store all of my design documents there, then write some kind of sync script that will upload each one into couch (probably just blindly overwriting what's already there). Is this a good idea?
The documentation for "Writing views in Python" is 5 sentences, and really just explains how to install couchpy. On the project's google code site, there is mention of a couchdb.design module that sounds like it might help, but there's no documentation (that I can find). The source code for that module indicates that it does most of what I'm interested in, but it stops short of actually loading files. I think I should do some kind of module discovery, but I've heard that's non-Pythonic. Advice?
Edit:
In particular, the idea of storing my map/reduce functions inside string literals seems completely hacky. I'd like to write real python code, in a real module, in a real package, with real unit tests. Periodically, I'd like to synchronize my "couch views" package with a couchdb instance.
Here's an approach that seems reasonable. First, I subclass couchdb.design.ViewDefinition. (Comments and pydocs removed for brevity.)
import couchdb.design
import inflection
DESIGN_NAME="version"
class CurrentVersion(couchdb.design.ViewDefinition):
def __init__(self):
map_fun = self.__class__.map
if hasattr(self.__class__, "reduce"):
reduce_fun = self.__class__.reduce
else:
reduce_fun = None
super_args = (DESIGN_NAME,
inflection.underscore(self.__class__.__name__),
map_fun,
reduce_fun,
'python')
super(CurrentVersion, self).__init__(*super_args)
#staticmethod
def map(doc):
if 'version_key' in doc and 'created_ts' in doc:
yield (doc['version_key'], [doc['_id'], doc['created_ts']])
#staticmethod
def reduce(keys, values, rereduce):
max_index = 0
for index, value in enumerate(values):
if value[1] > values[max_index][1]:
max_index = index
return values[max_index]
Now, if I want to synchronize:
import couchdb.design
from couchview.version import CurrentVersion
db = get_couch_db() # omitted for brevity
couchdb.design.ViewDefinition.sync_many(db, [CurrentVersion()], remove_missing=True)
The benefits of this approach are:
Organization. All designs/views exist as modules/classes (respectively) located in a single package.
Real code. My text editor will highlight syntax. I can write unit tests against my map/reduce functions.
The ViewDefinition subclass can also be used for querying.
current_version_view = couchview.version.CurrentVersion()
result = current_version_view(self.db, key=version_key)
It's still not ready for production, but I think this is a big step closer compared to storing map/reduce functions inside string literals.
Edit: I eventually wrote a couple blog posts on this topic, since I couldn't find any other sources of advice:
http://markhaase.com/2012/06/23/couchdb-views-in-python/
http://markhaase.com/2012/07/01/unit-tests-for-python-couchdb-views/

Categories