One of my logger.debug() statements takes input that's fairly labour-intensive.
I know that I should do logger.debug("The spam is %s", spam_temperature) rather than logger.debug("The spam is {}".format(spam_temperature)). The problem is that the operations I need to perform to actually find out the spam_temperature are quite CPU-intensive, and I have no use for them if the logging level is, say, INFO.
What is the best practice in a case like this?
Found the answer myself - I added
python
if not logging.getLogger().isEnabledFor(logging.DEBUG):
return
before the section I wanted to avoid.
(I got the inspiration from https://stackoverflow.com/a/27849836/3061818)
Related
I've seen this multiple times in multiple places, but never have found a satisfying explanation as to why this should be the case.
So, hopefully, one will be presented here. Why should we (at least, generally) not use exec() and eval()?
EDIT: I see that people are assuming that this question pertains to web servers – it doesn't. I can see why an unsanitized string being passed to exec could be bad. Is it bad in non-web-applications?
There are often clearer, more direct ways to get the same effect. If you build a complex string and pass it to exec, the code is difficult to follow, and difficult to test.
Example: I wrote code that read in string keys and values and set corresponding fields in an object. It looked like this:
for key, val in values:
fieldName = valueToFieldName[key]
fieldType = fieldNameToType[fieldName]
if fieldType is int:
s = 'object.%s = int(%s)' % (fieldName, fieldType)
#Many clauses like this...
exec(s)
That code isn't too terrible for simple cases, but as new types cropped up it got more and more complex. When there were bugs they always triggered on the call to exec, so stack traces didn't help me find them. Eventually I switched to a slightly longer, less clever version that set each field explicitly.
The first rule of code clarity is that each line of your code should be easy to understand by looking only at the lines near it. This is why goto and global variables are discouraged. exec and eval make it easy to break this rule badly.
When you need exec and eval, yeah, you really do need them.
But, the majority of the in-the-wild usage of these functions (and the similar constructs in other scripting languages) is totally inappropriate and could be replaced with other simpler constructs that are faster, more secure and have fewer bugs.
You can, with proper escaping and filtering, use exec and eval safely. But the kind of coder who goes straight for exec/eval to solve a problem (because they don't understand the other facilities the language makes available) isn't the kind of coder that's going to be able to get that processing right; it's going to be someone who doesn't understand string processing and just blindly concatenates substrings, resulting in fragile insecure code.
It's the Lure Of Strings. Throwing string segments around looks easy and fools naïve coders into thinking they understand what they're doing. But experience shows the results are almost always wrong in some corner (or not-so-corner) case, often with potential security implications. This is why we say eval is evil. This is why we say regex-for-HTML is evil. This is why we push SQL parameterisation. Yes, you can get all these things right with manual string processing... but unless you already understand why we say those things, chances are you won't.
eval() and exec() can promote lazy programming. More importantly it indicates the code being executed may not have been written at design time therefore not tested. In other words, how do you test dynamically generated code? Especially across browsers.
Security aside, eval and exec are often marked as undesirable because of the complexity they induce. When you see a eval call you often don't know what's really going on behind it, because it acts on data that's usually in a variable. This makes code harder to read.
Invoking the full power of the interpreter is a heavy weapon that should be only reserved for very tricky cases. In most cases, however, it's best avoided and simpler tools should be employed.
That said, like all generalizations, be wary of this one. In some cases, exec and eval can be valuable. But you must have a very good reason to use them. See this post for one acceptable use.
In contrast to what most answers are saying here, exec is actually part of the recipe for building super-complete decorators in Python, as you can duplicate everything about the decorated function exactly, producing the same signature for the purposes of documentation and such. It's key to the functionality of the widely used decorator module (http://pypi.python.org/pypi/decorator/). Other cases where exec/eval are essential is when constructing any kind of "interpreted Python" type of application, such as a Python-parsed template language (like Mako or Jinja).
So it's not like the presence of these functions are an immediate sign of an "insecure" application or library. Using them in the naive javascripty way to evaluate incoming JSON or something, yes that's very insecure. But as always, its all in the way you use it and these are very essential functions.
I have used eval() in the past (and still do from time-to-time) for massaging data during quick and dirty operations. It is part of the toolkit that can be used for getting a job done, but should NEVER be used for anything you plan to use in production such as any command-line tools or scripts, because of all the reasons mentioned in the other answers.
You cannot trust your users--ever--to do the right thing. In most cases they will, but you have to expect them to do all of the things you never thought of and find all of the bugs you never expected. This is precisely where eval() goes from being a tool to a liability.
A perfect example of this would be using Django, when constructing a QuerySet. The parameters passed to a query accepts keyword arguments, that look something like this:
results = Foo.objects.filter(whatever__contains='pizza')
If you're programmatically assigning arguments, you might think to do something like this:
results = eval("Foo.objects.filter(%s__%s=%s)" % (field, matcher, value))
But there is always a better way that doesn't use eval(), which is passing a dictionary by reference:
results = Foo.objects.filter( **{'%s__%s' % (field, matcher): value} )
By doing it this way, it's not only faster performance-wise, but also safer and more Pythonic.
Moral of the story?
Use of eval() is ok for small tasks, tests, and truly temporary things, but bad for permanent usage because there is almost certainly always a better way to do it!
Allowing these function in a context where they might run user input is a security issue, and sanitizers that actually work are hard to write.
Same reason you shouldn't login as root: it's too easy to shoot yourself in the foot.
Don't try to do the following on your computer:
s = "import shutil; shutil.rmtree('/nonexisting')"
eval(s)
Now assume somebody can control s from a web application, for example.
Reason #1: One security flaw (ie. programming errors... and we can't claim those can be avoided) and you've just given the user access to the shell of the server.
Try this in the interactive interpreter and see what happens:
>>> import sys
>>> eval('{"name" : %s}' % ("sys.exit(1)"))
Of course, this is a corner case, but it can be tricky to prevent things like this.
I like using Lettuce to define test cases. In many cases, it's easy to write Lettuce scenarios in such a way that they can be run either atomically or as part of other scenarios in a feature. However, I find that Lettuce is also a useful tool to try and reason about and implement more complex integration tests. In these cases, it makes sense to break up the tests into scenarios, but define a dependency on a previous scenario. That way I can run a scenario without having to explicitly define which other scenarios need to be run. It also makes the dependency clear in the scenario definition. This might look something like this:
Scenario: Really long scenario
Given some condition
Given another condition
Then something
...
Scenario: A dependent scenario
Given the scenario "Really long scenario" has been run
Given new condition
Then some stuff
...
Then I could do something like:
#step('Given the scenario "([^"]*)" has been run')
def check_scenario(step, sentence):
scenario = get_scenario(sentence) # This what I don't know how to do
if not scenario.ran:
scenario.run()
How do you handle this situation? Are there any gotchas I'm missing with this approach? Taking a quick look through the API docs and the source code, it didn't seem like there was an easy way to retrieve a scenario by it's string.
The only thing I know about is defining new steps which call previously defined steps: Check out the tutorial about that topic. Maybe this can be a good workaround for your problem.
You can use world in order to store data between your scenarios.
#before.each_feature
def feature_setup(feature):
...
world.feature_data = dict()
...
You can access this data from everywhere you have access to world
You can mix it with your terrain.py file in order to save data between steps, scenarios, features.
I would like to know if there is a way of writing the below module code without having to add another indentation level the whole module code.
# module code
if not condition:
# rest of the module code (big)
I am looking for something like this:
# module code
if condition:
# here I need something like a `return`
# rest of the module code (big)
Note, I do not want to throw an Exception, the import should pass normally.
I don't know of any solution to that, but I guess you could put all your code in an internal module and import that if the condition is not met.
I know of no way to do this. The only thing I could imagine that would work would be return but that needs to be inside a function.
It's super hard to say without knowing what your higher-level goal is. (For instance, what is the condition? Why does it matter? Are you DEAD SURE you're not having an X-Y problem here? Can't you just tell us what your overall goal is?) It's also really hard to say without knowing how the module is going to be called. (As a script from the command line? By being imported by another module?) And it would help a lot to know (a) why you're trying to avoid indentation (WWII is over, and we don't need to ration spaces any more; or, to put it more kindly, Python is a language that uses indentation as a SYNTACTIC FEATURE, so saying "I can't use this syntactic feature" strikes many people as a weird constraint. It's like giving up if-then tests: you might theoretically be able to work around that constraint, possibly, sometimes, but why are you going into the boxing ring with your hands tied behind your back?), and (b) why you can't throw an exception (no, really: are you TOTALLY SURE you ABSOLUTELY CANNOT THROW ANY EXCEPTIONS AT ALL?).
As it is, all you've really done is ask a "how do I do X, given conditions A, B, and C?" question, without indicating why you want to do X, or why conditions A, B, and C exist, or even whether you're 100% sure they exist and cannot be worked around.
If what you're really saying is "I don't want to hit {TAB} 40 times while writing a function," then the real problem is that you need a better text editor. If what you're really saying is "I happen to find indentation to be aesthetically unpleasant," then you should think about (a) what the other side of the argument is; that is, why people Python's use of indentation as syntax to be useful; (b) whether your own aesthetic preferences in this regard are more important than the reasons you've come up with in (a); and (c) whether, given these things, Python is the right tool for you personally to be using to accomplish whatever your own larger-scale goal is. (It's OK to not like indentation as a syntactic feature; but this is so basic to Python that being philosophically opposed to it to an extent that rules it out is a strong indication that maybe Python is not the ideal language for you to accomplish your programming goals in.) If what you're really saying is that you would benefit from factoring code that needs to be run under two different sets of circumstances into two modules, then it would benefit you to refactor. If what you're saying is that you've got spaghetti code that winds up being totally impossible to refactor, then that's really the first problem to be addressed, before you try to abort module imports.
Python is so dynamic that it's not always clear what's going on in a large program, and looking at a tiny bit of source code does not always help. To make matters worse, editors tend to have poor support for navigating to the definitions of tokens or import statements in a Python file.
One way to compensate might be to write a special profiler that, instead of timing the program, would record the runtime types and paths of objects of the program and expose this data to the editor.
This might be implemented with sys.settrace() which sets a callback for each line of code and is how pdb is implemented, or by using the ast module and an import hook to instrument the code, or is there a better strategy? How would you write something like this without making it impossibly slow, and without runnning afoul of extreme dynamism e.g side affects on property access?
I don't think you can help making it slow, but it should be possible to detect the address of each variable when you encounter a STORE_FAST STORE_NAME STORE_* opcode.
Whether or not this has been done before, I do not know.
If you need debugging, look at PDB, this will allow you to step through your code and access any variables.
import pdb
def test():
print 1
pdb.set_trace() # you will enter an interpreter here
print 2
What if you monkey-patched object's class or another prototypical object?
This might not be the easiest if you're not using new-style classes.
You might want to check out PyChecker's code - it does (i think) what you are looking to do.
Pythoscope does something very similar to what you describe and it uses a combination of static information in a form of AST and dynamic information through sys.settrace.
BTW, if you have problems refactoring your project, give Pythoscope a try.
One of my favorite features about python is that you can write configuration files in python that are very simple to read and understand. If you put a few boundaries on yourself, you can be pretty confident that non-pythonistas will know exactly what you mean and will be perfectly capable of reconfiguring your program.
My question is, what exactly are those boundaries? My own personal heuristic was
Avoid flow control. No functions, loops, or conditionals. Those wouldn't be in a text config file and people aren't expecting to have understand them. In general, it probably shouldn't matter the order in which your statements execute.
Stick to literal assignments. Methods and functions called on objects are harder to think through. Anything implicit is going to be a mess. If there's something complicated that has to happen with your parameters, change how they're interpreted.
Language keywords and error handling are right out.
I guess I ask this because I came across a situation with my Django config file where it seems to be useful to break these rules. I happen to like it, but I feel a little guilty. Basically, my project is deployed through svn checkouts to a couple different servers that won't all be configured the same (some will share a database, some won't, for example). So, I throw a hook at the end:
try:
from settings_overrides import *
LOCALIZED = True
except ImportError:
LOCALIZED = False
where settings_overrides is on the python path but outside the working copy. What do you think, either about this example, or about python config boundaries in general?
There is a Django wiki page, which addresses exactly the thing you're asking.
http://code.djangoproject.com/wiki/SplitSettings
Do not reinvent the wheel. Use configparser and INI files. Python files are to easy to break by someone, who doesn't know Python.
Your heuristics are good. Rules are made so that boundaries are set and only broken when it's obviously a vastly better solution than the alternate.
Still, I can't help but wonder that the site checking code should be in the parser, and an additional configuration item added that selects which option should be taken.
I don't think that in this case the alternative is so bad that breaking the rules makes sense...
-Adam
I think it's a pain vs pleasure argument.
It's not wrong to put code in a Python config file because it's all valid Python, but it does mean you could confuse a user who comes in to reconfigure an app. If you're that worried about it, rope it off with comments explaining roughly what it does and that the user shouldn't edit it, rather edit the settings_overrides.py file.
As for your example, that's nigh on essential for developers to test then deploy their apps. Definitely more pleasure than pain. But you should really do this instead:
LOCALIZED = False
try:
from settings_overrides import *
except ImportError:
pass
And in your settings_overrides.py file:
LOCALIZED = True
... If nothing but to make it clear what that file does.. What you're doing there splits overrides into two places.
As a general practice, see the other answers on the page; it all depends. Specifically for Django, however, I see nothing fundamentally wrong with writing code in the settings.py file... after all, the settings file IS code :-)
The Django docs on settings themselves say:
A settings file is just a Python module with module-level variables.
And give the example:
assign settings dynamically using normal Python syntax. For example:
MY_SETTING = [str(i) for i in range(30)]
Settings as code is also a security risk. You import your "config", but in reality you are executing whatever code is in that file. Put config in files that you parse first and you can reject nonsensical or malicious values, even if it is more work for you. I blogged about this in December 2008.