When uses numpy + setproctitle, titles are truncated to 11 characters. Any ideas why is happening that?
from setproctitle import setproctitle
import numpy
setproctitle("ETL-1234567890123456789")
# It's truncated to "ETL-1234567"
If I remove numpy import it works.
It works fine on OSX but not in Ubuntu 14.04
My numpy version 1.9.0
As the docs say, setproctitle wraps up source code from Postgres that does different things on each platform.
On OS X, like most *BSD systems, just reassigning argv[0] to a pointer another string is sufficient.* But on Linux, it's not; you have to leave argv[0] pointing to the same place, and clobber that buffer (possibly rearranging the other arguments and even the environment to make space).**
* Well, not quite; you also have to change _NSGetArgv() and _NSGetEnviron().
** How does this not screw up the rest of your code that might want to access argv or env? It makes a deep copy, then reassigns your globals so the rest of your code will see that copy; only the OS sees the original buffer.
According to the comments, this has to be done "early in startup".* Touching sys.argv or sys.environ from Python shouldn't actually matter, because those operate on copies, but NumPy is written in C, and does all kinds of stuff when it's imported that could conceivably be a problem.**
* Presumably this is because other code might either keep references to arguments or env variables that are about to be clobbered, or might call functions like setenv that might themselves copy the data to a new buffer so we don't end up operating on the one the OS sees.
** It's even possible that the problem is just that NumPy imports sys or another stdlib module, and that's what causes the problem.
So, I think the answer is to make sure you call setproctitle before importing numpy—or, to be safe, any C extension modules (maybe even some of the ones in the stdlib). In other words:
from setproctitle import setproctitle
setproctitle("ETL-1234567890123456789")
import numpy
Alternatively, it may be sufficient to delay the import of setproctitle until right before you call it:*
import numpy
from setproctitle import setproctitle
setproctitle("ETL-1234567890123456789")
* The module init calls spt_setup, which does all the horrible hackery needed to find the real argv buffer. So, most likely, it's too late to do this after importing NumPy. But possibly, it's OK to do it, and the problem is only problem is the results changing before you use them.
But either way, don't do anything between the import and the call.
(If this doesn't work, let me know and I'll research further and/or delete the answer.)
Related
We recently had a talk about removing any circular imports we might have and refactoring our code to not use imports inside functions anymore.
One of the ways that is recommended for this in many places (including PEP-8 style guide here - under imports) is to use:
import a
a.foo()
instead of:
from a import foo
foo()
I saw some examples and got convinced that this is probably a better way to do the imports (even though im used to the from way of importing).
But I dont understand is how come 99% of python examples dont use this way of importing. And why my pycharm is not allowing auto-import in this manner (or is it?) if this is the right way to import.
Looking into pycharm configurations, I didnt find a way to auto-complete with import y.x instead of from y import x
You can't use import x instead of from y import x. You must use import x.y if y is a submodule, or import x; a = x.y if y is a variable (class, function, constant ...)
To make Pycharm auto complete with the root module name, you have to start typing the root module name instead of the sub-element you want to use.
By exemple, if you want to use the split function of os.path, if you start typing split and then hit double CTRL-SPACE, Pycharm will auto-import with from os.path import split. But if you start typing os, Pycharm will auto-complete with import os and the you can end your statement with .path.split
With very common Python modules, I find that importing using the from .. import statement greatly increases the readability of my code, since I can reference methods by name without the dot notation. However, in some modules, the methods I require are nested differently, e.g in os:
from os.path import join
from os import listdir, getcwd
Why doesn't from os import path.join, listdir, getcwd work? What would be a "pythonic" way to import all the methods I need in a more succinct manner?
The opinion on whether from <module> import <identifier> is Pythonic itself is quite split - it hides away the origin of a method so it's not easy to figure out whence a certain variable/function comes from just by perusing the code. On the other hand, it reduces verbosity which some people consider Pythonic even tho it's not specifically mandated. Either way, Pythonic is as elusive term as you're going to get and more often than not it means "the way I think Python code should look like" backed up by several PEPs and obscure mail list posts while conveniently omitting the ones that go against one's notion of Pythonic.
from os import path.join doesn't work because os defines the os.path module (by directly writing to sys.modules of all things), it's not an identifier in the os module itself. path, however, is an identifier in the os module pointing to the os.path module so you can do from os import path or from os.path import join.
Finally, succinct and Pythonic are not synonyms, in fact PEP 8 for example prescribes using multiple lines for multiple imports even tho you can succinctly write import <module1>, <module2>, <module3> .... It says that it's OK to import multiple identifiers like that, tho, but keep in mind that os and os.path are two different modules so based on PEP 8 they shouldn't be on the same line and therefore should be written as:
from os import <identifier_1>, <identifier_2>
from os.path import <identifier_3>, <identifier_4>
Now, I would go as far as claiming that this is Pythonic but it makes the most sense based on PEP 8, at least to me.
I was recently tasked with maintaining a bunch of code that uses from module import * fairly heavily.
This codebase has gotten big enough that import conflicts/naming ambiguity/"where the heck did this function come from, there are like eight imported modules that have one with the same name?!"ism have become more and more common.
Moving forward, I've been using explicit members (i.e. import module ... module.object.function() to make the maintenance work I do more readable.
But I was wondering: is there an IDE or utility which robustly parses Python code and refactors * import statements into module import statements, and then prepends the full module path onto all references to members of that module?
We're not using metaprogramming/reflection/inspect/monkeypatching heavily, so if aforementened IDE/util behaves poorly with such things, that is OK.
Not a perfect solution, but what I usually do is this:
Open Pydev
Remove all * imports
Use the optimize imports command (ctrl+shift+o) to re-add all the imports
Roughly solves the problem :)
If you want to build a solution yourself, try http://docs.python.org/library/modulefinder.html
Here are the other related tools mentioned:
working with AST directly, which is very low-level for your use.
working with modulefinder which may have a lot of the boilerplate code you are looking for,
rope, a refactoring library (#Lucas Graf),
the bicycle repair man, a refactoring libary
the logilab-astng library used in pylint
More about pylint
pylint is a very good tool built on top of ast that is already able to tell you where in your code there are from somemodule import * statements, as well as telling you which imports are not necessary.
example:
# next is what's on line 32
from re import *
this will complain:
W: 32,0: Wildcard import re
W: 32,0: Unused import finditer from wildcard import
W: 32,0: Unused import LOCALE from wildcard import
... # this is a long list ...
Towards a solution?
Note that in the above output pylint gives you the line numbers. it might be some effort, but a refactoring tool can look at those particular warnings, get the line number, import the module and look at the __all__ list, or using a sandboxed execfile() statement to see the module's global names (would modulefinder help with that? maybe...). With the list of global names from __all__ and the names that pylint complains about, you can have two set() and proceed to get the difference. Replace the line featuring wildcard imports with specific imports.
I wrote some refactoring tools to do just that. Star Namer will go through all of your wildcard * imports for a script and replace them with the actual functions to be imported.
Usage: ./star_namer.py module_filename script_filename
Once you've converted all of your star imports to actual names you can use from_to_import.py to fix them. This is how it works:
Running your script through pylint and counting up all of the currently undefined words.
Removing all of the from modname import lines from the script.
Running the script through pylint again and comparing the difference in undefined words.
Going through the json output of pylint (in reverse order), it determines the exact position of replacements to be made and inserts the modname. in the correct place.
I thought this approach would be a little more robust, by offloading the syntax processing to an advanced utility, that's designed for it, instead of trying to grep through all the text myself with regex expressions.
Usage: from_to_import.py script_name modname
It will show you what changes are to be made before making them. Press y to save. The main issues I've found so far are text alignment issues caused by inserting the modname. text which makes comments misaligned and it doesn't deal with aliased function names well (from ... import quickrun as qrun)
Full documentation here: https://github.com/SurpriseDog/Star-Wrangler
I am working on a large Python program which makes use of a multitude of modules depending on command-line options, in particular, numpy. We have recently found a need to run this on a small embedded module which precludes the use of numpy. From our perspective, this is easy enough (just don't use the problematic command line options.)
However, following PEP 8, our import numpy is at the beginning of each module that might need it, and the program will crash due to numpy not being installed. The straightforward solution is to move import numpy from the top of the file to the functions that need it. The question is, "How bad is this"?
(An alternative solution is to wrap import numpy in a try .. except. Is this better?)
Here is a best practice pattern to check if a module is installed and make code branch depending on it.
# GOOD
import pkg_resources
try:
pkg_resources.get_distribution('numpy')
except pkg_resources.DistributionNotFound:
HAS_NUMPY = False
else:
HAS_NUMPY = True
# You can also import numpy here unless you want to import it inside the function
Do this in every module imports having soft dependency to numpy. More information in Plone CMS coding conventions.
Another idiom which I've seen is to import the module as None if unavailable:
try:
import numpy as np
except ImportError:
np = None
Or, as in the other answer, you can use the pkg_resources.get_distribution above, rather than try/except (see the blog post linked to from the plone docs).
In that way, before using numpy you can hide numpy's use in an if block:
if np:
# do something with numpy
else:
# do something in vanilla python
The key is to ensure your CI tests have both environments - with and without numpy (and if you are testing coverage this should count both block as covered).
I was recently tasked with maintaining a bunch of code that uses from module import * fairly heavily.
This codebase has gotten big enough that import conflicts/naming ambiguity/"where the heck did this function come from, there are like eight imported modules that have one with the same name?!"ism have become more and more common.
Moving forward, I've been using explicit members (i.e. import module ... module.object.function() to make the maintenance work I do more readable.
But I was wondering: is there an IDE or utility which robustly parses Python code and refactors * import statements into module import statements, and then prepends the full module path onto all references to members of that module?
We're not using metaprogramming/reflection/inspect/monkeypatching heavily, so if aforementened IDE/util behaves poorly with such things, that is OK.
Not a perfect solution, but what I usually do is this:
Open Pydev
Remove all * imports
Use the optimize imports command (ctrl+shift+o) to re-add all the imports
Roughly solves the problem :)
If you want to build a solution yourself, try http://docs.python.org/library/modulefinder.html
Here are the other related tools mentioned:
working with AST directly, which is very low-level for your use.
working with modulefinder which may have a lot of the boilerplate code you are looking for,
rope, a refactoring library (#Lucas Graf),
the bicycle repair man, a refactoring libary
the logilab-astng library used in pylint
More about pylint
pylint is a very good tool built on top of ast that is already able to tell you where in your code there are from somemodule import * statements, as well as telling you which imports are not necessary.
example:
# next is what's on line 32
from re import *
this will complain:
W: 32,0: Wildcard import re
W: 32,0: Unused import finditer from wildcard import
W: 32,0: Unused import LOCALE from wildcard import
... # this is a long list ...
Towards a solution?
Note that in the above output pylint gives you the line numbers. it might be some effort, but a refactoring tool can look at those particular warnings, get the line number, import the module and look at the __all__ list, or using a sandboxed execfile() statement to see the module's global names (would modulefinder help with that? maybe...). With the list of global names from __all__ and the names that pylint complains about, you can have two set() and proceed to get the difference. Replace the line featuring wildcard imports with specific imports.
I wrote some refactoring tools to do just that. Star Namer will go through all of your wildcard * imports for a script and replace them with the actual functions to be imported.
Usage: ./star_namer.py module_filename script_filename
Once you've converted all of your star imports to actual names you can use from_to_import.py to fix them. This is how it works:
Running your script through pylint and counting up all of the currently undefined words.
Removing all of the from modname import lines from the script.
Running the script through pylint again and comparing the difference in undefined words.
Going through the json output of pylint (in reverse order), it determines the exact position of replacements to be made and inserts the modname. in the correct place.
I thought this approach would be a little more robust, by offloading the syntax processing to an advanced utility, that's designed for it, instead of trying to grep through all the text myself with regex expressions.
Usage: from_to_import.py script_name modname
It will show you what changes are to be made before making them. Press y to save. The main issues I've found so far are text alignment issues caused by inserting the modname. text which makes comments misaligned and it doesn't deal with aliased function names well (from ... import quickrun as qrun)
Full documentation here: https://github.com/SurpriseDog/Star-Wrangler