PyDev: #DynamicAttrs for modules - python

In the program I am writing, I created a module called settings that declares a few constants but loads other from a configuration file, placing them in the module namespace (for example: the value of π might be in the code of the module, but the weight of the user in a configuration file).
This is such that in other modules, I can do:
from settings import *
Everything works fine for me but - using Aptana Studio / PyDev, the code analysis tool throws a lot of undefined variable errors like this:
I found here that there is a flag usable to prevent this behaviour in class docstrings, but it has no effect if I try to use it at module level. So I wonder two things:
Is there a way to selectively get rid of these errors (meaning that I wouldn't want to completely turn off the option "mark as errors the undefined variables": in other modules it could in fact be an error)?
If not, is there an alternative pattern to achieve what I want in terms of wild imports, but without confusing the code analysis tool?
Pre-emptive note: I am perfectly aware of the fact wild imports are discouraged.

Actually you'd probably have the same error even if it wasn't a wild import (i.e.: import settings / settings.MY_VARIABLE would still show an error because the code-analysis can't find it).
Aside from the #UndefinedVariable in each place that references it (CTRL+1 will show that option), I think that a better pattern for your module would be:
MY_VARIABLE = 'default value'
...
update_default_values() # Go on and override the defaults.
That way, the code-analysis (and anyone reading your module), would know which variables are expected.
Otherwise, if you don't know them before, I think a better approach would be having a method (i.e.: get_settings('MY_VARIABLE')).
Unrelated to the actual problem. I'd really advise against using a wild import here (nor even importing the constant... i.e.: from settings import MY_VARIABLE).
A better approach for a settings module is always using:
import settings
settings.MY_VARIABLE
(because otherwise, if any place decides it wants to change the MY_VARIABLE, any place that has put the reference in its own namespace will probably never get the changed variable).
An even safer approach would be having a method get_setting('var'), as it would allow you to a better lazy-loading of your preferences (i.e.: don't load on import, but when it's called the 1st time).

You can use Ctrl-1 on an error and choose #UndefinedVariable or type ##UndefinedVariable on a line that has an error you want to ignore.
You can try to add your module to be scanned by the PyDev interpreter by going to Window > Preferences, then PyDev > Interpreter - Python. Under the Libraries tab, click New Folder and browse to the directory that contains settings, then click Apply. Hopefully, Pydev will find your package and recognize the wildly-imported variables.

Related

Importing variable from another file changes it [duplicate]

While developing a largeish project (split in several files and folders) in Python with IPython, I run into the trouble of cached imported modules.
The problem is that instructions import module only reads the module once, even if that module has changed! So each time I change something in my package, I have to quit and restart IPython. Painful.
Is there any way to properly force reloading some modules? Or, better, to somehow prevent Python from caching them?
I tried several approaches, but none works. In particular I run into really, really weird bugs, like some modules or variables mysteriously becoming equal to None...
The only sensible resource I found is Reloading Python modules, from pyunit, but I have not checked it. I would like something like that.
A good alternative would be for IPython to restart, or restart the Python interpreter somehow.
So, if you develop in Python, what solution have you found to this problem?
Edit
To make things clear: obviously, I understand that some old variables depending on the previous state of the module may stick around. That's fine by me. By why is that so difficult in Python to force reload a module without having all sort of strange errors happening?
More specifically, if I have my whole module in one file module.py then the following works fine:
import sys
try:
del sys.modules['module']
except AttributeError:
pass
import module
obj = module.my_class()
This piece of code works beautifully and I can develop without quitting IPython for months.
However, whenever my module is made of several submodules, hell breaks loose:
import os
for mod in ['module.submod1', 'module.submod2']:
try:
del sys.module[mod]
except AttributeError:
pass
# sometimes this works, sometimes not. WHY?
Why is that so different for Python whether I have my module in one big file or in several submodules? Why would that approach not work??
import checks to see if the module is in sys.modules, and if it is, it returns it. If you want import to load the module fresh from disk, you can delete the appropriate key in sys.modules first.
There is the reload builtin function which will, given a module object, reload it from disk and that will get placed in sys.modules. Edit -- actually, it will recompile the code from the file on the disk, and then re-evalute it in the existing module's __dict__. Something potentially very different than making a new module object.
Mike Graham is right though; getting reloading right if you have even a few live objects that reference the contents of the module you don't want anymore is hard. Existing objects will still reference the classes they were instantiated from is an obvious issue, but also all references created by means of from module import symbol will still point to whatever object from the old version of the module. Many subtly wrong things are possible.
Edit: I agree with the consensus that restarting the interpreter is by far the most reliable thing. But for debugging purposes, I guess you could try something like the following. I'm certain that there are corner cases for which this wouldn't work, but if you aren't doing anything too crazy (otherwise) with module loading in your package, it might be useful.
def reload_package(root_module):
package_name = root_module.__name__
# get a reference to each loaded module
loaded_package_modules = dict([
(key, value) for key, value in sys.modules.items()
if key.startswith(package_name) and isinstance(value, types.ModuleType)])
# delete references to these loaded modules from sys.modules
for key in loaded_package_modules:
del sys.modules[key]
# load each of the modules again;
# make old modules share state with new modules
for key in loaded_package_modules:
print 'loading %s' % key
newmodule = __import__(key)
oldmodule = loaded_package_modules[key]
oldmodule.__dict__.clear()
oldmodule.__dict__.update(newmodule.__dict__)
Which I very briefly tested like so:
import email, email.mime, email.mime.application
reload_package(email)
printing:
reloading email.iterators
reloading email.mime
reloading email.quoprimime
reloading email.encoders
reloading email.errors
reloading email
reloading email.charset
reloading email.mime.application
reloading email._parseaddr
reloading email.utils
reloading email.mime.base
reloading email.message
reloading email.mime.nonmultipart
reloading email.base64mime
Quitting and restarting the interpreter is the best solution. Any sort of live reloading or no-caching strategy will not work seamlessly because objects from no-longer-existing modules can exist and because modules sometimes store state and because even if your use case really does allow hot reloading it's too complicated to think about to be worth it.
With IPython comes the autoreload extension that automatically repeats an import before each function call. It works at least in simple cases, but don't rely too much on it: in my experience, an interpreter restart is still required from time to time, especially when code changes occur only on indirectly imported code.
Usage example from the linked page:
In [1]: %load_ext autoreload
In [2]: %autoreload 2
In [3]: from foo import some_function
In [4]: some_function()
Out[4]: 42
In [5]: # open foo.py in an editor and change some_function to return 43
In [6]: some_function()
Out[6]: 43
For Python version 3.4 and above
import importlib
importlib.reload(<package_name>)
from <package_name> import <method_name>
Refer below documentation for details.
There are some really good answers here already, but it is worth knowing about dreload, which is a function available in IPython which does as "deep reload". From the documentation:
The IPython.lib.deepreload module allows you to recursively reload a
module: changes made to any of its dependencies will be reloaded
without having to exit. To start using it, do:
http://ipython.org/ipython-doc/dev/interactive/reference.html#dreload
It is available as a "global" in IPython notebook (at least my version, which is running v2.0).
HTH
You can use import hook machinery described in PEP 302 to load not modules themself but some kind of proxy object that will allow you to do anything you want with underlying module object — reload it, drop reference to it etc.
Additional benefit is that your currently existing code will not require change and this additional module functionality can be torn off from a single point in code — where you actually add finder into sys.meta_path.
Some thoughts on implementing: create finder that will agree to find any module, except of builtin (you have nothing to do with builtin modules), then create loader that will return proxy object subclassed from types.ModuleType instead of real module object. Note that loader object are not forced to create explicit references to loaded modules into sys.modules, but it's strongly encouraged, because, as you have already seen, it may fail unexpectably. Proxy object should catch and forward all __getattr__, __setattr__ and __delattr__ to underlying real module it's keeping reference to. You will probably don't need to define __getattribute__ because of you would not hide real module contents with your proxy methods. So, now you should communicate with proxy in some way — you can create some special method to drop underlying reference, then import module, extract reference from returned proxy, drop proxy and hold reference to reloaded module. Phew, looks scary, but should fix your problem without reloading Python each time.
I am using PythonNet in my project. Fortunately, I found there is a command which can perfectly solve this problem.
using (Py.GIL())
{
dynamic mod = Py.Import(this.moduleName);
if (mod == null)
throw new Exception( string.Format("Cannot find module {0}. Python script may not be complied successfully or module name is illegal.", this.moduleName));
// This command works perfect for me!
PythonEngine.ReloadModule(mod);
dynamic instance = mod.ClassName();
Think twice for quitting and restarting in production
The easy solution without quitting & restarting is by using the reload from imp
import moduleA, moduleB
from imp import reload
reload (moduleB)

Navigating a big Python codebase faster

As programmers we read more than we write. I've started working at a company that uses a couple of "big" Python packages; packages or package-families that have a high KLOC. Case in point: Zope.
My problem is that I have trouble navigating this codebase fast/easily. My current strategy is
I start reading a module I need to change/understand
I hit an import which I need to know more of
I find out where the source code for that import is by placing a Python debug (pdb) statement after the imports and echoing the module, which tells me it's source file
I navigate to it, in shell or the Vim file explorer.
most of the time the module itself imports more modules and before I know it I've got 10KLOC "on my plate"
Alternatively:
I see a method/class I need to know more of
I do a search (ack-grep) for the definition of that method/class across the whole codebase (which can be a pain because the codebase is partly in ~/.buildout-eggs)
I find one or more pieces of code that define that method/class
I have to deduce which one of them is the one I need to read
This costs a lot of time, which is understandable for a big codebase. But I get the feeling that navigating a large and unknown Python codebase is a common enough problem.
So I'm looking for technical tools or strategic solutions for this problem.
...
I just can't imagine hardcore Python programmers using the strategies outlined above.
on Vim, I like NERDTree (a file browser) and taglist.vim (source code browser --> http://www.vim.org/scripts/script.php?script_id=273)
also in Vim, you can use CTRL-] to jump to a definition (:h CTRL-]):
download exuberant ctags http://ctags.sourceforge.net/
follow the install directions and put it somewhere on your PATH
from the 'root' directory of your source code, make a tags file from the shell: "ctags -R"
(make sure you have :set noautochdir, and make sure :pwd is the root directory from step 3)
go into Vim, cursor over some function or class name, hit CTRL-]
by default, if there's multiple matches for the tag, it shows you everywhere it was imported, and where it was declared
if the tag only has one match, it immediately jumps to it
...then use Ctrl+O and Ctrl+I to move back and forth from where you were
(repeat above steps for the source code of particular libraries you use, i usually keep a separate Vim window open to study stuff)
I use ipython's ?? command
You just need to figure out how to import the things you want to look for, then add ?? to the end of the module or class or function or method name to view their source code. And the command completion helps on figuring out long names as well.
Try red pill: https://github.com/klen/python-mode

how do you statically find dynamically loaded modules

How does one get (finds the location of) the dynamically imported modules from a python script ?
so, python from my understanding can dynamically (at run time) load modules.
Be it using _import_(module_name), or using the exec "from x import y", either using imp.find_module("module_name") and then imp.load_module(param1, param2, param3, param4) .
Knowing that I want to get all the dependencies for a python file. This would include getting (or at least I tried to) the dynamically loaded modules, those loaded either by using hard coded string objects or those returned by a function/method.
For normal import module_name and from x import y you can do either a manual scanning of the code or use module_finder.
So if I want to copy one python script and all its dependencies (including the custom dynamically loaded modules) how should I do that ?
You can't; the very nature of programming (in any language) means that you cannot predict what code will be executed without actually executing it. So you have no way of telling which modules could be included.
This is further confused by user-input, consider: __import__(sys.argv[1]).
There's a lot of theoretical information about the first problem, which is normally described as the Halting problem, the second just obviously can't be done.
From a theoretical perspective, you can never know exactly what/where modules are being imported. From a practical perspective, if you simply want to know where the modules are, check the module.__file__ attribute or run the script under python -v to find files when modules are loaded. This won't give you every module that could possibly be loaded, but will get most modules with mostly sane code.
See also: How do I find the location of Python module sources?
This is not possible to do 100% accurately. I answered a similar question here: Dependency Testing with Python
Just an idea and I'm not sure that it will work:
You could write a module that contains a wrapper for __builtin__.__import__. This wrapper would save a reference to the old __import__and then assign a function to __builtin__.__import__ that does the following:
whenever called, get the current stacktrace and work out the calling function. Maybe the information in the globals parameter to __import__ is enough.
get the module of that calling functions and store the name of this module and what will get imported
redirect the call the real __import__
After you have done this you can call your application with python -m magic_module yourapp.py. The magic module must store the information somewhere where you can retrieve it later.
That's quite of a question.
Static analysis is about predicting all possible run-time execution paths and making sure the program halts for specific input at all.
Which is equivalent to Halting Problem and unfortunately there is no generic solution.
The only way to resolve dynamic dependencies is to run the code.

Why does mass importing not work but importing definition individually works?

So I just met a strange so-called bug. Because this work on my other .py files, but just on this file it suddenly stopped working.
from tuttobelo.management.models import *
The above used to work, but it stopped working all of a sudden, and I had to replace it with the bottom.
from tuttobelo.management.models import Preferences, ProductVariant, UserSeller, ProductOwner, ProductModel, ProductVariant
from tuttobelo.management.models import ProductMeta, ShippingMethods
I know the following is the better way of coding, however ALL of the models mentioned in models are used, so my question is, what possible reasons can wildcard stop working?
The error I got was that the model I was trying to import does not exist, only if I remove the wildcard and import the name of the model could I get it imported properly.
Thanks!
Maybe the models module has an __all__ which does not include what you're looking for. Anyway, from ... import * is never a good idea in production code -- we always meant the import * feature for interactive exploratory use, not production use. Specifically import the module you need -- use that name to qualify names that belong there -- and you'll be vastly happier in the long run!-)
There are some cases in Python where importing with * will not yield anything. In your example, if tuttobelo.management.models is a package (i.e. a directory with an __init__.py) with the files Preferences.py, ProductVariant.py, etc in it, importing with star will not work, unless you already have imported it explicitly somewhere else.
This can be solved by putting in the __init__.py:
__all__ = ['Preferences', 'ProductVariant', 'UserSeller', <etc...> ]
This will make it possible to do import * again, but as noted, that's a horrible coding style for several reasons. One, tools like pyflakes and pylint, and code introspection in your editor, stops working. Secondly, you end up putting a lot of names in the local namespace, which in your code you don't know where they come from, and secondly you can get clashes in names like this.
A better way is to do
from tuttobelo.management import models
And then refer to the other things by models.Preferences, models.ProductVariant etc. This however will not work with the __all__ variable. Instead you need to import the modules from the __init__.py:
import Preferences, ProductVariant, UserSeller, ProductOwner, <etc...>
The drawback of this is that all modules get imported even if you don't use them, which means it will take more memory.

IronPython - How to prevent CLR (and other modules) from being imported

I'm setting up a web application to use IronPython for scripting various user actions and I'll be exposing various business objects ready for accessing by the script. I want to make it impossible for the user to import the CLR or other assemblies in order to keep the script's capabilities simple and restricted to the functionality I expose in my business objects.
How do I prevent the CLR and other assemblies/modules from being imported?
This would prevent imports of both python modules and .Net objects so may not be what you want. (I'm relatively new to Python so I might be missing some things as well):
Setup the environment.
Import anything you need the user to have access to.
Either prepend to their script or execute:
__builtins__.__import__ = None #Stops imports working
reload = None #Stops reloading working (specifically stops them reloading builtins
#giving back an unbroken __import___!
then execute their script.
You'll have to search the script for the imports you don't want them to use, and reject the script in toto if it contains any of them.
Basically, just reject the script if it contains Assembly.Load, import or AddReference.
You might want to implement the protection using Microsoft's Code Access Security. I myself am not fully aware of its workings (or how to make it work with IPy), but its something which I feel you should consider.
There's a discussion thread on the IPy mailing list which you might want to look at. The question asked is similar to yours.
If you'd like to disable certain built-in modules I'd suggest filing a feature request over at ironpython.codeplex.com. This should be an easy enough thing to implement.
Otherwise you could simply look at either Importer.cs and disallow the import there or you could simply delete ClrModule.cs from IronPython and re-build (and potentially remove any references to it).
In case anyone comes across this thread from google still (like i did)
I managed to disable 'import clr' in python scripts by commenting out the line
//[assembly: PythonModule("clr", typeof(IronPython.Runtime.ClrModule))]
in ClrModule.cs, but i'm not convinced this is a full solution to preventing unwanted access, since you will still need to override things like the file builtin.

Categories