Custom Python exception with different include paths - python

Update: This is, as I was told, no principle Python related problem, but seems to be more specific. See below for more explanations to my problem.
I have a custom exception (let's call it CustomException), that lives in a file named exceptions.py. Now imagine, that I can import this file via two paths:
import application.exceptions
or
import some.application.exceptions
with the same result. Furthermore I have no control over which way the module is imported in other modules.
Now to show my problem: Assume that the function do_something comes from another module that imports exceptions.py in a way I don't know. If I do this:
import application.exceptions
try:
do_something ()
except application.exceptions.CustomException:
catch_me ()
it might work or not, depending on how the sub-module imported exceptions.py (which I do not know).
Question: Is there a way to circumvent this problem, i.e., a name for the exception that will always be understood regardless of inclusion path? If not, what would be best practices to avoid these name clashes?
Cheers,
Update
It is a Django app. some would be the name of the Django 'project', application the name of one Django app. My code with the try..except clause sits in another app, frontend, and lives there as a view in a file some/frontend/views.py.
The PYTHONPATH is clean, that is, from my project only /path/to/project is in the path. In the frontend/views.py I import the exceptions.py via import application.exceptions, which seems to work. (Now, in retrospective, I don't know exactly, why it works...)
The exception is raised in the exceptions.py file itself.
Update 2
It might be interesting for some readers, that I finally found the place, where imports went wrong.
The sys.path didn't show any suspect irregularities. My Django project lay in /var/www/django/project. I had installed the apps app1 and app2, but noted them in the settings.py as
INSTALLED_APPS = [
'project.app1',
'project.app2',
]
The additional project. was the culprit for messing up sys.modules. Rewriting the settings to
INSTALLED_APPS = [
'app1',
'app2',
]
solved the problem.

Why that would be a problem? exception would me matched based on class type and it would be same however it is imported e.g.
import exceptions
l=[]
try:
l[1]
except exceptions.IndexError,e:
print e
try:
l[1]
except IndexError,e:
print e
both catch the same exception
you can even assign it to a new name, though not recommended usually
import os
os.myerror = exceptions.IndexError
try:
l[1]
except os.myerror,e:
print e

"If not, what would be best practices to avoid these name clashes?"
That depends entirely on why they happen. In a normal installation, you can not import from both application.exceptions and somepath.application.exceptions, unless the first case is a relative path from within the module somepath. And in that case Python will understand that the modules are the same, and you won't have a problem.
You are unclear on if you really have a problem or if it's theory. If you do have a problem, I'd guess that there is something fishy with your PYTHONPATH. Maybe both a directory and it's subdirectory is in the PATH?

Even if the same module is imported several times and in different ways, the CustomException class is still the same object, so it doesn't matter how you refer to it.

I don't know if there is a way to handle this inclusion path issue.
My suggestion would be to use the 'as' keyword in your import
Something like:
import some.application.exceptions as my_exceptions
or
import application.exceptions as my_exceptions

Related

Python can't find module to import despite __init__.py files being present

I've looked all over for a day or so and am about ready to pull my hair out. Help me, SO, you're my only hope.
I have found countless solutions to this issue that boil down to "make sure you have __init__.py in appropriate places and it should just work."
I am sure that I am missing something obvious, this is a noob question.
Anyways -
Folder structure here
I have added what __init__ files I believe should exist.
Here is the code for my unit test:
import unittest
from pyrum.src import Card
class TestCard(unittest.TestCase):
def setUp(self):
pass
def test_Card_Initalization(self):
#Tests that numeric position is set properly for various options.
self.assertTrue(Card("S","A").position==1)
self.assertTrue(Card("S","2").position==2)
self.assertTrue(Card("S","J").position==11)
self.assertTrue(Card("S","Q").position==12)
self.assertTrue(Card("S","K").position==13)
#Tests that an exception is raised when an invalid suit is passed.
with self.assertRaises(Exception) as context:
Card("Z","Z")
self.assertTrue('Suit Z provided does not match recongized suit code. Options are: [S: Spades, H: Hearts, C: Clubs, D: Diamonds]' in str(context.exception))
def test_attack(self):
pass
def tearDown(self):
pass
It is under construction, but this is it.
Whatever way I call test_Card.py (have tried a few different ways/contexts), I still get
Traceback (most recent call last):
File "/workspaces/VSCodeDockerTest/pyrum/test/test_Card.py", line 2, in <module>
from pyrum.src import Card
ModuleNotFoundError: No module named 'pyrum'
But to my understanding, it should be able to find pyrum no problem. When I place the card and test modules in the same folder (and change the import accordingly) it works fine.
Can someone please explain to me what I am missing? I am at my wits end here, and don't know what else I could search or look at to find a solution.
Thanks!
try: from src import Card
that could work
Here is my own solution. If someone has a better answer as to what is going on, I am all ears.
From this related Q&A:
My module has __init__.py and still Python can't import it
It seems like the issue was that python doesn't handle paths how I think it does? I was able to fix it by running
cd ..
export PYTHONPATH=.
and then running
python ./pyrum/tests/runner.py
(runner.py runs all my tests in turn)
I also removed the __init__.py files from src and test, but I am not sure that that, specifically, was necessary or helpful.
I am curious how to avoid having to do that each time I want to run my tests without permanently modifying my path. I am developing on Docker so I am less concerned if that's the best solution (I can just write a bash script to do those 3 things if I really have to...) but I'd like a clean and portable solution, if anyone has something better. I am assuming using a virtual environment would alleviate some of this issue, so I will probably try that in the future as my next random thing to try.
Now it's working everywhere except the debugger.
As you are importing pyrum.src, you are considering it as a module. Therefore, you have to use -m option.
Example: if your pyrum folder is at a/b/c/pyrum, stand on a/b/c and call
python -m pyrum.test.test_Card # without .py extension

How should one write the import procedures in a module that uses imported modules in a limited way?

I have a module that features numerous functions. Some of these functions are dependent on other modules. The module is to be used in some environments that are not going to have these other modules. In these environments, I want the functionality of the module that is not dependent on these other unavailable modules to be usable. What is a reasonable way of coding this?
I am happy for an exception to be raised when a function's dependency module is not met, but I want the module to be usable for other functions that do not have dependency problems.
At present, I'm thinking of a module-level function something like the following:
tryImport(moduleName = None):
try:
module = __import__(moduleName)
return(module)
except:
raise(Exception("module {moduleName} import error".format(
moduleName = moduleName)))
sys.exit()
This would then be used within functions in a way such as the following:
def function1():
pyfiglet = tryImport(moduleName = "pyfiglet")
For your use case, it sounds like there's nothing wrong with putting the imports you need inside functions, rather than at the top of your module (don't worry, it costs virtually nothing to reimport):
def some_function():
import foo # Throws if there is no foo
return foo.bar ()
Full stop, please.
The first thing is that the problem you described indicates that you should redesign your module. The most obvious solution is to break it into a couple of modules in one package. Each of them would contain group of functions with common external dependencies. You can easily define how many groups you need if you only know which dependencies might be missing on the target machine. Then you can import them separately. Basically in a given environment you import only those you need and the problem doesn't exist anymore.
If you still believe it's not the case and insist on keeping all the functions in one module then you can always do something like:
if env_type == 'env1':
import pyfiglet
if env_type in ('env1', 'env2'):
import pynose
import gzio
How you deduct the env_type is up to you. Might come from some configuration file or the environment variable.
Then you have your functions in this module. No problem occurs if none of the module consumers calls function which makes use of the module which is unavailable in a given environment.
I don't see any point in throwing your custom exception. NameError exception will be thrown anyway upon trial to access not imported name. By the way your sys.exit() would never be executed anyway.
If you don't want to define environment types, you can still achieve the goal with the following code:
try: import pyfiglet
except ImportError: pass
try: import pynose
except ImportError: pass
try: import gzio
except ImportError: pass
Both code snippets are supposed to be used on module level and not inside functions.
TL;DR I would just break this module into several parts. But if you really have to keep it monolithic, just use the basic language features and don't over-engineer it by using __import__ and a dedicated function.

import * from module named as a variable

I've got a situation where I need to import * from a list of modules which may or may not exist. So far, I've got a working solution that goes a little something like this:
for module in modules:
try:
imported = __import__(module, globals(), locals(), [], -1)
for sub in module.split(".")[1:]:
imported = getattr(imported, sub)
for name in [n for n in dir(imported) if not n.startswith("_")]:
globals()[name] = getattr(imported, name)
except (ImportError, AttributeError):
pass
This works, but is a total mess and confuses linters no end when they're looking for the variables that get imported in this way. I'm certain I'm missing something in the way __import__ works, since surely from foo import * doesn't generate such a horrible call as I've got above. Can anyone enlighten me?
Update:
The actual use case here is refactoring a django project settings file, to move app-specific settings into a settings file within that app. Currently, all settings are in the main settings file, which has become unmaintainably large. I need to retain the ability to override any of the settings defined in that way from another settings file (client_settings.py), which gets imported as from foo_project.client_settings import * at the bottom of the settings file.
Doing from app.settings import * for each app would work, but I want to drive this from the installed apps setting to maintain DRY.

Load modules conditionally Python

I'm wrote a main python module that need load a file parser to work, initially I was a only one text parser module, but I need add more parsers for different cases.
parser_class1.py
parser_class2.py
parser_class3.py
Only one is required for every running instance, then I'm thinking load it by command line:
mmain.py -p parser_class1
With this purpose I wrote this code in order to select the parser to load when the main module will be called:
#!/usr/bin/env python
import argparse
aparser = argparse.ArgumentParser()
aparser.add_argument('-p',
action='store',
dest='module',
help='-p module to import')
results = aparser.parse_args()
if not results.module:
aparser.error('Error! no module')
try:
exec("import %s" %(results.module))
print '%s imported done!'%(results.module)
except ImportError, e:
print e
But, I was reading that this way is dangerous, maybe no stardard..
Then, is this approach ok? or I must find another way to do it?
Why?
Thanks, any comment are welcome.
You could actually just execute the import statement inside a conditional block:
if x:
import module1a as module1
else:
import module1b as module1
You can account for various whitelisted module imports in different ways using this, but effectively the idea is to pre-program the imports, and then essentially use a GOTO to make the proper imports... If you do want to just let the user import any arbitrary argument, then the __import__ function would be the way to go, rather than eval.
Update:
As #thedox mentioned in the comment, the as module1 section is the idiomatic way for loading similar APIs with different underlying code.
In the case where you intend to do completely different things with entirely different APIs, that's not the pattern to follow.
A more reasonable pattern in this case would be to include the code related to a particular import with that import statement:
if ...:
import module1
# do some stuff with module1 ...
else:
import module2
# do some stuff with module2 ...
As for security, if you allow the user to cause an import of some arbitrary code-set (e.g. their own module, perhaps?), it's not much different than using eval on user-input. It's essentially the same vulnerability: the user can get your program to execute their own code.
I don't think there's a truly safe manner to let the user import arbitrary modules, at all. The exception here is if they have no access to the file-system, and therefore cannot create new code to be imported, in which case you're basically back to the whitelist case, and may as well implement an explicit whitelist to prevent future-vulnerabilities if/when at some point in the future the user does gain file-system access.
here is how to use __import__()
allowed_modules = ['os', 're', 'your_module', 'parser_class1.py', 'parser_class2.py']
if not results.module:
aparser.error('Error! no module')
try:
if results.module in allowed_modules:
module = __import__(results.module)
print '%s imported as "module"'%(results.module)
else:
print 'hey what are you trying to do?'
except ImportError, e:
print e
module.your_function(your_data)
EVAL vs __IMPORT__()
using eval allows the user to run any code on your computer. Don't do that. __import__() only allows the user to load modules, apparently not allowing user to run arbitrary code. But it's only apparently safer.
The proposed function, without allowed_modules is still risky since it can allow to load an arbitrary model that may have some malicious code running on when loaded. Potentially the attacker can load a file somewhere (a shared folder, a ftp folder, a upload folder managed by your webserver ...) and call it using your argument.
WHITELISTS
Using allowed_modules mitigates the problem but do not solve it completely: to hardening even more you still have to check if the attacker wrote a "os.py", "re.py", "your_module.py", "parser_class1.py" into your script folder, since python first searches module there (docs).
Eventually you may compare parser_class*.py code against a list of hashes, like sha1sum does.
FINAL REMARKS: At the real end, if user has write access to your script folder you cannot ensure an absolutely safe code.
You should think of all of the possible modules you may import for that parsing function and then use a case statement or dictionary to load the correct one. For example:
import parser_class1, parser_class2, parser_class3
parser_map = {
'class1': parser_class1,
'class2': parser_class2,
'class3': parser_class3,
}
if not args.module:
#report error
parser = None
else:
parser = parser_map[args.module]
#perform work with parser
If loading any of the parser_classN modules in this example is expensive, you can define lambdas or functions that return that module (i.e. def get_class1(): import parser_class1; return parser_class1) and alter the line to be parser = parser_map[args.module]()
The exec option could be very dangerous because you're executing unvalidated user input. Imagine if your user did something like -
mmain.py -p "parser_class1; some_function_or_code_that_is_malicious()"

Python - import error

I've done what I shouldn't have done and written 4 modules (6 hours or so) without running any tests along the way.
I have a method inside of /mydir/__init__.py called get_hash(), and a class inside of /mydir/utils.py called SpamClass.
/mydir/utils.py imports get_hash() from /mydir/__init__.
/mydir/__init__.py imports SpamClass from /mydir/utils.py.
Both the class and the method work fine on their own but for some reason if I try to import /mydir/, I get an import error saying "Cannot import name get_hash" from /mydir/__init__.py.
The only stack trace is the line saying that __init__.py imported SpamClass. The next line is where the error occurs in in SpamClass when trying to import get_hash. Why is this?
This is a pretty easy problem to encounter. What's happening is this that the interpreter evaluates your __init__.py file line-by line. When you have the following code:
import mydir.utils
def get_hash(): return 1
The interpreter will suspend processing __init__.py at the point of import mydir.utils until it has fully executed 'mydir/utils.py' So when utils.py attempts to import get_hash(), it isn't defined because the interpreter hasn't gotten to it's definition yet.
To add to what the others have said, another good approach to avoiding circular import problems is to avoid from module import stuff.
If you just do standard import module at the top of each script, and write module.stuff in your functions, then by the time those functions run, the import will have finished and the module members will all be available.
You then also don't have to worry about situations where some modules can update/change one of their members (or have it monkey-patched by a naughty third party). If you'd imported from the module, you'd still have your old, out-of-date copy of the member.
Personally, I only use from-import for simple, dependency-free members that I'm likely to refer to a lot: in particular, symbolic constants.
In absence of more information, I would say you have a circular import that you aren't working around. The simplest, most obvious fix is to not put anything in mydir/__init__.py that you want to use from any module inside mydir. So, move your get_hash function to another module inside the mydir package, and import that module where you need it.

Categories