Python 3 modules and package relative import doesn't work? - python

I have some difficulties constructing my project structure.
This is my project directory structure :
MusicDownloader/
__init__.py
main.py
util.py
chart/
__init__.py
chart_crawler.py
test/
__init__.py
test_chart_crawler.py
These are codes :
1.main.py
from chart.chart_crawler import MelonChartCrawler
crawler = MelonChartCrawler()
2.test_chart_crawler.py
from ..chart.chart_crawler import MelonChartCrawler
def test_melon_chart_crawler():
crawler = MelonChartCrawler()
3.chart_crawler.py
import sys
sys.path.append("/Users/Chois/Desktop/Programming/Project/WebScrape/MusicDownloader")
from .. import util
class MelonChartCrawler:
def __init__(self):
pass
4.util.py
def hi():
print("hi")
In MusicDownloader, when I execute main.py by python main.py, it shows errors:
File "main.py", line 1, in <module>
from chart.chart_crawler import MelonChartCrawler
File "/Users/Chois/Desktop/Programming/Project/WebScrape/MusicDownloader/chart/chart_crawler.py", line 4, in <module>
from .. import util
ValueError: attempted relative import beyond top-level package
But when I execute my test code in test directory by py.test test_chart_crawler.py, it works
When I first faced with absolute, relative imports, it seems like very easy and intuitive. But it drives me crazy now. Need your helps. Thanks

The first problem is MusicDownloader not being a package. Add __init__.py to MusicDownloader along with main.py and your relative import ..chart should work. Relative imports work only inside packages, so you can't .. to non-package folder.
Editing my post to provide you with more accurate answer to your answer edit.
It's all about the __name__. Relative imports use __name__ of the module they are used in and the from .(.) part to form a full package/module name to import. Explaining in simple terms importer's __name__ is concatenated with from part, with dots showing how many components of name to ignore/remove, i.e.:
__name__='packageA.packageB.moduleA' of the file containing line: from .moduleB import something, leads to combined value for import packageA.packageB.moduleB, so roughly from packageA.packageB.moduleB import something(but not absolute import as it would be if typed like that directly).
__name__='packageA.packageB.moduleA' of the file containing line: from ..moduleC import something, leads to combined value for import packageA.moduleC, so roughly from packageA.moduleC import something(but not absolute import as it would be if typed like that directly).
Here if it's a moduleB(C) or a packageB(C) doesn't really matter. What's important is that we still have that packageA part which works as an 'anchor' for relative import in both cases. If there will be no packageA part, relative import won't be resolved, and we'll get an error like "Attempted relative import beyond toplevel package".
One more note here, when a module is run it gets a special __name__ value of __main__, which obviously prevents it from solving any relative imports.
Now regarding your case try adding print(__name__) as the very first line to every file and run your files in different scenarios and see how the output changes.
Namely if you run your main.py directly, you'll get:
__main__
chart.chart_crawler
Traceback (most recent call last):
File "D:\MusicDownloader\main.py", line 2, in <module>
from chart.chart_crawler import MelonChartCrawler
File "D:\MusicDownloader\chart\chart_crawler.py", line 2, in <module>
from .. import util
ValueError: Attempted relative import beyond toplevel package
What happened here is... main.py has no idea about MusicDownloader being a package (even after previous edit with adding __init__.py). In your chart_crawler.py: __name__='chart.chart_crawler' and when running relative import with from .. the combined value for package will need to remove two parts (one for every dot) as explained above, so the result will become '' as there're just two parts and no enclosing package. This leads to exception.
When you import a module the code inside it is run, so it's almost the same as executing it, but without the __name__ becoming __main__ and the enclosing package, if there's any, being 'noticed'.
So, the solution is to import main.py as part of the MusicDownloader package. To accomplish the described above, create a module, say named launcher.py on the same level of hierarchy as MusicDownloader folder (near it, not inside it near main.py) with the following code:
print(__name__)
from MusicDownloader import main
Now run launcher.py and see the changes. The output:
__main__
MusicDownloader.main
MusicDownloader.chart.chart_crawler
MusicDownloader.util
Here __main__ is the __name__ inside launcher.py. Inside chart_crawler.py: __name__='MusicDownloader.chart.chart_crawler' and when running relative import with from .. the combined value for package will need to remove two parts (one for every dot) as explained above, so the result will become 'MusicDownloader' with import becoming from MusicDownloader import util. And as we see on the next line when util.py is imported successfully it prints its __name__='MusicDownloader.util'.
So that's pretty much it - "it's all about that __name__".
P.S. One thing not mentioned is why the part with test package worked. It wasn't launched in common way, you used some additional module/program to lauch it and it probably imported it in some way, so it worked. To understand this it's best to see how that program works.
There's a note in official docs:
Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.

Related

Does a package 'see' itself in __init__.py?

I have a flask app with the root folder called project_folder.
A code snippet from the __init__.py file of this project_folder package:
#jwt.token_in_blacklist_loader
def check_if_token_in_blacklist(decrypted_token):
jti = decrypted_token['jti']
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
from project_folder.Controller.root import root
from project_folder.Controller import auth_controller
from project_folder.Controller import item_controller
Now the interesting thing is, that the project_folder package naturally has other smaller packages itself, which I'm importing to use them (for REST resources in this example). These are the last 3 lines, nothing throws an error so far.
But, if you take a look at the annotated function (in this example it always runs before some kind of JWT Token is being used), I am returning some inner package's function. Now when the logic truly runs this part the code breaks:
PROJECT_ROUTE\project_folder\__init__.py", line 38, in check_if_token_in_blacklist
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
NameError: name 'project_folder' is not defined
After thinking about it, it seems understandable. Importing from project_folder does import from the __init__.py file of the package, which is the actual file the interpreter currently is. So removing the package name prefix form the
return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti)
to
return Model.RevokedTokenModel.is_jti_blacklisted(jti)
does not throw an error anymore.
The question is: Why is it only a problem inside the callback function and not with the last 3 imports?
This has to do with circular imports in python. Circular import is a form of circular dependency, created at the module import level.
How it works:
When you launch your application, python keeps a register (a kind of table) in which it records all the imported modules. When you call somewhere in your code a module, python will see in its registry if it has already been registered and loads it from there. You can access this registry via sys.module, which is actually a dictionary containing all the modules that have been imported since Python was started.
Example of use:
>>> import sys
>>> print('\n'.join(sys.modules.keys()))
So, since Python is an interpreted language, reading and execution of code is done line by line from top to bottom.
In your code, you put your imports at the bottom of your __init__.py file.
While browsing it, when python arrives at the line return project_folder.Model.RevokedTokenModel.is_jti_blacklisted(jti), it will look if the module exists in its register. Which is clearly not yet the case. That's why he raises an NameError: name 'project_folder' is not defined exception.

Custom Python module structure

I am currently doing a personal coding project and I am trying to build a module, but I don't know why my structure doesn't work the way it's supposed to:
\mainModule
__init__.py
main.py
\subModule_1
__init__.py
someCode_1.py
someCode_2.py
\subModule_2
__init__.py
otherCode.py
I want to be able to run the following code from main.py:
>>> from subModule_1 import someCode_1
>>> someCode_1.function()
"Hey, this works!"
>>> var = someCode_2.someClass("blahblahblah")
>>> var.classMethod1()
>>> "blah blah blah"
>>> from subModule2 import otherCode
>>> otherCode("runCode","#ff281ba0")
However, when I try to import someCode_1, for example, it returns an AttributeError, and I'm not really sure why. Is it to do with the __init__.py file?
REVISIONS
Minimal, Complete and verifiable (I hope...)
\mainDir
__init__.py # blank file
main.py
\subDir
__init__.py # blank file
codeFile.py
Using this...
#main.py file
import subDir
subDir.codeFile.function()
And this...
#codeFile.py file
def function():
return "something"
...it returns the same problem mentioned above**.
** The exact error is:
Traceback (most recent call last):
File "C:\...\mainDir\main.py", line 2, in <module>
subDir.codeFile.function()
AttributeError: module 'subDir' has no attribute 'codeFile'
Credits to #jonrsharpe: Thanks for showing me how to use Stack Overflow correctly.
You have two options to make this work.
Either this:
from subdir import codeFile
codeFile.function()
Or:
import subdir.codeFile
subdir.codeFile.function()
When you import subDir, it does three things:
executes the code in mainDir/subDir/__init__.py (i.e. in this case does nothing, because this file is empty)
imports the resulting module under the name subDir locally, which will in turn make it an attribute of the mainDir module;
registers the new import globally in the sys.modules dictionary (because the import is being performed from a parent module mainDir, the name is completed to 'mainDir.subDir' for the purposes of this registration);
What it does not do, because it hasn't been told to, is import subDir.codeFile. Therefore, the code in codeFile.py has not been run and the name codeFile has not yet been imported into the namespace of mainDir.subDir. Hence the AttributeError when trying to access it. If you were to add the following line to mainDir/subDir/__init__.py then it would work:
import codeFile
Specifically, this will:
run the code in codeFile.py
add the resulting module as an attribute of the mainDir.subDir module
store a reference to it as yet another entry in sys.modules, this time under the name mainDir.subDir.codeFile.
You could also achieve the same effect from higher up the module hierarchy, by saying import subDir, subDir.codeFile instead of just import subDir in your mainDir.main source file.
NB: When you test this from the command line or IDE, make sure that your current working directory (queried by os.getcwd(), changed using os.chdir(wherever) ) is neither mainDir nor subDir. Work from somewhere else—e.g. the parent directory of mainDir. Working from inside a module will lead to unexpected results.

Circular function import

I'm trying to do the following in python 2.6.
my_module.py:-
from another_module import another_factory
def my_factory(name):
pass
another_module.py:-
from my_module import my_factory
def another_factory(name):
pass
Both modules in the same folder.
It gives me the error:
Error: cannot import name my_factory
As seen from the comments, you are trying to do a circle import which is impossible.
If in your module A you try to import something from the module B, and when loading the module B (to satisfy this dependency) you are trying to import something from the module A, you are where you started and you got a circle import: A needs B and B needs A!!, it is somehow like saying that A needs A, which is quite unlogic.
For instance:
# moduleA
from moduleB import functionB
...
So the interpreter tries to load the moduleB, which looks like the following:
# moduleB
from moduleA import functionA
...
And goes back to the moduleA, which tries again to import B, and, etc. Therefore python just raises the error and stops the insanity for a greater good.
Dependencies don't work like this. Define what module needs the other one, and just do a simple import. In your example, it seems that another_module needs my_module, so change my_module and eliminate the dependency on another_module.
If both modules actually need each other, it is a clear sign that they belong to the same logical concept, and should be merged.
PD: in some cases to avoid huge files, you can split a logical unit in two, and to avoid the circle dependencies, you write your imports inside of the functions (which are not executed at load time), so that there is not a circle. This is however in general something to avoid.
The real question is... do you consider each file as a module or are they part of a package ?
Trying to import modules outside a package is sometimes painful. You should rather build a package by simply creating an empty __init__.py module in the directory. Though, if you have
__init__.py
my_module.py
another_module.py
If you have te following function in my_module.py,
def my_factory(x):
return x * x
You should be able to access the my_factory() function from another_module.py by writing this :
from my_module import my_factory
But, if you don't have the __init__.py file/module, the import function will be (somehow) lost and will only use the sys.path for searching other modules. You may then add the following lines (before the import) in the another_module.py file :
sys.path.append(os.path.dirname(os.path.expanduser('.')))
You may also use the various packages available to help importing modules, like imp or import_file (see the documentation). Or you can decide to use load_source (also see the doc : https://docs.python.org/2/library/imp.html)

Change main package name in python

In my project I want to change the main package name.
I've a dir structure like this:
hallo/sub
hallo/foo
hallo/bar
And I want to change the main name for example to 'goodbye':
goodbye/sub
goodbye/foo
goodbye/bar
But as result the new name is always rejected! for example if I import
import goodbye.sub.utils as utils
It return the error
ImportError: No module named sub.utils
And clearly the old name don't works.
The file __init__.py is written in all subdirectories!
I've tried to remove all *.pyc files and cache directory, I've tried to re-clone the project in another directory, but nothing, the new name is always rejected!
I'm using python2 under *nix and I'm never moved under windows.
Some idea?
Edit:
The old name works perfectly:
import hallo.sub.utils as utils
Has always worked without any errors, the problem is the name change.
For imports from the outside, check the contents of __init__.py for variables that can throw things off -- like __all__.
Also, the typical idiom is:
from hallo.sub import utils #tyically this
import hallo.sub.utils as utils #instead of this
I can't imagine why that would make a difference, but python occasionally has silly bugs.
For imports from within the package, you can instead use relative imports. Within your hallo package you can change this:
import hallo.sub as sub
import hallo.sub.utils as utils
to this
from . import sub
from .sub import utils
Then it doesn't matter what the outer package is called.

Python Packages?

Ok, I think whatever I'm doing wrong, it's probably blindingly obvious, but I can't figure it out. I've read and re-read the tutorial section on packages and the only thing I can figure is that this won't work because I'm executing it directly. Here's the directory setup:
eulerproject/
__init__.py
euler1.py
euler2.py
...
eulern.py
tests/
__init__.py
testeulern.py
Here are the contents of testeuler12.py (the first test module I've written):
import unittest
from .. import euler12
class Euler12UnitTests(unittest.TestCase):
def testtriangle(self):
"""
Ensure that the triangle number generator returns the first 10
triangle numbers.
"""
self.seq = [1,3,6,10,15,21,28,36,45,55]
self.generator = euler12.trianglegenerator()
self.results = []
while len(self.results) != 10:
self.results.append(self.generator.next())
self.assertEqual(self.seq, self.results)
def testdivisors(self):
"""
Ensure that the divisors function can properly factor the number 28.
"""
self.number = 28
self.answer = [1,2,4,7,14,28]
self.assertEqual(self.answer, euler12.divisors(self.number))
if __name__ == '__main__':
unittest.main()
Now, when I execute this from IDLE and from the command line while in the directory, I get the following error:
Traceback (most recent call last):
File "C:\Documents and Settings\jbennet\My Documents\Python\eulerproject\tests\testeuler12.py", line 2, in <module>
from .. import euler12
ValueError: Attempted relative import in non-package
I think the problem is that since I'm running it directly, I can't do relative imports (because __name__ changes, and my vague understanding of the packages description is that __name__ is part of how it tells what package it's in), but in that case what do you guys suggest for how to import the 'production' code stored 1 level up from the test code?
I had the same problem. I now use nose to run my tests, and relative imports are correctly handled.
Yeah, this whole relative import thing is confusing.
Generally you would have a directory, the name of which is your package name, somewhere on your PYTHONPATH. For example:
eulerproject/
euler/
__init__.py
euler1.py
...
tests/
...
setup.py
Then, you can either install this systemwide, or make sure to set PYTHONPATH=/path/to/eulerproject/:$PYTHONPATH when invoking your script.
An absolute import like this will then work:
from euler import euler1
Edit:
According to the Python docs, "modules intended for use as the main module of a Python application should always use absolute imports." (Cite)
So a test harness like nose, mentioned by the other answer, works because it imports packages rather than running them from the command line.
If you want to do things by hand, your runnable script needs to be outside the package hierarchy, like this:
eulerproject/
runtests.py
euler/
__init__.py
euler1.py
...
tests/
__init__.py
testeulern.py
Now, runtests.py can do from euler.tests.testeulern import TestCase and testeulern.py can do from .. import euler1

Categories