#Background and Problem#
I'm trying to build a web-scraper to back up my social media accounts (summer project, I know it's useless).
I'm trying to create a nice class structure, so I've come up with the following structure (I accept critique, I'm pretty new to programming):
\social-media-backup
\Files
__init__.py
File.py
Image.py
Video.py
\SocialMedia
SocialMediaFeed.py
SocialMediaPost.py
\Instagram
__init__.py
\MediaTypes
__init__.py
GraphImage.py
GraphVideo.py
\SearchTypes
__init__.py
InstagramUser.py
\Twitter
\VSCO
(Twitter and VSCO are, for now, empty. Anything without extension and starting with , is a folder. Every file has a class with the same name as the file inside)
#Questions#
Where can I learn Python's packaging system definitively? Any book or web-site recommendation?
How would I import Image into GraphImage? How would I import File into SocialMediaPost?
What do I need to write in __init__.py so as to import SOME_PACKAGE and have it import every module? (e.g.: import Files and have Files.Image and File.Video accessible).
(I know there are a lot of questions. They're written in order of importance)
#To accomplish importing File into SocialMediaPost I've tried:#
from Files.File import File
from ...Files.File import File
import File
from File import File
And almost any combination you can imagine.
I always get an Unable to import, No module named '__main__.Files' or Attempted relative import beyond top-level package.
#Expected behavior#
I'm used to Java's way of doing this, and I cannot figure out how to do this in Python. It seems so messed up. I really miss just adding a package and a folder tree from where the compiler would run.
knocks desk THERE MUST BE A BETTER WAY
##THANKS!!##
Lots of stuff is written about this... however most guides focus on how you do it, not what to do.
How I do it (for small to medium-sized projects):
Do not mess with sys.path.
Have one "project root" directory with your modules / packages underneath (as you already do).
Use absolute imports always, except for "sister" modules.
Always run as module, i.e. using python -m foo.bar
Concrete example using your structure. Assuming that your entry point might be in \SocialMedia\SocialMediaFeed.py, use import statements:
from . import SocialMediaPost (sister module)
import Instagram (child module)
from Files import Image (other module)
and run using: python -m SocialMedia.SocialMediaFeed
By running as module, you always have the project root (social-media-backup) added as "search path". This way absolute imports refering to its subfolders always work. By the way, you can print out the module search path using import sys; print(sys.path).
Some of this might seem overcomplicated, but I found that following the above points pays out greatly as soon as you try to package up stuff for installation (keyword setup.py).
Edit: to answer 3rd question: Have __init__.py contain:
from . import File
from . import Image
from . import Video
I would second the comments by Damian and user2357112 - try to avoid name collisions between folder, file and class/function when creating modules.
You probably won't be able to import anything outside of the current working directory without adding it to your PYTHONPATH. Adding a folder to your PYTHONPATH environment variable means that python will always check that folder when importing modules, so you'll be able to import it from anywhere.
There is a good thread on this already that will put you in the right direction:
Permanently add a directory to PYTHONPATH?
(It's a lot to cover in one post)
Related
I've tried to create a new module in Python. It's git link : https://github.com/Sanmitha-Sadhishkumar/strman
After uploading and installing that using pip, I found that I could access that module as
import strman.strman as s
s.func_name
What are the changes to be made to access that as
import strman
strman.func_name
In your __init__.py file, you want to use a relative import.
from .strman import *
You have a strman package (the outer directory) and within it a strman module (the strman.py file). That's a perfectly common pattern. But without the relative import your __init__.py wasn't importing from deep enough in the hierarchy.
More generally, whenever you import from a sibling module within a project, you almost always should use a relative import, because it's explicit and avoids various complications, such as the example in your case.
You have to move the files into the main directory. Your directory will look like this
\myProject
- init.py
- strman.py
- main.py # <-- this would be the file your programming is
Read more about modules in the docs
I have a python project structured like this:
repo_dir/
----project_package/
--------__init__.py
--------process.py
--------config.py
----tests/
--------test_process.py
__init__.py is empty
config.py looks like this:
name = 'brian'
USAGE
I use the library by running python process.py from the project/project/ directory, or by specifying the python file path absolutely. I'm running Python 2.7 on Amazon EC2 Linux.
When process.py looks like below, everything works fine and process.py prints brian.
import config
print config.name
When process.py looks like below, I get the error ImportError: No module named project.config.
import project.config
print config.name
When process.py looks like below, I get the error ImportError: No module named project. This makes sense as the same behavior from the previous example should be expected.
from project import config
print config.name
If I add these lines to process.py to include the library root in sys.path, all configurations above, work fine.
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.path.dirname(__file__), '..')))
MY CONFUSION
Many resources suggest setting up python libraries to import modules using project.module_name, but it doesn't seem like sys.path appending is standard, and seems weird that I need it. I can see that the sys.path append added my library root as a path in sys, but I thought that's what the __init__.py in my library root was supposed to do. What gives? What am I missing? I know Python importing creates lots of headaches so I've tried to simplify this as much as possible to wrap my head around it. I'm going crazy and it's Friday before a holiday. I'm bummed. Please help!!
QUESTIONS
How should I set up my libraries? How should I import packages? Where should I have __init__.py files? Do I need to append my library root to sys.path in every project? Why is this so confusing?
Your project setup is alright. I renamed the directories just for clarity
in this example, but the structure is the same as yours:
repo_dir/
project_package/
__init__.py
process.py
config.py
# Declare your project in a setup.py file, so that
# it will be installable, both by users and by you.
setup.py
When you have a module that wants to import from another module in
the same project, the best approach is to use relative imports. For example:
# In process.py
from .config import name
...
While working on the code on your dev box, do your work in a Python virtualenv,
and pip install your project in "editable" mode.
# From the root of your repo:
pip install -e .
With that approach, you'll never need to muck around with sys.path -- which
is almost always the wrong approach.
I think the problem is how you're running your script. If you want the script to be living in a package (the inner project folder), you should run it with python -m project.process, rather than by filename. Then you can make absolute or explicit relative imports to get config from process.
An absolute import would be from project import config or import project.config.
An explicit relative import would be from . import config.
Python 2 also allows implicit relative imports, but they're a really bad misfeature that you should never use. With implicit relative imports, internal package modules can shadow top-level modules. For instance, a project/json.py file would hide the standard library's json module from all the other modules in the package. You can tell Python you want to forbid implicit relative imports by putting from __future__ import absolute_import at the top of the file. It's the standard behavior in Python 3.
Sorry for asking my own question 2nd time, but i am totally stuck in import file in python.
I have a directory structure below:
|--test/foo.py
|--library #This is my PYTHONPATH
|--|--script1.py
|--|--library_1
|--|--|--script2.py
|--|--library_2
|--|--library_3
I am accessing library/library_1/script2.py from test/foo.py.
Here i am confused about what is the better approach. Generally all library folders or utility functions should be added to pythonpath.
This is a folder structure i am maintaining to differentiate utility functions and test scripts.
I tried putting __init__.py in library and library1 & then imported like from library1 import script2, but getting error as No module named script.
I have tried appending that path to system path as well.
Working: if i add another pythonpath like path/to/library/libray_1/. So should i do this for all folders which are inside library folder to make it work ?
Here's what you need to do:
|--test/foo.py
|--library #This is my PYTHONPATH
|--__init__.py
|--|--script1.py
|--|--library_1
|--|--|--__init__.py
|--|--|--script2.py
|--|--library_2
|--|--|--__init__.py
|--|--library_3
|--|--|--__init__.py
And inside the first __init__.py below library you need to do:
import library1
import library2
import script
Then, if library is your python path, you can do this within test/foo.py with no errors:
import library
library.library1.bar()
library.script.foo()
I have some doubts in relation to packages structure in a python project when I make the imports
These are some conventions
python-irodsclient_API = Project Name
I've defined python packages for each file, in this case are the following:
python-irodsclient_API/config/
python-irodsclient_API/connection/
These packages are well define as a packages and not as a directories really?
I have the file python-irodsclient_API/config/config.py in which I've defined some constants about of configuration for connect with my server:
And I have the python-irodsclient_API/connection/connection.py file:
In the last or previous image (highlighted in red) .. is this the right way of import the files?
I feel the sensation of this way is not better.
I know that the "imports" should be relatives and not absolutes (for the path) and that is necessary use "." instead "*"
In my case I don't know if this can be applied in relation to the I'm doing in the graphics.
I appreciate your help and orientation
Best Regards
There is a good tutorial about this in the Python module docs, which explains how to refer to packages under structured folders.
Basically, from x import y, where y is a submodule name, allows you to use y.z instead of x.y.z.
You have 2 options here:
1) make your project a package. Since it seems like your connection and config packages are interdependent, they should be modules within the same package. To make this happens, add a __init__.py files in python-irodsclient_API folder. Now you can use relative imports to import config into connection, as they are part of the same package:
from ..config import config
The .. part means import from one level above within the package structure (similar to how .. means parent directory in Unix)
2) if you don't want to make python-irodsclient_API a package for some reason, then the second option is to add that folder to the PYTHONPATH. You can do this dynamically per Tony Yang's answer, or do this from the bash command line as followed:
export PYTHONPATH=$PYTHONPATH:/path/to/python-irodsclient_API
I can invoke sys module to append python-irodsclient_API path.
import sys
sys.path.append('C:\..\python-irodsclient_API')
When you operate connection.py and want to invoke config, it's able to be successful.
I did some websearch, but all I found was frustration.
I have a project in a directory (lets call it) "projectdir", in which I have "main.py".
In projectdir I have a subdirectory called "otherstuff", In which I have "foo.py".
How do I import foo.py, so I can use its contents in main.py, without doing much of the work that python designers/implementors should have, and without relying on boilerplate files?
Or is that impossible in python?
You need to put a __init__.py file in otherstuff subdirectory, to mark it as a package. After, you can import your module using:
import subdirectory.foo
or
from subdirectory import foo
The __init__.py file can be empty. There is no other "clean" way to achieve that in python.
you need to include __init__.py in your otherstuff directory. This is to tell python to search there for imports.
The python documentation explains how the module/package import works. And is def worth the time reading it, despite its kind of long length