I can't seem to import my own custom NYT module. My project structure is as follows and I'm on a mac:
articulation/
articulation/
__init__.py # empty
lib/
nyt.py
__init__.py # empty
tests/
test_nyt.py
__init__.py # empty
When I try running python articulation/tests/test_nyt.py from that first parent directory, I get
File "articulation/tests/test_nyt.py", line 5, in <module>
from articulation.lib.nyt import NYT
ImportError: No module named articulation.lib.nyt
I also tried
(venv) Ericas-MacBook-Pro:articulation edohring$ Python -m articulation/tests/test_nyt.py
/Users/edohring/Desktop/articulation/venv/bin/Python: Import by filename is not supported.
test_nyt.py
import sys
sys.path.insert(0, '../../')
import unittest
#from mock import patch
# TODO: store example as fixture and complete test
from articulation.lib.nyt import NYT
class TestNYT(unittest.TestCase):
#patch('articulation.lib.nyt.NYT.fetch')
def test_nyt(self):
print "hi"
#assert issubclass(NYT, Article)
# self.assertTrue(sour_surprise.title == '')"""
nyt.py
from __future__ import division
import regex as re
import string
import urllib2
from collections import Counter
from bs4 import BeautifulSoup
from cookielib import CookieJar
PARSER_TYPE = 'html.parser'
class NYT:
def __init__(self, title, url):
self.url = url
self.title = title
self.words = get_words(url)
def get_words(url):
cj = CookieJar()
opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(cj))
p = opener.open(url)
soup = BeautifulSoup(p.read(), PARSER_TYPE)
# title = soup.html.head.title.string
letters = soup.find_all('p', class_='story-body-text story-content')
if len(letters)==0:
letters = soup.find_all('p', class_='paragraph--story')
if len(letters)==0:
letters = soup.find_all('p', class_='story-body-text', )
words = Counter()
for element in letters:
a = element.get_text().split()
for c in a:
c = ''.join(ch for ch in c if c.isalpha())
c = c.lower()
if len(c) > 0:
words[c] += 1
return words
def test_nyt():
china_apple_stores = NYT('title_test', 'http://www.nytimes.com/2016/12/29/technology/iphone-china-apple-stores.html?_r=0')
assert(len(china_apple_stores.words) > 0)
# print china_apple_stores.words
fri_brief = NYT('Russia, Syria, 2017: Your Friday Briefing', 'http://www.nytimes.com/2016/12/30/briefing/us-briefing-russia-syria-2017.html')
assert(fri_brief.title == 'Russia, Syria, 2017: Your Friday Briefing')
assert(fri_brief.url == 'http://www.nytimes.com/2016/12/30/briefing/us-briefing-russia-syria-2017.html')
assert(len(fri_brief.words) > 0)
vet = NYT('title_test', 'http://lens.blogs.nytimes.com/2017/01/03/a-love-story-and-twins-for-a-combat-veteran-amputee/')
assert(len(vet.words)>0)
print "All NYT Tests Passed"
#test_nyt()
I've tried the following and none seem to work - does anyone know how to fix this?
- Adding an init.py file to the top directory -> Doesn't help
- Entering Memory Python couldn't find this - maybe because I'm using Python 2. If this is the issue I can post more what I tried.
- Adding sys.path at the top from suggestion below
Doing this:
import sys
sys.path.insert(0, '../../')
is usually a bad idea. Sometimes it's useful for when you're testing something, or you have a single-use program that you just need to work for a short time and then you're going to throw away, but in general it's a bad habit to get into because it might stop working once you move directories around or once you give the code to someone else. I would advise you not to let yourself get in the habit of doing that.
The most likely reason to get the kind of error you're seeing is that the directory /Users/edohring/Desktop/articulation does not appear in sys.path. The first thing to do is see what actually is in sys.path, and one good way to do that is to temporarily put these lines at the top of test_nyt.py:
import os.path, sys
for p in sys.path:
print(p)
if not os.path.isabs(p):
print(' (absolute: {})'.format(os.path.abspath(p)))
sys.exit()
Then run
python articulation/tests/test_nyt.py
and look at the output. You will get a line for each directory path that Python looks in to find its modules, and if any of those paths are relative, it will also print out the corresponding absolute path so that there is no confusion. I suspect you will find that /Users/edohring/Desktop/articulation does not appear anywhere in this list.
If that turns out to be the case, the most straightforward (but least future-proof) way to fix it is to run
export PYTHONPATH=".:$PYTHONPATH"
in the shell (not in Python!) before you use Python itself to do anything using your module. Directories named in the PYTHONPATH environment variable will be added to sys.path when Python starts up. This is only a temporary fix, unless you put it in a file like $HOME/.bashrc which will get read by the shell every time you open up a Terminal window. You can read about this and better ways to add the proper directory to sys.path in this question.
Perhaps a better way to run your script is to use the shell command
python -m articulation.tests.test_nyt
This needs to be run in the directory /Users/edohring/Desktop/articulation, or at least that directory needs to appear in sys.path in order for the command to work. But using the -m switch in this way causes Python to handle how it sets up sys.path a little differently, and it may work for you. You can read more about how sys.path is populated in this answer.
Related
I'm using Python 3.9.5.
Based on this post, I'm trying to reuse some functions from the parent directory. Here's my code hierarchy:
github_repository
src
base
string_utilities.py
validation
email_validator.py
I also have __init__.py in all folders. In ALL of them.
Here's the string_utilities.py content:
def isNullOrEmpty(text: str):
return text is not None and len(text) > 0
And here's the email_validator.py content:
from src.base import string_utilities
def is_email(text: str):
if string_utilities.isNullOrEmpty(text):
return False
# logic to check email
return True
Now when I run python email_validator.py, I get this error:
ModuleNotFoundError: No module named 'src'
I have changed that frustrating import statement to all of these different forms, and I still get no results:
from ...src.base import string_utilities
which results in:
ImportError: attempted relative import with no known parent package
import src.base.string_utilities
Which causes compiler to not know the isNullOrEmpty function.
import ...src.base.string_utilities
Which results in:
Relative imports cannot be used with "import .a" form; use "from . import a" instead
I'm stuck at this point on how to reuse that function in this file. Can someone please help?
File setup:
...\Project_Folder
...\Project_Folder\Project.py
...\Project_folder\Script\TestScript.py
I'm attempting to have Project.py import modules from the folder Script based on user input.
Python Version: 3.4.2
Ideally, the script would look something like
q = str(input("Input: "))
from Script import q
However, python does not recognize q as a variable when using import.
I've tried using importlib, however I cannot figure out how to import from the Script folder mentioned above.
import importlib
q = str(input("Input: "))
module = importlib.import_module(q, package=None)
I'm not certain where I would implement the file path.
Repeat of my answer originally posted at How to import a module given the full path?
as this is a Python 3.4 specific question:
This area of Python 3.4 seems to be extremely tortuous to understand, mainly because the documentation doesn't give good examples! This was my attempt using non-deprecated modules. It will import a module given the path to the .py file. I'm using it to load "plugins" at runtime.
def import_module_from_file(full_path_to_module):
"""
Import a module given the full path/filename of the .py file
Python 3.4
"""
module = None
try:
# Get module name and path from full path
module_dir, module_file = os.path.split(full_path_to_module)
module_name, module_ext = os.path.splitext(module_file)
# Get module "spec" from filename
spec = importlib.util.spec_from_file_location(module_name,full_path_to_module)
module = spec.loader.load_module()
except Exception as ec:
# Simple error printing
# Insert "sophisticated" stuff here
print(ec)
finally:
return module
# load module dynamically
path = "<enter your path here>"
module = import_module_from_file(path)
# Now use the module
# e.g. module.myFunction()
I did this by defining the entire import line as a string, formatting the string with q and then using the exec command:
imp = 'from Script import %s' %q
exec imp
I am new to python and I'm trying to create a program that creates a directory with todays date, create a sandbox into that directory and run the make file in the sandbox. I am having trouble getting the variables to be picked up in the os.path lines. The code is posted below:
#!/usr/bin/python
import mks_function
from mks_function import mks_create_sandbox
import sys, os, time, datetime
import os.path
today = datetime.date.today() # get today's date as a datetime type
todaystr = today.isoformat() # get string representation: YYYY-MM-DD
# from a datetime type.
if not os.path.exists('/home/build/test/sandboxes/'+todaystr):
os.mkdir(todaystr)
else:
pass
if not os.path.exists('/home/build/test/sandboxes/'+todaystr+'/new_sandbox/project.pj'):
mks_create_sandbox()
else:
pass
if os.path.exists('/home/build/test/sandboxes/'+todaystr+'/new_sandbox/Makefile'):
os.system("make >make_results.txt 2>&1")
Any help would be appreciated,
Thanks
a couple of notes:
#!/usr/bin/env python
# import mks_function .. you won't need this ...
from mks_function import mks_create_sandbox
import os, datetime
# import time, sys .. these aren't used in this snippet
# import os.path .. just refer to os.path, since os is already imported
# get today's date as a datetime type
todaystr = datetime.date.today().isoformat()
# .. use os.path.join()
if not os.path.exists(os.path.join('/home/build/test/sandboxes/', todaystr)):
os.mkdir(os.path.join('/home/build/test/sandboxes/', todaystr))
# .. 'else: pass' is unnecessary
if not os.path.exists(os.path.join(
'/home/build/test/sandboxes/', todaystr, '/new_sandbox/project.pj')):
# i'm not seen, that the sandbox is created in the right directory here
# maybe you should change the working directory via ..
# os.chdir(os.path.join('/home/build/test/sandboxes/', todaystr))
mks_create_sandbox()
if os.path.exists(os.path.join(
'/home/build/test/sandboxes/', todaystr, '/new_sandbox/Makefile')):
# .. change to the right directory
os.chdir(os.path.join(
'/home/build/test/sandboxes/', todaystr, '/new_sandbox/'))
os.system("make > make_results.txt 2>&1")
Please try adding chdir code before you call make
if os.path.exists('/home/build/test/sandboxes/'+todaystr+'/new_sandbox/Makefile'):
os.chdir('/home/build/test/sandboxes/'+todaystr+'/new_sandbox/')
os.system("make >make_results.txt 2>&1")
I think you want to change a few things:
def makeSandbox():
sbdir = os.path.join('/home/build/test/sandboxes/',todaystr)
if not os.path.exists(sbdir):
os.mkdir(sbdir) # <- fully qualified path
else:
pass
And I don't really see what variables need to be picked up, seems fine to me.
Not sure what the module mks_function does. But I see one issue with your code.
For example,
if not os.path.exists('/home/build/test/sandboxes/'+todaystr):
os.mkdir(todaystr)
In the above chunk you check if the directory "/home/build/test/sandboxes/+'todaystr'"
exists and a create a directory by name "value contained in todaystr" (say 2009-12-21). This creates directory by name '2009-12-21' in the current working directory, rather than under : /home/build/test/sandboxes
which is what you intended I guess. So change to the above directory before the call to mkdir. Also it is good to check the return status of mkdir to verify if the directory creation succeeded.
path module might help in this case:
#!/usr/bin/env python
from mks_function import mks_create_sandbox
import os, datetime
from path import path
sandboxes = path('/home/build/test/sandboxes/')
today = sandboxes / datetime.date.today().isoformat()
today.mkdir() # create directory if it doesn't exist
project = today / "new_sandbox/project.pj"
project.parent.mkdir() # create sandbox directory if it doesn't exist
if not project.isfile():
mks_create_sandbox()
makefile = project.parent / "Makefile"
if makefile.isfile():
os.chdir(makefile.parent)
os.system("make >make_results.txt 2>&1")
Is there a straightforward way to list the names of all modules in a package, without using __all__?
For example, given this package:
/testpkg
/testpkg/__init__.py
/testpkg/modulea.py
/testpkg/moduleb.py
I'm wondering if there is a standard or built-in way to do something like this:
>>> package_contents("testpkg")
['modulea', 'moduleb']
The manual approach would be to iterate through the module search paths in order to find the package's directory. One could then list all the files in that directory, filter out the uniquely-named py/pyc/pyo files, strip the extensions, and return that list. But this seems like a fair amount of work for something the module import mechanism is already doing internally. Is that functionality exposed anywhere?
Using python2.3 and above, you could also use the pkgutil module:
>>> import pkgutil
>>> [name for _, name, _ in pkgutil.iter_modules(['testpkg'])]
['modulea', 'moduleb']
EDIT: Note that the parameter for pkgutil.iter_modules is not a list of modules, but a list of paths, so you might want to do something like this:
>>> import os.path, pkgutil
>>> import testpkg
>>> pkgpath = os.path.dirname(testpkg.__file__)
>>> print([name for _, name, _ in pkgutil.iter_modules([pkgpath])])
import module
help(module)
Maybe this will do what you're looking for?
import imp
import os
MODULE_EXTENSIONS = ('.py', '.pyc', '.pyo')
def package_contents(package_name):
file, pathname, description = imp.find_module(package_name)
if file:
raise ImportError('Not a package: %r', package_name)
# Use a set because some may be both source and compiled.
return set([os.path.splitext(module)[0]
for module in os.listdir(pathname)
if module.endswith(MODULE_EXTENSIONS)])
Don't know if I'm overlooking something, or if the answers are just out-dated but;
As stated by user815423426 this only works for live objects and the listed modules are only modules that were imported before.
Listing modules in a package seems really easy using inspect:
>>> import inspect, testpkg
>>> inspect.getmembers(testpkg, inspect.ismodule)
['modulea', 'moduleb']
This is a recursive version that works with python 3.6 and above:
import importlib.util
from pathlib import Path
import os
MODULE_EXTENSIONS = '.py'
def package_contents(package_name):
spec = importlib.util.find_spec(package_name)
if spec is None:
return set()
pathname = Path(spec.origin).parent
ret = set()
with os.scandir(pathname) as entries:
for entry in entries:
if entry.name.startswith('__'):
continue
current = '.'.join((package_name, entry.name.partition('.')[0]))
if entry.is_file():
if entry.name.endswith(MODULE_EXTENSIONS):
ret.add(current)
elif entry.is_dir():
ret.add(current)
ret |= package_contents(current)
return ret
There is a __loader__ variable inside each package instance. So, if you import the package, you can find the "module resources" inside the package:
import testpkg # change this by your package name
for mod in testpkg.__loader__.get_resource_reader().contents():
print(mod)
You can of course improve the loop to find the "module" name:
import testpkg
from pathlib import Path
for mod in testpkg.__loader__.get_resource_reader().contents():
# You can filter the name like
# Path(l).suffix not in (".py", ".pyc")
print(Path(mod).stem)
Inside the package, you can find your modules by directly using __loader__ of course.
This should list the modules:
help("modules")
If you would like to view an inforamtion about your package outside of the python code (from a command prompt) you can use pydoc for it.
# get a full list of packages that you have installed on you machine
$ python -m pydoc modules
# get information about a specific package
$ python -m pydoc <your package>
You will have the same result as pydoc but inside of interpreter using help
>>> import <my package>
>>> help(<my package>)
Based on cdleary's example, here's a recursive version listing path for all submodules:
import imp, os
def iter_submodules(package):
file, pathname, description = imp.find_module(package)
for dirpath, _, filenames in os.walk(pathname):
for filename in filenames:
if os.path.splitext(filename)[1] == ".py":
yield os.path.join(dirpath, filename)
The other answers here will run the code in the package as they inspect it. If you don't want that, you can grep the files like this answer
def _get_class_names(file_name: str) -> List[str]:
"""Get the python class name defined in a file without running code
file_name: the name of the file to search for class definitions in
return: all the classes defined in that python file, empty list if no matches"""
defined_class_names = []
# search the file for class definitions
with open(file_name, "r") as file:
for line in file:
# regular expression for class defined in the file
# searches for text that starts with "class" and ends with ( or :,
# whichever comes first
match = re.search("^class(.+?)(\(|:)", line) # noqa
if match:
# add the cleaned match to the list if there is one
defined_class_name = match.group(1).strip()
defined_class_names.append(defined_class_name)
return defined_class_names
To complete #Metal3d answer, yes you can do testpkg.__loader__.get_resource_reader().contents() to list the "module resources" but it will work only if you imported your package in the "normal" way and your loader is _frozen_importlib_external.SourceFileLoader object.
But if you imported your library with zipimport (ex: to load your package in memory), your loader will be a zipimporter object, and its get_resource_reader function is different from importlib; it will require a "fullname" argument.
To make it work in these two loaders, just specify your package name in argument to get_resource_reader :
# An example with CrackMapExec tool
import importlib
import cme.protocols as cme_protocols
class ProtocolLoader:
def get_protocols(self):
protocols = {}
protocols_names = [x for x in cme_protocols.__loader__.get_resource_reader("cme.protocols").contents()]
for prot_name in protocols_names:
prot = importlib.import_module(f"cme.protocols.{prot_name}")
protocols[prot_name] = prot
return protocols
def package_contents(package_name):
package = __import__(package_name)
return [module_name for module_name in dir(package) if not module_name.startswith("__")]
I have a few Munin plugins which report stats from an Autonomy database. They all use a small library which scrapes the XML status output for the relevant numbers.
I'm trying to bundle the library and plugins into a Puppet-installable RPM. The actual RPM-building should be straightforward; once I have a distutils-produced distfile I can make it into an RPM based on a .spec file pinched from the Dag or EPEL repos [1]. It's the distutils bit I'm unsure of -- in fact I'm not even sure my library is correctly written for packaging. Here's how it works:
idol7stats.py:
import datetime
import os
import stat
import sys
import time
import urllib
import xml.sax
class IDOL7Stats:
cache_dir = '/tmp'
def __init__(self, host, port):
self.host = host
self.port = port
# ...
def collect(self):
self.data = self.__parseXML(self.__getXML())
def total_slots(self):
return self.data['Service:Documents:TotalSlots']
Plugin code:
from idol7stats import IDOL7Stats
a = IDOL7Stats('db.example.com', 23113)
a.collect()
print a.total_slots()
I guess I want idol7stats.py to wind up in /usr/lib/python2.4/site-packages/idol7stats, or something else in Python's search path. What distutils magic do I need? This:
from distutils.core import setup
setup(name = 'idol7stats',
author = 'Me',
author_email = 'me#example.com',
version = '0.1',
py_modules = ['idol7stats'])
almost works, except the code goes in /usr/lib/python2.4/site-packages/idol7stats.py, not a subdirectory. I expect this is down to my not understanding the difference between modules/packages/other containers in Python.
So, what's the rub?
[1] Yeah, I could just plonk the library in /usr/lib/python2.4/site-packages using RPM but I want to know how to package Python code.
You need to create a package to do what you want. You'd need a directory named idol7stats containing a file called __init__.py and any other library modules to package. Also, this will affect your scripts' imports; if you put idol7stats.py in a package called idol7stats, then your scripts need to "import idol7stats.idol7stats".
To avoid that, you could just rename idol7stats.py to idol7stats/__init__.py, or you could put this line into idol7stats/__init__.py to "massage" the imports into the way you expect them:
from idol7stats.idol7stats import *