How to access a docstring from a separate script? - python

Building a GUI for users to select Python scripts they want to run. Each script has its own docstring explaining inputs and outputs for the script. I want to display that information in the UI once they've highlighted the script, but not selected to run it, and I can't seem to get access to the docstrings from the base program.
ex.
test.py
"""this is a docstring"""
print('hello world')
program.py
index is test.py for this example, but is normally not known because it's whatever the user has selected in the GUI.
# index is test.py
def on_selected(self, index):
script_path = self.tree_view_model.filePath(index)
fparse = ast.parse(''.join(open(script_path)))
self.textBrowser_description.setPlainText(ast.get_docstring(fparse))

Let's the docstring you want to access belongs to the file, file.py.
You can get the docstring by doing the following:
import file
print(file.__doc__)
If you want to get the docstring before you import it then the you could read the file and extract the docstring. Here is an example:
import re
def get_docstring(file)
with open(file, "r") as f:
content = f.read() # read file
quote = content[0] # get type of quote
pattern = re.compile(rf"^{quote}{quote}{quote}[^{quote}]*{quote}{quote}{quote}") # create docstring pattern
return re.findall(pattern, content)[0][3:-3] # return docstring without quotes
print(get_docstring("file.py"))
Note: For this regex to work the docstring will need to be at the very top.

Here's how to get it via importlib. Most of the logic has been put in a function. Note that using importlib does import the script (which causes all its top-level statements to be executed), but the module itself is discarded when the function returns.
If this was the script docstring_test.py in the current directory that I wanted to get the docstring from:
""" this is a multiline
docstring.
"""
print('hello world')
Here's how to do it:
import importlib.util
def get_docstring(script_name, script_path):
spec = importlib.util.spec_from_file_location(script_name, script_path)
foo = importlib.util.module_from_spec(spec)
spec.loader.exec_module(foo)
return foo.__doc__
if __name__ == '__main__':
print(get_docstring('docstring_test', "./docstring_test.py"))
Output:
hello world
this is a multiline
docstring.
Update:
Here's how to do it by letting the ast module in the standard library do the parsing which avoids both importing/executing the script as well as trying to parse it yourself with a regex.
This looks more-or-less equivalent to what's in your question, so it's unclear why what you have isn't working for you.
import ast
def get_docstring(script_path):
with open(script_path, 'r') as file:
tree = ast.parse(file.read())
return ast.get_docstring(tree, clean=False)
if __name__ == '__main__':
print(repr(get_docstring('./docstring_test.py')))
Output:
' this is a multiline\n docstring.\n'

Related

Creating a class object to be used globally in Python

I wrote the module below that will standardize how my logfiles are written as well as easily changing whether events get printed/written to the logfile or not.
FILE: Logging.py
================
import os
import datetime
import io
class Logfile():
def __init__(self,name):
self.logFile = os.getcwd() + r'\.Log\\' + name + '_' + str(datetime.date.today().year) + ('00' + str(datetime.date.today().month))[-2:] + '.log'
self.printLog = False
self.debug = False
# Setup logFile and consolidated Folder
if not os.path.exists(os.path.dirname(self.logFile)):
os.mkdir(os.path.dirname(self.logFile))
#Check if logfile exists.
if not os.path.exists(self.logFile):
with open(self.logFile, 'w') as l:
pass
# Write LogFile Entry
def logEvent(self, eventText, debugOnly): # Function to add an event to the logfile
# If this is marked as debugging only AND debugging is off
if debugOnly == True and self.debug == False:
return
if self.printLog == True:
print(datetime.datetime.strftime(datetime.datetime.now(), '%m/%d/%Y, %I:%M:%S %p, ') + str(eventText))
with open(self.logFile, 'a') as l:
l.seek(0)
l.write(datetime.datetime.strftime(datetime.datetime.now(), '%m/%d/%Y, %I:%M:%S %p, ') + str(eventText) + '\n')
return
This is very handy but, I am having trouble understanding how to make this available to all of my classes. For example, If i import the following module, I am not sure how to use the logfile i created within my main script.
FILE: HelloWorld.py
===================
class HelloWorld():
def __init__(self):
log.logEvent('You have created a HelloWorld Object!', False)
Main Script Here:
import Logging
from HelloWorld import HelloWorld
log = logging.Logfile
hw = HelloWorld()
^^ Will fail because it does not know log is a thing. What is the proper way to handle these sort of situations?
I believe you're trying to do something like this. (and as a side note, you may want to look into using pythons default logging module)
FILE: HelloWorld.py
===================
# import LogFile
from .Logging import LogFile
# create new LogFile instance
log = LogFile(name='log name')
class HelloWorld():
def __init__(self):
# call logEvent method on your LogFile instance
log.logEvent('You have created a HelloWorld Object!', False)
FILE: Main.py
===================
# import HelloWorld
from .HellowWorld import HellowWorld
# create new HellowWorld instance
hw = HellowWorld()
Also, to create a module you will need to add an __init__.py file in that given directory.
This problem is easily solved by using the built-in "Logging" module. In an answer to the broader "how to use a thing(log) within all of my modules" question, I assume the answer to this can be found by reading through the code in the logging module and mimicking that.

how to import function in pytest

I am trying to create a pytest for a Python 2.x script executable with dashes included in its name. I tried to import it the usual way but I can't figure out how to make it work with the dashes.
My project structure is as follows:
package
-- tests
-- bin
-- subpackage
-- ...py
Specifically, I need to test a function called master_disaster() which exists inside bin/let-me-out (yes with -). let-me-out is an executable .py file and my folder has no setup.py file or anything similar.
How can I import this function inside my test? My test is going to be a simple fixture that checks the time with:
#pytest.fixture
def now():
return timezone.now()
It then uses the now() function to create a new file which let-me-out will delete after a specific amount of time.
First of all, dashes make let-me-out word to an invalid identifier in Python. To work around it, you have to invoke the imp (Python 2.7)
or importlib (Python 3.5+) machinery.
Python 3.5+
Here is an example of importing a new module having a qualified name let_me_out, but using bin/let-me-out as source file:
import importlib
def test_master_disaster():
loader = importlib.machinery.SourceFileLoader('let_me_out', 'bin/let-me-out')
spec = importlib.util.spec_from_loader(loader.name, loader)
let_me_out = importlib.util.module_from_spec(spec)
loader.exec_module(let_me_out)
# this is only a stub, to show an example of calling the master_disaster function
assert let_me_out.master_disaster() == 'spam'
You can extract this code into a fixture to make it reusable:
import importlib
import pytest
#pytest.fixture(scope='session')
def let_me_out():
loader = importlib.machinery.SourceFileLoader('let_me_out', 'bin/let-me-out')
spec = importlib.util.spec_from_loader(loader.name, loader)
let_me_out = importlib.util.module_from_spec(spec)
loader.exec_module(let_me_out)
return let_me_out
def test_master_disaster(let_me_out):
assert let_me_out.master_disaster() == 'spam'
Python 2.7
Things are even easier with Python 2.7:
import imp
import pytest
#pytest.fixture(scope='session')
def let_me_out():
return imp.load_source('let_me_out', 'bin/let-me-out')
def test_master_disaster(let_me_out):
assert let_me_out.master_disaster() == 'spam'

how to preserve module path of a module executed as a script

I have a function called get_full_class_name(instance), which returns the full module-qualified class name of instance.
Example my_utils.py:
def get_full_class_name(instance):
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
Unfortunately, this function fails when given a class that's defined in a currently running script.
Example my_module.py:
#! /usr/bin/env python
from my_utils import get_full_class_name
class MyClass(object):
pass
def main():
print get_full_class_name(MyClass())
if __name__ == '__main__':
main()
When I run the above script, instead of printing my_module.MyClass, it prints __main__.MyClass:
$ ./my_module.py
__main__.MyClass
I do get the desired behavior if I run the above main() from another script.
Example run_my_module.py:
#! /usr/bin/env python
from my_module import main
if __name__ == '__main__':
main()
Running the above script gets:
$ ./run_my_module.py
my_module.MyClass
Is there a way I could write the get_full_class_name() function such that it always returns my_module.MyClass regardless of whether my_module is being run as a script?
I propose handling the case __name__ == '__main__' using the techniques discussed in Find Path to File Being Run. This results in this new my_utils:
import sys
import os.path
def get_full_class_name(instance):
if instance.__class__.__module__ == '__main__':
return '.'.join([os.path.basename(sys.argv[0]),
instance.__class__.__name__])
else:
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
This does not handle interactive sessions and other special cases (like reading from stdin). For this you may have to include techniques like discussed in detect python running interactively.
Following mkiever's answer, I ended up changing get_full_class_name() to what you see below.
If instance.__class__.__module__ is __main__, it doesn't use that as the module path. Instead, it uses the relative path from sys.argv[0] to the closest directory in sys.path.
One problem is that sys.path always includes the directory of sys.argv[0] itself, so this relative path ends up being just the filename part of sys.argv[0]. As a quick hack-around, the code below assumes that the sys.argv[0] directory is always the first element of sys.path, and disregards it. This seems unsafe, but safer options are too tedious for my personal code for now.
Any better solutions/suggestions would be greatly appreciated.
import os
import sys
from nose.tools import assert_equal, assert_not_equal
def get_full_class_name(instance):
'''
Returns the fully-qualified class name.
Handles the case where a class is declared in the currently-running script
(where instance.__class__.__module__ would be set to '__main__').
'''
def get_module_name(instance):
def get_path_relative_to_python_path(path):
path = os.path.abspath(path)
python_paths = [os.path.abspath(p) for p in sys.path]
assert_equal(python_paths[0],
os.path.split(os.path.abspath(sys.argv[0]))[0])
python_paths = python_paths[1:]
min_relpath_length = len(path)
result = None
for python_path in python_paths:
relpath = os.path.relpath(path, python_path)
if len(relpath) < min_relpath_length:
min_relpath_length = len(relpath)
result = os.path.join(os.path.split(python_path)[-1],
relpath)
if result is None:
raise ValueError("Path {} doesn't seem to be in the "
"PYTHONPATH.".format(path))
else:
return result
if instance.__class__.__module__ == '__main__':
script_path = os.path.abspath(sys.argv[0])
relative_path = get_path_relative_to_python_path(script_path)
relative_path = relative_path.split(os.sep)
assert_not_equal(relative_path[0], '')
assert_equal(os.path.splitext(relative_path[-1])[1], '.py')
return '.'.join(relative_path[1:-1])
else:
return instance.__class__.__module__
module_name = get_module_name(instance)
return '.'.join([module_name, instance.__class__.__name__])

Python merge .py part-files into one .py file

I am doing browser automation using python + splinter.
my structure is like this
[root]
+--start.py
+--end.py
+--[module1]
| +--mod11area1.py
| +--mod12area2.py
| +--[module1_2]
| | +--mod121area1.py
| +--[module1_3]
| +--mod131area1.py
+--[module2]
+--mod21area1.py
start.py sets the initialization and opening of the browser
and the inner modules.py performs actions per module
this structure would then be merged into one script upon execute by appending the contents in this fasion:
start.py
mod11area1.py
mod12area2.py
mod121area1.py
mod131area1.py
mod21area1.py
end.py
My question is, is there a better way of doing this? I'm quite new to this and just usually create a single script. since my project keeps on expanding I had to employ several other guys to script with me. Hence I came up with this approach.
No, Python has no simple way to merge scripts into one .py file.
But you can fake it, albeit in a fairly limited way.
Heres an example of how you can define multiple modules (each with their own namespace), in a single file.
But has the following limitations.
No package support(although this could be made to work).
No support for modules depending on eachother(a module can't be imported unless its already defined).
Example - 2 modules, each containing a function:
# Fake multiple modules in a single file.
import sys
_globals_init = None # include ourself in namespace
_globals_init = set(globals().keys())
# ------------------------
# ---- begin
__name__ = "test_module_1"
__doc__ = "hello world"
def test2():
print(123)
sys.modules[__name__] = type(sys)(__name__, __doc__)
sys.modules[__name__].__dict__.update(globals())
[globals().__delitem__(k) for k in list(globals().keys()) if k not in _globals_init]
# ---- end ------------
# ---------------------
# ---- begin
__name__ = "some_other"
__doc__ = "testing 123"
def test1():
print(321)
sys.modules[__name__] = type(sys)(__name__, __doc__)
sys.modules[__name__].__dict__.update(globals())
[globals().__delitem__(k) for k in list(globals().keys()) if k not in _globals_init]
# ---- end ------------
# ----------------
# ---- example use
import test_module_1
test_module_1.test2()
import some_other
some_other.test1()
# this will fail (as it should)
test1()
Note, this isn't good practice, if you have this problem, you're probably better off with some alternative solution (such as using https://docs.python.org/3/library/zipimport.html)
See my GitHub project.
There is likely a better way for your needs. I developed this project/hack for programming contests which only allow the contestant to submit a single .py file. This allows one to develop a project with multiple .py files and then combine them into one .py file at the end.
My hack is a decorator #modulize which converts a function into a module. This module can then be imported as usual. Here is an example.
#modulize('my_module')
def my_dummy_function(__name__): # the function takes one parameter __name__
# put module code here
def my_function(s):
print(s, 'bar')
# the function must return locals()
return locals()
# import the module as usual
from my_module import my_function
my_function('foo') # foo bar
I also have a script which can combine a project of many .py files which import each other into one '.py' file.
For example, assume I had the following directory structure and files:
my_dir/
__main__.py
import foo.bar
fb = foo.bar.bar_func(foo.foo_var)
print(fb) # foo bar
foo/
__init__.py
foo_var = 'foo'
bar.py
def bar_func(x):
return x + ' bar'
The combined file will look as follows. The code on the top defines the #modulize decorator.
import sys
from types import ModuleType
class MockModule(ModuleType):
def __init__(self, module_name, module_doc=None):
ModuleType.__init__(self, module_name, module_doc)
if '.' in module_name:
package, module = module_name.rsplit('.', 1)
get_mock_module(package).__path__ = []
setattr(get_mock_module(package), module, self)
def _initialize_(self, module_code):
self.__dict__.update(module_code(self.__name__))
self.__doc__ = module_code.__doc__
def get_mock_module(module_name):
if module_name not in sys.modules:
sys.modules[module_name] = MockModule(module_name)
return sys.modules[module_name]
def modulize(module_name, dependencies=[]):
for d in dependencies: get_mock_module(d)
return get_mock_module(module_name)._initialize_
##===========================================================================##
#modulize('foo')
def _foo(__name__):
##----- Begin foo/__init__.py ------------------------------------------------##
foo_var = 'foo'
##----- End foo/__init__.py --------------------------------------------------##
return locals()
#modulize('foo.bar')
def _bar(__name__):
##----- Begin foo/bar.py -----------------------------------------------------##
def bar_func(x):
return x + ' bar'
##----- End foo/bar.py -------------------------------------------------------##
return locals()
def __main__():
##----- Begin __main__.py ----------------------------------------------------##
import foo.bar
fb = foo.bar.bar_func(foo.foo_var)
print(fb) # foo bar
##----- End __main__.py ------------------------------------------------------##
__main__()
Instead of appending the contents into a single *.py file, why not just import what you need from the code that the other people in your team write?

Passing command line argument to another file imported in Python

I have a python file (html2text.py) which gives the desired result when i pass command line argument to it i.e., in the following way:
python html2text.py file.txt
where file.txt contains the source code of a web-site and the result is displayed on the console...
I want to use it in another file (let say a.py) and store the result (which was getting printed on the console) in a string.
For this I need to first import the file (html2text.py) in my file (a.py). Can anyone tell me how do I proceed further...?
Good way is to create some API in your html2text.py. For example:
# html2text.py
def parse(filename):
f = open(filename)
# do the stuff
return output_string
def main():
import sys
print parse(sys.argv[1])
if __name__ == '__main__':
main()
Then you will be able to use it in your a.py:
import html2text # main() will not run
import sys
output = html2text.parse(sys.argv[1])
I think the best way is the reorganize a little your html2text.py file. Append the line like this to your file:
def main():
message = sys.stdin.readlines()
a = your_def(message)
if __name__ == '__main__': main()
Now you're sure, that when invoking the file from command line, everything will go fine. Moreover, if you have everything kept in functions and classes, you can now in your a.py
import html2text
and work on it already in a.py.

Categories