Passing command line argument to another file imported in Python - python

I have a python file (html2text.py) which gives the desired result when i pass command line argument to it i.e., in the following way:
python html2text.py file.txt
where file.txt contains the source code of a web-site and the result is displayed on the console...
I want to use it in another file (let say a.py) and store the result (which was getting printed on the console) in a string.
For this I need to first import the file (html2text.py) in my file (a.py). Can anyone tell me how do I proceed further...?

Good way is to create some API in your html2text.py. For example:
# html2text.py
def parse(filename):
f = open(filename)
# do the stuff
return output_string
def main():
import sys
print parse(sys.argv[1])
if __name__ == '__main__':
main()
Then you will be able to use it in your a.py:
import html2text # main() will not run
import sys
output = html2text.parse(sys.argv[1])

I think the best way is the reorganize a little your html2text.py file. Append the line like this to your file:
def main():
message = sys.stdin.readlines()
a = your_def(message)
if __name__ == '__main__': main()
Now you're sure, that when invoking the file from command line, everything will go fine. Moreover, if you have everything kept in functions and classes, you can now in your a.py
import html2text
and work on it already in a.py.

Related

How do I open another Python file within a file with a parameter/argument/variable attached?

I want to run another file from my main.py and also set the value of a variable in the main.py and "export it", so I can use it in the other file.
But I absolutely donĀ“t know how to do that.
So that I can set a 'var = x[3]' in the main file and kinda "export" the variable into the file that I am opening like 'import otherfile.py(var = x[3])'
Thanks in advance
To import any variable from main file to otherfile you can use,
from main import*
Y = var
That's how you can set var in main.py file and import or use it in otherfile.py.
You can use sys.argv to get arguments if you run the file from the command line
so if you have the file:
import sys
print(sys.argv[0])
then open a command line and type python file.py hello it will print 'hello'.
As well, you can use os.system() to run command line functions from python
so your first file could be:
import os
var = 'test'
os.system(f"python otherfile.py {var}")
and otherfile.py could be:
import sys
print(sys.argv[0])
then if you run the first file, a python window will open and print 'test'.
Hope this was what you were looking for

How to access a docstring from a separate script?

Building a GUI for users to select Python scripts they want to run. Each script has its own docstring explaining inputs and outputs for the script. I want to display that information in the UI once they've highlighted the script, but not selected to run it, and I can't seem to get access to the docstrings from the base program.
ex.
test.py
"""this is a docstring"""
print('hello world')
program.py
index is test.py for this example, but is normally not known because it's whatever the user has selected in the GUI.
# index is test.py
def on_selected(self, index):
script_path = self.tree_view_model.filePath(index)
fparse = ast.parse(''.join(open(script_path)))
self.textBrowser_description.setPlainText(ast.get_docstring(fparse))
Let's the docstring you want to access belongs to the file, file.py.
You can get the docstring by doing the following:
import file
print(file.__doc__)
If you want to get the docstring before you import it then the you could read the file and extract the docstring. Here is an example:
import re
def get_docstring(file)
with open(file, "r") as f:
content = f.read() # read file
quote = content[0] # get type of quote
pattern = re.compile(rf"^{quote}{quote}{quote}[^{quote}]*{quote}{quote}{quote}") # create docstring pattern
return re.findall(pattern, content)[0][3:-3] # return docstring without quotes
print(get_docstring("file.py"))
Note: For this regex to work the docstring will need to be at the very top.
Here's how to get it via importlib. Most of the logic has been put in a function. Note that using importlib does import the script (which causes all its top-level statements to be executed), but the module itself is discarded when the function returns.
If this was the script docstring_test.py in the current directory that I wanted to get the docstring from:
""" this is a multiline
docstring.
"""
print('hello world')
Here's how to do it:
import importlib.util
def get_docstring(script_name, script_path):
spec = importlib.util.spec_from_file_location(script_name, script_path)
foo = importlib.util.module_from_spec(spec)
spec.loader.exec_module(foo)
return foo.__doc__
if __name__ == '__main__':
print(get_docstring('docstring_test', "./docstring_test.py"))
Output:
hello world
this is a multiline
docstring.
Update:
Here's how to do it by letting the ast module in the standard library do the parsing which avoids both importing/executing the script as well as trying to parse it yourself with a regex.
This looks more-or-less equivalent to what's in your question, so it's unclear why what you have isn't working for you.
import ast
def get_docstring(script_path):
with open(script_path, 'r') as file:
tree = ast.parse(file.read())
return ast.get_docstring(tree, clean=False)
if __name__ == '__main__':
print(repr(get_docstring('./docstring_test.py')))
Output:
' this is a multiline\n docstring.\n'

ConfigParser in python 3 vs python 2

I've been slowly making the transition from py2 -> py3 and I've run into an issue that I can't quite resolve (as trivial as I'm sure the problem is). When I execute the code below, the config file appears to have no sections :(
Where have I gone astray?
As a note, I did reuse this code from a python 2 script (replacing the old ConfigParser.SafeConfigParser with the new configparser.ConfigParser). I don't think this fact is relevant, but maybe it is? Clearly, I do not know :)
Here's the project/main.py
import inspect
import os
import utilities.utilities
def main():
config_ini_path = os.path.abspath(inspect.getfile(inspect.currentframe()).split('.py')[0] + '_config.ini'
print(config_ini_path)
config = utilities.utilies.get_config(config_ini_path)
print(config.sections())
if __name__ == "__main__":
main()
Here's the project/utilities/utilities.py:
import os
import configparser
import inspect
import sys
def get_config(config_file_path=os.path.abspath(inspect.getfile(inspect.currentframe()).split('.py')[0]) + '_config.ini'):
parser = configparser.ConfigParser()
if os.path.exists(config_file_path):
with open(config_file_path, 'r') as config_file:
parser.read(config_file)
return parser
else:
print('FAILED TO GET CONFIG')
sys.exit()
def set_config(parser, config_file_path):
if os.path.exists(config_file_path):
with open(config_file_path, 'w') as config_file:
parser.write(config_file)
else:
print('FAILED TO SET CONFIG')
sys.exit()
And finally, here is the project/project_config.ini:
[logging]
json_config_path = /project/logging.json
Interestingly, if I add
config['logging'] = {'json_config_path':'project/other.json'}
utilities.utilities.set_config(config, config_ini_path)
print(config.sections())
The change will be written to the file, however, upon re-execution, it will not be recalled (as witnessed by .sections()).
I'm sure I am missing something simple! What gives?
Turns out .read() accepts filenames, and .read_file() accepts filetypes. Originally, I was using .readfp(), but read_file() has replaced it in py3! Silly, silly me.

how to preserve module path of a module executed as a script

I have a function called get_full_class_name(instance), which returns the full module-qualified class name of instance.
Example my_utils.py:
def get_full_class_name(instance):
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
Unfortunately, this function fails when given a class that's defined in a currently running script.
Example my_module.py:
#! /usr/bin/env python
from my_utils import get_full_class_name
class MyClass(object):
pass
def main():
print get_full_class_name(MyClass())
if __name__ == '__main__':
main()
When I run the above script, instead of printing my_module.MyClass, it prints __main__.MyClass:
$ ./my_module.py
__main__.MyClass
I do get the desired behavior if I run the above main() from another script.
Example run_my_module.py:
#! /usr/bin/env python
from my_module import main
if __name__ == '__main__':
main()
Running the above script gets:
$ ./run_my_module.py
my_module.MyClass
Is there a way I could write the get_full_class_name() function such that it always returns my_module.MyClass regardless of whether my_module is being run as a script?
I propose handling the case __name__ == '__main__' using the techniques discussed in Find Path to File Being Run. This results in this new my_utils:
import sys
import os.path
def get_full_class_name(instance):
if instance.__class__.__module__ == '__main__':
return '.'.join([os.path.basename(sys.argv[0]),
instance.__class__.__name__])
else:
return '.'.join([instance.__class__.__module__,
instance.__class__.__name__])
This does not handle interactive sessions and other special cases (like reading from stdin). For this you may have to include techniques like discussed in detect python running interactively.
Following mkiever's answer, I ended up changing get_full_class_name() to what you see below.
If instance.__class__.__module__ is __main__, it doesn't use that as the module path. Instead, it uses the relative path from sys.argv[0] to the closest directory in sys.path.
One problem is that sys.path always includes the directory of sys.argv[0] itself, so this relative path ends up being just the filename part of sys.argv[0]. As a quick hack-around, the code below assumes that the sys.argv[0] directory is always the first element of sys.path, and disregards it. This seems unsafe, but safer options are too tedious for my personal code for now.
Any better solutions/suggestions would be greatly appreciated.
import os
import sys
from nose.tools import assert_equal, assert_not_equal
def get_full_class_name(instance):
'''
Returns the fully-qualified class name.
Handles the case where a class is declared in the currently-running script
(where instance.__class__.__module__ would be set to '__main__').
'''
def get_module_name(instance):
def get_path_relative_to_python_path(path):
path = os.path.abspath(path)
python_paths = [os.path.abspath(p) for p in sys.path]
assert_equal(python_paths[0],
os.path.split(os.path.abspath(sys.argv[0]))[0])
python_paths = python_paths[1:]
min_relpath_length = len(path)
result = None
for python_path in python_paths:
relpath = os.path.relpath(path, python_path)
if len(relpath) < min_relpath_length:
min_relpath_length = len(relpath)
result = os.path.join(os.path.split(python_path)[-1],
relpath)
if result is None:
raise ValueError("Path {} doesn't seem to be in the "
"PYTHONPATH.".format(path))
else:
return result
if instance.__class__.__module__ == '__main__':
script_path = os.path.abspath(sys.argv[0])
relative_path = get_path_relative_to_python_path(script_path)
relative_path = relative_path.split(os.sep)
assert_not_equal(relative_path[0], '')
assert_equal(os.path.splitext(relative_path[-1])[1], '.py')
return '.'.join(relative_path[1:-1])
else:
return instance.__class__.__module__
module_name = get_module_name(instance)
return '.'.join([module_name, instance.__class__.__name__])

How to know which file is calling which file, filesystem

How to know which file is calling which file in filesystem, like file1.exe is calling file2.exe
so file2.exe is modified,
and file1.exe is entered in log file.
winos
I have searched INTERNET but not able to find any samples.
In order know which file is calling which file you can use the Trace module
exp: if you have 2 files
***file1.py***
import file2
def call1():
file2.call2()
***file2.py***
def call2():
print "---------"
u can use it using console:
$ python -m trace --trackcalls path/to/file1.py
or within a program using a Trace object
****tracefile.py***
import trace,sys
from file1 import call1
#specify what to trace here
tracer = trace.Trace(ignoredirs=[sys.prefix, sys.exec_prefix], trace=0, count=1)
tracer.runfunc(call1) #call the function call1 in fille1
results = tracer.results()
results.write_results(summary=True, coverdir='.')

Categories