Loading a python script source by filename for testing - python

I would like to create a test for a python 3.7+ script called foo-bar (that's the file name, and it has no .py extension):
#!/usr/bin/env python
def foo(bar):
return bar + 42
if __name__ == '__main__':
print(foo(1))
How can I load this file by path alone, so that I can test the foo() method? The test should NOT trigger the if main condition.
UPDATE note that this is not about executing the file from the test (i.e. exec('foo-bar')), but rather loading/importing it as a module/resource, allowing the test code to execute foo() on it.

You can use the functions in importlib to load this module directly from the script file, without a .py extension.
To make this work, you need to use a loader explicitly, in this case SourceFileLoader will work.
from importlib.machinery import SourceFileLoader
foo_bar = SourceFileLoader('foo_bar', './foo-bar').load_module()
At this point, you can use the functions from inside the module:
result = foo_bar.foo(1)
assert result == 43

I think, what you can do is temporarily create copy of the file with extension. py and after importing delete it

Related

How to use a variable from one imported package in another without running the file again

I have a flask application, and I have split the code into multiple python files, each dedicated to a specific portion of the overall program (db, login, admin, etc). Most of the files have some sort of setup code that creates their respective module's object, which must have access to the Flask object defined in my main file. Some of the modules also need access to variables in other modules, but when importing them, they are run again, even though they were imported in main already.
The following code is an example of the problem.
If I have a main.py like this
import foo
import bar
if __name__ == "__main__":
foo.foofunc()
foo.py
import bar
#bar.barable
def foo(string):
print(string)
and bar.py
import foo
foo.foo("hello")
def barable(fun):
def r(*args, **kwargs):
print("this function is completely unbarable")
func(*args, **kwargs)
This code doesn't work because foo imports bar, which imports foo, which runs bar.barable, which hasn't been defined yet.
In this situation (assuming that calling foo.foo is necessary), is my only option to extract bar.barable out of bar and into a seperate module, or is there some other way to fix this?
I know that importing a module in python runs the file, but is there some way to put some of the code into the same sort of check as __name__ == "__main__" but to check if it is being imported by main and not by another module?

Import a function from a module without module's dependencies

I would like to import a function foo() from module abc.py
However, abc.py contains other functions which rely on modules which are not available for Python (i.e. I cannot import them into python interpreter, because I use ImageJ to run abc.py as Jython)
One solution I found is to put the problematic imports inside the name == "main" check, such as:
# abc.py
def foo():
print("Hello, World!")
def run_main_function():
foo()
...other stuff using IJ...
if __name__ == "__main__":
from ij import IJ
run_main_function()
So when I try to import foo from into another script def.py, e.g.:
# def.py
from abc import foo
def other_func():
foo()
if __name__ == "__main__":
other_func()
This works. But when I put imports in normal fashion, at the top of the script, I get an error: No module named 'ij'. I would like to know if there is a solution to this problem? Specifically, that I put the imports at the top of the script and then within def.py I say to import just the function, without dependencies of abc.py?
I would like to know if there is a solution to this problem? Specifically, that I put the imports at the top of the script and then within def.py I say to import just the function, without dependencies of abc.py?
As far I know, it's the way that python works. You should put that import in the function that uses it if won't be aviable always.
def run_main_function():
from ij import IJ
foo()
Also, don't use abc as a module name, it's a standard library module: Abstract Base Class 2.7, Abstract Base Class 3.6
Edit: don't use trailing .py when importing as Kind Stranger stated.

Load source file without a module

I want to pass my program a file and get a function out of it.
For example, I have a file, foo.py, who's location is not known until run time (it will be passed to to code by the command line or something like that), can be anywhere on my system and looks like this:
def bar():
return "foobar"
how can I get my code to run the function bar?
If the location was known before run time I could do this:
import sys
sys.path.append("path_to_foo")
import foo
foo.bar()
I could create an init.py file in the folder where foo.py is and use importlib or imp but it seems messy. I can't use __import__ as I get ImportError: Import by filename is not supported.
You could open the file and execute it using exec.
f = open('foo.py')
source = f.read()
exec(source)
print bar()
You could even look for the specific function using re
You can write a dummy function and call it everywhere you need, to bypass checking:
def bar():
pass
at run-time override the function with the one you actually intend to, and automatically everywhere it will be used.
Note: This is more messy than using the importlib.
If foo.py is in the same directory as your main file, you can use
from . import foo
(Python 3). This works because . is the directory of your file, and Python will import files in the same directory for you. You can then use
foo.bar()
If it is not, you need to find it, and then execute it:
import os
from os.path import join
lookfor = "foo.py"
for root, dirs, files in os.walk('C:\\'): # or '/' for Linux / OSX
if lookfor in files:
execfile(join(root, lookfor)) # Python 2, or
with open(join(root, lookfor)) as file:
exec(''.join(file.readlines()))
break
foo.bar()
Credits: Martin Stone. I did not just copy the code, I understand the code, and could've made it myself.

How can I import a .pyc compiled python file and use it

Im trying to figure out how to include a .pyc file in a python script.
For example my script is called:
myscript.py
and the script I would like to include is called:
included_script.pyc
So, do I just use:
import included_script
And will that automatically execute the included_script.pyc ? Or is there something further I need to do, to get my included_script.pyc to run inside the myscript.py?
Do I need to pass the variables used in included_script.pyc also? If so, how might this be achieved?
Unfortunately, no, this cannot be done automatically. You can, of course, do it manually in a gritty ugly way.
Setup:
For demonstration purposes, I'll first generate a .pyc file. In order to do that, we first need a .py file for it. Our sample test.py file will look like:
def foo():
print("In foo")
if __name__ == "__main__":
print("Hello World")
Super simple. Generating the .pyc file can done with the py_compile module found in the standard library. We simply pass in the name of the .py file and the name for our .pyc file in the following way:
py_compile.compile('test.py', 'mypyc.pyc')
This will place mypyc.pyc in our current working directory.
Getting the code from .pyc files:
Now, .pyc files contain bytes that are structured in the following way:
First 4 bytes signalling a 'magic number'
Next 4 bytes holding a modification timestamp
Rest of the contents are a marshalled code object.
What we're after is that marshalled code object, so we need to import marshal to un-marshall it and execute it. Additionally, we really don't care/need the 8 first bytes, and un-marshalling the .pyc file with them is disallowed, so we'll ignore them (seek past them):
import marshal
s = open('mypyc.pyc', 'rb')
s.seek(8) # go past first eight bytes
code_obj = marshal.load(s)
So, now we have our fancy code object for test.py which is valid and ready to be executed as we wish. We have two options here:
Execute it in the current global namespace. This will bind all definitions inside our .pyc file in the current namespace and will act as a sort of: from file import * statement.
Create a new module object and execute the code inside the module. This will be like the import file statement.
Emulating from file import * like behaviour:
Performing this is pretty simple, just do:
exec(code_obj)
This will execute the code contained inside code_obj in the current namespace and bind everything there. After the call we can call foo like any other funtion:
foo()
# prints: In foo!
Note: exec() is a built-in.
Emulating import file like behaviour:
This includes another requirement, the types module. This contains the type for ModuleType which we can use to create a new module object. It takes two arguments, the name for the module (mandatory) and the documentation for it (optional):
m = types.ModuleType("Fancy Name", "Fancy Documentation")
print(m)
<module 'Fancy Name' (built-in)>
Now that we have our module object, we can again use exec to execute the code contained in code_obj inside the module namespace (namely, m.__dict__):
exec(code_obj, m.__dict__)
Now, our module m has everything defined in code_obj, you can verify this by running:
m.foo()
# prints: In foo
These are the ways you can 'include' a .pyc file in your module. At least, the ways I can think of. I don't really see the practicality in this but hey, I'm not here to judge.

Unexpected behavior of Python classes (importlib x import subsystem) [duplicate]

To preface, I think I may have figured out how to get this code working (based on Changing module variables after import), but my question is really about why the following behavior occurs so I can understand what to not do in the future.
I have three files. The first is mod1.py:
# mod1.py
import mod2
var1A = None
def func1A():
global var1
var1 = 'A'
mod2.func2()
def func1B():
global var1
print var1
if __name__ == '__main__':
func1A()
Next I have mod2.py:
# mod2.py
import mod1
def func2():
mod1.func1B()
Finally I have driver.py:
# driver.py
import mod1
if __name__ == '__main__':
mod1.func1A()
If I execute the command python mod1.py then the output is None. Based on the link I referenced above, it seems that there is some distinction between mod1.py being imported as __main__ and mod1.py being imported from mod2.py. Therefore, I created driver.py. If I execute the command python driver.py then I get the expected output: A. I sort of see the difference, but I don't really see the mechanism or the reason for it. How and why does this happen? It seems counterintuitive that the same module would exist twice. If I execute python mod1.py, would it be possible to access the variables in the __main__ version of mod1.py instead of the variables in the version imported by mod2.py?
The __name__ variable always contains the name of the module, except when the file has been loaded into the interpreter as a script instead. Then that variable is set to the string '__main__' instead.
After all, the script is then run as the main file of the whole program, everything else are modules imported directly or indirectly by that main file. By testing the __name__ variable, you can thus detect if a file has been imported as a module, or was run directly.
Internally, modules are given a namespace dictionary, which is stored as part of the metadata for each module, in sys.modules. The main file, the executed script, is stored in that same structure as '__main__'.
But when you import a file as a module, python first looks in sys.modules to see if that module has already been imported before. So, import mod1 means that we first look in sys.modules for the mod1 module. It'll create a new module structure with a namespace if mod1 isn't there yet.
So, if you both run mod1.py as the main file, and later import it as a python module, it'll get two namespace entries in sys.modules. One as '__main__', then later as 'mod1'. These two namespaces are completely separate. Your global var1 is stored in sys.modules['__main__'], but func1B is looking in sys.modules['mod1'] for var1, where it is None.
But when you use python driver.py, driver.py becomes the '__main__' main file of the program, and mod1 will be imported just once into the sys.modules['mod1'] structure. This time round, func1A stores var1 in the sys.modules['mod1'] structure, and that's what func1B will find.
Regarding a practical solution for using a module optionally as main script - supporting consistent cross-imports:
Solution 1:
See e.g. in Python's pdb module, how it is run as a script by importing itself when executing as __main__ (at the end) :
#! /usr/bin/env python
"""A Python debugger."""
# (See pdb.doc for documentation.)
import sys
import linecache
...
# When invoked as main program, invoke the debugger on a script
if __name__ == '__main__':
import pdb
pdb.main()
Just I would recommend to reorganize the __main__ startup to the beginning of the script like this:
#! /usr/bin/env python
"""A Python debugger."""
# When invoked as main program, invoke the debugger on a script
import sys
if __name__ == '__main__':
##assert os.path.splitext(os.path.basename(__file__))[0] == 'pdb'
import pdb
pdb.main()
sys.exit(0)
import linecache
...
This way the module body is not executed twice - which is "costly", undesirable and sometimes critical.
Solution 2:
In rarer cases it is desirable to expose the actual script module __main__ even directly as the actual module alias (mod1):
# mod1.py
import mod2
...
if __name__ == '__main__':
# use main script directly as cross-importable module
_mod = sys.modules['mod1'] = sys.modules[__name__]
##_modname = os.path.splitext(os.path.basename(os.path.realpath(__file__)))[0]
##_mod = sys.modules[_modname] = sys.modules[__name__]
func1A()
Known drawbacks:
reload(_mod) fails
pickle'ed classes would need extra mappings for unpickling (find_global ..)

Categories