PyCharm: Embed code of imported functions automatically - python

I'm more or less the only one in my office who codes little scripts for the analysis of data in Python. Hence I have a small library of helper functions which I reuse every now and than in different scripts; this library is placed centrally in the Anaconda librarys on my machine.
Sometimes though, I want to "deploy" the scripts as standalone version, so they could be run on a plain python distribution, i.e. without first having to "install" my library files.
Is there any way in PyCharm to automatically replace certain import statements by the code of the imported functions? I know this can be a complex task to determine whether the function-code to be embedded further relies on different functions again which also need to be imported (but somehow PyInstaller also gets these dependencies). But mainly I'm talking about functions which do not have any further dependencies or only within the same module...
To give an example:
Maybe my library looks like this:
def readfile(filepath):
return ....
def readfileAsList(filepath):
return readfile(filepath).splitlines()
def writefile(contents, filepath, writemode='w'):
with open(filepath, writemode):
...
And my script for some analysis looks like this:
import re
from myLib import readfileAsList
# ANALYSIS CODE GOES HERE ...
Now I want PyCharm to automatically transform the script-code and embed readfileAsList as well as readfile (because of the dependencies).
So the code would look like:
import re
def readfile(filepath):
return ....
def readfileAsList(filepath):
return readfile(filepath).splitlines()
# ANALYSIS CODE GOES HERE ...
Thanks in advance!

Related

Calling MATLAB functions inside a Python module

I have written a small Python module, with a single submodule. So far, life is good, and everything works. Here is a skeleton of the layout of said module
my_module/
__init__.py
setup.py
submod1/
__init__.py
my_matlab_fn.m
my_python.py
my_matlab_fn.m is a useful matlab function, which for the purposes of reproducing this example, may look like this:
Function Out = my_matlab_fn(x)
dummy = 2.0*x;
Out.val = dummy;
So far so good. Now, let's look inside my_python.py
import matlab
import matlab.engine
def double_my_t(t):
t_eng = matlab.engine.start_matlab('nodesktop -nosplash -nodisplay')
output = t_eng.my_matlab_fn(t)['val']
return output
if __name__ == '__main__':
print(double_my_t(4.0))
(In reality this is a simplified version of my actual code that I cannot show, in reality I am passing a dict to the matlab function and doing lots more operations with it).
OK. So, I am able to successfully do the following:
python3 my_python.py
when inside the submod1 directory. All is well and good and I get the correct result from this function call.
For context, here is what init.py looks like in the submod1 dir
from .my_python import double_my_t
and here is what init.py looks like in the my_module dir
from submod1.my_python import double_my_t
from submod1.my_python import double_my_t as double_my_t
I then go ahead and build my python module. Again, no issues here (have tested and seen my script successfully get inside the double_my_t function).
The issue is if I were to run something like this
from my_module.submod1 import double_my_t
print(double_my_t(4))
I get
matlab.engine.MatlabExecutionError: Undefined function 'my_matlab_fn' for inputs arguments ........
I think I understand what is going wrong, but I am not sure on this. I think that the reason why running my_python.py works is because matlab can 'see' the my_matlab_fn' function in 'my_matlab_fn.m' as this file is in the same dir as my_python.py when it is ran, and that when you try to call double_my_t from somewhere else outside this dir, then the matlab engine doesn't 'know' where my_matlab_fn.m is anymore.
I was wondering if anyone knew how I should resolve this issue. I want to be able to keep using these user-defined matlab functions in my Python module, however the documentation on this is not entirely clear.
Cheers

How to load a dylib-file as CPython-extension?

This has been asked before (e.g. here), but the given solution (i.e. renaming file to *.so) is not acceptable. I have a CPython extension called name.dylib, which cannot be imported. If the filename is changed to use name.so it is imported correctly. Changing the filename is not an option**, and should not be necessary.
Python has a lot of hooks for searching for modules, so there must be a way to make it recognise a dylib-file. Can someone show how to do this? Using a low level import which spells out the whole filename is not nice but is an acceptable solution.
** because the build code forces dylib, and, other contexts I have assume it. The extension module is dual purpose, it can by used both as an ordinary shared library and a Python extension. Using a symlink does work, but is a last resort because it requires manual intervention in an automated process.
You could manipulate sys.path_hooks and replace FileFinder-hook with one which would accept .dylib-extensions. But see also the simpler but less convinient alternative which would import given the full file-name of the extension.
More information how .so, .py and .pyc files are imported can be found for example in this answer of mine.
This manipulation could look like following:
import sys
import importlib
from importlib.machinery import FileFinder, ExtensionFileLoader
# pick right loader for .dylib-files:
dylib_extension = ExtensionFileLoader, ['.dylib']
# add dylib-support to file-extension supported per default
all_supported_loaders = [dylib_extension]+ importlib._bootstrap_external._get_supported_file_loaders()
# replace the last hook (i.e. FileFinder) with one recognizing `.dylib` as well:
sys.path_hooks.pop()
sys.path_hooks.append(FileFinder.path_hook(*all_supported_loaders))
#and now import name.dylib via
import name
This must be the first code executed, when python-script starts to run. Other modules might not expect sys.path_hooks being manipulated somewhere during the run of the program, so there might be some problems with other modules (like pdb, traceback and so). For example:
import pdb
#above code
import name
will fail, while
#above code
import pdb
import name
will work, as pdb seems to manipulate the import-machinery.
Normally, FileFinder-hook is the last in the sys.path_hooks, because it is the last resort, once path_hook_for_FileFinder is called for a path, the finder is returned (ImportError should be raised if PathFinder should look at further hooks):
def path_hook_for_FileFinder(path):
"""Path hook for importlib.machinery.FileFinder."""
if not _path_isdir(path):
raise ImportError('only directories are supported', path=path)
return cls(path, *loader_details) # HERE Finder is returned!
However, one might want to be sure and check, that really the right hook is replaced.
A simpler alternative would be to use imp.load_dynamic (neglecting for the moment that imp is depricated):
import imp
imp.load_dynamic('name', 'name.dylib') # or what ever path is used
That might be more robust than the first solution (no problems with pdb for example) but less convinient for bigger projects.

How to structure Python library and path in the same way as MATLAB

Over the years I have written a few hundred functions in matlab for space engineering purposes which I use on a daily basis. They are all nicely put in one folder ordered in subfolders, and in matlab I just have an addpath() command for the root folder in the startup.m file, and then I can use any of the functions right away after starting up matlab.
I am now trying to do the same in python.
As far as I understand in python I shouldn't have 1 file for every function like in matlab, but rather bundle all the functions together in 1 py file. Is this correct? I am trying to avoid this, since I have a strong preference for short scripts rather than 1 huge one, due to it being way more intuitive for me that way.
And then, once I have all my python scripts, can I place them anywhere in order to use them? Because I read that python works differently than matlab in this aspect, and scripts need to be in the working directory in order to import them. However, I want to load the scripts and be able to use them regardless of my active working directory. So I am guessing I have to do something with paths. I have found that I can append to pythonpath using sys.path.insert or append, however this feels like a workaround to me, or is it the way to go?
So considering I have put all my rewritten matlab fuctions in a single python file (lets call it agfunctions.py) saved in a directory (lets call it PythonFunctions). The core of my startup.py would then be something like (I have added the startup file to PYTHONSTARTUP path):
# startup.py
import os, sys
import numpy as np
import spiceypy as spice
sys.path.append('C:\Users\AG5\Documents\PythonFunctions')
import agfunctions as ag
Does any of this make sense? Is this the way to go, or is there a better way in python?
Well, the python package is probably the best way to solve your problem. You can read more here. Python packages have no need to be built and not always are created for sharing, so do not worry.
Assume you had this file structure:
Documents/
startup.py
PythonFunctions/
FirstFunc.py
SecondFunc.py
Then you can add file __init__.py in your PythonFunctions directory with next content:
__all__ = ['FirstFunc', 'SecondFunc']
Init file must be updated if you change filenames, so maybe it isn't best solution for you. Now, the directory looks like:
Documents/
startup.py
PythonFunctions/
__init__.py
FirstFunc.py
SecondFunc.py
And it's all - PythonFunctions now is a package. You can import all files by one import statement and use them:
from PythonFunctions import *
result_one = FirstFunc.function_name(some_data)
result_two = SecondFunc.function_name(some_other_data)
And if your startup.py is somewhere else, you can update path before importing as next:
sys.path.append('C:\Users\AG5\Documents')
More detailed explanations can be found here
There is no need to write all your matlab functions within one file. You can also keep (kind of) your desired folder structure of your library. However, you should write all functions in the deepest subfolder into one *.py file.
Suppose your MATLAB library is in the folder space_engineering and is set up like this:
space_engineering\
subfolder1\
functiongroup1\
printfoo.m
printbar.m
subfolder2\
...
subfolder3\
...
...
After doing addpath(genpath('\your_path\space_engineering')) all functions in subfolders subfolder* and functiongroup* are available in the global namespace like this
>> printfoo % function printfoo does only do fprintf('foo')
foo
I understand this is an behaviour your want to preserve in your migrated python library. And there is a way to do so. Your new python library would be structured like this:
space_engineering\
__init__.py
subfolder1\
__init__.py
functiongroup1.py
functiongroup2.py
subfolder2\
__init__.py
...
subfolder3\
__init__.py
...
...
As you can see the deepest subfolder layers functiongroup* of the MATLAB structure is replaced now by functiongroup*.py files, so called modules. The only compromise you have to allow for is that functions printfoo() and printbar() are now defined in this .py modules instead of having individual .py files.
# content of functiongroup1.py
def printfoo():
print(foo)
def printbar():
print(bar)
To allow for doing the same function calling as in MATLAB you have to make the function names printfoo and printbar available in the global namespace by adjusting the __init__.py files of each subfolder
# content of space_enginieering\subfolder1\__init__.py
from .functiongroup1 import *
from .functiongroup2 import *
as well as the __init__.py of the main folder
# content of space_engineering\__init__.py
from .subfolder1 import *
from .subfolder2 import *
from .subfolder3 import *
The from .functiongroup1 import * statement loads all names from module functiongroup1 into the namespace of subfolder1. Successively from .subfolder1 import * will forward them to the global namespace.
Like this you can do in an python console (or in any script) e.g.:
>>> sys.path.append('\your_path\space_engineering')
>>> from space_engineering import *
>>> printfoo()
foo
This way you can use your new python library in the same way as you former used the MATLAB library.
HOWEVER: The usage of from xyz import * statement is not recommend in python (see here why, it is similar to why not using eval in MATLAB) and some Pythonistas may complain recommending this. But for your special case, where you insist on creating a python library with MATLAB like comfort, it is a proper solution.

Always import same list at beginning of python script

I have several python scripts that all start with the same set of lines, and I would like to be able to change the lines only once.
For example something like this:
import pickle
import sys
sys.path.append("/Users/user/folder/")
import someOtherModules
x=someOtherModules.function()
I know I could save this as a string, and then load the string and run the exec() command, but I was wondering if there is a better way to do this.
In other words, I want to import a list of modules and run some functions at the beginning of every script.
You can define your own module.
# my_init.py
def init():
print("Initializing...")
And simply add this at the beginning of all your scripts:
import my_init
my_init.init()
Depending on where your scripts and my_init.py are located, you may need to add it to your Python user site directory, see: Where should I put my own python module so that it can be imported
You can move all stuff in separate script and in another scripts import everything from it: from my_fancy_script import *. In this case not only code inside my_fancy_script will be executed but also imports will be pushed to another files.
Anyway, better to use Delgan's answer

How do I override a Python import?

I'm working on pypreprocessor which is a preprocessor that takes c-style directives and I've been able to make it work like a traditional preprocessor (it's self-consuming and executes postprocessed code on-the-fly) except that it breaks library imports.
The problem is: The preprocessor runs through the file, processes it, outputs to a temporary file, and exec() the temporary file. Libraries that are imported need to be handled a little different, because they aren't executed, but rather they are loaded and made accessible to the caller module.
What I need to be able to do is: Interrupt the import (since the preprocessor is being run in the middle of the import), load the postprocessed code as a tempModule, and replace the original import with the tempModule to trick the calling script with the import into believing that the tempModule is the original module.
I have searched everywhere and so far and have no solution.
This Stack Overflow question is the closest I've seen so far to providing an answer:
Override namespace in Python
Here's what I have.
# Remove the bytecode file created by the first import
os.remove(moduleName + '.pyc')
# Remove the first import
del sys.modules[moduleName]
# Import the postprocessed module
tmpModule = __import__(tmpModuleName)
# Set first module's reference to point to the preprocessed module
sys.modules[moduleName] = tmpModule
moduleName is the name of the original module, and tmpModuleName is the name of the postprocessed code file.
The strange part is this solution still runs completely normal as if the first module completed loaded normally; unless you remove the last line, then you get a module not found error.
Hopefully someone on Stack Overflow know a lot more about imports than I do, because this one has me stumped.
Note: I will only award a solution, or, if this is not possible in Python; the best, most detailed explanation of why this is not impossible.
Update: For anybody who is interested, here is the working code.
if imp.lock_held() is True:
del sys.modules[moduleName]
sys.modules[tmpModuleName] = __import__(tmpModuleName)
sys.modules[moduleName] = __import__(tmpModuleName)
The 'imp.lock_held' part detects whether the module is being loaded as a library. The following lines do the rest.
Does this answer your question? The second import does the trick.
Mod_1.py
def test_function():
print "Test Function -- Mod 1"
Mod_2.py
def test_function():
print "Test Function -- Mod 2"
Test.py
#!/usr/bin/python
import sys
import Mod_1
Mod_1.test_function()
del sys.modules['Mod_1']
sys.modules['Mod_1'] = __import__('Mod_2')
import Mod_1
Mod_1.test_function()
To define a different import behavior or to totally subvert the import process you will need to write import hooks. See PEP 302.
For example,
import sys
class MyImporter(object):
def find_module(self, module_name, package_path):
# Return a loader
return self
def load_module(self, module_name):
# Return a module
return self
sys.meta_path.append(MyImporter())
import now_you_can_import_any_name
print now_you_can_import_any_name
It outputs:
<__main__.MyImporter object at 0x009F85F0>
So basically it returns a new module (which can be any object), in this case itself. You may use it to alter the import behavior by returning processe_xxx on import of xxx.
IMO: Python doesn't need a preprocessor. Whatever you are accomplishing can be accomplished in Python itself due to it very dynamic nature, for example, taking the case of the debug example, what is wrong with having at top of file
debug = 1
and later
if debug:
print "wow"
?
In Python 2 there is the imputil module that seems to provide the functionality you are looking for, but has been removed in python 3. It's not very well documented but contains an example section that shows how you can replace the standard import functions.
For Python 3 there is the importlib module (introduced in Python 3.1) that contains functions and classes to modify the import functionality in all kinds of ways. It should be suitable to hook your preprocessor into the import system.

Categories