I want to use importlib.machinery.EXTENSION_SUFFIXES from Python 3 but am unfortunately using Python 2.7.
EXTENSION_SUFFIXES evaluates to ['.cpython-34m-x86_64-linux-gnu.so', '.cpython-34m.so', '.abi3.so', '.so'], but this is specific to my machine and possibly python version, so I cannot simply hardcode the list.
Here's where EXTENSION_SUFFIXES is built in Python 3's source: https://github.com/python/cpython/blob/3.6/Lib/importlib/_bootstrap_external.py#L1431. However it seems to go down into the C implementation (link), so it's unclear to me how I can get this info.
How can I obtain this list in Python 2.7?
Use imp.get_suffixes() instead:
Return a list of 3-element tuples, each describing a particular type of module. Each triple has the form (suffix, mode, type), where suffix is a string to be appended to the module name to form the filename to search for, mode is the mode string to pass to the built-in open() function to open the file (this can be 'r' for text files or 'rb' for binary files), and type is the file type, which has one of the values PY_SOURCE, PY_COMPILED, or C_EXTENSION, described below.
Thus, to filter this into a list of suffixes for C extension modules:
import imp
extension_suffixes = [suffix for (suffix, mode, type) in imp.get_suffixes()
if type == imp.C_EXTENSION]
This also works in Python 3, although imp is deprecated in Python 3.
Related
I'm really unexpert in python, so forgive my question if stupid.
I'm trying a simple script that operates on all the files in a folder.
However, I apparently can only access the folder recursively!
I explain. I have a folder, DATA, with subfolders for each day (of the form YYYY-MM-DD).
If I try
for filename in glob.glob('C:\Users\My username\Documents\DATA\2021-01-20\*'):
print filename
I get no output.
However, if I try instead
for filename in glob.glob('C:\Users\My username\Documents\DATA\*\*'):
print filename
the output is that expected:
C:\Users\My username\Documents\DATA\2021-01-20\210120_HOPG_sputteredTip0001.sxm
C:\Users\My username\Documents\DATA\2021-01-20\210120_HOPG_sputteredTip0002.sxm
...
I even tried different folder names (removing the dashes, using letters in the beginning, using only letters, using a shorter folder name) but the result is still the same.
What am I missing?
(BTW: I am on python 2.7, and it's because the program I need for the data is only compatible with python 2)
Beware when using backslashes in strings. In Python this means escaping characters. Try prepending your string with r like so:
for filename in glob.glob(r'C:\Users\My username\Documents\DATA\*'):
# Do you business
Edit:
As #poomerang has pointed out a shorter answer has previously been provided as to what 'r' does in Python here
Official docs for Python string-literals: Python 2.7 and for Python 3.8.
Recursive file search is not possible with glob in Python 2.7. I.e. searching for files in a folder, its subfolders, sub-subfolders and so on.
You have two options:
use os.walk (you might need to change your code's structure however)
Use the backported pathlib2 module from PyPI https://pypi.org/project/pathlib2/ - which should include a glob function supporting the recursive search using ** wildcard.
I have a text file /etc/default/foo which contains one line:
FOO="/path/to/foo"
In my python script, I need to reference the variable FOO.
What is the simplest way to "source" the file /etc/default/foo into my python script, same as I would do in bash?
. /etc/default/foo
Same answer as #jil however, that answer is specific to some historical version of Python.
In modern Python (3.x):
exec(open('filename').read())
replaces execfile('filename') from 2.x
You could use execfile:
execfile("/etc/default/foo")
But please be aware that this will evaluate the contents of the file as is into your program source. It is potential security hazard unless you can fully trust the source.
It also means that the file needs to be valid python syntax (your given example file is).
Keep in mind that if you have a "text" file with this content that has a .py as the file extension, you can always do:
import mytextfile
print(mytestfile.FOO)
Of course, this assumes that the text file is syntactically correct as far as Python is concerned. On a project I worked on we did something similar to this. Turned some text files into Python files. Wacky but maybe worth consideration.
Just to give a different approach, note that if your original file is setup as
export FOO=/path/to/foo
You can do source /etc/default/foo; python myprogram.py (or . /etc/default/foo; python myprogram.py) and within myprogram.py all the values that were exported in the sourced' file are visible in os.environ, e.g
import os
os.environ["FOO"]
If you know for certain that it only contains VAR="QUOTED STRING" style variables, like this:
FOO="some value"
Then you can just do this:
>>> with open('foo.sysconfig') as fd:
... exec(fd.read())
Which gets you:
>>> FOO
'some value'
(This is effectively the same thing as the execfile() solution
suggested in the other answer.)
This method has substantial security implications; if instead of FOO="some value" your file contained:
os.system("rm -rf /")
Then you would be In Trouble.
Alternatively, you can do this:
>>> with open('foo.sysconfig') as fd:
... settings = {var: shlex.split(value) for var, value in [line.split('=', 1) for line in fd]}
Which gets you a dictionary settings that has:
>>> settings
{'FOO': ['some value']}
That settings = {...} line is using a dictionary comprehension. You could accomplish the same thing in a few more lines with a for loop and so forth.
And of course if the file contains shell-style variable expansion like ${somevar:-value_if_not_set} then this isn't going to work (unless you write your very own shell style variable parser).
There are a couple ways to do this sort of thing.
You can indeed import the file as a module, as long as the data it contains corresponds to python's syntax. But either the file in question is a .py in the same directory as your script, either you're to use imp (or importlib, depending on your version) like here.
Another solution (that has my preference) can be to use a data format that any python library can parse (JSON comes to my mind as an example).
/etc/default/foo :
{"FOO":"path/to/foo"}
And in your python code :
import json
with open('/etc/default/foo') as file:
data = json.load(file)
FOO = data["FOO"]
## ...
file.close()
This way, you don't risk to execute some uncertain code...
You have the choice, depending on what you prefer. If your data file is auto-generated by some script, it might be easier to keep a simple syntax like FOO="path/to/foo" and use imp.
Hope that it helps !
The Solution
Here is my approach: parse the bash file myself and process only variable assignment lines such as:
FOO="/path/to/foo"
Here is the code:
import shlex
def parse_shell_var(line):
"""
Parse such lines as:
FOO="My variable foo"
:return: a tuple of var name and var value, such as
('FOO', 'My variable foo')
"""
return shlex.split(line, posix=True)[0].split('=', 1)
if __name__ == '__main__':
with open('shell_vars.sh') as f:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
print(shell_vars)
How It Works
Take a look at this snippet:
shell_vars = dict(parse_shell_var(line) for line in f if '=' in line)
This line iterates through the lines in the shell script, only process those lines that has the equal sign (not a fool-proof way to detect variable assignment, but the simplest). Next, run those lines into the function parse_shell_var which uses shlex.split to correctly handle the quotes (or the lack thereof). Finally, the pieces are assembled into a dictionary. The output of this script is:
{'MOO': '/dont/have/a/cow', 'FOO': 'my variable foo', 'BAR': 'My variable bar'}
Here is the contents of shell_vars.sh:
FOO='my variable foo'
BAR="My variable bar"
MOO=/dont/have/a/cow
echo $FOO
Discussion
This approach has a couple of advantages:
It does not execute the shell (either in bash or in Python), which avoids any side-effect
Consequently, it is safe to use, even if the origin of the shell script is unknown
It correctly handles values with or without quotes
This approach is not perfect, it has a few limitations:
The method of detecting variable assignment (by looking for the presence of the equal sign) is primitive and not accurate. There are ways to better detect these lines but that is the topic for another day
It does not correctly parse values which are built upon other variables or commands. That means, it will fail for lines such as:
FOO=$BAR
FOO=$(pwd)
Based off the answer with exec(.read()), value = eval(.read()), it will only return the value. E.g.
1 + 1: 2
"Hello Word": "Hello World"
float(2) + 1: 3.0
Is the path separator employed inside a Python tarfile.TarFile object a '/' regardless of platform, or is it a backslash on Windows?
I basically never touch Windows, but I would kind of like the code I'm writing to be compatible with it, if it can be. Unfortunately I have no Windows host on which to test.
A quick test tells me that a (forward) slash is always used.
In fact, the tar format stores the full path of each file as a single string, using slashes (try looking at a hex dump), and python just reads that full path without any modification. Likewise, at extraction time python hard-replaces slashes with the local separator (see TarFile._extract_member).
... which makes me think that there are surely some nonconformant implementations of tar for Windows that create tarfiles with backslashs as separators!?
Python supports zipping files when zlib is available, ZIP_DEFLATE
see:
https://docs.python.org/3.4/library/zipfile.html
The zip command-line program on Linux supports -1 fastest, -9 best.
Is there a way to set the compression level of a zip file created in Python's zipfile module?
Starting from python 3.7, the zipfile module added the compresslevel parameter.
https://docs.python.org/3/library/zipfile.html
I know this question is dated, but for people like me, that fall in this question, it may be a better option than the accepted one.
The zipfile module does not provide this. During compression it uses constant from zlib - Z_DEFAULT_COMPRESSION. By default it equals -1. So you can try to change this constant manually, as possible solution.
Python 3.7+ answer: If you look at the zipfile.ZipFile constructor you'll see this:
def __init__(self, file, mode="r", compression=ZIP_STORED, allowZip64=True,
compresslevel=None):
"""Open the ZIP file with mode read 'r', write 'w', exclusive create 'x',
or append 'a'.
...
compression: ZIP_STORED (no compression), ZIP_DEFLATED (requires zlib),
ZIP_BZIP2 (requires bz2) or ZIP_LZMA (requires lzma).
compresslevel: None (default for the given compression type) or an integer
specifying the level to pass to the compressor.
When using ZIP_STORED or ZIP_LZMA this keyword has no effect.
When using ZIP_DEFLATED integers 0 through 9 are accepted.
When using ZIP_BZIP2 integers 1 through 9 are accepted.
"""
which means you can pass the desired compression in the constructor:
myzip = zipfile.ZipFile(file_handle, "w", compression=zipfile.ZIP_DEFLATED, compresslevel=9)
See also https://docs.python.org/3/library/zipfile.html
On my Centos server Python's mimetypes.guess_type("mobile.3gp") returns (None, None), instead of ('video/3gpp', None).
Where does Python get the list of mimetypes from, and is it possible to add a missing type to the list?
On my system (Debian lenny) its in /usr/lib/python2.5/mimetypes.py
in the list knownfiles you can supply your own files for the init() function.
The mimetypes module uses mime.types files as they are common on Linux/Unix systems. If you look in mimetypes.knownfiles you will find a list of files that Python tries to access to load the data. You can also specify your own file to add new types by adding it to that list.
In python 3.6 find that file: C:\Users\Me\AppData\Local\Programs\Python\Python36\Lib\mimetypes.py
Search for mp3 (to get the list of extension). Add your file (it's intuitive)