Mimic 7zip with python

Mimic 7zip with python - python

I am using Python 3.6, and currently I subprocess out to my 7zip program to get the compression I need.
subprocess.call('7z a -t7z -ms=off {0} *'.format(filename))
I know the zipfile class has ‘ZIP_LZMA’ compression, but the application I am passing this too says the output file isn’t correct. So what else do I have to add to the ZipFile class to make it mimic the above command?

If you do not care much for Windows, then perhaps libarchive could help. In Ubuntu, for example:
$ sudo apt install python3-libarchive-c
Then:
import libarchive
with libarchive.file_writer('test.7z', '7zip') as archive:
archive.add_files('first.file', 'second.file', 'third.file')
Then there is the pylib7zip library, which wraps the existing 7z.dll and seems to offer a Windows-only alternative.

Related

Not managing to extract RAR archive using rarfile module

I have been trying to make a script that extracts *.rar files for but am receiving errors. I've been struggling to understand the documentation of the module to no avail (I'm new to programming so sometimes get a bit lost in all the docs).
Here is the relevant part of my code, and the error received.
Snippet from my code:
import rarfile
rarpath='/home/maze/Desktop/test.rar'
def unrar(file):
rf=rarfile.RarFile(file)
rf.rarfile.extract_all()
unrar(rarpath)
Error received:
File "unrarer.py", line 26, in unrar
rf.rarfile.extract_all()
AttributeError: 'str' object has no attribute 'extract_all'
I have installed rarfile2.8 and unrar0.3 using pip (note sure if the later was necessary).
Thanks in advance for any assistance correcting my function or helping understand the package's documentation.

Support for RAR files in general is quite poor, this experience is par for the course.
In order to get the rarfile Python module to work, you have to also install a supported tool for extracting RAR files. Your only two choices are bsdtar or unrar. Do not install these with Pip, you have to install these with your Linux package manager (or install them yourself, if you think that the computer's time is more valuable than your time). For example on Debian-based systems (this includes Ubuntu) run,
sudo apt install bsdtar
Or,
sudo apt install unrar
Note that bsdtar does not have the same level of support for RAR files that Unrar does. Some newer RAR files will not extract with bsdtar.
Then your code should look something like this:
import rarfile
def unrar(file):
rf = rarfile.RarFile(file)
rf.extract_all()
unrar('/home/maze/Desktop/test.rar')
Note the use of rf.extract_all(), not rf.rarfile.extract_all().
If you are just doing extract_all then there is no need to use a rarfile module, though. You can just use the subprocess module:
import subprocess
path = '/path/to/archive.rar'
subprocess.check_call(['unrar', 'x', path])
The rarfile module is basically nothing more than a wrapper around subprocess anyway.
Of course, if you have a choice in the matter, I recommend migrating your archives to a more portable and better supported archive format.

if you are in windows, it worked for me. You need to go to https://www.rarlab.com/rar_add.htm download UnRAR for Windows - Command line freeware Windows UnRAR, execute it, extract it to a folder and add the executable path in your code after importing rarfile:
rarfile.UNRAR_TOOL = r"C:\FilePath\UnRAR.exe"

rf.rarfile is the name of the file which you can see by printing its value. Remove that and check out help(rarfile.RarFile) for the method you want.
import rarfile
rarpath='/home/maze/Desktop/test.rar'
def unrar(file):
rf=rarfile.RarFile(file)
rf.extractall()
unrar(rarpath)

try this
import fnmatch
from rarfile import RarFile
path = r'C:\Users\byqpz\Desktop\movies\rars'
destinationPath = r'C:\Users\byqpz\Desktop\movies\destination'
for root, dirs, files in os.walk(path):
for filename in fnmatch.filter(files, '*.rar'):
fullPath = os.path.join(root, filename)
RarFile(fullPath).extract(destinationPath)

Distributing a Python script to unpack .tar.xz

Is there a way to distribute a Python script that can unpack a .tar.xz file?
Specifically:
This needs to run on other people's machines, not mine, so I can't require any extra modules to have been installed.
I can get away with assuming the presence of Python 2.7, but not 3.x.
So that seems to amount to asking whether out-of-the-box Python 2.7 has such a feature, and as far as I can tell the answer is no, but is there anything I'm missing?

First decompress the xz file into tar data and then extract the tar data:
import lzma
import tarfile
with lzma.open("file.tar.xz") as fd:
with tarfile.open(fileobj=fd) as tar:
content = tar.extractall('/path/to/extract/to')
For python2.7 you need to install pip27.pylzma

Where is _functools.py located? [duplicate]

How do I learn where the source file for a given Python module is installed? Is the method different on Windows than on Linux?
I'm trying to look for the source of the datetime module in particular, but I'm interested in a more general answer as well.

For a pure python module you can find the source by looking at themodule.__file__.
The datetime module, however, is written in C, and therefore datetime.__file__ points to a .so file (there is no datetime.__file__ on Windows), and therefore, you can't see the source.
If you download a python source tarball and extract it, the modules' code can be found in the Modules subdirectory.
For example, if you want to find the datetime code for python 2.6, you can look at
Python-2.6/Modules/datetimemodule.c
You can also find the latest version of this file on github on the web at
https://github.com/python/cpython/blob/main/Modules/_datetimemodule.c

Running python -v from the command line should tell you what is being imported and from where. This works for me on Windows and Mac OS X.
C:\>python -v
# installing zipimport hook
import zipimport # builtin
# installed zipimport hook
# C:\Python24\lib\site.pyc has bad mtime
import site # from C:\Python24\lib\site.py
# wrote C:\Python24\lib\site.pyc
# C:\Python24\lib\os.pyc has bad mtime
import os # from C:\Python24\lib\os.py
# wrote C:\Python24\lib\os.pyc
import nt # builtin
# C:\Python24\lib\ntpath.pyc has bad mtime
...
I'm not sure what those bad mtime's are on my install!

I realize this answer is 4 years late, but the existing answers are misleading people.
The right way to do this is never __file__, or trying to walk through sys.path and search for yourself, etc. (unless you need to be backward compatible beyond 2.1).
It's the inspect module—in particular, getfile or getsourcefile.
Unless you want to learn and implement the rules (which are documented, but painful, for CPython 2.x, and not documented at all for other implementations, or 3.x) for mapping .pyc to .py files; dealing with .zip archives, eggs, and module packages; trying different ways to get the path to .so/.pyd files that don't support __file__; figuring out what Jython/IronPython/PyPy do; etc. In which case, go for it.
Meanwhile, every Python version's source from 2.0+ is available online at http://hg.python.org/cpython/file/X.Y/ (e.g., 2.7 or 3.3). So, once you discover that inspect.getfile(datetime) is a .so or .pyd file like /usr/local/lib/python2.7/lib-dynload/datetime.so, you can look it up inside the Modules directory. Strictly speaking, there's no way to be sure of which file defines which module, but nearly all of them are either foo.c or foomodule.c, so it shouldn't be hard to guess that datetimemodule.c is what you want.

If you're using pip to install your modules, just pip show $module the location is returned.

The sys.path list contains the list of directories which will be searched for modules at runtime:
python -v
>>> import sys
>>> sys.path
['', '/usr/local/lib/python25.zip', '/usr/local/lib/python2.5', ... ]

from the standard library try imp.find_module
>>> import imp
>>> imp.find_module('fontTools')
(None, 'C:\\Python27\\lib\\site-packages\\FontTools\\fontTools', ('', '', 5))
>>> imp.find_module('datetime')
(None, 'datetime', ('', '', 6))

datetime is a builtin module, so there is no (Python) source file.
For modules coming from .py (or .pyc) files, you can use mymodule.__file__, e.g.
> import random
> random.__file__
'C:\\Python25\\lib\\random.pyc'

Here's a one-liner to get the filename for a module, suitable for shell aliasing:
echo 'import sys; t=__import__(sys.argv[1],fromlist=[\".\"]); print(t.__file__)' | python -
Set up as an alias:
alias getpmpath="echo 'import sys; t=__import__(sys.argv[1],fromlist=[\".\"]); print(t.__file__)' | python - "
To use:
$ getpmpath twisted
/usr/lib64/python2.6/site-packages/twisted/__init__.pyc
$ getpmpath twisted.web
/usr/lib64/python2.6/site-packages/twisted/web/__init__.pyc

In the python interpreter you could import the particular module and then type help(module). This gives details such as Name, File, Module Docs, Description et al.
Ex:
import os
help(os)
Help on module os:
NAME
os - OS routines for Mac, NT, or Posix depending on what system we're on.
FILE
/usr/lib/python2.6/os.py
MODULE DOCS
http://docs.python.org/library/os
DESCRIPTION
This exports:
- all functions from posix, nt, os2, or ce, e.g. unlink, stat, etc.
- os.path is one of the modules posixpath, or ntpath
- os.name is 'posix', 'nt', 'os2', 'ce' or 'riscos'
et al

On windows you can find the location of the python module as shown below:i.e find rest_framework module

New in Python 3.2, you can now use e.g. code_info() from the dis module:
http://docs.python.org/dev/whatsnew/3.2.html#dis

Check out this nifty "cdp" command to cd to the directory containing the source for the indicated Python module:
cdp () {
cd "$(python -c "import os.path as _, ${1}; \
print _.dirname(_.realpath(${1}.__file__[:-1]))"
)"
}

Just updating the answer in case anyone needs it now, I'm at Python 3.9 and using Pip to manage packages. Just use pip show, e.g.:
pip show numpy
It will give you all the details with the location of where pip is storing all your other packages.

On Ubuntu 12.04, for example numpy package for python2, can be found at:
/usr/lib/python2.7/dist-packages/numpy
Of course, this is not generic answer

Another way to check if you have multiple python versions installed, from the terminal.
$ python3 -m pip show pyperclip
Location: /Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-
$ python -m pip show pyperclip
Location: /Users/umeshvuyyuru/Library/Python/2.7/lib/python/site-packages

Not all python modules are written in python. Datetime happens to be one of them that is not, and (on linux) is datetime.so.
You would have to download the source code to the python standard library to get at it.

For those who prefer a GUI solution: if you're using a gui such as Spyder (part of the Anaconda installation) you can just right-click the module name (such as "csv" in "import csv") and select "go to definition" - this will open the file, but also on the top you can see the exact file location ("C:....csv.py")

If you are not using interpreter then you can run the code below:
import site
print (site.getsitepackages())
Output:
['C:\\Users\\<your username>\\AppData\\Local\\Programs\\Python\\Python37', 'C:\\Users\\<your username>\\AppData\\Local\\Programs\\Python\\Python37\\lib\\site-packages']
The second element in Array will be your package location. In this case:
C:\Users\<your username>\AppData\Local\Programs\Python\Python37\lib\site-packages

In an IDE like Spyder, import the module and then run the module individually.
enter image description here

as written above
in python just use help(module)
ie
import fractions
help(fractions)
if your module, in the example fractions, is installed then it will tell you location and info about it, if its not installed it says module not available
if its not available it doesn't come by default with python in which case you can check where you found it for download info

How to bundle Python dependancies in IronWorker?

I'm writing a simple IronWorker in Python to do some work with the AWS API.
To do so I want to use the boto library which is distributed via PyPi repository. The boto library is not installed by default in the IronWorker runtime environment.
How can I bundle the boto library dependancy with my IronWorker code?
Ideally I'm hoping I can use something like the gem dependancy bundling available for Ruby IronWorkers - i.e in myRuby.worker specify
gemfile '../Gemfile', 'common', 'worker' # merges gems from common and worker groups
In the Python Loggly sample, I see that the hoover library is used:
#here we have to include hoover library with worker.
hoover_dir = os.path.dirname(hoover.__file__)
shutil.copytree(hoover_dir, worker_dir + '/loggly') #copy it to worker directory
However, I can't see where/how you specify which hoover library version you want, or where to download it from.
What is the official/correct way to use 3rd party libraries in Python IronWorkers?

Newer iron_worker version has native support of pip command.
So, you need:
runtime "python"
exec "something.py"
pip "boto"
pip "someotherpip"
full_remote_build true

[edit]We've worked on our toolset a bit since this answer was written and accepted. The answer from my colleague below is the recommended course moving forward.[/edit]
I wrote the Python client library for IronWorker. I'm also employed by Iron.io.
If you're using the Python client library, the easiest (and recommended) way to do this is to just copy over the library's installed folder, and include it when uploading the package. That's what the Python Loggly sample is doing above. As you said, that doesn't specify a version or where to download the library from, because it doesn't care. It just takes the one installed on your system and uses it. Whatever you get when you enter "import boto" on your local machine is what would be uploaded.
The other option is using our CLI to upload your worker, with a .worker file.
To do this, here's what you'd need to do:
Create a botoworker.worker file:
runtime "binary"
build 'pip install --install-option="--prefix=`pwd`/pips" boto'
file 'botoworker.py'
exec "botoworker.sh"
That second line is the pip command that will be run to install the dependency. You can modify it like you would any pip command run from the command line. It's going to execute that command on the worker during the "build" phase, so it's only executed once instead of every time you run a task.
The third line should be changed to the Python file you want to run--it's your Python worker file. Here's the one we used to test this:
import boto
If you save that as botoworker.py, the above should work without any modification. :)
The fourth line is a shell script that's going to actually run your worker. I've included the one we used below. Just save it as botoworker.sh, and you won't have to worry about modifying the .worker file above.
PYTHONPATH="$HOME/pips/lib/python2.7/site-packages:$PYTHONPATH" python botoworker.py "$#"
You'll notice it refers to your Python file--if you don't name your Python file botoworker.py, remember to change it here, too. All this does is set your PYTHONPATH to include the installed library, and then runs your Python file.
To upload this, just make sure you have the CLI installed (gem install iron_worker_ng, making sure your Ruby version is 1.9.3 or higher) and then run "iron_worker upload botoworker" in your shell, from the same directory your botoworker.worker file is in.
Hope this helps!

Is there a faster method to load a yaml file than the standard .load method? Django/Python

I am loading a big yaml file and it is taking forever. I am wondering if there is a faster method than the yaml.load() method.
I have read that there is a CLoader method but havent been able to run it.
The website that suggested this CLoader method asks me to do this:
Download the source package PyYAML-3.08.tar.gz and unpack it.
Go to the directory PyYAML-3.08 and run:
$ python setup.py install
If you want to use LibYAML bindings, which are much faster than the pure Python version, you need to download and install LibYAML.
Then you may build and install the bindings by executing
$ python setup.py --with-libyaml install
In order to use LibYAML based parser and emitter, use the classes CParser and CEmitter:
from yaml import load, dump
try:
from yaml import CLoader as Loader, CDumper as Dumper
except ImportError:
from yaml import Loader, Dumper
This looks like this will work but I dont have a setup.py directory anywhere in my Django project and therefore can't install/import any of these things
Can anyone help me figure out how to do this or let me know about another faster loading method??
Thanks for the help!!

I have no idea what's faster - bspymaster's ideas might be the most useful.
When you download PyYAML-3.08.tar.gz, inside the archive there will be a setup.py what you can run.
Note to use LibYAML, download this: http://pyyaml.org/download/libyaml/yaml-0.1.4.tar.gz
And run using the instructions from http://pyyaml.org/wiki/LibYAML
You will need a set a build tools, which should be installed on linux/unix, for osx make sure xcode is installed, and I'm not sure about windows.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.