Not managing to extract RAR archive using rarfile module - python

I have been trying to make a script that extracts *.rar files for but am receiving errors. I've been struggling to understand the documentation of the module to no avail (I'm new to programming so sometimes get a bit lost in all the docs).
Here is the relevant part of my code, and the error received.
Snippet from my code:
import rarfile
rarpath='/home/maze/Desktop/test.rar'
def unrar(file):
rf=rarfile.RarFile(file)
rf.rarfile.extract_all()
unrar(rarpath)
Error received:
File "unrarer.py", line 26, in unrar
rf.rarfile.extract_all()
AttributeError: 'str' object has no attribute 'extract_all'
I have installed rarfile2.8 and unrar0.3 using pip (note sure if the later was necessary).
Thanks in advance for any assistance correcting my function or helping understand the package's documentation.

Support for RAR files in general is quite poor, this experience is par for the course.
In order to get the rarfile Python module to work, you have to also install a supported tool for extracting RAR files. Your only two choices are bsdtar or unrar. Do not install these with Pip, you have to install these with your Linux package manager (or install them yourself, if you think that the computer's time is more valuable than your time). For example on Debian-based systems (this includes Ubuntu) run,
sudo apt install bsdtar
Or,
sudo apt install unrar
Note that bsdtar does not have the same level of support for RAR files that Unrar does. Some newer RAR files will not extract with bsdtar.
Then your code should look something like this:
import rarfile
def unrar(file):
rf = rarfile.RarFile(file)
rf.extract_all()
unrar('/home/maze/Desktop/test.rar')
Note the use of rf.extract_all(), not rf.rarfile.extract_all().
If you are just doing extract_all then there is no need to use a rarfile module, though. You can just use the subprocess module:
import subprocess
path = '/path/to/archive.rar'
subprocess.check_call(['unrar', 'x', path])
The rarfile module is basically nothing more than a wrapper around subprocess anyway.
Of course, if you have a choice in the matter, I recommend migrating your archives to a more portable and better supported archive format.

if you are in windows, it worked for me. You need to go to https://www.rarlab.com/rar_add.htm download UnRAR for Windows - Command line freeware Windows UnRAR, execute it, extract it to a folder and add the executable path in your code after importing rarfile:
rarfile.UNRAR_TOOL = r"C:\FilePath\UnRAR.exe"

rf.rarfile is the name of the file which you can see by printing its value. Remove that and check out help(rarfile.RarFile) for the method you want.
import rarfile
rarpath='/home/maze/Desktop/test.rar'
def unrar(file):
rf=rarfile.RarFile(file)
rf.extractall()
unrar(rarpath)

try this
import fnmatch
from rarfile import RarFile
path = r'C:\Users\byqpz\Desktop\movies\rars'
destinationPath = r'C:\Users\byqpz\Desktop\movies\destination'
for root, dirs, files in os.walk(path):
for filename in fnmatch.filter(files, '*.rar'):
fullPath = os.path.join(root, filename)
RarFile(fullPath).extract(destinationPath)

Related

How to unrar files

I am trying to unrar files uploaded by users in django web application. I have gone through different approaches but none of them are working for me.
The below methods work fine in my local OS (Ubuntu 18.04), but n't working in the server.
I am using Python 3.6.
import rarfile - here 'rarfile' is imported from python3.6/site-packages/rarfile. It is giving this error -
bsdtar: Error opening archive: Failed to open '--'
Following the steps suggested here, I installed 'unrar'.
from unrar import rarfile - here it's being imported from python3.6/site-packages/unrar/
But this gave the error - LookupError: Couldn't find path to unrar library.
So to solve the issue, I followed the steps mentioned here, and created UnRaR
executable, and also set the path in my environment. This works fine in local, but unable to understand how to do the same in 'production environment'.
import patoolib - importing from patool
I also tried implementing patool, which worked fine in local but not in production and I am getting this error - patoolib.util.PatoolError: could not find an executable program to extract format rar; candidates are (rar,unrar,7z),
This has become a big blocker in my application, can anyone please explain how to solve the problem.
Code snippet for rarfile -
import rarfile
.....
rf = rarfile.RarFile(file_path)
rf.extractall(extract_to_path)
Code snippet for patool -
import patoolib
...
patoolib.extract_archive(file_path, outdir=extract_to_path)
Patool is the good python library it is simple and easy. A simple example of it is here:
pip install patool
Use pip install patool to install the library
import patoolib
patoolib.extract_archive("foo_bar.rar", outdir="path here")
Now, in your case the problem is probably that you don't have your unraring tool in the path, indicating it can not be called from the command line (which is exactly what patool does). Just put your rar file folder to the path and you're fine or give the absolute path to the rar file.

Trying to read JSON file within a Python package

I am in the process of packaging up a python package that I'll refer to as MyPackage.
The package structure is:
MyPackage/
script.py
data.json
The data.json file comprises cached data that is read in script.py.
I have figured out how to include data files (use of setuptools include_package_data=True and to also include path to data file in the MANIFEST.in file) but now when I pip install this package and import the installed MyPackage (currently testing install by pip from the GitHub repository) I get a FileNotFound exception (data.json) in the script that is to utilize MyPackage. However, I see that the data.json file is indeed installed in Lib/site-packages/MyPackage.
Am I doing something wrong here by trying to read in a json file in a package?
Note that in script.py I am attempting to read data.json as open('data.json', 'r')
Am I screwing up something regarding the path to the data file?
You're not screwing something up, accessing package resources is just a little tricky - largely because they can be packaged in formats where your .json might strictly speaking not exist as an actual file on the system where your package is installed (e.g. as zip-app). The right way to access your data file is not by specifying a path to it (like "MyPackage/data.json"), but by accessing it as a resource of your installed package (like "MyPackage.data.json"). The distinction might seem pedantic, but it can matter a lot.
Anyway, the access should be done using the builtin importlib.resources module:
import importlib.resources
import json
with importlib.resources.open_text("MyPackage", "data.json") as file:
data = json.load(file)
# you should be able to access 'data' like a dictionary here
If you happen to work on a python version lower than 3.7, you will have to install it as importlib_resources from pyPI.
I resolved the issue by getting the 'relative path' to where the package is.
self.data = self.load_data(path=os.path.join(
os.path.dirname(os.path.abspath(__file__)),
'data.json'))
load_data just reads the data file
Any constructive criticism is still very much welcome. Not trying to write stupid code if I can't help it :)

ZXing in python FileNotFoundError Problem

I have written a simple Python code to detect qrCode. Code:
import zxing
reader = zxing.BarCodeReader()
barcode = reader.decode('../images/QR_CODE-easy.png')
print(barcode)
Now, when I run it, I get the following error:
FileNotFoundError: [WinError 2] The system cannot find the file specified
I have check this file location is valid by using cv.imread command. Please let me know if someone has a solution to this problem.
You need to install Java Development Kit.
The readme of ZXing Python wrapper says the following:
Dependencies and installation
Use the Python 3 version of pip (usually invoked via pip3) to install: pip3 install zxing
You'll neeed to have a recent java binary somewhere in your path. (Tested with OpenJDK.)
pip will automatically download the relevant JAR files for the Java ZXing libraries (currently v3.4.1)
You appear to be on Windows (as the error code suggests), which uses the backslash for file paths.
It's not good practice as it won't be widely compatible, but if you're in a hurry and know you won't want to use the code on Mac or Linux, you can use double backslashes:
reader.decode('..\\images\\QR_CODE-easy.png')
Otherwise you should use os.path.join or pathlib (assuming your using Python 3)
import os.path
qr_file = os.path.join("..", "images", "QR_CODE-easy.png")
Or
from pathlib import Path
qr_file = Path("../images/QR_CODE-easy.png")
There are further details of a few options here:
https://medium.com/#ageitgey/python-3-quick-tip-the-easy-way-to-deal-with-file-paths-on-windows-mac-and-linux-11a072b58d5f
Edit: it is also worth confirming that your relative path is indeed correct when starting in your current working directory. You can check the current working directory with: cwd = os.getcwd() . You may want to try an absolute path to your file too, just to confirm whether it works with that first.
More details on cwd here: https://stackoverflow.com/a/5137509/142780

Mimic 7zip with python

I am using Python 3.6, and currently I subprocess out to my 7zip program to get the compression I need.
subprocess.call('7z a -t7z -ms=off {0} *'.format(filename))
I know the zipfile class has ‘ZIP_LZMA’ compression, but the application I am passing this too says the output file isn’t correct. So what else do I have to add to the ZipFile class to make it mimic the above command?
If you do not care much for Windows, then perhaps libarchive could help. In Ubuntu, for example:
$ sudo apt install python3-libarchive-c
Then:
import libarchive
with libarchive.file_writer('test.7z', '7zip') as archive:
archive.add_files('first.file', 'second.file', 'third.file')
Then there is the pylib7zip library, which wraps the existing 7z.dll and seems to offer a Windows-only alternative.

How to make an "always relative to current module" file path?

Let's say you have a module which contains
myfile = open('test.txt', 'r')
And the 'test.txt' file is in the same folder. If you'll run the module, the file will be opened successfully.
Now, let's say you import that module from another one which is in another folder. The file won't be searched in the same folder as the module where that code is.
So how to make the module search files with relative paths in the same folder first?
There are various solutions by using "__file__" or "os.getcwd()", but I'm hoping there's a cleaner way, like same special character in the string you pass to open() or file().
The solution is to use __file__ and it's pretty clean:
import os
TEST_FILENAME = os.path.join(os.path.dirname(__file__), 'test.txt')
For normal modules loaded from .py files, the __file__ should be present and usable. To join the information from __file__ onto your relative path, there's a newer option than os.path interfaces available since 2014:
from pathlib import Path
here = Path(__file__).parent
fname = here / "test.txt"
with fname.open() as f:
...
pathlib was added to Python in 3.4 - see PEP428. For users still on Python 2.7 wanting to use the same APIs, a backport is available.
Note that when you're working with a Python package, there are better approaches available for reading resources - you could consider moving to importlib-resources. This requires Python 3.7+, for older versions you can use pkgutil. One advantage of packaging the resources correctly, rather than joining data files relative to the source tree, is that the code will still work in cases where it's not extracted on a filesystem (e.g. a package in a zipfile). See How to read a (static) file from inside a Python package? for more details about reading/writing data files in a package.

Categories