I need to compress multiple files into one bz2 file in python.
I'm trying to find a way but I can't can find an answer.
Is it possible?
This is what tarballs are for. The tar format packs the files together, then you compress the result. Python makes it easy to do both at once with the tarfile module, where passing a "mode" of 'w:bz2' opens a new tar file for write with seamless bz2 compression. Super-simple example:
import tarfile
with tarfile.open('mytar.tar.bz2', 'w:bz2') as tar:
for file in mylistoffiles:
tar.add(file)
If you don't need much control over the operation, shutil.make_archive might be a possible alternative, which would simplify the code for compressing a whole directory tree to:
shutil.make_archive('mytar', 'bztar', directory_to_compress)
Take a look at python's bz2 library. Make sure to google and read the docs first!
https://docs.python.org/2/library/bz2.html#bz2.BZ2Compressor
you have import package for:
import tarfile,bz2
and multilfile compress in bz format
tar = tarfile.open("save the directory.tar.bz", "w:bz2")
for f in ["gti.png","gti.txt","file.taz"]:
tar.add(os.path.basename(f))
tar.close()
let use for in zip format was open in a directory open file
an use
os.path.basename(src_file)
open a only for file
Python's standard lib zipfile handles multiple files and has supported bz2 compression since 2001.
import zipfile
sourcefiles = ['a.txt', 'b.txt']
with zipfile.ZipFile('out.zip', 'w') as outputfile:
for sourcefile in sourcefiles:
outputfile.write(sourcefile, compress_type=zipfile.ZIP_BZIP2)
Related
Is there a way to unzip a .usdz file in python? I was looking at the shutil.unpack_archive, but it looks like I can't use that without an existing function to unpack it. They use zip compression, just have a different file extension. Would just renaming them to have .zip extensions work? Is there a way to "tell" shutil that these are basically .zip files, something else I can use?
Running the Linux unzip command can unpack them, but due to my relative unfamiliarity with shell scripting and the file manipulation I'll need to do, I'd prefer to use python.
You can do this a couple ways.
Use shutil.unpack_archive with the format="zip" argument, e.g.
import shutil
archive_path = "/path/to/archive.usdz"
shutil.unpack_archive(archive_path, format="zip")
# note you can also pass extract_dir keyword argument to
# set where the files are extracted to
You can also directly use the zipfile module:
import zipfile
archive_path = "/path/to/archive.usdz"
zf = zipfile.ZipFile(archive_path)
zf.extractall()
# note that this extracts to the working directory unless you specify the path argument
Let's say we have a tar file which in turn contains multiple gzip compressed files. I want to be able to read the contents of those gzip files without compressing either the tar file or the individual gzip files. I 'm trying to use tarfile module in python.
This might work, I haven't tested it, but this has the main ideas, and related tools. It iterates over the files in the tar, and if they are gzipped, then will read them into the file_contents variable:
import tarfile as t
import gzip as g
for member in t.open("your.gz.tar").getmembers():
fo=t.extractfile(member)
file_contents = g.GzipFile(fileobj=fo).read()
note: if the file is too large for memory, then consider looking into a streamed reader (chunk by chunk) as linked.
If you have additional logic based on what the member (TarInfo) object looks like you can use these:
https://docs.python.org/2/library/tarfile.html#tarinfo-objects
see:
How can I decompress a gzip stream with zlib?
Python decompressing gzip chunk-by-chunk
reading tar file contents without untarring it, in python script
I am aware of this question Why "is_zipfile" function of module "zipfile" always returns "false"?. I want to seek some more clarification and confirmation.
I have created a zip file in python using the gzip module.
If I check the zip file using the file command in OSX I get this
> file data.txt
data.txt: gzip compressed data, was "Slide1.html", last modified: Tue Oct 13 10:10:13 2015, max compression
I want to write a generic function to tell if the file is gzip'ed or not.
import gzip
import os
f = '/path/to/data.txt'
print os.path.exists(f) # True
with gzip.GzipFile(f) as zf:
print zf.read() # Print out content as expected
import zipfile
print zipfile.is_zipfile(f) # Give me false. Not expected
I want to use zipfile module but it always reports false.
I just want to have a confirmation that zipfile module is not compatible with gzip. If so, why it is the case? Are zip and gzip considered different format?
I have created a zip file in python using the gzip module.
No you haven't. gzip doesn't create zip files.
I just want to have a confirmation that zipfile module is not compatible with gzip.
Confirmed.
If so, why it is the case?
A gzip file is a single file compressed with zlib with a very small header. A zip file is multiple files, each optionally compressed with zlib, in a single archive with a header and directory.
Are zip and gzip considered different format?
Yes.
I am trying to unzip a Alpha.zip folder which contains a Beta directory which contains a Gamma Folder which contains a.Z, b.Z, c.Z, d.Z files. Using zip and 7-zip I was able to extract all a.D, b.D, c.D, d.D files stored within the .Z files.
I tried this in python using Import gzip and Import zlib.
import sys
import os
import getopt
import gzip
f = open('a.d.Z','r')
file_content = f.read()
f.close()
I keep getting all sorts of errors including: this is not a zip file, return codecs.charmap_encode(input self.errors encoding_map) 0. Any suggestions as to how to code this?
You need to actually make use of a zip library of some kind. Right now you're importing gzip, but you're not doing anything with it. Try taking a look at the gzip documentation and opening the file using that library.
gzip_file = gzip.open('a.d.Z') # use gzip.open instead of builtin open function
file_content = gzip_file.read()
Edit based on your comment: you can't just open all kinds of compressed files with any compression library. Since you have a .Z file, it's likely that you want to use zlib rather than gzip, but since extensions are just conventions, only you know for sure what compression format your file is in. To use zlib, do something like this instead:
# Note: untested code ahead!
import zlib
with open('a.d.Z', 'rb') as f: # Notice that I open this in binary mode
file_content = f.read() # Read the compressed binary data
decompressed_content = zlib.decompress(file_content) # Decompress
I am trying to extract a bz2 compressed folder in a specific location.
I can see the data inside by :
handler = bz2.BZ2File(path, 'r')
print handler.read()
But I wish to extract all the files in this compressed folder into a location (specified by the user) maintaining the internal directory structure of the folder.
I am fairly new to this language .. Please help...
Like gzip, BZ2 is only a compressor for single files, it can not archive a directory structure. What I suspect you have is an archive that is first created by a software like tar, that is then compressed with BZ2. In order to recover the "full directory structure", first extract your Bz2 file, then un-tar (or equivalent) the file.
Fortunately, the Python tarfile module supports bz2 option, so you can do this process in one shot.
bzip2 is a data compression system which compresses one entire file. It does not bundle files and compress them like PKZip does. Therefore handler in your example has one and only one file in it and there is no "internal directory structure".
If, on the other hand, your file is actually a compressed tar-file, you should look at the tarfile module of Python which will handle decompression for you.
You need to use the tarfile module to uncompress a .tar.bz2 file ... from the docs here is how you can do it:
import tarfile
tar = tarfile.open(path, "r:bz2")
for tarinfo in tar:
print tarinfo.name, "is", tarinfo.size, "bytes in size and is",
if tarinfo.isreg():
print "a regular file."
# read the file
f = tar.extractfile(tarinfo)
print f.read()
elif tarinfo.isdir():
print "a directory."
else:
print "something else."
tar.close()