Edit files inside zip file - python

Is there a way to work with (i.e. edit, create, delete...) files such as .docx1, and .xlsx inside a zip file without unzipping the file in Python?

There is no way to work (intelligent) with a file without reading it first. Unzipping == reading.

Related

How to open file with gz extendion in spyder?

I have got a file whith gz extension. Basically it was just a python file with py extension. After some manipulaition with that file It bacome file with gz ext. How to get it back?
Thanks in advance?
You did one of two things.
You compressed the file using a command like zip, winzip, gzip etc
You renamed the file with .gz as the extension
A file extension is just an indicator of the expected file contents. I can have a text file with python code and a gz extension.
To fix your problem, open the file in a text editor. If the contents look like garbage, you've opened a compressed file and will need to uncompressed it using appropriate software.
Most likely, you will find that the file contains readable python code. In that case simply save the file with the correct extension.
An pay closer attention to how you are manipulating your files in the future.

How to create a zip archive in Python without creating files on file system?

One easy way is to create a directory and populate it with files. Then archive and compress that directory into a zip file called, say, file.zip. But this approach is needless since my files are in memory already, and needing to save them to disk is excessive.
Is it possible that I create the directory structure right in memory, without saving the unzipped files/directories? So that I end up saving only the final file.zip (without the intermediate stage of saving files/directories on file system)?
You can use zipfile:
from zipfile import ZipFile
with ZipFile("file.zip", "w") as zip_file:
zip_file.writestr("root/file.json", json.dumps(data))
zip_file.writestr("README.txt", "hello world")

How to exclude ".DS_Store" path when compressing files in Python

I have a python script that compresses specific files into a zip file. However I have noticed that a file ".DS_Store" is produced within this zip file. Is there a way I can remove this from the zip file or avoid it being created in the first place in my python script. From what I have found online I think on a windows machine this hidden file appears as "macosx" file.
I've tested the zip file with and without the ".DS_Store" hidden file (I manually deleted it). When I remove it, the zip file is able to be processed correctly and when I leave it in, errors are thrown.
This is how I create the zip file in my python script:
#Create zip file of all necessary files
zipf = zipfile.ZipFile(new_path+zip_file_name, 'w', zipfile.ZIP_DEFLATED)
create_zip(new_path,zipf)
zipf.close()
Any advice how to approach removing this hidden file would be appreciated.
Your code uses a function, create_zip, but you haven't shared the code of that function. Presumably, it loops through the contents of a directory and calls the .write method of the ZipFile instance in order to write each file into the archive. If this is the case, just add some logic to that function to exclude any files called .DS_Store.
def create_zip(path, zipfile):
files = os.listdir(path)
for file in files:
if file != '.DS_Store':
zipfile.write(file)

Reading gzipped data in Python

I have a *.tar.gz compressed file that I would like to read in with Python 2.7. The file contains multiple h5 formatted files as well as a few text files. I'm a novice with Python. Here is the code I'm trying to adapt:
`subset_path='c:\data\grant\files'
f=gzip.open(filename,'subset_full.tar.gz')
subset_data_path=os.path.join(subset_path,'f')
The first statement identifies the path to the folder with the data. The second statement tells Python to open a specific compressed file and the third statement (hopefully) executes a join of the prior two statements.
Several lines below this code I get an error when Python tries to use the 'subset_data_path' assignment.
What's going on?
The gzip module will only open a single file that has been compressed, i.e. my_file.gz. You have a tar archive of multiple files that are also compressed. This needs to be both untarred and uncompressed.
Try using the tarfile module instead, see https://docs.python.org/2/library/tarfile.html#examples
edit: To add a bit more information on what has happened, you have successfully opened the zipped tarball into a gzip file object, which will work almost the same as a standard file object. For instance you could call f.readlines() as if f was a normal file object and it would return the uncompressed lines.
However, this did not actually unpack the archive into new files in the filesystem. You did not create a subdirectory 'c:\data\grant\files\f', and so when you try to use the path subset_data_path you are looking for a directory that does not exist.
The following ought to work:
import tarfile
subset_path='c:\data\grant\files'
tar = tarfile.open("subset_full.tar.gz")
tar.extractall(subset_path)
subset_data_path=os.path.join(subset_path,'subset_full')

Create file inside of zip archive python

I have an external file-system and a way to download data from there. I want to download all data into .zip archive.
What I can do is:
Create file to write into
Download data from device to this file
Write file
Add file to zip archive with zipfile.write(file)
What I want to do is:
Create zip archive
Download data from device to created file in this archive without creating it on my local drive
Here is not working code to get an Idea:
def get_all_files(self):
self.savedir()
zipf = zipfile.ZipFile(self.dir_to_save+"/SD_contents.zip", 'w');
for file in self.nsh.get_all_files("/fs/microsd"):
# get_all_files() returns list of full file paths on the SD
print file
data = self.nsh.download_file("/fs/microsd"+file)
zipf.write(data);
If your target is simply to not create temp file, StringIO
is your saver, along with ZipFile.writestr() from Ignacio's answer.
ZipFile.writestr() will allow you to write the contents of an in-memory buffer to a zip entry given by filename or ZipInfo instance. But there is no way to do it in a streaming manner due to the nature of zip files.

Categories