Saving a .tar.gz file located on server to a FILE object - python

I'm currently working on a Python Flask API.
For demo purposes, I have a folder in the server containing .tar.gz files.
Basically I'm wondering how do I save these files knowing their relative path name, say like file.tar.gz, into a FILE object. I need the tar file in the format to be able to run the following code on it, where f would be the tar file:
tar = tarfile.open(mode="r:gz", fileobj=f)
for member in tar.getnames():
tf = tar.extractfile(member)
Thanks in advance!

Not ver familiar with this but , just saving it normally with .tar.gz extension should work? if yes and if you have the file already compressed then a very simple code could do that,
compresseddata= 'your file'
with open('file.tar.gz') as fo:
fo.write(compressed data)
fo.flush().close()
Will this do the job , or am i getting something wrong here?

Related

How to use json file that placed in the python project?

I have project in Pycharm, and I want to use json file that placed in this project.
How can I call it?
I use this one:
import json
file = open('~/PycharmProjects/Test/JSON_new.json')
x = json.load(file)
And received the following error:
FileNotFoundError: [Errno 2] No such file or directory: '~/PycharmProjects/Test/JSON_new.json'
But path is correct
EDIT: I understood what is the problem. Instead of json file txt was created (but I selected json). It creates txt files, maybe, someone knows hot to solve it? I can create only .py files directly. Other files no.
Is it correct if I create scratch json file and placed it in Scratches?
You may like to use following path(In linux):
file = open('/home/<user>/PycharmProjects/Test/JSON_new.json')
Replace user with your username. You need to know the correct path to the file, for which you can user PWD command in terminal.
You can use json module for this. You can open the file in a seperate object and pass it to json.load if you have a JSON string use json.loads.
import json
file = open('/path/to/json/file.json')
file_opened = json.load(file)

Is it possible to download just part of a ZIP file using python zipfile library

I was wondering is there any way by which I can download only a part of a .rar or .zip file without downloading the whole file ? There is a zip file containing files A,B,C and D. I only need A. Can I somehow, use zipfile module so that i can only download 1 file ?
i am trying below code:
r = c.get(file)
z = ZipFile.ZipFile(BytesIO(r.content))
for file1 in z.namelist():
if 'time' not in file1:
print("hi")
z.extractall(file1,download_path + filename)
This code is downloading whole zip file and only extracting specific one. Can i somehow download only the file i Need.
There is similar question here but it shows only approch by command line in linux. That question dosent address how it can be done using python liabraries.
The question #Juggernaut mentioned in a comment is actually very helpful, as it points you in the direction of the solution.
You need to create a replacement for Bytes.IO that returns the necessary information to ZipFile. You will need to get the length of the file, and then get whatever sections ZipFile asks for.
How large are those file? Is it really worth the trouble?
Use remotezip: https://github.com/gtsystem/python-remotezip. You can install it using pip:
pip install remotezip
Usage example:
from remotezip import RemoteZip
with RemoteZip("https://path/to/zip/file.zip") as zip_file:
for file in zip_file.namelist():
if 'time' not in file:
print("hi")
zip_file.extract(file, path="/path/to/extract")
Note that to use this approach, the web server from which you receive the file needs to support the Range header.

How to get information of .jar file in python-magic

I have a folder full of jar, html, css, exe type file. How can I check the file?
I already run "file" command on *NIX and using python-magic. but the result is all like this.
test : Zip archive data, at least v1.0 to extract
How can I get information specifically like test : jar only using using magic number.
How do I do like this?
While not required, most JAR files have a META-INF/MANIFEST.MF file contained within them. You could check for the existence of this file, after checking if it's a zip file:
import zipfile
def zipFileContains(zipFileName, pathName):
f = zipfile.ZipFile(zipFileName, "r")
result = any(x.startswith(pathName.rstrip("/")) for x in f.namelist())
f.close()
return result
print zipFileContains("test.jar", "META-INF/MANIFEST.MF")
However, it might be better to just check if it's a zip file that ends in .jar.
Magic alone won't do it for you, since a JAR is literally just a zip file. Read more about the format here.

Reading gzipped data in Python

I have a *.tar.gz compressed file that I would like to read in with Python 2.7. The file contains multiple h5 formatted files as well as a few text files. I'm a novice with Python. Here is the code I'm trying to adapt:
`subset_path='c:\data\grant\files'
f=gzip.open(filename,'subset_full.tar.gz')
subset_data_path=os.path.join(subset_path,'f')
The first statement identifies the path to the folder with the data. The second statement tells Python to open a specific compressed file and the third statement (hopefully) executes a join of the prior two statements.
Several lines below this code I get an error when Python tries to use the 'subset_data_path' assignment.
What's going on?
The gzip module will only open a single file that has been compressed, i.e. my_file.gz. You have a tar archive of multiple files that are also compressed. This needs to be both untarred and uncompressed.
Try using the tarfile module instead, see https://docs.python.org/2/library/tarfile.html#examples
edit: To add a bit more information on what has happened, you have successfully opened the zipped tarball into a gzip file object, which will work almost the same as a standard file object. For instance you could call f.readlines() as if f was a normal file object and it would return the uncompressed lines.
However, this did not actually unpack the archive into new files in the filesystem. You did not create a subdirectory 'c:\data\grant\files\f', and so when you try to use the path subset_data_path you are looking for a directory that does not exist.
The following ought to work:
import tarfile
subset_path='c:\data\grant\files'
tar = tarfile.open("subset_full.tar.gz")
tar.extractall(subset_path)
subset_data_path=os.path.join(subset_path,'subset_full')

How to open a video file in python 2.7?

I am new to python, and I am trying to open a video file "This is the file.mp4" and then read the bytes from that file. I know I should be using open(filename, "rb"), however I am not clear about the following things:
In what directory is python looking for my file when I use open()? (My file is located in the downloads folder, should I move it? Where?
Is using "rb" the correct way to read the bytes from a file?
So far I tried to open the file and I get this error:
IOError: [Errno 2] No such file or directory: 'This is the file.mp4'
I know it is probably an obvious thing to do, however I have looked all over the internet and I still haven't found an answer.
Thank you in advance!
By default, Python opens the file from the current working directory, which is usually the folder where the .py script of the program is located.
If you move the video file in the same directory as the script, it should work.
You can also view the current working directory like this:
import os
print os.getcwd()
Also, instead of moving the file, you can just change "This is the file.mp4" to "C:/Users/<username>/Downloads/This is the file.mp4" if you are using Windows 7 and maybe 8. You will have to change the <username> to your computer username.
Wildcards might also work: "~/Downloads/This is the file.mp4"
Finally, what are you planning to do with the video file bytes? If you want to copy the file to somewhere else, there are modules to do that.
"rb" is a correct way to read bytes of a file.

Categories