glob.iglob Remove path from filename - python

I am trying to get the most recent file added to a directory using python 2.7 os and glob modules.
import os
import glob
path = "files/"
newestFile = max(glob.iglob(path + '*.txt'), key=os.path.getctime)
print newestFile
When I print the newestFile variable I get the path included i.e.
files\file.txt
I just want the filename but my .txt file and .py script are not in the same directory. The text file is one directory down under the files directory. How do I refer to the directory and get the newest .txt file added to that directory.

You can use os.path.basename to just get the filename:
newestFile = os.path.basename(max(glob.iglob(path + '*.txt'), key=os.path.getctime))
os.path.getctime is going to need the full path so one way or another you would have to use the full path.

Related

Extract and rename zip file from specific folder using Python3

I am using python3.10. To unzip a file I have a zip file in folder 'wowo' if I select folder using path and use only file name the code doesn't work. But, when full path+filename given it works. I don't want go give full path and file name together. I want to define path saperately.
zipdata = zipfile.ZipFile('/Volumes/MacHD/MYPY/wowo/NST_cm.zip')
zipinfos = zipdata.infolist()
for zipinfo in zipinfos:
zipinfo.filename = 'Nst.csv'
zipdata.extract(path=path, member=zipinfo)
You could join the two strings in order to form the full filepath.
filepath = os.path.join(path, filename)
zipfile.ZipFile(filepath)
Or I believe the ZipFile function can take a path and file name expression like this
zipfile.ZipFile(path,'filename')
Replacing filename with the name of the file you wish to work with
You can use pathlib and add the path with the filename in the zipfile.zipfile:
import pathlib
path = pathlib.Path('PATH/TO/FOLDER')
zipfile.ZipFile( path / 'filename')

Find a file 1 folder level down

I am trying to get full_path of places.sqlite file present in '%APPDATA%\Mozilla\Firefox\Profiles\<random_folder>\places.sqlite' using Python OS module. The issue as you can see that <random_folder> has a random name and there could be multiple folders inside the Profiles folder.
How do I navigate/find the path to the places.sqlite file?
You would ideally want to go through each folder to search for this file. In terminal 'locate file_name' command would do this for you. In python file you can use the following command:
import os
db_path = os.path.join(os.getenv('APPDATA'), r'Mozilla\Firefox\Profiles')
def find_file(file_name, path):
for root_folder, directory, file_names in os.walk(path):
if file_name in file_names:
return os.path.join(root_folder, file_name)
print(find_file('places.sqlite', db_path))
os.walk gives a list of all files in a path recusivly. Use it to search for 'places.sqlite' as follows.
path = ""
for root, dirs, files in os.walk("%APPDATA%\\Mozilla\\Firefox\\Profiles\\"):
if "places.sqlite" in files:
path = os.path.join(root, 'places.sqlite')
break
Use the os module to list out all directories in %APPDATA%\Mozilla\Firefox\Profiles\
loop over the directories until you find places.sqlite file (also using os module)
A glob might be simpler as in this case one expects the file to be there in level below the Profiles folder or not there at all.
import os
import pathlib
profiles = pathlib.Path(os.environ["APPDATA"]) / "Mozilla" / "Firefox" / "Profiles"
# rglob will recursively search as well
if places := list(profiles.rglob("places.sqlite")):
print(places[0]) # will print the sqllite file path
with places[0].open() as f:
# ....

Python - I zip some folders with subfolders but it zips twice.

I have written a script. It finds the current path and changes the path and zips. Then I want that it just find the zip file copy it to another directory and at the end removes the content of the folder. But it zips once and zips again the whole folders and zip-file. The intial situation is as in Figure 1.
The script is like this:
import os
import zipfile
import shutil
import glob
Pfad = os.getcwd()
newPfad = 'D'+ Pfad[1:]
Zip_name=os.path.basename(os.path.normpath(Pfad))
shutil.make_archive(Zip_name, 'zip', Pfad)
if not os.path.exists(newPfad):
os.makedirs(newPfad)
dest_dir=newPfad
files = glob.iglob(os.path.join(Pfad, "*.zip"))
for file in files:
if os.path.isfile(file):
shutil.copy2(file, dest_dir)
shutil.rmtree(Pfad)
And finally the result is illustrated in the following figure.
The batch file is just for running the python script.
How can I get the following desired situation?
The issue is that zip file is created prior to listing the directory contents, therefore empty zip file is added to. Create archive in the parent directory and then move it. Moving a file or directory is cheap and atomic.
import os
import shutil
cwd = os.path.abspath(os.path.curdir)
zip_target = os.path.join(cwd, os.path.basename(cwd)) + '.zip'
zip_source = shutil.make_archive(cwd, 'zip')
os.rename(zip_source, zip_target)

ZipFile archives the folders, too

I want to insert all .ini files into an archive; it does it well but when I open my .zip, there are the path folders to those files included, too.
Here's my code:
from path import Path
import zipfile
def main():
folderul_cu_demouri = Path('/my/path/bla/bla')
nume_arhiva = 'demoz.zip'
arhiva = zipfile.ZipFile(nume_arhiva, 'w')
for demo in folderul_cu_demouri.files(pattern='*.ini'):
arhiva.write(demo)
arhiva.close()
if __name__ == '__main__':
main()
So when I open my zip file, I gotta browse through /my/path/to/files, and only then I can see my .ini files. How can I make it so only the .ini are inserted in the zip file, without the directories?
Thanks.
PS: I'm using path.py to get their extensions.
if your files are located directly in the archive folder, you could basename your files and pass the name in arcname parameter so the name in the archive is the filename, without the full path:
arhiva.write(demo,arcname=os.path.basename(demo))
else, you could remove the first characters of the full file path so relative paths are preserved:
len_to_strip = len('/my/path/bla/bla')+1
arhiva.write(demo,arcname=demo[:len_to_strip])

How to extract a file within a folder within a zip?

I need to extract a file called Preview.pdf from a folder called QuickLooks inside of a zip file.
Right now my code looks a little like this:
with ZipFile(newName, 'r') as newName:
newName.extract(\QuickLooks\Preview.pdf)
newName.close()
(In this case, newName has been set equal to the full path to the zip).
It's important to note that the backslash is correct in this case because I'm on Windows.
The code doesn't work; here's the error it gives:
Traceback (most recent call last):
File "C:\Users\Asit\Documents\Evam\Python_Scripts\pageszip.py", line 18, in <module>
ZF.extract("""QuickLooks\Preview.pdf""")
File "C:\Python33\lib\zipfile.py", line 1019, in extract
member = self.getinfo(member)
File "C:\Python33\lib\zipfile.py", line 905, in getinfo
'There is no item named %r in the archive' % name)
KeyError: "There is no item named 'QuickLook/Preview.pdf' in the archive"
I'm running the Python script from inside Notepad++, and taking the output from its console.
How can I accomplish this?
Alternatively, how could I extract the whole QuickLooks folder, move out Preview.pdf, and then delete the folder and the rest of it's contents?
Just for context, here's the rest of the script. It's a script to get a PDF of a .pages file. I know there are bonified converters out there; I'm just doing this as an excercise with some sort of real-world application.
import os.path
import zipfile
from zipfile import *
import sys
file = raw_input('Enter the full path to the .pages file in question. Please note that file and directory names cannot contain any spaces.')
dir = os.path.abspath(os.path.join(file, os.pardir))
fileName, fileExtension = os.path.splitext(file)
if fileExtension == ".pages":
os.chdir(dir)
print (dir)
fileExtension = ".zip"
os.rename (file, fileName + ".zip")
newName = fileName + ".zip" #for debugging purposes
print (newName) #for debugging purposes
with ZipFile(newName, 'w') as ZF:
print("I'm about to list names!")
print(ZF.namelist()) #for debugging purposes
ZF.extract("QuickLook/Preview.pdf")
os.rename('Preview.pdf', fileName + '.pdf')
finalPDF = fileName + ".pdf"
print ("Check out the PDF! It's located at" + dir + finalPDF + ".")
else:
print ("Sorry, this is not a valid .pages file.")
sys.exit
I'm not sure if the import of Zipfile is redundant; I read on another SO post that it was better to use from zipfile import * than import zipfile. I wasn't sure, so I used both. =)
EDIT: I've changed the code to reflect the changes suggested by Blckknght.
Here's something that seems to work. There were several issues with your code. As I mentioned in a comment, the zipfile must be opened with mode 'r' in order to read it. Another is that zip archive member names always use forward slash / characters in their path names as separators (see section 4.4.17.1 of the PKZIP Application Note). It's important to be aware that there's no way to extract a nested archive member to a different subdirectory with Python's currentzipfilemodule. You can control the root directory, but nothing below it (i.e. any subfolders within the zip).
Lastly, since it's not necessary to rename the .pages file to .zip — the filename you passZipFile() can have any extension — I removed all that from the code. However, to overcome the limitation on extracting members to a different subdirectory, I had to add code to first extract the target member to a temporary directory, and then copy that to the final destination. Afterwards, of course, this temporary folder needs to deleted. So I'm not sure the net result is much simpler...
import os.path
import shutil
import sys
import tempfile
from zipfile import ZipFile
PREVIEW_PATH = 'QuickLooks/Preview.pdf' # archive member path
pages_file = input('Enter the path to the .pages file in question: ')
#pages_file = r'C:\Stack Overflow\extract_test.pages' # hardcode for testing
pages_file = os.path.abspath(pages_file)
filename, file_extension = os.path.splitext(pages_file)
if file_extension == ".pages":
tempdir = tempfile.gettempdir()
temp_filename = os.path.join(tempdir, PREVIEW_PATH)
with ZipFile(pages_file, 'r') as zipfile:
zipfile.extract(PREVIEW_PATH, tempdir)
if not os.path.isfile(temp_filename): # extract failure?
sys.exit('unable to extract {} from {}'.format(PREVIEW_PATH, pages_file))
final_PDF = filename + '.pdf'
shutil.copy2(temp_filename, final_PDF) # copy and rename extracted file
# delete the temporary subdirectory created (along with pdf file in it)
shutil.rmtree(os.path.join(tempdir, os.path.split(PREVIEW_PATH)[0]))
print('Check out the PDF! It\'s located at "{}".'.format(final_PDF))
#view_file(final_PDF) # see Bonus below
else:
sys.exit('Sorry, that isn\'t a .pages file.')
Bonus: If you'd like to actually view the final pdf file from the script, you can add the following function and use it on the final pdf created (assuming you have a PDF viewer application installed on your system):
import subprocess
def view_file(filepath):
subprocess.Popen(filepath, shell=True).wait()

Categories