I am trying to extract the files from a zip archive and appending "EI" to each file inside it. I want these files to be extracted in a certain location. I'm new to python, hence unable to figure out.
for i in zip_list:
if ("Rally-EI" in i):
zipdata = zipfile.ZipFile(i)
zipinfos = zipdata.infolist()
for zipinfo in zipinfos:
zipinfo.filename = zipinfo.filename[:-4] + "_EI.txt"
zipdata.extract(zipinfo)
This is the code I'm using for appending the file name and it is working well. Need to extract these files to a specific location.
Thanks
Try using os.chdir() to change the current directory temporarily for this extraction. It's not the most efficient way, but, it will do the work.
Do save your current working directory using os.getcwd() to revert back to the original working directory after the extraction is done.
My goal: To build a program that:
Opens a folder (provided by the user) from the user's computer
Iterates through that folder, opening each document in each subdirectory (named according to language codes; "AR," "EN," "ES," etc.)
Substitutes a string in for another string in each document. Crucially, the new string will change with each document (though the old string will not), according to the language code in the folder name.
My level of experience: Minimal; been learning python for a few months but this is the first program I'm building that's not paint-by-numbers. I'm building it to make a process at work faster. I'm sure I'm not building this as efficiently as possible; I've been throwing it together from my own knowledge and from reading stackexchange religiously while building it.
Research I've done on my own: I've been living in stackexchange the past few days, but I haven't found anyone doing quite what I'm doing (which was very surprising to me). I'm not sure if this is just because I lack the vocabulary to search (tried out a lot of search terms, but none of them totally match what I'm doing) or if this is just the wrong way of going about things.
The issue I'm running into:
I'm getting this error:
Traceback (most recent call last):
File "test5.py", line 52, in <module>
for f in os.listdir(src_dir):
OSError: [Errno 20] Not a directory: 'ExploringEduTubingEN(1).txt'
I'm not sure how to iterate through every file in the subdirectories and update a string within each file (not the file names) with a new and unique string. I thought I had it, but this error has totally thrown me off. Prior to this, I was getting an error for the same line that said "Not a file or directory: 'ExploringEduTubingEN(1).txt'" and it's surprising to me that the first error could request a file or a directory, and once I fixed that, it asked for just a directory; seems like it should've just asked for a directory at the beginning.
With no further ado, the code (placing at bottom because it's long to include context):
import os
ex=raw_input("Please provide an example PDF that we'll append a language code to. ")
#Asking for a PDF to which we'll iteratively append the language codes from below.
lst = ['_ar.pdf', '_cs.pdf', '_de.pdf', '_el.pdf', '_en_gb.pdf', '_es.pdf', '_es_419.pdf',
'_fr.pdf', '_id.pdf', '_it.pdf', '_ja.pdf', '_ko.pdf', '_nl.pdf', '_pl.pdf', '_pt_br.pdf', '_pt_pt.pdf', '_ro.pdf', '_ru.pdf',
'_sv.pdf', '_th.pdf', '_tr.pdf', '_vi.pdf', '_zh_tw.pdf', '_vn.pdf', '_zh_cn.pdf']
#list of language code PDF appending strings.
pdf_list=open('pdflist.txt','w+')
#creating a document to put this group of PDF filepaths in.
pdf2='pdflist.txt'
#making this an actual variable.
for word in lst:
pdf_list.write(ex + word + "\n")
#creating a version of the PDF example for every item in the language list, and then appending the language codes.
pdf_list.seek(0)
langlist=pdf_list.readlines()
#creating a list of the PDF paths so that I can use it below.
for i in langlist:
i=i.rstrip("\n")
#removing the line breaks.
pdf_list.close()
#closing the file after removing the line breaks.
file1=raw_input("Please provide the full filepath of the folder you'd like to convert. ")
#the folder provided by the user to iterate through.
folder1=os.listdir(file1)
#creating a list of the files within the folder
pdfpath1="example.pdf"
langfile="example2.pdf"
#setting variables for below
#my thought here is that i'd need to make the variable the initial folder, then make it a list, then iterate through the list.
for ogfile in folder1:
#want to iterate through all the files in the directory, including in subdirectories
src_dir=ogfile.split("/",6)
src_dir="/".join(src_dir[:6])
#goal here is to cut off the language code folder name and then join it again, w/o language code.
for f in os.listdir(src_dir):
f = os.path.join(src_dir, f)
#i admit this got a little convoluted–i'm trying to make sure the files put the right code in, I.E. that the document from the folder ending in "AR" gets the PDF that will now end in "AR"
#the perils of pulling from lots of different questions in stackexchange
with open(ogfile, 'r+') as f:
content = f.read()
f.seek(0)
f.truncate()
for langfile in langlist:
f.write(content.replace(pdfpath1, langfile))
#replacing the placeholder PDF link with the created PDF links from the beginning of the code
If you read this far, thanks. I've tried to provide as much information as possible, especially about my thought process. I'll keep trying things and reading, but I'd love to have more eyes on it.
You have to specify the full path to your directories/files. Use os.path.join to create a valid path to your file or directory (and platform-independent).
For replacing your string, simply modify your example string using the subfolder name. Assuming that ex as the format filename.pdf, you could use: newstring = ex[:-4] + '_' + str.lower(subfolder) + '.pdf'. That way, you do not have to specify the list of replacement strings nor loop through this list.
Solution
To iterate over your directory and replace the content of your files as you'd like, you can do the following:
# Get the name of the file: "example.pdf" (note the .pdf is assumed here)
ex=raw_input("Please provide an example PDF that we'll append a language code to. ")
# Get the folder to go through
folderpath=raw_input("Please provide the full filepath of the folder you'd like to convert. ")
# Get all subfolders and go through them (named: 'AR', 'DE', etc.)
subfolders=os.listdir(folderpath)
for subfolder in subfolders:
# Get the full path to the subfolder
fullsubfolder = os.path.join(folderpath,subfolder)
# If it is a directory, go through it
if os.path.isdir(fullsubfolder):
# Find all files in subdirectory and go through each of them
files = os.listdir(fullsubfolder)
for filename in files:
# Get full path to the file
fullfile = os.path.join(fullsubfolder, filename)
# If it is a file, process it (note: we do not check if it is a text file here)
if os.path.isfile(fullfile):
with open(fullfile, 'r+') as f:
content = f.read()
f.seek(0)
f.truncate()
# Create the replacing string based on the subdirectory name. Ex: 'example_ar.pdf'
newstring = ex[:-4] + '_' + str.lower(subfolder) + '.pdf'
f.write(content.replace(ex, newstring))
Note
Instead of asking the user to find write the folder, you could ask him to open the directory with a dialog box. See this question for more info: Use GUI to open directory in Python 3
Does anyone know how I can copy/duplicate a file from one directory into another without specification of src path? I got it to work with "shutil.copy2" but it's not exactly what I am looking for since the src argument asks for the path.
My goal is to be able to copy/duplicate a file from one directory into another by filename. Has anyone done this before, if so can you guide me in the right direction? - Thanks
#----------------------------------------------------------------------------------------------------------------#
# These params will be used for specifying which template you want to copy and where to output
#----------------------------------------------------------------------------------------------------------------#
'''Load file from x directory into current working directory '''
#PullTemplate: Specify which template you want to copy, by directory path
TemplateRepo = ("/home/hadoop/BackupFolders/Case_Project/scripts")
#OutputTemplate: Let's you specify where you want to output the copied template.
#Originally set to your current working directory (u".")
OutputTemplate = (u".")
shutil.copy2(TemplateRepo, OutputTemplate)
Well if you are trying to load a file in the same project you need to have at least the folder name inside that project.
You can use json
Something like this.
import json
#someFiles is just a fold name inside the projects main folder.
with open("someFiles\\file_name", "r") as whatever_u_want:
var_of_choice = json.load(whatever_u_want)
print (var_of_choice)
once the file is open you can save the variable var_of_choice as any file name you wish where you wish using the json dump method.
Click the file you want to copy, create a duplicate of the file by choosing Duplicate key under File ( below the Jupyter logo at the top ).
Choose the file copied(file_copy), and choose Move key under File.
Choose the file path where you want to paste/move the copied file.
Rename the copied file name as you wish.
For more info, you can refer to here: https://subscription.packtpub.com/book/big_data_and_business_intelligence/9781785884870/1/ch01lvl1sec12/basic-notebook-operations
I an pretty new to python and have been given a problem to solve in my lab, how could one possibly remove specific files ending with .gff or another ending if the file is empty? The files were all just created and are all in the same directory.
for getting files of a particular kind out of a directory glob works very well glob.glob("./file/path/*.gff") will return a list of files ending in .gff
as for finding the size of the file use os.stat("./file/path/blah.gff").st_size
I'm trying to replace a file in a zipped archive with a script using zipfile. The file is one directory, the archive is in another. To do this, I copy everything from the original archive into another, excluding the file I want to replace. Then I write the new version of the file to replace into the new archive, close it, and delete the old archive and rename the new one. Should be easy, right? Wrong.
For whatever reason, the zipfile.write() method has this silly thing it does where it assumes that the second (optional) argument, arcname is the same as your file name, unless you specify it. So, if I have the following:
fileName = "C:\\Documents\\file"
archive.write(fileName)
I will get an archive with a subarchive called "Documents", and within that will be the file. I want the file to be in the root directory of the archive (sidenote: is 'root directory' the right term for what I'm refering to?)
Thing's I've Tried:
archive.write(fileName,'') This produced a weird file in the archive, which could not be opened.
archive.write(fileName, archive) I really thought this would work, but the system really didn't like it.
archive.write(fileNameWithoutPath) This one returned an error, since Python could no longer find the file.
So how do I specify that I want to put the file in the root directory of the archive and still specify its path so Python can find it?
Minor, and semi-related question: Is there a way to create the new archive such that it is hidden in windows explorer?
I am assuming you want a entry in the zipfile called file containin the contents of C:\Documents\file
From python docs
ZipFile.write(filename[, arcname[, compress_type]])
Write the file named filename to the archive, giving it the archive name arcname
so you want
archive.write(fileName, fileNameWithoutPath)
The first argument is the file that goes in the zip and the second is the name that is to be used in the archive, as it contains no path separators it will not create any directories.