I have folder with 1500 txt files and also folder with 20000 jpg files. 1500 of jpg files are named the same as txt files. I need that these jpg files with names similar to txt files to be moved to another folder.
To put it simply, I have txt files from 1(1) to 5(500), I need to move only files with the name from 1(1) to 5(500) from the folder with jpg files from 0(1) to 9(500).
I tried to write a program in Python via shutil, but I don't understand how to enter all 1500 valuse, so that only these files are moved.
Please tell me how to do this? Thanks in advance!
I found all the names of txt files and tried to copy pictures from another folder with the same names.
Example on pictures:
I have this names
I need to copy only this images, because their names are the same as the names of txt:
import os
import glob
import shutil
source_txt = 'C:\obj\TXT'
dir = 'C:\obj\Fixed_cars'
vendors =['C:\obj\car']
files_txt = [os.path.splitext(filename)[0] for filename in os.listdir(source_txt)]
for file in vendors:
for f in (glob.glob(file)):
if files_txt in f: # if apple in name, move to new apple dir
shutil.move(f, dir)
Well, using this answer on looping through files of a folder
you can loop through the files of your folder.
import os
directory = ('your_path_in_string')
txt_files = []
for filename in os.listdir(directory):
if filename.endswith(".txt"):
txt_files.append(filename.rstrip('.txt'))
for filename in os.listdir(directory):
if (filename.endswith(".jpg") and (filename.rstrip('.jpg') in txt_files)):
os.rename(f"path/to/current/{filename}", f"path/to/new/destination/for/{filename}")
for the the moving file part you can read this answer
Related
I have a script to loop through folders and merge the PDF's inside into one. I have a question that is above my pay grade and I don't know how to do, I am very new to programming and am betting on the benevolence of the internet for help.
Is there anyway I can specify the name of the PDF to be saved to be the same as the folder where they are found or even the name of one of the original files?
import PyPDF2
import os
Path = 'C:/Users/Folder location/'
folders = os.listdir(Path)
pdfMerger = PyPDF2.PdfFileMerger()
name = os.listdir(Path)
def pdf_merge(filelist):
pdfMerger = PyPDF2.PdfFileMerger()
for file in os.listdir(filelist):
if file.endswith('.pdf'):
pdfMerger.append(filelist+'/'+file)
pdfOutput = open(Path+folder+'/.pdf', 'wb')
pdfMerger.write(pdfOutput)
pdfOutput.close()
for folder in folders:
pdf_merge(Path+folder)
I'm wanting to move .csv files after reading them.
The code I've come up with is to move any .csv files found in a folder, then direct to an archive folder.
src1 = "\\xxx\xxx\Source Folder"
dst1 = "\\xxx\xxx\Destination Folder"
for root, dirs, files in os.walk(src1):
for f in files:
if f.endswith('.csv'):
shutil.move(os.path.join(root,f), dst1)
Note: I imported shutil at the beginning of my code.
Note 2: The destination archive folder is within the source folder - will this have implications for the above code?
When I run this, nothing happens. I get no error messages and the file remains in the source folder.
Any insight is appreciated.
Edit (some context on my goal):
My overall code will be used to read .csv files that are moved manually into a source folder by users - I then want to archive these .csv files using Python once the data has been used. Every .csv file placed into the source folder by the users will have a different name - no .csv file name will be the same, which is why I want to search the source folder for .csv files and move them all.
You can use the pathlib module. I'm assuming you have got the same folder structure in the destination directory.
from pathlib import Path
src1 = "<Path to source folder>"
dst1 = "<Path to destination folder>"
for csv_file in Path(src1).glob('**/*.csv'):
relative_file_path = csv_file.relative_to(src1)
destination_path = dst1 / relative_file_path
csv_file.rename(destination_path)
Explanation-
for csv_file in Path(src1).glob('**/*.csv'):
The glob(returns generator object) will capture all the CSV files in the directory as well as in the subdirectory. Now, we can iterate over the files one by one.
relative_file_path = csv_file.relative_to(src1)
All the csv_files are now pathlib path objects. So, we can use the functions that the library provides. One such function is relative to. Here It'll copy the relative path of the file from the src folder. Let's say you have a CSV file like-
scr_folder/A/B/c.csv - It'll copy A/B/c.csv
destination_path = dst1 / relative_file_path
As the folder structure is the same the destination path now becomes -
dst_folder/A/B/c.csv
csv_file.rename(destination_path)
At Last, rename will just move the file from src to destination.
After a bunch of research I have found a solution:
import shutil
source = r"\\xx\Source"
destination = r"\\xx\Destination"
files = os.listdir(source)
for file in files:
new_path = shutil.move(f"{source}/{file}", destination)
print(new_path)
I was making it more complicated than it needed to be - because all files in the folder would be .csv anyway, I just needed to move all files. Thanks stackoverlfow.
import os
import shutil
folder = 'c:\\Users\\myname\\documents'
for folderNames, subfolders, filenames in os.walk(folder):
for file in filenames:
if file.endswith('.txt'):
shutil.copy(?????, 'c:\\Users\\myname\\documents\\putfileshere\\' + file)
This was simple to do for all .txt files in a folder by using os.listdir but I'm having trouble with this because in oswalk I don't know how to get the full filepath of the file that ends in .txt since it could be in however many subfolders
Not sure If I'm using the correct terminology of directory, but to be more clear I want to move all .txt files to the new folder even if it's 1,2,3 subfolders deep into the documents folder.
To get the full path, you have to combine the root and filename parts. The root part points to the full path of the enumerated file names.
for root, _, filenames in os.walk(folder):
for filename in filenames:
if filename.endswith('.txt'):
file_path = os.path.join(root, filename)
shutil.copy(file_path, ...)
You could also use glob.glob(pathname)
I am new to python, I have a .zip file which has multiple sub-folders and each sub-folders has multiple .txt files. I am trying to read all .txt files But I want to store files folder specific into a variable But I am not able to do so.
For eg:
"test.zip" which has three folders "a","b","c", each has multiple(>10,000) .txt files
I want to read all files inside folder "a" and store it into a variable a_file and same with folder "b" and "c"
I tried the following code:
for file in os.listdir():
if file.endswith('test.zip'):
zfile=zipfile.ZipFile(file)
fnames= [f.filename for f in zfile.infolist()]
for subfile in fnames:
if fnames == "a" . #name of a folder
if subfile.endswith('.txt'):
lines=zfile.open(subfile).read()
print(lines)
But the code is extracting all files from multiple folders and not displaying any output maybe because of if condition
it. Instead of a reading folder specific and storing it
Thank You in Advance for helping
That happened because zip file lists the files as follows:
a/a1.txt a/a2.txt b/b1.txt b/b2.txt
So you need to separate files from directory using split('/')
You could try this:
import os
from zipfile import ZipFile
for file in os.listdir():
if file.endswith('test.zip'):
zfile = ZipFile(file);
fnames = [f.filename for f in zfile.filelist];
for subfile in fnames:
dir_name = subfile.split('/')[0];
if(dir_name == 'a'):
if(subfile.endswith('.txt')):
lines = zfile.open(subfile).read();
print(lines);
I have the following directory, in the parent dir there are several folders lets say ABCD and within each folder many zips with names as displayed and the letter of the parent folder included in the name along with other info:
-parent--A-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
-xxxAxxxx_timestamp.zip
--B-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
-xxxBxxxx_timestamp.zip
--C-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
-xxxCxxxx_timestamp.zip
--D-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
-xxxDxxxx_timestamp.zip
I need to unzip only selected zips in this tree and place them in the same directory with the same name without the .zip extension.
Output:
-parent--A-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
-xxxAxxxx_timestamp
--B-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
-xxxBxxxx_timestamp
--C-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
-xxxCxxxx_timestamp
--D-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
-xxxDxxxx_timestamp
My effort:
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zipfile=os.path.basename(path) #save the zipfile path
zip_ref=zipfile.ZipFile(path, 'r')
zip_ref=extractall(zipfile.replace(r'.zip', '')) #unzip to a folder without the .zip extension
The problem is that i dont know how to save the A,B,C,D etc to include them in the path where the files will be unzipped. Thus, the unzipped folders are created in the parent directory. Any ideas?
The code that you have seems to be working fine, you just to make sure that you are not overriding variable names and using the correct ones. The following code works perfectly for me
import os
import zipfile
import glob
for path in glob.glob('./*/xxx*xxxx*'): ##walk the dir tree and find the files of interest
zf = os.path.basename(path) #save the zipfile path
zip_ref = zipfile.ZipFile(path, 'r')
zip_ref.extractall(path.replace(r'.zip', '')) #unzip to a folder without the .zip extension
Instead of trying to do it in a single statement , it would be much easier and more readable to do it by first getting list of all folders and then get list of files inside each folder. Example -
import os.path
for folder in glob.glob("./*"):
#Using *.zip to only get zip files
for path in glob.glob(os.path.join(".",folder,"*.zip")):
filename = os.path.split(path)[1]
if folder in filename:
#Do your logic