Selecting files from multiple folders with a certain extension - python

So consider this as the Folder structure below:
Images
-1.jpg
Yellow
yellow1.jpg
yellow2.jpg
yellow1.csv
Blue
blue1.jpg
Orange
Purple
purple1.jpg
purple2.jpg
puprple.csv
Now my agenda is to take all the jpegs from the Images master directory which are there in separate folders.
I thought I can use glob as
input_dir=r'../../../../Images'
files=glob.glob(input_dir+"/**/*.jpg")
but this only yields output with the last file
like files=files=[''../../../../Images/purple2.jpg']
but I want the files as
['../../../../Images/yellow1.jpg','../../../../Images/yellow2.jpg','../../../../Images/blue1.jpg','../../../../Images/purple1.jpg','../../../../Images/purple2.jpg']
I need to have all the files, can someone help me rectify this?

Just use pathlib.
from pathlib import Path
p = Path("your_source_folder")
files = [f for f in p.rglob('*.jpg') if f.is_file()]
This will recursively go through your folder structure, select all jpeg files and return a list of Path objects for all found files.

Related

Using glob to find all zip files recursively in three sub folder

I am trying to look only in three specific subfolders and then recursively create a list of all zip files within the folders. I can easily do this with just 1 folder and recursively look through all subfolders that are within the inputpath, but there are other folders that get created that we cannot use plus we do not know what the folder names will be. So This is where I am at and I am not sure how to pass three subfolders to glob correctly.
# using glob, create a list of all the zip files in specified sub directories COMM, NMR, and NMH inside of input_path
zip_file = glob.glob(os.path.join(inputpath, "/comm/*.zip,/nmr/*.zip,/nmh/*.zip"), recursive=True)
#print(zip_file)
print(f"Found {len(zip_file)} zip files")
The string with commas in it is ... just a string. If you want to perform three globs, you need something like
zip_file = []
for dir in {"comm", "nmr", "nmh"}:
zip_file.extend(glob.glob(os.path.join(inputpath, dir, "*.zip"), recursive=True)
As noted by #Barmar in comments, if you want to look for zip files anywhere within these folders, the pattern needs to be ...(os.path.join(inputpath, dir, "**/*.zip"). If not, perhaps edit your question to provide an example of the structure you want to traverse.

How would i choose a random image from a directory? Python

My program's goal is to take a random png image and places it against another random image. So far i have it so it gets the image, pastes it onto another, and saves it and would like to get it to be random.
from PIL import Image
from PIL import ImageFilter
France = Image.open(r"C:\Users\Epicd\Desktop\Fortnite\France.png")
FranceRGB = France.convert('RGB')
Crimson_Scout = Image.open(r"C:\Users\Epicd\Desktop\Fortnite\Crimson_Scout.png")
FranceRGB.paste(Crimson_Scout, box=(1,1), mask=Crimson_Scout)
FranceRGB.save(r"C:\Users\Epicd\Desktop\Fortnite\Pain1.png")
The easiest way to do this would be to list the files in a directory, and choose randomly from the given paths. Something like this:
import os
import random
random.choice(os.listdir("/path/to/dir"))
It would probably be smart to add in some logic to ensure you are filtering out directories, and only accepting files with specific extension (pbg, jpg, etc)
You can use os.listdir to get a list of the paths of all the items in a directory. Then use the random class to select items from that list.
You can pick 2 random *.png files from the working directory like this:
import glob
import random
all_pngs = glob.glob("./*.png")
randPng1 = random.choice(all_pngs)
randPng2 = random.choice(all_pngs)
print randPng1
print randPng2
Then you can these two variables(randPng1 andrandPng2) instead of the hardcoded paths to your images.
If randomly picking the same png twice is not what you want, then you need to remove the randPng1 element from the all_pngs array, before getting the second random element from the array.
You can use random.choice and os.walk for that task.
The code for picking an image would something like this:
import os
import random
path = "path/to/your/images.png"
images = []
# This will get each root, dir and file list in the path specified recursively (like the "find" command in linux, but separating files, from directories, from paths).
# root is the full path from your specified path to the the directory it is inspecting
# dirs is a list containing all the directories found in the current inspecting directory
# files is a list containing all the files found in the current inspecting directory
for root, dirs, files in os.walk(path):
# This will filter all the .png files in case there is something else in the directory
# If your directory only has images, you can do this:
# images = files
# instead of filtering the '.png' images with the for loop
for f in files:
if f[-4:] == '.png':
images.append(f)
print(random.choice(images))

-Python- Move All PDF Files in Folder to NewDirectory Based on Matching Names, Using Glob or Shutil

I'm trying to write code that will move hundreds of PDF files from a :/Scans folder into another directory based on the matching each client's name. I'm not sure if I should be using Glob, or Shutil, or a combination of both. My working theory is that Glob should work for such a program, as the glob module "finds all the pathnames matching a specified pattern," and then use Shutil to physically move the files?
Here is a breakdown of my file folders to give you a better idea of what I'm trying to do.
Within :/Scans folder I have thousands of PDF files, manually renamed based on client and content, such that the folder looks like this:
lastName, firstName - [contentVariable]
(repeat the above 100,000x)
Within the :/J drive of my computer I have a folder named 'Clients' with sub-folders for each and every client, similar to the pattern above, named as 'lastName, firstName'
I'm looking to have the program go through the :/Scans folder and move each PDF to the matching client folder based on 'lastName, firstName'
I've been able to write a simple program to move files between folders, but not one that will do the aforesaid name matching.
shutil.copy('C:/Users/Kenny/Documents/Scan_Drive','C:/Users/Kenny/Documents/Clients')
^ Moving a file from one folder to another.. quite easily done.
Is there a way to modify the above code to apply to a regex (below)?
shutil.copy('C:/Users/Kenny/Documents/Scan_Drive/\w*', 'C:/Users/Kenny/Documents/Clients/\w*')
EDIT: #Byberi - Something as such?
path = "C:/Users/Kenny/Documents/Scans"
dirs = os.path.isfile()
This would print all the files and directories
for file in dirs:
print file
dest_dir = "C:/Users/Kenny/Documents/Clients"
for file in glob.glob(r'C:/*'):
print(file)
shutil.copy(file, dest_dir)
I've consulted the following threads already, but I cannot seem to find how to match and move the files.
Select files in directory and move them based on text list of filenames
https://docs.python.org/3/library/glob.html
Python move files from directories that match given criteria to new directory
https://www.guru99.com/python-copy-file.html
https://docs.python.org/3/howto/regex.html
https://code.tutsplus.com/tutorials/file-and-directory-operations-using-python--cms-25817

how to load images from different folders and subfolders in python

I am developing a CNN in a animal classification dataset, which are separated into 2 folders, and the 2 folders involve another subfolders...there are four layers of this structure, and now I want to load them and convert them to n-dimension-arrays to feed to tensorflow, the names of these folders are the labels.
I hope that someone can help me with some concrete codes or some useful materials.
Thank you very much in advance!
Here I will give some examples:
Anisopleura Libellulidae Leach, 1815 Trithemis aurora
Zygoptera Calopterygidae Selys, 1850 Calopteryx splendens
the aurora and splendens are the labels of this problem, and they are also the name of fifth floor subfolders, the images are stored in these folders.
C:\Users\Seth\Desktop\dragonfly\Anisopleura\Libellulidae Leach, 1815\Pseudothemis\zonata
this is a path.
I using openface library for face recognition, In this library iterImgs is method that gives list of you all images under a Directory
For detail iterImgs
from openface.data import iterImgs
imgs = list(iterImgs("Directory path"))
print imgs # print all images in Directory path also in Tree
or another way is defined a vailed extension
vailed_ext = [".jpg",".png"]
import os
f_list = []
def Test2(rootDir):
for lists in os.listdir(rootDir):
path = os.path.join(rootDir, lists)
filename, file_extension = os.path.splitext(path)
if file_extension in vailed_ext:
print path
f_list.append[path]
if os.path.isdir(path):
Test2(path)
Test2("/home/")
print f_list
os.walk() is what you are looking for.
import os
# traverse root directory, and list directories as dirs and files as files
for root, dirs, files in os.walk("."):
path = root.split(os.sep)
print((len(path) - 1) * '---', os.path.basename(root))
for file in files:
print(len(path) * '---', file)
This code will allow you to parse recursively all folders and subfolders. You get the name of the subfolder(labels in your case) and all files in the file variable.
The next work for you is then to maybe create a dictionnary (or numpy multi-dimensional array) to store for each label (or subfolder) the features of your image.

Following a nested directory structure until the end

I have a some directories that contain some other directories which, at the lowest level, contain bunch of csv files such as (folder) a -> b -> c -> (csv files). There is usually only one folder at each level. When I process a directory how can I follow this structure until the end to get the csv files ? I was thinking maybe a recursive solution but I think there may be better ways to do this. I am using python. Hope I was clear.
The os package has a walk function that will do exactly what you need:
for current_path, directory, files in walk("/some/path"):
# current_path is the full path of the directory we are currently in
# directory is the name of the directory
# files is a list of file names in this directory
You can use os.path's to derive the full path to each file (if you need it).
Alternately, you might find the glob module to be of more use to you:
for csv_file in glob(/some/path/*/*.csv"):
# csv_file is the full path to the csv file.

Categories