I want to know how to apply a function over a file of images and save each of them in a separate file. For one image it works successfully, but i cannot apply it to all images.
import glob
images = glob.glob('/Desktop/Dataset/Images/*')
for img in images:
img = np.array(Image.open(img))
output = 'Desktop/Dataset/Output'
MyFn(img = img,saveFile = output)
You did not define the sv value in your 2nd code snippet.
As the image will be overwrite, try this code:
import glob
images = glob.glob('/Desktop/Dataset/Images/*')
i = 0
for img in images:
i += 1 #iteration to avoid overwrite
img = np.array(Image.open(img))
output = 'Desktop/Dataset/Output'
MyFn(img = img + str(i),saveFile = output)
try to use the library os directly with
import os
entries = os.listdir('image/')
this will return a list of all the file into your folder
This is because you are not setting the sv value in your loop. You should set it to a different value at each iteration in order for it to write to different files.
Thank you massively to Paul M who posted the following code in response to my first query on how to compile a "stack" of randomly selected images into a unique image:
from pathlib import Path
from random import choice
layers = [list(Path(directory).glob("*.png")) for directory in ("bigcircle/", "mediumcircle/")]
selected_paths = [choice(paths) for paths in layers]
img = Image.new("RGB", (4961, 4961), color=(0, 220, 15))
for path in selected_paths:
layer = Image.open(str(path), "r")
img.paste(layer, (0, 0), layer)
I have the code sat in a for _ in itertools.repeat(None, num): where num defines the number of different images being generated. I end the loop with the following to save each image with a unique (incremental) file name:
i = 0
while os.path.exists("Finished Pieces/image %s.png" % i):
i += 1
img.save("Finished Pieces/image %s.png" % i,)
So far, so good. The challenge I'm struggling with now is how to append to a data.csv file with the details of each image created.
For example, in loop 1 bigcircle1.png is selected from bigcircle/ folder and mediumcircle6.png from mediumcircle/, loop 2 uses bigcircle3.png and mediumcircle2.png, and so on. At the end of this loop the data.csv file would read:
Filename,bigcircle,mediumcircle
image 0,bigcircle1.png,mediumcircle6.png
image 1,bigcircle3.png,mediumcircle2.png
I've tried the following, which I know wouldn't give the desired result, but which I thought might be a good start for me to run and tweak until right, but it doesn't generate any output (and I am importing numpy as np):
np.savetxt('data.csv', [p for p in zip(img, layer)], delimiter=',', fmt='%s')
If it's not too much of an ask, ideally the first iteration of the loop would create data.csv and store the first record, with the second iteration onwards appending this file.
Me again ;)
I think it would make sense to split up the functionality of the program into separate functions. I would start maybe with a function called something like discover_image_paths, which discovers (via glob) all image paths. It might make sense to store the paths according to what kind of circle they represent - I'm envisioning a dictionary with "big" and "medium" keys, and lists of paths as associated values:
def discover_image_paths():
from pathlib import Path
keys = ("bigcircle", "mediumcircle")
return dict(zip(keys, (list(Path(directory).glob("*.png")) for directory in (key+"/" for key in keys))))
def main():
global paths
paths = discover_image_paths()
return 0
if __name__ == "__main__":
import sys
sys.exit(main())
In the terminal:
>>> paths["bigcircle"]
[WindowsPath('bigcircle/big1.png'), WindowsPath('bigcircle/big2.png'), WindowsPath('bigcircle/big3.png')]
>>> paths["mediumcircle"]
[WindowsPath('mediumcircle/med1.png'), WindowsPath('mediumcircle/med2.png'), WindowsPath('mediumcircle/med3.png')]
>>>
As you can see, to test the script, I created some dummy image files - three for each category.
Extending this by adding a function to generate an output image (given an iterable of paths to combine and an output file name) and a main loop to generate num_images number of images (sorry, I'm not familiar with numpy):
def generate_image(paths, color, output_filename):
from PIL import Image
dimensions = (4961, 4961)
image = Image.new("RGB", dimensions, color=color)
for path in paths:
layer = Image.open(path, "r")
image.paste(layer, (0, 0), layer)
image.save(output_filename)
def discover_image_paths(keys):
from pathlib import Path
return dict(zip(keys, (list(Path(directory).glob("*.png")) for directory in (key+"/" for key in keys))))
def main():
from random import choice, choices
from csv import DictWriter
field_names = ["filename", "color"]
keys = ["bigcircle", "mediumcircle"]
paths = discover_image_paths(keys)
num_images = 5
with open("data.csv", "w", newline="") as file:
writer = DictWriter(file, fieldnames=field_names+keys)
writer.writeheader()
for image_no in range(1, num_images + 1):
selected_paths = {key: choice(category_paths) for key, category_paths in paths.items()}
file_name = "output_{}.png".format(image_no)
color = tuple(choices(range(0, 256), k=3))
generate_image(map(str, selected_paths.values()), color, file_name)
row = {**dict(zip(field_names, [file_name, color])), **{key: path.name for key, path in selected_paths.items()}}
writer.writerow(row)
return 0
if __name__ == "__main__":
import sys
sys.exit(main())
Example output in the CSV:
filename,color,bigcircle,mediumcircle
output_1.png,"(49, 100, 190)",big3.png,med1.png
output_2.png,"(228, 37, 227)",big2.png,med3.png
output_3.png,"(251, 14, 193)",big1.png,med1.png
output_4.png,"(35, 12, 196)",big1.png,med3.png
output_5.png,"(62, 192, 170)",big2.png,med2.png
I am working on sign language gesture classifier using pytorch, I have pictures resembling each letter residing in a folder titled with that specific letter. E.g. folder "A" has "1_A_1.jpg", "1_A_2.jpg", "21_A_3.jpg".. etc.
I am trying to build a function that:
Iterates through the different folders
Splits the data into training, validation, and test sets
Labels those pictures with their respective folder name (i.e. letter label)
Returns 3 created folders that are train, test and validation
All the online code shows examples of splitting data coming from torchvision data sets (built in data sets), nothing from scratch.
I found the following on stackoverflow:
import os
import numpy as np
import argparse
def get_files_from_folder(path):
files = os.listdir(path)
return np.asarray(files)
def main(path_to_data, path_to_test_data, train_ratio):
# get dirs
_, dirs, _ = next(os.walk(path_to_data))
# calculates how many train data per class
data_counter_per_class = np.zeros((len(dirs)))
for i in range(len(dirs)):
path = os.path.join(path_to_data, dirs[i])
files = get_files_from_folder(path)
data_counter_per_class[i] = len(files)
test_counter = np.round(data_counter_per_class * (1 - train_ratio))
# transfers files
for i in range(len(dirs)):
path_to_original = os.path.join(path_to_data, dirs[i])
path_to_save = os.path.join(path_to_test_data, dirs[i])
#creates dir
if not os.path.exists(path_to_save):
os.makedirs(path_to_save)
files = get_files_from_folder(path_to_original)
# moves data
for j in range(int(test_counter[i])):
dst = os.path.join(path_to_save, files[j])
src = os.path.join(path_to_original, files[j])
shutil.move(src, dst)
and when I tried doing the following:
path_to_data= r'path\A'
path_to_test_data=r"path\test"
train_ratio=0.8
main(path_to_data,path_to_test_data,train_ratio)
Nothing really happened..
If I can get this working for train and test, I can easily extend it for validation.
Give this a go:
from pathlib import Path
def main(data_path, out_path, train_ratio):
#1
dir_paths = [child for child in Path(data_path).iterdir() if child.is_dir()]
for i, dir_path in enumerate(dir_paths):
#2
files = list(dir_path.iterdir())
train_len = int(len(files) * (1 - train_ratio))
#3
out_dir = Path(out_path).joinpath(dir_path.name)
if not out_dir.exists():
out_dir.mkdir(parents=True)
#4
for file_ in files[:train_len]:
file_.replace(out_dir.joinpath(file_.name))
if __name__ == '__main__':
main('data', 'test', 0.8)
My directory has hundreds of images and text files(.png and .txt). what's special about them is that each image has its own matching txt file, for example im1.png has img1.txt, news_im2.png has news_im2.png etc.. What i want is some way to give it a parameter or percentage, let's say 40 where it randomly copy 40% of the images along with their correspondent texts to a new file, and the most important word here is randomely as if i do the test again i shouldn't get the same results. Ideally i should be able to take 2 kind of parameters(reminder that the first would be the % of each sample) the second being number of samples for example maybe i want my data in 3 different samples randomly not only 2, in this case it should be able to take destination directories path equal to the number of samples i want and spread them accordingly, for example i shouldn't find img_1 in 2 different samples.
What i have done so far is simply set up my method to copy them :
import glob, os, shutil
source_dir ='all_the_content/'
dest_dir = 'percentage_only/'
files = glob.iglob(os.path.join(source_dir, "*.png"))
for file in files:
if os.path.isfile(file):
shutil.copy2(file, dest_dir)
and the start of my code to set the random switching:
import os, shutil,random
my_pic_dict = {}
source_dir ='/home/michel/ubuntu/EAST/data_0.8/'
for element in os.listdir(source_dir):
if element.endswith('.png'):
my_pic_dict[element] = element.replace('.png', '.txt')
print(my_pic_dict)
print (len(my_pic_dict))
imgs_list = my_pic_dict.keys()
print(imgs_list)
hwo can i finalize it as i couldn't make random.sample work.
try this:
import random
import numpy as np
n_elem = 1000
n_samples = 4
percentages = [50,10,10,30]
indices = list(range(n_elem))
random.shuffle(indices)
elem_per_samples = [int(p*n_elem/100) for p in percentages]
limits = np.cumsum([0]+elem_per_samples)
samples = [indices[limits[i]:limits[i+1]] for i in range(n_samples)]
I have a folder of unzipped files from Landsat 5, Landsat 7, and Landsat 8. I want to import the red and NIR bands to run NDVI. But, this means I need to bring in bands 4 and 5 for Landsat 8, and bands 3 and 4 for Landsat's 5 and 7. I am having a hard time writing a code to just import these bands. I am completely new to python so this might be way off, but here is what I have:
import os
import arcpy
import re
from arcpy import env
from arcpy.sa import *
mydir = r"E:\Thesis\005005-006004\005005\US_Landsat_4-8_ARD"
directory to store indices
rasters = r"E:\Thesis\Processing\005005"
lists for images & corresponding xml files
P_2Band = []
for root,dirs,files in os.walk(mydir):
for name in files:
if name.startswith("LC08")
if name.endswith("4.tif") or if name.endswith("5.tif"):
mypath = root+"\\"+name
P_2Band.append(mypath)
print(name)
elif name.startswith("LE07") or if name.startswith("LT05")
if name.endswith("3.tif" or "4.tif"):
mypath = root+"\\"+name
P_Metadata.append(mypath)
print(name)
Thanks for any help!