How do I convert a folder of images to a npy file? - python

I have a folder containing images of gestures. But to make it work on my code I need to change it to X.npy and Y.npy. I looked for many questions regarding this kinda problems but still in the dark. How do I evaluate this? How do I convert the folder to create npy dataset of my own? Is there any code for this or any converter?

I found a piece of code for this purpose on github.
from PIL import Image
import os, sys
import cv2
import numpy as np
'''
Converts all images in a directory to '.npy' format.
Use np.save and np.load to save and load the images.
Use it for training your neural networks in ML/DL projects.
'''
# Path to image directory
path = "/path/to/image/directory/"
dirs = os.listdir( path )
dirs.sort()
x_train=[]
def load_dataset():
# Append images to a list
for item in dirs:
if os.path.isfile(path+item):
im = Image.open(path+item).convert("RGB")
im = np.array(im)
x_train.append(im)
if __name__ == "__main__":
load_dataset()
# Convert and save the list of images in '.npy' format
imgset=np.array(x_train)
np.save("imgds.npy",imgset)

You can refer to the code snippet in the following GitHub repo, that I found in google to convert a folder of images to a npy file:
https://gist.github.com/anilsathyan7/ffb35601483ac46bd72790fde55f5c04
Here in this case entire images in the folder are converted into NumPy array and are appended in a list named x_train.To convert and save this list of images in a single '.npy' format file, we can use the same code snippet:
imgset=np.array(x_train)
np.save("imgds.npy",imgset)
To convert and save this list of images in multiple '.npy' format files, use below code snippet :
imgset=np.array(x_train,dtype=object)
for i in range(len(imgset)):
np.save("imgds"+str(i)+".npy",imgset[i])

Related

Python script to convert RBG image dataset into Grayscale images using pillow

I want to convert an image RGB dataset to Grayscale dataset using pillow. I want to write a script that takes in the dataset path and converts all images one by one into grayscale. At the end I want to save this script, and want to run this script on the server to avoid the copying of huge data to the server.
Probably this code would work for you?
Code:
import os
from PIL import Image
dir = '/content/pizza_steak/test/pizza'
for i in range(len(os.listdir(dir))):
# directory where images are stored
dir = '/content/pizza_steak/test/steak'
# get the file name
file_name = os.listdir(dir)[i]
# creating a final path
final_path = dir + '/' + file_name
# convet and save the image
Image.open(final_path).convert('L').save(f"/content/meow/gray{file_name}")
# uncomment this is you want to delete the file
# os.remove(final_path)

Convert a folder of images into a dataset of numpy arrays in Google Colab

I have uploaded the fairface dataset (https://github.com/joojs/fairface) into my google drive and I'm trying to convert the images to a dataset of arrays that I can use in a CNN.
First, I created a list of the files for the validation set. Now I am trying to convert the images to arrays. This is what I am trying, but it says my directory does not exist.
val is the folder of validation images.
import os
from PIL import Image
from numpy import asarray
val_items = os.listdir('/content/val')
train_items = os.listdir('/content/train')
val_img_array = []
# load the image and convert into
# numpy array
for i in range(len(val_items)):
img = Image.open('/content/val/*.jpg')
numpydata = asarray(img)
val_img_array.append(numpydata)
print(val_img_array)
Please give me any guidance you have. Thanks!
You are not importing the drive correctly. Your path should look like this:
Image.open("/content/drive/MyDrive/val/")

How to load many images eficiently from folder using openCV

I try to create my own image datasets for machine learning.
The workflow I thought is the following :
①Load all image files as an array in the folder.
②Label the loaded images
③Split loaded image files to image_data and label_data.
④Finally, split image_data to image_train_data and image_test_data and split label_data to label_train_data and label_test_data.
However, it doesn't go well in the first step(①).
How can I load all image data efficiently?
And if you implement an image data set for machine learning according to this workflow, how you handle it?
I wrote following code.
cat_im = cv2.imread("C:\\Users\\path\\cat1.jpg")
But, Am I forced writing \cat1.jpg , \cat2.jpg ,\cat3.jpg.....?
## you can find all images like extenstion
import os,cv2
import glob
all_images_path= glob.glob('some_folder\images\*png') ## it gives path of images as list
## then you can loop over all files
loaded_images = []
for image_path in all_images_path:
image = cv2.imread(image_path)
loaded_images.append(image)
## lets assume your labels are just name of files and its like cat1.png,cat2.png etc
labels = []
for image_path in all_images_path:
labels.append(os.basename(image_path))

Reading image files

I have a directory containing 4 folder (1,2,3,4). Each folder has jpg images in them. I used the code below to read the images. The problem is all images are in different shapes. So, now I have a list of images each with different shape.
1) Is there a better way to read img files from a directory? (maybe assign directly to a numpy array)
2) How can I resize the images so that they all have the same shape?
Thanks!
import imageio
import os.path
images = []
for folder in os.listdir('images'):
for filename in os.listdir('images/'+folder):
if filename.endswith(".jpg"):
img = imageio.imread('images/'+folder+'/'+filename)
img.reshape((1,img.flatten().shape[0])).shape
images.append(img)

How do I get the face_recognition encoding from many images in a directory and store them in a CSV File?

This is the code I have and it works for single images:
Loading images and apply the encoding
from face_recognition.face_recognition_cli import image_files_in_folder
Image1 = face_recognition.load_image_file("Folder/Image1.jpg")
Image_encoding1 = face_recognition.face_encodings(Image1)
Image2 = face_recognition.load_image_file("Folder/Image2.jpg")
Image_encoding2 = face_recognition.face_encodings(Image2)
Face encodings are stored in the first array, after column_stack we have to resize
Encodings_For_File = np.column_stack(([Image_encoding1[0]],
[Image_encoding2[0]]))
Encodings_For_File.resize((2, 128))
Convert array to pandas dataframe and write to csv
Encodings_For_File_Panda = pd.DataFrame(Encodings_For_File)
Encodings_For_File_Panda.to_csv("Celebrity_Face_Encoding.csv")
How do I loop over the images in 'Folder' and extract the encoding into a csv file? I have to do this with many images and cannot do it manually. I tried several approaches, but none a working for me. Cv2 can be used instead of load_image_file?
Try this
Note: I am assuming you dont need to specify folder path before file name in your command. This code will show you how to iterate over the directory to list files and process them
import os
from face_recognition.face_recognition_cli import image_files_in_folder
my_dir = 'folder/path/' # Folder where all your image files reside. Ensure it ends with '/
encoding_for_file = [] # Create an empty list for saving encoded files
for i in os.listdir(my_dir): # Loop over the folder to list individual files
image = my_dir + i
image = face_recognition.load_image_file(image) # Run your load command
image_encoding = face_recognition.face_encodings(image) # Run your encoding command
encoding_for_file.append(image_encoding[0]) # Append the results to encoding_for_file list
encoding_for_file.resize((2, 128)) # Resize using your command
You can then convert to pandas and export to csv. Let me know how it goes

Categories