How to create variables for Facial_Recognition from database - python

I'm trying to be able to pull data from a database with a name and an image file name then put it into a face_recognition Python program. However, for the code that I'm using, the program learns the faces by calling variables with different names.
How can I create variables based on the amount of data that I have in the database?
What could be a better approach to solve this problem?
first_image = face_recognition.load_image_file("first.jpg")
first_face_encoding = face_recognition.face_encodings(first_image)[0]
second_image = face_recognition.load_image_file("second.jpg")
biden_face_encoding = face_recognition.face_encodings(second_image)[0]

You can use arrays instead of storing each image/encoding in an individual variable, and fill the arrays from a for loop.
Assuming you can change the filenames from first.jpg, second.jpg... to 1.jpg, 2.jpg... you can do this:
numberofimages = 10 # change this to the total number of images
images = [None] * (numberofimages+1) # create an array to store all the images
encodings = [None] * (numberofimages+1) # create an array to store all the encodings
for i in range(1, numberofimages+1):
filename = str(i) + ".jpg" # generate image file name (eg. 1.jpg, 2.jpg...)
# load the image and store it in the array
images[i] = face_recognition.load_image_file(filename)
# store the encoding
encodings[i] = face_recognition.face_encodings(images[i])[0]
You can then access eg. the 3rd image and 3rd encoding like this:
image[3]
encoding[3]
If changing image file names is not an option, you can store them in a dictionary and do this:
numberofimages = 3 # change this to the total number of images
images = [None] * (numberofimages+1) # create an array to store all the images
encodings = [None] * (numberofimages+1) # create an array to store all the encodings
filenames = {
1: "first",
2: "second",
3: "third"
}
for i in range(1, numberofimages+1):
filename = filenames[i] + ".jpg" # generate file name (eg. first.jpg, second.jpg...)
print(filename)
# load the image and store it in the array
images[i] = face_recognition.load_image_file(filename)
# store the encoding
encodings[i] = face_recognition.face_encodings(images[i])[0]

Related

Saving images with different name in folder

I tried save images in folder like this, it saves different images but every next image have all names of previously images.
db = h5py.File('results/Results.h5', 'r')
dsets = sorted(db['data'].keys())
for k in dsets:
db = get_data()
imnames = sorted(db['data'].keys())
slika = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imwrite(f'spremljene_slike/ime_{imnames}.png', slika)
So i tried like this and it saves different names but only last generated picture is imwrited in folder, so different names - the same picture
NUM_IMG = -1
N = len(imnames)
global NUM_IMG
if NUM_IMG < 0:
NUM_IMG = N
start_idx,end_idx = 0,N #min(NUM_IMG, N)
**In different function:**
for u in range(start_idx,end_idx):
imname = imnames[u]
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', imname)
Can someone help, I can't figure out.
I have script which generate images with rendered text and save it in .h5 file, and then from there I want to save this pictures with corresponding names in different folder.
Don't see how this works at all. On line 1 you define db=h5py.File(), then on line 4, you redefine it as db=get_data(). What is get_data()?
It's hard to write code without the schema. Answer below is my best-guess assuming your images are datasets in db['data'] and you want to use the dataset names (aka keys) as the image names.
with h5py.File('results/Results.h5', 'r') as db:
dsets = sorted(db['data'].keys())
for imname in dsets:
img_arr = db['data'][imname][()]
slika = cv2.cvtColor(img_arr, cv2.COLOR_BGR2RGB)
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', slika)
That should be all you need to do. You will get 1 .png for each dataset named ime_{imname}.png (where imname is the matching dataset name).
Also, you can eliminate all of the intermediate variables (dsets, img_arr and slika). Compress the code above into a few lines:
with h5py.File('results/Results.h5', 'r') as db:
for imname in sorted(db['data'].keys()):
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', \
cv2.cvtColor(db['data'][imname][()], cv2.COLOR_BGR2RGB))

Loading high number of images to memory and save pickle

I have a problem...
I have a dataset with 1200 cases and 30 classes per case and 160 images per class. These images are grayscale ndarrays, float64 dtype.
I would like to slice each case and get only 30 images from each class and put them in a dictionary where the first_key is case_name and second_one name of a class. After all of this I would like to save whole dictionary to a pickle.
but I run out of memory all the time.
brain_all = []
for dir in path.iterdir():
brain_sample = {}
path_dir = path_save / dir.name
try:
path_dir.mkdir(parents=True, exist_ok=False)
except FileExistsErorr:
print('Folder is already there')
for file in dir.iterdir():
sample = nib.load(file).get_fdata()[:, :, 75:105]
if 'flair' in file.name:
brain_sample['flair'] = sample
elif 't1ce' in file.name:
brain_sample['t1ce'] = sample
brain_all.append([file, brain_sample])

Cannot iterate over a file?

I want to know how to apply a function over a file of images and save each of them in a separate file. For one image it works successfully, but i cannot apply it to all images.
import glob
images = glob.glob('/Desktop/Dataset/Images/*')
for img in images:
img = np.array(Image.open(img))
output = 'Desktop/Dataset/Output'
MyFn(img = img,saveFile = output)
You did not define the sv value in your 2nd code snippet.
As the image will be overwrite, try this code:
import glob
images = glob.glob('/Desktop/Dataset/Images/*')
i = 0
for img in images:
i += 1 #iteration to avoid overwrite
img = np.array(Image.open(img))
output = 'Desktop/Dataset/Output'
MyFn(img = img + str(i),saveFile = output)
try to use the library os directly with
import os
entries = os.listdir('image/')
this will return a list of all the file into your folder
This is because you are not setting the sv value in your loop. You should set it to a different value at each iteration in order for it to write to different files.

Apply the same code to multiple files in the same directory

I have a code that already works but I need to use it to analyse many files in the same folder. How can I re-write it to do this? All the files have similar names (e.g. "pos001", "pos002", "pos003").
This is the code at the moment:
pos001 = mpimg.imread('pos001.tif')
coord_pos001 = np.genfromtxt('treat_pos001_fluo__spots.csv', delimiter=",")
Here I label the tif file "pos001" to differentiate separate objects in the same image:
label_im = label(pos001)
regions = regionprops(label_im)
Here I select only the object of interest by setting its pixel values == 1 and all the others == 0 (I'm interested in many objects, I show only one here):
cell1 = np.where(label_im != 1, 0, label_im)
Here I convert the x,y coordinates of the spots in the csv file to a 515x512 image where each spot has value 1:
x = coord_pos001[:,2]
y = coord_pos001[:,1]
coords = np.column_stack((x, y))
img = Image.new("RGB", (512,512), "white")
draw = ImageDraw.Draw(img)
dotSize = 1
for (x,y) in coords:
draw.rectangle([x,y,x+dotSize-1,y+dotSize-1], fill="black")
im_invert = ImageOps.invert(img)
bin_img = im_invert.convert('1')
Here I set the values of the spots of the csv file equal to 1:
bin_img = np.where(bin_img == 255, 1, bin_img)
I convert the arrays from 2d to 1d:
bin_img = bin_img.astype(np.int64)
cell1 = cell1.flatten()
bin_img = bin_img.flatten()
I multiply the arrays to get an array where only the spots overlapping the labelled object have value = 1:
spots_cell1 = []
for num1, num2 in zip(cell1, bin_img):
spots_cell1.append(num1 * num2)
I count the spots belonging to that object:
spots_cell1 = sum(float(num) == 1 for num in spots_cell1)
print(spots_cell1)
I hope it's clear. Thank you in advance!
You can define a function that takes the .tif file path and the .csv file path and processes the two
def process(tif_file, csv_file):
pos = mpimg.imread(tif_file)
coord = np.genfromtxt(csv_file, delimiter=",")
# Do other processing with pos and coord
To process a single pair of files, you'd do:
process('pos001.tif', 'treat_pos001_fluo__spots.csv')
To list all the files in your tif file directory, you'd use the example in this answer:
import os
tif_file_directory = "/home/username/path/to/tif/files"
csv_file_directory = "/home/username/path/to/csv/files"
all_tif_files = os.listdir(tif_file_directory)
for file in all_tif_files:
if file.endswith(".tif"): # Make sure this is a tif file
fname = file.rstrip(".tif") # Get just the file name without the .tif extension
tif_file = f"{tif_file_directory}/{fname}.tif" # Full path to tif file
csv_file = f"{csv_file_directory}/treat_{fname}_fluo__spots.csv" # Full path to csv file
# Just to keep track of what is processed, print them
print(f"Processing {tif_file} and {csv_file}")
process(tif_file, csv_file)
The f"...{variable}..." construct is called an f-string. More information here: https://realpython.com/python-f-strings/

Accessing a specific item

I have the following images:
im1 = cv2.imread(root + '/' + '1.jpg')
im1_file = '1.jpg'
img1 = (im1,im1_file)
im2 = cv2.imread(root + '/' + '2.jpg')
im2_file = '2.jpg'
img2 = (im2,im2_file)
I then add the images to the pairs list, as follows:
pair = (img1,img2)
pairs.append(pair)
How can I access the file name (i.e. im_file) in each pair, img1 and img2?
Whenever you start using numbered variables, use a list.
You can store the file names all at once, and you can then read the files from that list.
import os
root = '/'
files = ['1.jpg', '2.jpg']
images = [cv2.imread(os.path.join(root, f) for f in files)]
You can access a specific item with images[0], for example.
If you need to join the lists for any reason, you can zip(files, images)
You can use simple tuple indexing to get the filename out (which indeed is the second, or 1-index, element in the tuple)
>>> img1[1]
'1.jpg'
Or as a collection:
>>> for pair in pairs:
... print(pair[1])
1.jpg
2.jpg
However you may want to consider using namedtuples here
from collections import namedtuple
ImageInfo = namedtuple("ImageInfo", "data name")
img1 = ImageInfo(im1, im1_file)
img2 = ImageInfo(im2, im2_file)
This makes your interface a little nicer to use
>>> img1.name
'1.jpg'
>>> for pair in pairs
... print(pair.name)
1.jpg
2.jpg
(in full disclosure, I tend to overengineer my data structures and a namedtuple may be a bit too heavy here depending on the context. Without knowing that, I tend to prefer to overcomplicate. YMMV.)

Categories