TROUBLE WITH CODE -> Python recognizing faces in a newspaper

TROUBLE WITH CODE -> Python recognizing faces in a newspaper - python

My assignment is to iterate through a zipfile containing some images, each image is a page of the newspaper, and the goal is to search for a word in a page and display all faces recognized in that page.
I'm storing the data in a dictionary, which keys are the names of the image files and the values are, first the text generated by pytesseract and second is the ZipInfo object for that image.
The function to generate the dictionary is working just fine, as it generates what I want, but the problem is in the other two functions, wordCheck() and detectFaces(), as an empty list is being returned
Here is the code:
def getDict(zippedFile):
'''
This function gets a .zip file containing images and returns a dictionary whose keys are the image names
and the values are the images text content.
'''
j = 0
i = 0
dic = {}
contents = []
imageObject = []
with zipfile.ZipFile(zippedFile) as file:
for single_info in file.infolist():
imageObject.append(single_info)
with file.open(single_info) as imageInfo:
img = Image.open(imageInfo)
text = pytesseract.image_to_string(img)
contents.append(text)
for name in file.namelist():
dic[name] = [contents[j], imageObject[j]]
j += 1
return dic
def detectFaces(imageName, dic):
'''
This function gets and image name, that is in a .zip file and returns a list containing bounding boxes for
faces detected on the given image.
'''
boundingBoxes = []
with zipfile.ZipFile('readonly/small_img.zip') as file:
imageInfo = dic[imageName][1]
PILImage = Image.open(imageInfo)
display(PILImage)
img = cv.imread(PILImage)
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
print(gray)
faceBoxes = face_cascade.detectMultiScale(gray)
for item in faceBoxes:
print(item)
boundingBoxes.append(item[0])
return boundingBoxes
def checkWord(word, dic):
'''
'''
bBoxes = []
for key in dic:
if word in dic[key]:
print('Results found in {}'.format(key))
bBoxes.append(detectFaces(key, dic))
return bBoxes
dictera = getDict('readonly/small_img.zip')
result = checkWord('Senator', dictera)
print(result)
I'm very newbie in programming, so if I made a silly mistake, pardon me!
I have no idea on what to try next, do you guys have a clue for what's going on?

Had some other problems and already solved them, but figured out I was returning a list which is never mentioned. Duur!

Related

Saving images with different name in folder

I tried save images in folder like this, it saves different images but every next image have all names of previously images.
db = h5py.File('results/Results.h5', 'r')
dsets = sorted(db['data'].keys())
for k in dsets:
db = get_data()
imnames = sorted(db['data'].keys())
slika = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.imwrite(f'spremljene_slike/ime_{imnames}.png', slika)
So i tried like this and it saves different names but only last generated picture is imwrited in folder, so different names - the same picture
NUM_IMG = -1
N = len(imnames)
global NUM_IMG
if NUM_IMG < 0:
NUM_IMG = N
start_idx,end_idx = 0,N #min(NUM_IMG, N)
**In different function:**
for u in range(start_idx,end_idx):
imname = imnames[u]
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', imname)
Can someone help, I can't figure out.
I have script which generate images with rendered text and save it in .h5 file, and then from there I want to save this pictures with corresponding names in different folder.

Don't see how this works at all. On line 1 you define db=h5py.File(), then on line 4, you redefine it as db=get_data(). What is get_data()?
It's hard to write code without the schema. Answer below is my best-guess assuming your images are datasets in db['data'] and you want to use the dataset names (aka keys) as the image names.
with h5py.File('results/Results.h5', 'r') as db:
dsets = sorted(db['data'].keys())
for imname in dsets:
img_arr = db['data'][imname][()]
slika = cv2.cvtColor(img_arr, cv2.COLOR_BGR2RGB)
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', slika)
That should be all you need to do. You will get 1 .png for each dataset named ime_{imname}.png (where imname is the matching dataset name).
Also, you can eliminate all of the intermediate variables (dsets, img_arr and slika). Compress the code above into a few lines:
with h5py.File('results/Results.h5', 'r') as db:
for imname in sorted(db['data'].keys()):
cv2.imwrite(f'spremljene_slike/ime_{imname}.png', \
cv2.cvtColor(db['data'][imname][()], cv2.COLOR_BGR2RGB))

Apply the same code to multiple files in the same directory

I have a code that already works but I need to use it to analyse many files in the same folder. How can I re-write it to do this? All the files have similar names (e.g. "pos001", "pos002", "pos003").
This is the code at the moment:
pos001 = mpimg.imread('pos001.tif')
coord_pos001 = np.genfromtxt('treat_pos001_fluo__spots.csv', delimiter=",")
Here I label the tif file "pos001" to differentiate separate objects in the same image:
label_im = label(pos001)
regions = regionprops(label_im)
Here I select only the object of interest by setting its pixel values == 1 and all the others == 0 (I'm interested in many objects, I show only one here):
cell1 = np.where(label_im != 1, 0, label_im)
Here I convert the x,y coordinates of the spots in the csv file to a 515x512 image where each spot has value 1:
x = coord_pos001[:,2]
y = coord_pos001[:,1]
coords = np.column_stack((x, y))
img = Image.new("RGB", (512,512), "white")
draw = ImageDraw.Draw(img)
dotSize = 1
for (x,y) in coords:
draw.rectangle([x,y,x+dotSize-1,y+dotSize-1], fill="black")
im_invert = ImageOps.invert(img)
bin_img = im_invert.convert('1')
Here I set the values of the spots of the csv file equal to 1:
bin_img = np.where(bin_img == 255, 1, bin_img)
I convert the arrays from 2d to 1d:
bin_img = bin_img.astype(np.int64)
cell1 = cell1.flatten()
bin_img = bin_img.flatten()
I multiply the arrays to get an array where only the spots overlapping the labelled object have value = 1:
spots_cell1 = []
for num1, num2 in zip(cell1, bin_img):
spots_cell1.append(num1 * num2)
I count the spots belonging to that object:
spots_cell1 = sum(float(num) == 1 for num in spots_cell1)
print(spots_cell1)
I hope it's clear. Thank you in advance!

You can define a function that takes the .tif file path and the .csv file path and processes the two
def process(tif_file, csv_file):
pos = mpimg.imread(tif_file)
coord = np.genfromtxt(csv_file, delimiter=",")
# Do other processing with pos and coord
To process a single pair of files, you'd do:
process('pos001.tif', 'treat_pos001_fluo__spots.csv')
To list all the files in your tif file directory, you'd use the example in this answer:
import os
tif_file_directory = "/home/username/path/to/tif/files"
csv_file_directory = "/home/username/path/to/csv/files"
all_tif_files = os.listdir(tif_file_directory)
for file in all_tif_files:
if file.endswith(".tif"): # Make sure this is a tif file
fname = file.rstrip(".tif") # Get just the file name without the .tif extension
tif_file = f"{tif_file_directory}/{fname}.tif" # Full path to tif file
csv_file = f"{csv_file_directory}/treat_{fname}_fluo__spots.csv" # Full path to csv file
# Just to keep track of what is processed, print them
print(f"Processing {tif_file} and {csv_file}")
process(tif_file, csv_file)
The f"...{variable}..." construct is called an f-string. More information here: https://realpython.com/python-f-strings/

Tensorflow - Extract string from Tensor

I'm trying to follow the "Load using tf.data" part of this tutorial. In the tutorial, they can get away with only working with string Tensors, however, I need to extract the string representation of the filename, as I need to look up extra data from a dictionary. I can't seem to extract the string part of a Tensor. I'm pretty sure the .name attribute of a Tensor should return the string, but I keep getting an error message saying KeyError: 'strided_slice_1:0' so somehow, the slicing is doing something weird?
I'm loading the dataset using:
dataset_list = tf.data.Dataset.list_files(str(DATASET_DIR / "data/*"))
and then process it using:
def process(t):
return dataset.process_image_path(t, param_data, param_min_max)
dataset_labeled = dataset_list.map(
process,
num_parallel_calls=AUTOTUNE)
where param_data and param_min_max are two dictionaries I've loaded that contains extra data that is needed to construct the label.
These are the three functions that I use to process the data Tensors (from my dataset.py):
def process_image_path(image_path, param_data_file, param_max_min_file):
label = path_to_label(image_path, param_data_file, param_max_min_file)
img = tf.io.read_file(image_path)
img = decode_img(img)
return (img, label)
def decode_img(img):
"""Converts an image to a 3D uint8 tensor"""
img = tf.image.decode_jpeg(img, channels=3)
img = tf.image.convert_image_dtype(img, tf.float32)
return img
def path_to_label(image_path, param_data_file, param_max_min_file):
"""Returns the NORMALIZED label (set of parameter values) of an image."""
parts = tf.strings.split(image_path, os.path.sep)
filename = parts[-1] # Extract filename with extension
filename = tf.strings.split(filename, ".")[0].name # Extract filename
param_data = param_data_file[filename] # ERROR! .name above doesn't seem to return just the filename
P = len(param_max_min_file)
label = np.zeros(P)
i = 0
while i < P:
param = param_max_min_file[i]
umin = param["user_min"]
umax = param["user_max"]
sub_index = param["sub_index"]
identifier = param["identifier"]
node = param["node_name"]
value = param_data[node][identifier]
label[i] = _normalize(value[sub_index])
i += 1
return label
I have verified that filename = tf.strings.split(filename, ".")[0] in path_to_label() does return the correct Tensor, but I need it as a string. The whole thing is proving difficult to debug as well, as I can't access attributes when debugging (I get errors saying AttributeError: Tensor.name is meaningless when eager execution is enabled.).

The name field is a name for the tensor itself, not the content of the tensor.
To do a regular python dictionary lookup, wrap your parsing function in tf.py_func.
import tensorflow as tf
tf.enable_eager_execution()
d = {"a": 1, "b": 3, "c": 10}
dataset = tf.data.Dataset.from_tensor_slices(["a", "b", "c"])
def parse(s):
return s, d[s]
dataset = dataset.map(lambda s: tf.py_func(parse, [s], (tf.string, tf.int64)))
for element in dataset:
print(element[1].numpy()) # prints 1, 3, 10

How to create variables for Facial_Recognition from database

I'm trying to be able to pull data from a database with a name and an image file name then put it into a face_recognition Python program. However, for the code that I'm using, the program learns the faces by calling variables with different names.
How can I create variables based on the amount of data that I have in the database?
What could be a better approach to solve this problem?
first_image = face_recognition.load_image_file("first.jpg")
first_face_encoding = face_recognition.face_encodings(first_image)[0]
second_image = face_recognition.load_image_file("second.jpg")
biden_face_encoding = face_recognition.face_encodings(second_image)[0]

You can use arrays instead of storing each image/encoding in an individual variable, and fill the arrays from a for loop.
Assuming you can change the filenames from first.jpg, second.jpg... to 1.jpg, 2.jpg... you can do this:
numberofimages = 10 # change this to the total number of images
images = [None] * (numberofimages+1) # create an array to store all the images
encodings = [None] * (numberofimages+1) # create an array to store all the encodings
for i in range(1, numberofimages+1):
filename = str(i) + ".jpg" # generate image file name (eg. 1.jpg, 2.jpg...)
# load the image and store it in the array
images[i] = face_recognition.load_image_file(filename)
# store the encoding
encodings[i] = face_recognition.face_encodings(images[i])[0]
You can then access eg. the 3rd image and 3rd encoding like this:
image[3]
encoding[3]
If changing image file names is not an option, you can store them in a dictionary and do this:
numberofimages = 3 # change this to the total number of images
images = [None] * (numberofimages+1) # create an array to store all the images
encodings = [None] * (numberofimages+1) # create an array to store all the encodings
filenames = {
1: "first",
2: "second",
3: "third"
}
for i in range(1, numberofimages+1):
filename = filenames[i] + ".jpg" # generate file name (eg. first.jpg, second.jpg...)
print(filename)
# load the image and store it in the array
images[i] = face_recognition.load_image_file(filename)
# store the encoding
encodings[i] = face_recognition.face_encodings(images[i])[0]

Extracting bounding boxes and category labels in MS-COCO dataset

I am working with MS-COCO dataset and I want to extract bounding boxes as well as labels for the images corresponding to backpack (category ID: 27) and laptop (category ID: 73) categories, and store them into different text files to train a neural network based model later.
I have already extracted the images corresponding to the aforementioned two categories and have created empty annotation files in a separate folder wherein I am looking to store the annotations along with labels (format of the annotation file is something like: label x y w h where w and h indicate width and height of the detected category). I built upon COCO-API (coco.py to be precise) to extract the images and create the empty text annotation files.
Following is the main function I wrote on top of coco.py to do so:
if __name__ == "__main__":
littleCo = COCO('/home/r.bohare/coco_data/annotations/instances_train2014.json')
#id_laptop = littleCo.getCatIds('laptop')
"""Extracting image ids corresponding to backpack and laptop images."""
bag_img_ids = littleCo.getImgIds(catIds=[27])
laptop_img_ids = littleCo.getImgIds(catIds=[73])
#print "IDs of bag images:", bag_img_ids
#print "IDs of laptop imgs:", laptop_img_ids
"""Extracting annotation ids corresponding to backpack and laptop images."""
bag_ann_ids = littleCo.getAnnIds(catIds=[27])
laptop_ann_ids = littleCo.getAnnIds(catIds=[73])
#print "Annotation IDs of bags:", bag_ann_ids
#print "Annotation IDs of laptops:", laptop_ann_ids
"""Extracting image names corresponding to bag and laptop categories."""
bag_imgs = littleCo.loadImgs(ids=bag_img_ids)
laptop_imgs = littleCo.loadImgs(ids=laptop_img_ids)
#print "Bag images:", bag_imgs
#print "Laptop images:", laptop_imgs
bag_img_names = [image['file_name'] for image in bag_imgs]
laptop_img_names = [image['file_name'] for image in laptop_imgs]
print "Bag Images:", len(bag_img_names), bag_img_names[:5]
print "Laptop Images:", len(laptop_img_names), laptop_img_names[:5]
"""Extracting annotations corresponding to bag and laptop images."""
bag_ann = littleCo.loadAnns(ids=bag_ann_ids)
laptop_ann = littleCo.loadAnns(ids=laptop_ann_ids)
bag_bbox = [ann['bbox'] for ann in bag_ann]
laptop_bbox = [ann['bbox'] for ann in laptop_ann]
print "Bags' bounding boxes:", len(bag_ann), bag_bbox[:5]
print "Laptops' bounding boxes:", len(laptop_bbox), laptop_bbox[:5]
"""Saving files corresponding to bags and laptop category in a directory."""
import shutil
#path_to_imgs = "/export/work/Data Pool/coco_data/train2014/"
#path_to_subset_imgs = "/export/work/Data Pool/coco_subset_data/"
path_to_ann = "/export/work/Data Pool/coco_subset_data/annotations/"
dirs_list = [("/export/work/Data Pool/coco_data/train2014/", "/export/work/Data Pool/coco_subset_data/")]
for source_folder, destination_folder in dirs_list:
for img in bag_img_names:
shutil.copy(source_folder + img, destination_folder + img)
print "Bag images copied!"
for img in laptop_img_names:
shutil.copy(source_folder + img, destination_folder + img)
print "Laptop images copied!"
"""Creating empty files for annotation."""
for f in os.listdir("/export/work/Data Pool/coco_subset_data/images/"):
if f.endswith('.jpg'):
open(os.path.join(path_to_ann, f.replace('.jpg', '.txt')), 'w+').close()
print "Done creating empty annotation files."
I provided only the main function here as the rest of the code is coco.py file in COCO-API.
I debugged the code to find that there are different data structures:
cats, a dictionary which maps category IDs to their supercategories and category names (labels).
imgToAnns, also a dictionary which maps every image ID to its segmentation ground truth, bounding box ground truth, category ID etc. From what I have managed to know so far, I think I need to use this dictionary to somehow map the image names I have in bag_img_names and laptop_img_names lists to their labels and bounding boxes, but I am not able to think in the right direction, as to how to access this dictionary (No method in coco.py returns it directly).
imgs, another dictionary which gives meta information about all images, such as, image name, image url, captured date etc.
Finally, I know this is an extremely specific question. Feel free to let me know if this belongs to a community other than stackoverflow (stats.stackexchange.com for example), and I will remove it. Also, it might be possible that I missed some vital information. I will provide it if I can think of it, or if someone asks.
I am only a beginner in Python, so please forgive me if I might have missed something obvious.
Any help whatsoever is highly appreciated. Thank You.

2 years have passed. Now coco.py can already do what you were doing, by adding at the end some functions to map the annotations, converted into RLE format, to the images. take a look at this cocoapi.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

TROUBLE WITH CODE -> Python recognizing faces in a newspaper - python

Had some other problems and already solved them, but figured out I was returning a list which is never mentioned. Duur!

Related

Saving images with different name in folder

Apply the same code to multiple files in the same directory

Tensorflow - Extract string from Tensor

How to create variables for Facial_Recognition from database

Extracting bounding boxes and category labels in MS-COCO dataset

Categories

Resources