Distance calculation between objects in MS COCO dataset - python

I have dataset includes images and its JSON file (MS COCO format) and only single class problem, I need to Calculate the distances between objects and find the center for each object in MS coco dataset and Group the images based on number of objects and distances in each image.
The segmentation or annotation of this dataset is tooth, which means segmenting each tooth separately. After calculating the distances, we can group the images(some images have 32 teeth, while others have something between maybe 20-28.
Here's dataset https://drive.google.com/file/d/13xpt7ppnQ56OUORq6TOjJOWGWskkoj5R/view?usp=sharing
Here are images https://drive.google.com/drive/folders/1G7Yc5ttUMqDFpPsZLb2s-4i5MVh5jkIl?usp=sharing
Here's my code:
from pycocotools.coco import COCO
import numpy as np
import skimage.io as io
import random
import os
import cv2
from tensorflow.keras.preprocessing.image import ImageDataGenerator
### For visualizing the outputs ###
import matplotlib.pyplot as plt
import matplotlib.gridspec as gridspec
dataDir='./COCOdataset2017'
dataType='val'
annFile='{}/annotations/instances_{}.json'.format(dataDir,dataType)
# Initialize the COCO api for instance annotations
coco=COCO(annFile)
# Load the categories in a variable
catIDs = coco.getCatIds()
cats = coco.loadCats(catIDs)
print(cats)
def getClassName(classID, cats):
for i in range(len(cats)):
if cats[i]['id']==classID:
return cats[i]['name']
return "None"
print('The class name is', getClassName(77, cats))
#####################################################################
filterClasses = ['tooth']
# Fetch class IDs only corresponding to the filterClasses
catIds = coco.getCatIds(catNms=filterClasses)
# Get all images containing the above Category IDs
imgIds = coco.getImgIds(catIds=catIds)
print("Number of images containing all the classes:", len(imgIds))
annIds = coco.getAnnIds(imgIds=img['id'], catIds=catIds, iscrowd=None)
anns = coco.loadAnns(annIds)
coco.showAnns(anns)
####################
#### GENERATE A SEGMENTATION MASK ####
filterClasses = ['tooth']
mask = np.zeros((img['height'],img['width']))
for i in range(len(anns)):
className = getClassName(anns[i]['category_id'], cats)
pixel_value = filterClasses.index(className)+1
mask = np.maximum(coco.annToMask(anns[i])*pixel_value, mask)
plt.imshow(mask)
I don't know how to calculate distance between objects and calculating center of objects

Related

Map function of tf.data.Dataset API giving unexpected error | TypeError

I'm doing Audio classification project, i.e. finding if whale call is present or not in given .wav file. Following are the steps of data preprocessing. First cell creates dataset of paths to positive and negative samples. Second cell shows the data type of of dataset object. As you can see in the third cell when we iterate through some samples in dataset each sample is a tensor which contains a path to .wav file and label. Fourth cell is the preprocessing method which I need to apply on each dataset sample. Problem is when I run the data.map(preprocess) it throws error. You can see at the end for more detail.
import librosa
import matplotlib.pyplot as plt
import librosa.display
import math
import tensorflow as tf
import numpy
FRAME_SIZE = 2048
HOP_LENGTH = 512
SR = 200
POS = '/kaggle/input/datafestintegration2023/train/train/1'
NEG = '/kaggle/input/datafestintegration2023/train/train/0'
pos = tf.data.Dataset.list_files(POS+'/*.wav')
neg = tf.data.Dataset.list_files(NEG+'/*.wav')
positives = tf.data.Dataset.zip((pos, tf.data.Dataset.from_tensor_slices(tf.ones(len(pos)))))
negatives = tf.data.Dataset.zip((neg,tf.data.Dataset.from_tensor_slices(tf.zeros(len(neg)))))
dataset = positives.concatenate(negatives)
dataset
for d in dataset.take(5):
print(d)
def preprocess(file_path, label):
wav, _ = librosa.load(file_path, sr=SR)
wav = wav[:12000]
zero_padding = tf.zeros([12000] - tf.shape(wav), dtype=tf.float32)
wav = tf.concat([zero_padding, wav],0)
wav = np.array(wav)
mel_spectrogram = librosa.feature.melspectrogram(wav, sr=SR, n_fft=FRAME_SIZE, hop_length=HOP_LENGTH, n_mels=10)
log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)
spectrogram = tf.expand_dims(log_mel_spectrogram, axis=2)
return spectrogram, label
dataset = dataset.map(preprocess)
When I run the above code cell it throws the below error. As per my understanding the preprocess method is not able to fetch the paths from dataset. What should I do?

How to add colour to graph to differentiate positive and negative-numbered data?

Plot of dataset showing banknote authentication
I don't know how to add colors to the different dots to differentiate between the positive and negative datasets. I tried following other examples, but I did not make any progress.
For the record, the Python coding I used is as follows:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
data = pd.read_csv('Banknote_authentication_dataset.csv')
from sklearn.cluster import KMeans
#V1 is the Variance of Wavelet Transformed image
#V2 is the Skewness of Wavelet Transformed image
V1 = data['V1']
V2 = data['V2']
V1_V2 = np.column_stack((V1, V2))
km_res = KMeans(n_clusters=2).fit(V1_V2)
clusters = km_res.cluster_centers_
plt.xlabel('Variance')
plt.ylabel('Skewness')
plt.scatter(V1, V2)
plt.scatter(clusters[:,0], clusters[:,1], s=1000, alpha = 0.50)
The link to the dataset is: https://d3c33hcgiwev3.cloudfront.net/1fXr31hcEemkYxLyQ1aU1g_50fc36ee697c4b158fe26ade3ec3bc24_Banknote-authentication-dataset-.csv?Expires=1613433600&Signature=PhnPBuxjL9TwNwXV2dmS7HN3YOtLJsJo3A26UID0CBBC13cxsBmRmpsyUVN7MXIcrte6oUCBeybrhveDMCb-6-nMsQ8JzSH8qxZgYR7mwfO32WZYDQ7S6qm2Z6hFnkw76NIeEdto5L9CDDFpKkF8OhLd81bjxnTictbS1UTOPXw_&Key-Pair-Id=APKAJLTNE6QMUY6HBC5A.
You can get the predictions by using km_res.predict(V1_V2) and then just pass that into your first call to plt.scatter. So your code would change to look like:
# ... code above
preds = km_res.predict(V1_V2)
plt.scatter(V1, V2, c=preds)
# ... code below
If you want control over what colors it uses just change the number predictions to colors (so you'd make all points that have prediction one turn to the string red for example)

Numpy append sometimes works, sometimes doesn't

so I've been working on this facial identification project. It's for my science fair and I'm in the phase where I'm trying to get data graphs, plots, and visualizations. I've got it to work to some extent, but it's not consistent (in terms of execution).
The thing is, sometimes the code works, sometimes it'll give me an error.
For some context, the error is with Numpy append(). I have a variable I want to append data to but when it doesn't work the error is AttributeError: 'numpy.ndarray' object has no attribute 'append'
#Although the results aren't as expected, this can make for a good demo in ISEF
#The whole refresh after a face is detected is cool and can be used to show how different faces cluster
# Numerical computation requirements
import numpy as np
from numpy import linalg, load, expand_dims, asarray, savez_compressed, append
from numpy.linalg import norm
import pandas as pd
# Plotting requirements
import matplotlib
from matplotlib import pyplot as plt
import matplotlib.patheffects as PathEffects
from matplotlib.animation import FuncAnimation as ani
import seaborn as sb
# Clustering requirements
import sklearn
from sklearn.cluster import KMeans
from sklearn.manifold import TSNE
from sklearn.preprocessing import scale
# Miscellaneous requirements
import os
import cv2
from PIL import Image
from mtcnn.mtcnn import MTCNN
from keras.models import load_model
from scipy.spatial.distance import squareform, pdist
# Initialize RNG seed and required size for Facenet
seed = 12345678
size = (160,160)
# Required networks
facenet = load_model('facenet_keras.h5')
fd = MTCNN()
# Initialize Seaborn plots
sb.set_style('darkgrid')
sb.set_palette('muted')
sb.set_context('notebook', font_scale=1.5, rc={'lines.linewidth': 2.5})
# Matplotlib animation requirements?
plt.style.use('fivethirtyeight')
fig = plt.figure()
# Load embeddings
data = load('jerome only npz/jerome embeddings.npz')
Data_1 = data['arr_0']
Dataset = []
for array in Data_1:
Dataset.append(np.expand_dims(array, axis=0))
# Create cluster
cluster = KMeans(n_clusters=2, random_state=0).fit(Data_1)
y = cluster.labels_
z = pd.DataFrame(y.tolist())
faces = list()
def scatter(x,colors):
palette = np.array(sb.color_palette('hls', 26))
plot = plt.figure()
ax = plt.subplot(aspect='equal')
# sc = ax.scatter(x[:,0],x[:,1], lw =0, s=120, c=palette[colors.astype(np.int)])
sc = ax.scatter(x[:,0],x[:,1], lw =0, s=120)
labels = []
return plot , ax, sc, labels
def detembed():
cam = cv2.VideoCapture(0)
_,frame = cam.read()
info = fd.detect_faces(frame)
if info != []:
for i in info:
print("***************** FACE DETECTED *************************************************")
x,yc,w,h = i['box']
x,y = abs(x), abs(yc)
w,h = abs(w), abs(h)
xx, yy = x+w, yc+h
#cv2.rectangle(frame, (x,y), (xx,yy), (0,0,255),2)
face = frame[yc:yy, x:xx]
image = Image.fromarray(face)
image = image.resize(size)
arr = asarray(image)
arr = arr.astype('float32')
mean, std = arr.mean(), arr.std()
arr = (arr - mean) / std
samples = expand_dims(arr, axis=0)
faces.append(samples)
#cv2.imshow('Camera Feed', frame)
while True:
detembed()
embeddings = Dataset
if not faces:
continue
else:
for face in faces:
embeds = facenet.predict(face)
#switch these if conflicts arise
embeddings.append(embeds)
embeddings = asarray(embeddings)
embeddings = embeddings[:,0,:]
cluster = KMeans(n_clusters=2, random_state=0).fit(Data_1)
y = cluster.labels_
points = TSNE(random_state=seed).fit_transform(embeddings)
# here "y" dictates the color of the plots depending on the kmeans algorithm
scatter(points,y)
graph = ani(fig, scatter, interval=20)
fcount = len(embeddings)
plt.text(0,0,'{} points'.format(fcount))
plt.show()
# reset embeddings var to initial dataset
Dataset = np.delete(Dataset, fcount - 1,0)
embeddings = Dataset
if cv2.waitKey(1) & 0xFF == ord('q'):
break
cv2.release()
cv2.destroyAllWindows
Note that I am not a talented programmer; this code was botched from some example I found online. I had to pick up Python as I went along with this project. I do have a background in C, so I would say I get the basics of code logic.
Please help. I'm getting really desperate; the science fair is getting closer and I am a high schooler with no ML mentor. I live on an island (Guam) with no machine learning practitioners (not even in the university), so I turn to Stackoverflow.
There's no issue with NumPy's append(). Here(3rd statement) you're trying to append a value to Numpy array without using NumPy's np.append().
Dataset.append(np.expand_dims(array, axis=0))
embeddings = Dataset
embeddings.append(embeds)
Since Datasets contain Numpy array after running the first statement, embeddings will also be a NumPy array and hence the operation fails whenever the execution comes here.
A simple fix would be to use this:
np.append(embeddings, embeds)
Or this,
embeddings = list(Dataset)
Hope that helps.

How to initialize and train an SVM with rootSIFT features in python

I have a CBIR system set up in python utilizing OpenCV. I have successfully extracted the keypoints and descriptors, clustered them using k-means to create a codebook, and have generated histograms describing images based on this codebook. I would like to know how I can use these histograms generated on the last line of this code to train an SVM, or if I am going about this in the wrong way.
import argparse
import glob
import cv2
import numpy
import pickle
import base64
from scipy.cluster.vq import *
from cassandra.cluster import Cluster
def compute(imagePath, eps=1e-7)
sift = cv2.xfeatures2d.SIFT_create()
image = cv2.imread(imagePath, 0)
kp, des = sift.detectAndCompute(image, None)
if des is not None:
kp, des = sift.compute(image, kp)
if len(kp) == 0:
return ([], None)
des /= (des.sum(axis=1, keepdims=True) + eps)
des = numpy.sqrt(des)
des = whiten(des)
return kp, des
for imagePath in dataset :
kp,des = compute(imagePath)
codes, distortion = vq(des, codebook)
hist, bins = numpy.histogram(codes, K)
Take a look at sklearn.svm and how SVM classification works here.
Maybe you can just follow the Bag of Words common procedure, which means that for each image feature (histogram) you should select the nearest codeword of the dictionary (according to some measure of feature distance/similarity).

Kullback-Leibler divergence of the same image - scipy.stats.entropy

I was calculating kl distance between 3 images histograms:
import numpy as np
import scipy.misc
from skimage.io import ImageCollection, imread
from skimage import color
import skimage
from sklearn.datasets import load_sample_image
# all images in grayscale
lena = scipy.misc.lena().astype('uint8')
china = skimage.img_as_ubyte(color.rgb2grey( load_sample_image("china.jpg")) )
flower = skimage.img_as_ubyte(color.rgb2grey( load_sample_image("flower.jpg")) )
# histograms for all images
hist_lena, bin_edges_lena = np.histogram(lena, bins = range(256))
hist_china, bin_edges_china = np.histogram(china, bins = range(256))
hist_flower, bin_edges_flower = np.histogram(flower, bins = range(256))
When I use scipy.stats.entropy to compare the same image I've got different results:
# http://docs.scipy.org/doc/scipy-dev/reference/generated/scipy.stats.entropy.html
from scipy.stats import entropy
print entropy(pk=hist_lena, qk=hist_lena) # nan
print entropy(pk=hist_china, qk=hist_china) # -0.0
print entropy(pk=hist_flower, qk=hist_flower) # nan
I was expecting zero (unsigned?) as results.
Am I applying entropy function correctly?
Does it seem correct to apply this function on images histograms?

Categories