I have a CNN trained on classes = [dog, cat, frog] and in the testing-phase-only, I want to include several pictures of horses to see which known classes those images get classified as. Any idea how to implement this in a Keras model?
One thing I've tried, but I don't like is to distribute the horse pictures equally and randomly across the training images for the known classes (dog, cat, and frog) and then see what happens with the testing images. I'm worried the number of horse images (though relatively small) would negatively impact the model's knowledge of a Here is the corresponding code:
<x_train, x_test, y_train, and y_test has already been done prior to this step>
clsLst = [dog, cat, frog]
clsRemove = horse
seed(1)
newClsLst = [0,0,0]
for I in range(0,len(y_train)):
if y_train[i][clsRemove] = 1.0:
y_train[i][clsRemove] = 0.0
randIndex = random.randint(0,8)
newCls = clsLst[randIndex]
newClsLst[newCls] = newClsLst[newCls] + 1
y_train[i][newCls] = 1.0
This is only my second time using Keras and I don't have a programming background so all tips and overexplaining is appreciated.
As you have correctly noted yourself, adding the horse images to your training data is a bad idea - unless, of course, you want to expand your model's classification capabilities so that it learns to identify horses.
That said, you could simply add the horse images to x_test or set up a separate testing dataset (say, x_test_horses) for this specific testing purpose, i.e. what horses are (mis)classified as.
As has been pointed out in Saankhya Mondal's comment below your original post, with both options you can simply use model.predict() to make predictions (y_pred = model.predict(x_test) respectively y_pred = model.predict(x_test_horses)).
Related
My question is close to this thread but the difference is I want my training and test dataset to be spatially disjoint. So no two samples from the same geographical region --you can also define the region by county, state, random geographical grid you create for you own dataset among others. An example of my dataset is like THIS which is an instance segmentation task for satellite imagery.
I know pytorch has this capability for random splitting:
train_size = int(0.75 * len(full_dataset))
test_size = len(full_dataset) - train_size
train_dataset, test_dataset = torch.utils.data.random_split(full_dataset, [train_size, test_size])
However perhaps what I want is spatially_random_spliting functionality.
Picture below is also showing the question where in my case each point is an image with associated labels.
I am not completely sure what your dataset and labels look like but from what i see why not cut image into pre defined chunk sizes like here - https://stackoverflow.com/a/63815878/4471672
and say save each chunk in different folders according to location then sample from whichever set you need (or know to be "spatially disjoint) randomly
I found the answer via TorchGEO library. Thank you all.
from torchgeo.samplers import RandomGeoSampler
sampler = RandomGeoSampler(dataset, size=256, length=10000)
dataloader = DataLoader(dataset, batch_size=128, sampler=sampler,
collate_fn=stack_samples)
Premise: I have been working on this ML dataset and I found my ADA boost and SVM to be extremely good when it comes to detecting TP. The confusion matrix for both models is identical shown below.
Here's the image:
Out of the 10 models I have trained, 2 of them are ADA and SVM. The other 8, some have lower accuracy and others higher by ~+-2%
MAIN QUESTION:
How do I chain/pipeline so that all my test cases are handled in the following manner?
Pass all the cases through SVM and ADA. If the either SVM or ADA has 80%+ confidence return the result
Else, if SVM or ADA don't have a high confidence, have only those test cases evaluated by the other 8 models for a final decision
Potential Solution:
My potential attempt involved the use of 2 voting classifiers. One classifier with just ADA and SVM, the second classifier with the other 8 models. But I don't know hot to make this work
Here's the code for my approach:
from sklearn.ensemble import VotingClassifier
ensemble1=VotingClassifier(estimators=[
('SVM',model[5]),
('ADA',model[7]),
], voting='hard').fit(X_train,Y_train)
print('The accuracy for ensembled model is:',ensemble1.score(X_test, Y_test))
#I was trying to make ensemble 1 the "first pass" if it was more than 80% confident in it's decision, return the result
#ELSE, ensemble 2 jumps in to make a decision
ensemble2=VotingClassifier(estimators=[
('LR',model[0]),
('DT',model[1]),
('RFC',model[2]),
('KNN',model[3]),
('GBB',model[4]),
('MLP',model[6]),
('EXT',model[8]),
('XG',model[9])
], voting='hard').fit(X_train,Y_train)
#I don't know how to make these two models work together though.
Extra Questions:
These questions are to facilitate some extra concerns I had and are NOT the main question:
Is what I am trying to do worth it?
Is it normal to have a Confusion matrix with just True Positives and False Positives? Or is this indicative of incorrect training? As seen above in the picture for Model 5.
Are the accuracies of my models on an individual level considered to be good? The models are predicting likelihood of developing heart disease. Accuracies below:
Sorry for the long post and thanks for all your input and suggestions. I'm new to ML so I'd appreciate any pointers.
This is a simple implementation, that hopefully solves your main problem of chaining multiple estimators:
class ChainEstimator(BaseEstimator,ClassifierMixin):
def __init__(self,est1,est2):
self.est1 = est1
self.est2 = est2
def fit(self,X,y):
self.est1.fit(X,y)
self.est2.fit(X,y)
return self
def predict(self,X):
ans = np.zeros((len(X),)) - 1
probs = self.est1.predict_proba(X) #averaging confidence of Ada & SVC
conf_samples = np.any(probs>=.8,axis=1) #samples with >80% confidence
ans[conf_samples] = np.argmax(probs[conf_samples,:],axis=1) #Predicted Classes of confident samples
if conf_samples.sum()<len(X): #Use est2 for non-confident samples
ans[~conf_samples] = self.est2.predict(X[~conf_samples])
return ans
Which you can call like this:
est1 = VotingClassifier(estimators=[('ada',AdaBoostClassifier()),('svm',SVC(probability=True))],voting='soft')
est2 = VotingClassifier(estimators=[('dt',DecisionTreeClassifier()),('knn',KNeighborsClassifier())])
clf = ChainEstimator(est1,est2).fit(X_train,Y_train)
ans = clf.predict(X_test)
Now if you want to base your chaining on the performance of est1, you can do something like this to record its performance during training, and add a few more ifs on the predict function:
def fit(self,X,y):
self.est1.fit(X,y)
self.est1_perf = cross_val_score(self.est1,X,y,cv=4,scoring='f1_macro')
self.est2.fit(X,y)
self.est2_perf = cross_val_score(self.est2,X,y,cv=4,scoring='f1_macro')
return self
Note that you shouldn't be using simple accuracy for problem like this.
I've trained my model in keras to classify images of food in 16 categories.
I've got 1000 samples per category. My model is doing pretty well in this case, with about 95% accuracy.
Now I want to reuse this model on the same set of images, but now brought to two categories (healthy and unhealthy).
I have tried to do this like this:
model = load_model('model_saved.h5')
ll = model.layers[len(model.layers)-2].output
ll = Dense(1,activation="sigmoid",name="densehh_out")(ll)
model = Model(inputs=model.input,outputs=ll)
but accuracy starts on about 50% and won't pass 72%/
also loss acts pretty jumpy on my val_data.
Is there any better way to achieve better accuracy and convergent loss function?
Context to what I'm trying to achieve:
I have a problem regarding image classification using scikit. I have Cifar 10 data, training and testing images. There are 10000 training images and 1000 testing images. Each test/train image is stored in a test/train npy file, as a 4-d matrix (height,width,rgb,sample). I also have test/train labels. I have a ‘computeFeature’ method that utilizes Histogram of Orientated Gradients method to represent image domain features as a vector. I am trying to iterate this method over both the training and testing data so that I can create an array of features that can be used later so that the images can be classified. I have tried creating a for loop using I and storing the results in a numpy array. I must then continue to apply PCA/LDA and do image classification with SVC and CNN etc (any method of image classification).
import numpy as np
import skimage.feature
from sklearn.decomposition import PCA
trnImages = np.load('trnImage.npy')
tstImages = np.load('tstImage.npy')
trnLabels = np.load('trnLabel.npy')
tstLabels = np.load('tstLabel.npy')
from sklearn.svm import SVC
def computeFeatures(image):
hog_feature, hog_as_image = skimage.feature.hog(image, visualize=True, block_norm='L2-Hys')
return hog_feature
trnArray = np.zeros([10000,324])
tstArray = np.zeros([1000,324])
for i in range (0, 10000 ):
trnFeatures = computeFeatures(trnImages[:,:,:,i])
trnArray[i,:] = trnFeatures
for i in range (0, 1000):
tstFeatures = computeFeatures(tstImages[:,:,:,i])
tstArray[i,:] = tstFeatures
pca = PCA(n_components = 2)
trnModel = pca.fit_transform(trnArray)
pca = PCA(n_components = 2)
tstModel = pca.fit_transform(tstArray)
# Divide the dataset into the two sets.
test_data = tstModel
test_labels = tstLabels
train_data = trnModel
train_labels = trnLabels
C = 1
model = SVC(kernel='linear', C=C)
model.fit(train_data, train_labels.ravel())
y_pred = model.predict(test_data)
accuracy = np.sum(np.equal(test_labels, y_pred)) / test_labels.shape[0]
print('Percentage accuracy on testing set is: {0:.2f}%'.format(accuracy))
Accuracy prints out as 100%, I'm pretty sure this is wrong but I'm not sure why?
First of all,
pca = PCA(n_components = 2)
tstModel = pca.fit_transform(tstArray)
this is wrong. You have to use:
tstModel = pca.transform(tstArray)
Secondly, how did you select the dimension of PCA? Why 2? Why not 25 or 100? 2 PC may be few for the images. Also, as I understand, datasets are not scaled prior to PCA.
Just for interest, check the balance of classes.
Regarding to 'shall we use PCA before SVM or not': highly depends on the data. Try to check both cases and then decide. SVC maybe pretty slow in computation so PCA (or other dimensionality reduction technique) may speed it up a little. But you need to check both cases.
The immediate concern in this sort of situation is that the model is over-fitted. Any professional reviewer would immediately return this to the investigator. In this case, I suspect it is a result of the statistical approach used.
I don't work with images, but I would question why PCA was being stacked onto SVM. In common speak, you are using two successive methods that reduce/collapse hyper-dimensional space. This would very likely lead to a definite outcome. If you collapse high-level dimensionality once, why repeat it?
The PCA is standard for images, but should be followed by something very simple such as K-means.
The other approach instead of PCA is, of course, NMF and I would recommend it if you feel PCA is not providing the resolution sought.
Otherwise the calculation looks fine.
accuracy = np.sum(np.equal(test_labels, y_pred)) / test_labels.shape[0]
On second thoughts, the accuracy index might not be concerned with over-fitting, IF (that's a grammatical emphasis type 'IF'), test_labels contained a prediction of the image (of which ~50% are incorrect).
I'm just guessing this is what "test_labels" data is however and we have no idea how that prediction was derived. So I'm not sure there's enough information to answer the question.
BTW could some explain, "shape[0]"please? Is it needed?
One obvious problem with your approach is that you apply PCA in a rather peculiar way. You should typically only estimate one transform -- on the training data -- and then use it to transform any evaluation set as well.
This way, you kind of... implement SVM with whitening batch-norm, which sounds cool, but is at least rather unusual. So it would need much care. E.g. this way, you cannot classify a single sample. Still, it may work as an unsupervised adaptation technique.
Apart from that, it's hard to tell without access to your data. Are you sure that the test and train sets are disjoint?
I am currently teaching myself the basics of machine learning by creating a simple image classifier using Keras (with a Tensorflow backend). The model classifies a (greyscaled) image as either a cat or not a cat.
My model is relatively good at this task, so I now want to see if it can generate images that it would classify as a cat.
I have attempted to start this in a simple way, by creating a random array of the same shape as the images, with random numbers in each index:
from random import randint
json_file = open('model.json', 'r')
loaded_model_json = json_file.read()
json_file.close()
model = model_from_json(loaded_model_json)
model.load_weights("model_weights.h5")
confidence = 0.0
thresholdConfidence = 0.6
while confidence < thresholdConfidence:
img_array = np.array([[[randint(0, 255) for z in range(1)] for y in range(64)] for x in range(64)])
img_array = img_array.reshape((1,) + img_array.shape)
confidence = model.predict(img_array)
This method is obviously not good at all, since it just creates random things and could potentially run eternally. Could the model somehow run in reverse by telling it that an array is 100% cat, and having it predict what the array representation of the image is?
Thank you for reading.
[This is my first post on StackOverflow, so please let me know if I've done something wrong!]
If you wish to generate a special type of image, you can use Generative Adversary Networks. This are made into two parts which need to be trained separately. The two parts are
Generator : Creates noise that is random images.
Discriminator : Gives feedback to the generator regarding the images
You can refer here.