I'm having some difficulty to get a series of images into the correct format to feed into sklearn.svm.SVC.
This is my first image recognition project, and so Im suffering a bit.
Ive got a loop which brings in a bunch of base64 RGB images (of different sizes) to a dataframe
imageData = mpimg.imread(io.BytesIO(base64.b64decode(value)),format='JPG')
then I convert the RGB image into gray-scale, and flatten
data_images = rgb2gray(imageData).ravel()
where rgb2gray:
def rgb2gray(rgb):
r, g, b = rgb[:,:,0], rgb[:,:,1], rgb[:,:,2]
gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
If I look at the size differences
df_raw.sample(10)
We can see that the picture pixel lengths are not the same between my samples. Im a little confused here about how to proceed. For lack of a better idea I decided to add a padding based on the picture with the largest size,
df_raw.picLen.max()
Then appending a number of zeros to the end of each 1D picture array.
def padPic(x,numb,maxN):
N = maxN-len(x)
out = np.pad(x,(numb,N),'constant')
return out
calling
df_raw['picNew'] = df_raw.apply(lambda row: padPic(row['pic'],0,df_raw.picLen.max()), axis=1)
df_raw['picNewLen'] = df_raw.apply(lambda row: len(row['picNew']), axis=1)
I now have arrays all of the same size
From here I attempt to fit a model to support vector algorithm using the picture data as X and a set of labels as y.
from sklearn.svm import SVC
X_train, X_test, y_train, y_test = train_test_split(df_raw.picNew, df_raw.name, test_size = 0.2, random_state=42)
check the size:
print('Training data and target sizes: \n{}, {}'.format(X_train.shape,y_train.shape))
print('Test data and target sizes: \n{}, {}'.format(X_test.shape,y_test.shape))
Training data and target sizes: (198,), (198,) Test data and target
sizes: (50,), (50,)
after Ive convinced myself everything is ready, then I try to fit the model
svm = SVC()
svm.fit(X_train, y_train)
this throws an error, and I cant figure out why:
/opt/wakari/anaconda/envs/ulabenv_2018-11-13_10.15.00/lib/python3.5/site-packages/numpy/core/numeric.py in asarray(a, dtype, order)
499
500 """
--> 501 return array(a, dtype, copy=False, order=order)
502
503
ValueError: setting an array element with a sequence.
I think this must have to do with the array size, but I cant figure it out. :-/
In addition to the error, more generally, I have a question to my approach in general. In particular I think my "padding" is probably incorrect and maybe some resize would be better.
I appreciate any feedback to my methodology. Thanks
I am pretty sure this is due to using list in a feature column and strings as target values. For the latter You need to use LabelEncoder class to turn them to normalized class labels, as required by fit().
See description here:
https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.LabelEncoder.html
This needs to be done before train/test split to make sure you have all names 'seen' by the LabelEncoder.
For the former, you might want to search for MNIST tutorials, that will provide a plethora of algorightms applied to image classification problems.
Also, Resize before flattening should work better than padding.
Ive figured out the problem.
Thank you to Artem for his catching my obvious problem of not encoding the classes, but this was not my issue in the end.
Turns out the way that my picture array was represented was incorrect.
The original array was df_raw['picNew'].shape which evaluates to
(248,)
What I needed was a 2D representation
np.stack(df_raw['picNew'] , axis=1).shape
(830435, 248)
All good now.
I'm still unsure about the most "correct" way to resize the images to be all the same length. Appending 0 to the array lengths seems a bit unsophisticated... So if anyone has an idea :)
Related
I am trying to solve a classification problem with a neural network and after I get the prediction I want to create a pandas data frame with a column from the test dataset and my predictions as the second column. But I am constantly getting error. Here is my code:enter image description here
and here is my error:
enter image description here
Important sidenote: Please, take some time to look into How to make good reproducible pandas examples, there are great suggestions there on how you could ask your question better.
Now for your error:
Data must be 1-dimensional
That means pandas wants a 1-dimensional array, i.e. of the form [0,0,1,1,...,1]. But your preds array is 2-dimensional, i.e. of the form [[0],[0],[1],[1],...,[1]].
So you need to flatten the preds array here:
Instead of for-loops consider using list comprehensions to change your code to something like this:
predictions = [1 if p>0.5 else 0 for p in preds]
df = pd.DataFrame({'PassengerId': test['PassengerId'].values,
'Survived': predictions})
Also, in the meantime look into ndarray.round method - maybe it will better fit your use case:
predictions = preds.round()
I am using sklearn.svm.SVR for a "regression task" which I want to use my "customized kernel method". Here is the dataset samples and the code:
index density speed label
0 14 58.844020 77.179139
1 29 67.624946 78.367394
2 44 77.679100 79.143744
3 59 79.361877 70.048869
4 74 72.529289 74.499239
.... and so on
from sklearn import svm
import pandas as pd
import numpy as np
density = np.random.randint(0,100, size=(3000, 1))
speed = np.random.randint(20,80, size=(3000, 1)) + np.random.random(size=(3000, 1))
label = np.random.randint(20,80, size=(3000, 1)) + np.random.random(size=(3000, 1))
d = np.hstack((a,b,c))
data = pd.DataFrame(d, columns=['density', 'speed', 'label'])
data.density = data.density.astype(dtype=np.int32)
def my_kernel(X,Y):
return np.dot(X,X.T)
svr = svm.SVR(kernel=my_kernel)
x = data[['density', 'speed']].iloc[:2000]
y = data['label'].iloc[:2000]
x_t = data[['density', 'speed']].iloc[2000:3000]
y_t = data['label'].iloc[2000:3000]
svr.fit(x,y)
y_preds = svr.predict(x_t)
the problem happens in the last line svm.predict which says:
X.shape[1] = 1000 should be equal to 2000, the number of samples at training time
I searched the web to find a way to deal with the problem but many questions alike (like {1}, {2}, {3}) were left unanswered.
Actually, I had used SVM methods with rbf, sigmoid, ... before and the code was working just fine but this was my first time using customized kernels and I suspected that it must be the reason why this error happened.
So after a little research and reading documentation I found out that when using precomputed kernels, the shape of the matrix for SVR.predict() must be like [n_samples_test, n_samples_train] shape.
I wonder how to modify x_test in order to get predictions and everything works just fine with no problem like when we don't use customized kernels?
If possible please describe "the reason that why the inputs for svm.predict function in precomputed kernel differentiates with the other kernels".
I really hope the unanswered questions that are related to this issue could be answered respectively.
The problem is in your kernel function, it doesn't do the job.
As the documentation https://scikit-learn.org/stable/modules/svm.html#using-python-functions-as-kernels says, "Your kernel must take as arguments two matrices of shape (n_samples_1, n_features), (n_samples_2, n_features) and return a kernel matrix of shape (n_samples_1, n_samples_2)." The sample kernel on the same page satisfies this criteria:
def my_kernel(X, Y):
return np.dot(X, Y.T)
In your function the second argument of dot is X.T and thus the output will have shape (n_samples_1, n_samples_1) which is not that is expected.
The shape does not match means the test data and train data are of not equal shape, always think about matrix or array in numpy. If you are doing any arithmetic operation you always need a similar shape. That's why we check array.shape.
[n_samples_test, n_samples_train] you can modify shapes but its not best idea.
array.shape, reshape, resize
are used for that
I want to evaluate if an event is happening in my screen, every time it happens a particular box/image shows up in a screen region with very similar structure.
I have collected a bunch of 84x94 .png RGB images from that screen region and I'd like to build a classifier to tell me if the event is happening or not.
Therefore my idea was to create a pd.DataFrame (df) containing 2 columns, df['np_array'] contains every picture as a np.array and df['is_category'] contains boolean values telling if that image is indicating that the event is happening or not.
The structure looks like this (with != size):
I have resized the images to 10x10 for training and converted to greyscale
df = pd.DataFrame(
{'np_array': [np.random.random((10, 10,2)) for x in range(0,10)],
'is_category': [bool(random.getrandbits(1)) for x in range(0,10)]
})
My problem is that I can't fit a scikit learn classifier by doing clf.fit(df['np_array'],df['is_category'])
I've never tried image recognition before, thanks upfront for any help!
If its a 10x10 grayscale image, you can flatten it:
import numpy as np
from sklearn import ensemble
# generate random 2d arrays
image_data = np.random.rand(10,10, 100)
# generate random labels
labels = np.random.randint(0,2, 100)
X = image_data.reshape(100, -1)
# then use any scikit-learn classification model
clf = ensemble.RandomForestClassifier()
clf.fit(X, y)
By the way, for images the best performing algorithms are convolutional neural networks.
I have a list of variable size image and wish to standardise them into 256x256 size. I used the following code
import tensorflow as tf
import matplotlib.pyplot as plt
file_contents = tf.read_file('image.jpg')
im = tf.image.decode_jpeg(file_contents)
im = tf.image.resize_images(im, 256, 256)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
img = sess.run(im)
plt.imshow(img)
plt.show()
However, tf.resize_images() tend to mess up the image. However, using tf.reshape() seems to allow resize_image() function correctly
Tensorflow version : 0.8.0
Original Image:
Resized Image:
I know skimage package can handle what I need, however I wish to enjoy the function from tf.train.shuffle_batch(). I try to avoid maintaining 2 identical dataset ( with 1 fixed image size ) since Caffe seems to have no problem handling them.
This happens because image_resize() is performing an interpolation between adjacent pixels, and returning floats instead of integers in the range 0-255. That's why NEAREST_NEIGHBOR does work: it takes the value of one of the near pixels without doing further math.
Suppose you have some adjacent pixels with values 240, 241. NEAREST_NEIGHBOR will return either 240 or 241. With any other method, the value could be something like 240.5, and is returned without rounding it, I assume intentionally so you can decide what is better for you (floor, round up, etc).
The plt.imshow() on the other side, when facing float values, interprets only the decimal part, as if they were pixel values in a full scale between 0.0 and 1.0.
To make the above code work, one of the possible solutions would be:
import numpy as np
plt.imshow(img.astype(np.uint8))
I am trying to train an image classifier in scikit-learn. I have a bunch of input images and I am using Pillow to process them. My question is about what shape to give the Pillow data to scikit-learn.
This is my code now:
training = glob.glob('./img/training/*/*.bmp')
data = []
classes = []
for imagefile in training:
edges = Image.open(imagefile).filter(ImageFilter.FIND_EDGES).convert("L")
in_data = np.asarray(edges, dtype=np.uint8)
data.append(in_data[0])
if 'class1' in imagefile:
classes.append('class1')
else:
classes.append('class2')
clf = svm.SVC(gamma=0.001, C=100.)
clf.fit(data, classes)
This runs without errors, but I have put the code together fairly crudely and I am not sure it is correct.
In particular, I'm not sure whether I should be using in_data[0]. I just did this because using in_data gives me an error: ValueError: Found array with dim 3. Estimator expected <= 2.
Unless you want the first row of the image matrix ( in_data[0] returns you the first row ) of each image, you probably want to use flattening.
Flattening will take each row of the image matrix and put the rows behind eachother in a 1 dimensional vector.
So it becomes data.append(in_data.flatten())
You could resize your image to a smaller format first, to reduce the number of columns of your data matrix.