I work with python and about 4000 images of watches (examples: watch_1, watch_2). The images are rgb and their resolution is 450x450. My aim is to find the most similar watches among them. For this reason I am using IncrementalPCA and partial_fit of scikit_learn to handle these big data with my 26GB RAM (see also: SO_Link_1, SO_Link_2). My source code is the following:
import cv2
import numpy as np
import os
from glob import glob
from sklearn.decomposition import IncrementalPCA
from sklearn import neighbors
from sklearn import preprocessing
data = []
# Read images from file #
for filename in glob('Watches/*.jpg'):
img = cv2.imread(filename)
height, width = img.shape[:2]
img = np.array(img)
# Check that all my images are of the same resolution
if height == 450 and width == 450:
# Reshape each image so that it is stored in one line
img = np.concatenate(img, axis=0)
img = np.concatenate(img, axis=0)
data.append(img)
# Normalise data #
data = np.array(data)
Norm = preprocessing.Normalizer()
Norm.fit(data)
data = Norm.transform(data)
# IncrementalPCA model #
ipca = IncrementalPCA(n_components=6)
length = len(data)
chunk_size = 4
pca_data = np.zeros(shape=(length, ipca.n_components))
for i in range(0, length // chunk_size):
ipca.partial_fit(data[i*chunk_size : (i+1)*chunk_size])
pca_data[i * chunk_size: (i + 1) * chunk_size] = ipca.transform(data[i*chunk_size : (i+1)*chunk_size])
# K-Nearest neighbours #
knn = neighbors.NearestNeighbors(n_neighbors=4, algorithm='ball_tree', metric='minkowski').fit(data)
distances, indices = knn.kneighbors(data)
print(indices)
However when I run this program for start with 40 images of watches I get the following error when i = 1:
ValueError: Number of input features has changed from 4 to 6 between calls to partial_fit! Try setting n_components to a fixed value.
However, it is obvious that I set n_components to 6 when coding ipca = IncrementalPCA(n_components=6) but for some reason ipca considers chunk_size = 4 as the number of components when i = 0 and then when i = 1 changes to 6.
Why is this happening?
How can I fix it?
This seems to follow the math behind PCA as it will be ill-conditioned for n_components > n_samples.
You might be interested in reading this (introduction of error-message) and some discussion behind it.
Try to increase the batch-size / chunk-size (or lowering n_components).
(In general i'm also somewhat sceptic about this approach. I hope you tested it on some small example-dataset using batch-PCA. It does not seem your watches are preprocessed in regards to geometry: cropping; maybe hist-/color-normalization.)
Related
I'm doing Audio classification project, i.e. finding if whale call is present or not in given .wav file. Following are the steps of data preprocessing. First cell creates dataset of paths to positive and negative samples. Second cell shows the data type of of dataset object. As you can see in the third cell when we iterate through some samples in dataset each sample is a tensor which contains a path to .wav file and label. Fourth cell is the preprocessing method which I need to apply on each dataset sample. Problem is when I run the data.map(preprocess) it throws error. You can see at the end for more detail.
import librosa
import matplotlib.pyplot as plt
import librosa.display
import math
import tensorflow as tf
import numpy
FRAME_SIZE = 2048
HOP_LENGTH = 512
SR = 200
POS = '/kaggle/input/datafestintegration2023/train/train/1'
NEG = '/kaggle/input/datafestintegration2023/train/train/0'
pos = tf.data.Dataset.list_files(POS+'/*.wav')
neg = tf.data.Dataset.list_files(NEG+'/*.wav')
positives = tf.data.Dataset.zip((pos, tf.data.Dataset.from_tensor_slices(tf.ones(len(pos)))))
negatives = tf.data.Dataset.zip((neg,tf.data.Dataset.from_tensor_slices(tf.zeros(len(neg)))))
dataset = positives.concatenate(negatives)
dataset
for d in dataset.take(5):
print(d)
def preprocess(file_path, label):
wav, _ = librosa.load(file_path, sr=SR)
wav = wav[:12000]
zero_padding = tf.zeros([12000] - tf.shape(wav), dtype=tf.float32)
wav = tf.concat([zero_padding, wav],0)
wav = np.array(wav)
mel_spectrogram = librosa.feature.melspectrogram(wav, sr=SR, n_fft=FRAME_SIZE, hop_length=HOP_LENGTH, n_mels=10)
log_mel_spectrogram = librosa.power_to_db(mel_spectrogram)
spectrogram = tf.expand_dims(log_mel_spectrogram, axis=2)
return spectrogram, label
dataset = dataset.map(preprocess)
When I run the above code cell it throws the below error. As per my understanding the preprocess method is not able to fetch the paths from dataset. What should I do?
I am using the MICCAI BRATS 2015 database containing 3D MRI images of the dimensions 155x240x240.
I wanted to perform intensity standardization on these images, and am trying to use the IntensityRangeStandardization class from medpy.filter.
The code is simple:
Load 20 flair images from the database into an array:
from glob import glob
import SimpleITK as sitk
pth = 'C:/BRats2015/HGG' #path to the directory
flair = glob(self.path + '*/*Flair*/*.mha') #contain paths to all images
flair = flair[:20] #choose 20 images
#load the 20 images in sitk format
im = []
for i in flair:
im.append(sitk.ReadImage(i))
#convert them into numpy array
for i in xrange(len(im)):
im[i] = sitk.GetArrayFromImage(im[i])
#initialize the filter
normalizer = IntensityRangeStandardization()
#train and transform the images
im_n = normalizer.train_transform(im)[1] # the second returned variable contains the new images, # hence [1]
I get the following error message:
File "intensity_range_standardization.py", line 268, in train
self.__stdrange = self.__compute_stdrange(images)
File "intensity_range_standardization.py", line 451, in __compute_stdrange
raise SingleIntensityAccumulationError('Image no.{} shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.'.format(idx))
SingleIntensityAccumulationError: Image no.0 shows an unusual single-intensity accumulation that leads to a situation where two percentile values are equal. This situation is usually caused, when the background has not been removed from the image. Another possibility would be to reduce the number of landmark percentiles landmarkp or to change their distribution.
Okay, I figured how to call the function train_transform if we are given images and their masks respectively. Here's the code from the medpy github repo.
Reshaping the images should be easy, but I'll still just post the link to the code in case of any confusion : Reshape the new images
The full code that worked for me:
images = [img1, img2, img3]
# each image is numpy array of shape (150,150)
masks = [i > 0 for i in images]
norm0 = IntensityRangeStandardization()
trained_model, transformed_images = norm0.train_transform([i[m] for i, m in zip(images, masks)])
for ti, i, m, in zip(transformed_images, images, masks):
i[m] = ti
norm_images.append(i)
To train and transform one after the other:
norm_images = []
trained_model = norm0.train([i[m] for i, m in zip(images, masks)])
transformed_images = [trained_model.transform(i[m], surpress_mapping_check = False) for i, m in zip(images, masks)]
for ti, i, m, in zip(transformed_images, images, masks):
i[m] = ti
norm_images.append(i)
I am trying to follow the tutorial http://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_ml/py_knn/py_knn_opencv/py_knn_opencv.html and replaced KNearest with cv2.m1.KNearest_create() but i am getting TypeError: only length-1 arrays can be converted to Python scalars
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('digits.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Now we split the image to 5000 cells, each 20x20 size
cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
# Make it into a Numpy array. It size will be (50,100,20,20)
x = np.array(cells)
# Now we prepare train_data and test_data.
train = x[:,:50].reshape(-1,400).astype(np.float32) # Size = (2500,400)
test = x[:,50:100].reshape(-1,400).astype(np.float32) # Size = (2500,400)
# Create labels for train and test data
k = np.arange(10)
train_labels = np.repeat(k,250)[:,np.newaxis]
test_labels = train_labels.copy()
# Initiate kNN, train the data, then test it with test data for k=1
cv2.m1.KNearest_create()
knn.train(train,train_labels)
ret,result,neighbours,dist = knn.find_nearest(test,k=5)
# Now we check the accuracy of classification
# For that, compare the result with test_labels and check which are wrong
matches = result==test_labels
correct = np.count_nonzero(matches)
accuracy = correct*100.0/result.size
print accuracy
(i am using a raspberry pi and followed this tutorial to install open cv http://www.pyimagesearch.com/2015/10/26/how-to-install-opencv-3-on-raspbian-jessie/ subsequently i pip installed matplotlib)
parameter cv2.ml.ROW_SAMPLE is missing and change knn.find_nearest(test,k=5) to below code.This is new in openCv3, please refer to openCv official site http://docs.opencv.org/3.0.0/dd/de1/classcv_1_1ml_1_1KNearest.html
` knn.train(train, cv2.ml.ROW_SAMPLE, train_labels)
ret, result, neighbours, dist = knn.findNearest(test, k=5)`
You're just missing one parameter, but I notice that a lot of people have questions about this section of the tutorial, so here's the whole final section adjusted to work with python3 and the modern openCV library.
knn = cv2.ml.KNearest_create()
knn.train(trainData, cv2.ml.ROW_SAMPLE, responses)
ret, results, neighbours, dist = knn.findNearest(newcomer, k=5)
print("result: ", results,"\n")
print("neighbours: ", neighbours,"\n")
print("distance: ", dist)
plt.show()
import numpy as np
import cv2
from matplotlib import pyplot as plt
img = cv2.imread('digits.png')
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Now we split the image to 5000 cells, each 20x20 size
cells = [np.hsplit(row,100) for row in np.vsplit(gray,50)]
# Make it into a Numpy array. It size will be (50,100,20,20)
x = np.array(cells)
# Now we prepare train_data and test_data.
train = x[:,:50].reshape(-1,400).astype(np.float32) # Size = (2500,400)
test = x[:,50:100].reshape(-1,400).astype(np.float32) # Size = (2500,400)
# Create labels for train and test data
k = np.arange(10)
train_labels = np.repeat(k,250)[:,np.newaxis]
test_labels = train_labels.copy()
# Initiate kNN, train the data, then test it with test data for k=1
knn = cv2.ml.KNearest_create()
knn.train(train, cv2.ml.ROW_SAMPLE, train_labels)
ret, results, neighbours, dist = knn.findNearest(test, k=5)
#print("result: ", results,"\n")
#print("neighbours: ", neighbours,"\n")
#print("distance: ", dist)
matches = result=test_labels
correct = np.count_nonzero(matches)
accuracy = correct*100.0/result.size
print(accuracy)
doc of opencv said that:
findNearest(...) | findNearest(samples, k[, results[,
neighborResponses[, dist]]]) -> retval, results, neighborResponses,
...
not knn.find_nearest(test,k=5)
you can run
help(cv2.ml.KNearest_create())
then you will see.
by the way ,there are losts erros on opencv website
So i recently successfully built a system which will record, plot, and playback an audio wav file entirely with python. Now, I'm trying to put some filtering and audio mixing in between the when i record and when i start plotting and outputting the file to the speakers. However, i have no idea where to start. Right now I'm to read in a the intial wav file, apply a low pass filter, and then re-pack the newly filtered data into a new wav file. Here is the code i used to plot the initial data once i recorded it.
import matplotlib.pyplot as plt
import numpy as np
import wave
import sys
spf = wave.open('wavfile.wav','r')
#Extract Raw Audio from Wav File
signal = spf.readframes(-1)
signal = np.fromstring(signal, 'Int16')
plt.figure(1)
plt.title('Signal Wave...')
plt.plot(signal)
And here is some code i used to generate a test audio file of a single tone:
import numpy as np
import wave
import struct
freq = 440.0
data_size = 40000
fname = "High_A.wav"
frate = 11025.0
amp = 64000.0
sine_list_x = []
for x in range(data_size):
sine_list_x.append(np.sin(2*np.pi*freq*(x/frate)))
wav_file = wave.open(fname, "w")
nchannels = 1
sampwidth = 2
framerate = int(frate)
nframes = data_size
comptype = "NONE"
compname = "not compressed"
wav_file.setparams((nchannels, sampwidth, framerate, nframes,
comptype, compname))
for s in sine_list_x:
wav_file.writeframes(struct.pack('h', int(s*amp/2)))
wav_file.close()
I'm not really sure how to apply said audio filter and repack it, though. Any help and/or advice you could offer would be greatly appreciated.
First step : What kind of audio filter do you need ?
Choose the filtered band
Low-pass Filter : remove highest frequency from your audio signal
High-pass Filter : remove lowest frequencies from your audio signal
Band-pass Filter : remove both highest and lowest frequencies from your audio signal
For the following steps, i assume you need a Low-pass Filter.
Choose your cutoff frequency
The Cutoff frequency is the frequency where your signal will be attenuated by -3dB.
Your example signal is 440Hz, so let's choose a Cutoff frequency of 400Hz. Then your 440Hz-signal is attenuated (more than -3dB), by the Low-pass 400Hz filter.
Choose your filter type
According to this other stackoverflow answer
Filter design is beyond the scope of Stack Overflow - that's a DSP
problem, not a programming problem. Filter design is covered by any
DSP textbook - go to your library. I like Proakis and Manolakis'
Digital Signal Processing. (Ifeachor and Jervis' Digital Signal
Processing isn't bad either.)
To go inside a simple example, I suggest to use a moving average filter (for a simple low-pass filter).
See Moving average
Mathematically, a moving average is a type of convolution and so it can be viewed as an example of a low-pass filter used in signal processing
This Moving average Low-pass Filter is a basic filter, and it is quite easy to use and to understand.
The parameter of the moving average is the window length.
The relationship between moving average window length and Cutoff frequency needs little bit mathematics and is explained here
The code will be
import math
sampleRate = 11025.0
cutOffFrequency = 400.0
freqRatio = cutOffFrequency / sampleRate
N = int(math.sqrt(0.196201 + freqRatio**2) / freqRatio)
So, in the example, the window length will be 12
Second step : coding the filter
Hand-made moving average
see specific discussion on how to create a moving average in python
Solution from Alleo is
def running_mean(x, windowSize):
cumsum = numpy.cumsum(numpy.insert(x, 0, 0))
return (cumsum[windowSize:] - cumsum[:-windowSize]) / windowSize
filtered = running_mean(signal, N)
Using lfilter
Alternatively, as suggested by dpwilson, we can also use lfilter
win = numpy.ones(N)
win *= 1.0/N
filtered = scipy.signal.lfilter(win, [1], signal).astype(channels.dtype)
Third step : Let's Put It All Together
import matplotlib.pyplot as plt
import numpy as np
import wave
import sys
import math
import contextlib
fname = 'test.wav'
outname = 'filtered.wav'
cutOffFrequency = 400.0
# from http://stackoverflow.com/questions/13728392/moving-average-or-running-mean
def running_mean(x, windowSize):
cumsum = np.cumsum(np.insert(x, 0, 0))
return (cumsum[windowSize:] - cumsum[:-windowSize]) / windowSize
# from http://stackoverflow.com/questions/2226853/interpreting-wav-data/2227174#2227174
def interpret_wav(raw_bytes, n_frames, n_channels, sample_width, interleaved = True):
if sample_width == 1:
dtype = np.uint8 # unsigned char
elif sample_width == 2:
dtype = np.int16 # signed 2-byte short
else:
raise ValueError("Only supports 8 and 16 bit audio formats.")
channels = np.fromstring(raw_bytes, dtype=dtype)
if interleaved:
# channels are interleaved, i.e. sample N of channel M follows sample N of channel M-1 in raw data
channels.shape = (n_frames, n_channels)
channels = channels.T
else:
# channels are not interleaved. All samples from channel M occur before all samples from channel M-1
channels.shape = (n_channels, n_frames)
return channels
with contextlib.closing(wave.open(fname,'rb')) as spf:
sampleRate = spf.getframerate()
ampWidth = spf.getsampwidth()
nChannels = spf.getnchannels()
nFrames = spf.getnframes()
# Extract Raw Audio from multi-channel Wav File
signal = spf.readframes(nFrames*nChannels)
spf.close()
channels = interpret_wav(signal, nFrames, nChannels, ampWidth, True)
# get window size
# from http://dsp.stackexchange.com/questions/9966/what-is-the-cut-off-frequency-of-a-moving-average-filter
freqRatio = (cutOffFrequency/sampleRate)
N = int(math.sqrt(0.196196 + freqRatio**2)/freqRatio)
# Use moviung average (only on first channel)
filtered = running_mean(channels[0], N).astype(channels.dtype)
wav_file = wave.open(outname, "w")
wav_file.setparams((1, ampWidth, sampleRate, nFrames, spf.getcomptype(), spf.getcompname()))
wav_file.writeframes(filtered.tobytes('C'))
wav_file.close()
sox library can be used for static noise removal.
I found this gist which has some useful commands as examples
I am a beginner to python and I am implementing Principal component analysis (PCA) using python, but I am having a problem computing the mean.
Here is my code:
import Image
import os
from PIL import Image
from numpy import *
import numpy as np
#import images
dirname = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder"
X = [np.asarray(Image.open(os.path.join(dirname, fn))) for fn in os.listdir(dirname)]
#get number of images and dimentions
path, dirs, files = os.walk(dirname).next()
num_images = len(files)
image_file = "C:\\Users\\Karim\\Downloads\\att_faces\\New folder\\2.pgm"
img = Image.open(image_file)
width, height = img.size
print width
print height
print num_images
M = (X-mean(X.T,axis=1)).T # subtract the mean (along columns)
I get the error:
AttributeError: 'list' object has no attribute 'T'
The problem is X.T in your last line because X is a python list, not a numpy.ndarray. It isn't clear what you're trying to do here but if you wanted to combine all the image arrays into a single numpy array, you could convert X = np.array(X) before the last line.
Also, unless you specifically want to roll your own PCA implementation, you can do this much more easily with numpy by using np.cov (for covariance calculation) and np.linalg.eig (to compute the eigenvalues and eigenvectors of the covariance matrix).
images -= np.mean(images, axis=0)