Audio Data Agmentation in python

Audio Data Agmentation in python - python

I am using below function to augment audio data generated from wav audio files.
def generate_augmented_data(file_path):
augmented_data = []
samples = load_wav(file_path,get_duration=False)
for time_value in [0.7, 1, 1.3]:
for pitch_value in [-1, 0, 1]:
time_stretch_data = librosa.effects.time_stretch(samples, rate=time_value)
final_data = librosa.effects.pitch_shift(time_stretch_data, sr=sample_rate, n_steps=pitch_value)
augmented_data.append(final_data)
return augmented_data
I also need to augment the class labels and facing difficulties with it.
Tried below cod, but its not getting me the expected result
## generating augmented data.
def generate_augmented_data_label(file_path, label):
augmented_data = []
augmented_label = []
samples = load_wav(file_path,get_duration=False)
for time_value in [0.7, 1, 1.3]:
for pitch_value in [-1, 0, 1]:
time_stretch_data = librosa.effects.time_stretch(samples, rate=time_value)
final_data = librosa.effects.pitch_shift(time_stretch_data, sr=sample_rate, n_steps=pitch_value)
augmented_data.append(final_data)
augmented_label.append(label)
return augmented_data,augmented_label
Before augmentation shape for data and labels are as below,
X_train.reset_index(inplace=True, drop=True)
y_train.reset_index(inplace=True, drop=True)
X_train_augmented_data = []
y_train_augmented_data = []
for i in range(len(X_train)):
#print(i)
t1 = X_train.iloc[i]
t2 = y_train[i]
tmp1,tmp2 = generate_augmented_data_label(t1,t2)
#print(tmp1,tmp2)
X_train_augmented_data.append(tmp1)
y_train_augmented_data.append(tmp2)
len(X_train)
1600
len(y_train)
1600
print(len(X_train_augmented_data))
print(len(y_train_augmented_data))
After data augmentation and an additional masking step, shape is coming as
augmented_train_data_mask = []
for i in range(0,len(augmented_train_data_pad)):
augmented_train_data_mask.append(list(map(bool,augmented_train_data_pad[i])))
augmented_train_data_mask = np.array(augmented_train_data_mask)
print(augmented_train_data_pad.shape)
print(augmented_train_data_mask.shape)
(14400, 17640)
(14400, 17640)
However, label len is still 1600. Later when I pass these into an LSTM model, I am getting a shape mismatch error.
ValueError: Data cardinality is ambiguous:
x sizes: 14400, 14400
y sizes: 1600
Make sure all arrays contain the same number of samples.
Looking for some help to resolve this issue.

You can use numpy repeat function to replicate your numpy array.
ex:
In: arr = np.arange(3)
out: array([0, 1, 2])
In : arr.repeat(3)
Out: array([0, 0, 0, 1, 1, 1, 2, 2, 2])
Hope this will suffice your requirement.

You may refer link for reference:
#https://www.geeksforgeeks.org/python-add-similar-value-multiple-times-in-list/
type(y_train)= panda series
from itertools import repeat
new_label=[]
for index, value in y_train.items():
new_label.extend(repeat(value, 2))
len(new_label)

Related

Simple neural network gives wrong output after training

I've been working on a simple neural network.
It takes in a data set with 3 columns, if the first column's value is a 1, then the output should be a 1.
I've provided comments so it is easier to follow.
Code is as follows:
import numpy as np
import random
def sigmoid_derivative(x):
return x * (1 - x)
def sigmoid(x):
return 1 / (1 + np.exp(-x))
def think(weights, inputs):
sum = (weights[0] * inputs[0]) + (weights[1] * inputs[1]) + (weights[2] * inputs[2])
return sigmoid(sum)
if __name__ == "__main__":
# Assign random weights
weights = [-0.165, 0.440, -0.867]
# Training data for the network.
training_data = [
[0, 0, 1],
[1, 1, 1],
[1, 0, 1],
[0, 1, 1]
]
# The answers correspond to the training_data by place,
# so first element of training_answers is the answer to the first element of training_data
# NOTE: The pattern is if there's a 1 in the first place, the result should be a one
training_answers = [0, 1, 1, 0]
# Train the neural network
for iteration in range(50000):
# Pick a random piece of training_data
selected = random.randint(0, 3)
training_output = think(weights, training_data[selected])
# Calculate the error
error = training_output - training_answers[selected]
# Calculate the adjustments that need to be applied to the weights
adjustments = np.dot(training_data[selected], error * sigmoid_derivative(training_output))
# Apply adjustments, maybe something wrong is going here?
weights += adjustments
print("The Neural Network has been trained!")
# Result of print below should be close to 1
print(think(weights, [1, 0, 0]))
The result of the last print should be close to 1, however it is not?
I have a feeling that I'm not adjusting the weights correctly.

Trying to create a 3D matrix in Python by choosing specific data

Hi so I'm trying to make a 3D matrix here.. It's the MovieLens data (https://grouplens.org/datasets/movielens/100k/) from where I'm taking a u1.base and u1.test pair as training and test sets (respectively). Below is an image of the format of the data of variable training_set you'll discover in the code.
The 3D matrix I'm trying to create is of the format (User, Movie, Timestamp) and the data in each of those cells is the ratings given by, for example, user 1 to movie 1 at time 1.
If it's any help, below is the code where a 2D matrix is created with users in the rows and all the movies as the columns.
import numpy as np
import pandas as pd
training_set = pd.read_csv('ml-100k/u1.base', delimiter = '\t')
training_set = np.array(training_set, dtype='int')
test_set = pd.read_csv('ml-100k/u1.test', delimiter = '\t')
test_set = np.array(test_set, dtype = 'int64')
nb_users = int(max(max(training_set[:, 0]), max(test_set[:, 0])))
nb_movies = int(max(max(training_set[:, 1]), max(test_set[:, 1])))
def convert(data):
new_data = [] #final list that we will return
for id_users in range(1, nb_users+1):
id_movies = data[:, 1][data[:, 0] == id_users] #contains the IDs of the movies rated by the id_user
id_ratings = data[:, 2][data[:, 0] == id_users] #all movie ratings given by specific user
ratings = np.zeros(nb_movies)
ratings[id_movies-1] = id_ratings #these two lines are just so that the movies that are not rated by user have null (0) values
new_data.append(list(ratings))
return (new_data)
training_set = convert(training_set)
test_set = convert(test_set)
Below is a code that I tried which gave a number of errors, so many that I couldn't scroll up to the first one it threw.
import numpy as np
import pandas as pd
training_set = pd.read_csv('ml-100k/u1.base', delimiter = '\t')
training_set = np.array(training_set, dtype='int')
test_set = pd.read_csv('ml-100k/u1.test', delimiter = '\t')
test_set = np.array(test_set, dtype = 'int64')
nb_users = int(max(max(training_set[:, 0]), max(test_set[:, 0])))
nb_movies = int(max(max(training_set[:, 1]), max(test_set[:, 1])))
#The changes I made start here --
nb_timestamps = int(max(len(training_set[:, 3]), len(test_set[:, 3])))
ts_min = int(min(min(training_set[:, 3]), min(test_set[:, 3])))
ts_max = int(max(max(training_set[:, 3]), max(test_set[:, 3])))
def convert(data):
new_data = [] #final list that we will return
for timestamp in range(ts_min, ts_max+1):
for id_users in range(1, nb_users+1):
id_movies = data[:, 1][data[:, 0] == id_users][data[:, 3] == timestamp]
#contains the IDs of the movies rated by the id_user
id_ratings = data[:, 2][data[:, 0] == id_users][data[:, 3] == timestamp]
ratings = np.zeros(nb_movies)
ratings[id_movies-1] = id_ratings
new_data.append(list(ratings))
return (new_data)
training_set = convert(training_set)
test_set = convert(test_set)

Remark: Please don't take this as an answer (yet).
There are few things to improve in your code:
When you read the csv you're taking the first row as header which means you are not considering all the data
If in this case (asn it should be so) there is just one user can rate a movie only one time you can use pd.pivot_table in order to get your 2D matrix.
import pandas as pd
import numpy as np
training_set = pd.read_csv('ml-100k/u1.base',
delimiter='\t',
header=None, # First row is not header
names=["user", "movie",
"rating", "timestamp"]) # rename headers
# with pd.pivot_table you get a df where user are in rows
# and movies in columns. The value is the rating for movie (i,j)
ratings = pd.pivot_table(training_set,
index=["user"],
columns=["movie"],
values="rating")
In case you want 0s instead of NaN you can use ratings.fillna(0). But I wouldn't do so. You should be care cos this will mess up the eventual statistics you want to extract.
In case you need the 2D matrix you can just use ratings.values.
UPDATE
In order to get your 3D matrix we can do the same pivoting with timestamps
timestamps = pd.pivot_table(training_set,
index=["user"],
columns=["movie"],
values="timestamp")
# get matrix
mat_ratings = ratings.values
mat_timestamps = timestamps.values
# stack matrix
mat3d = np.dstack((mat_ratings, mat_timestamps))
You can now check that from 2 matrix with shape (943, 1650) we get one of shape (943, 1650, 2). Note to get the shape of matrix mat just run mat.shape.

How to create a 2D array with N lots of random numbers?

I am trying to obtain a variance for a value I obtained by processing a 2x150 array into a discrete correlation function. In order to do this I need to randomly sample 80% of the original data N times, which will allow me to calculate a variance over these values.
have so far been able to create one randomly sampled set of data using this:
rand_indices = []
running_var = (len(find_length)*0.8)
x=0
while x<running_var:
rand_inx = randint(0, (len(find_length)-1))
rand_indices.append(rand_inx)
x=x+1
which creates an array 80% of the length of my original with randomly selected indices to be picked out and processed.
My problem is that I am not sure how to iterate this in order to get N sets of these random numbers, I think ideally in a Nx120 sized array. My whole code so far is:
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from random import randint
useless, just_to, find_length = np.loadtxt("w2_mjy_final.dat").T
w2_dat = np.loadtxt("w2_mjy_final.dat")
w2_rel = np.delete(w2_dat, 2, axis = 1)
w2_array = np.asarray(w2_rel)
w1_dat = np.loadtxt("w1_mjy_final.dat")
w1_rel = np.delete(w1_dat, 2, axis=1)
w1_array = np.asarray(w1_rel)
peaks = []
y=1
N = 0
x = 0
z = 0
rand_indices = []
rand_indices2d = []
running_var = (len(find_length)*0.8)
while z<N:
while x<running_var:
rand_inx = randint(0, (len(find_length)-1))
rand_indices.append(rand_inx)
x=x+1
rand_indices2d.append(rand_indices)
z=z+1
while y<N:
w1_sampled = w1_array[rand_indices, :]
w2_sampled = w2_array[rand_indices, :]
w1s_t, w1s_dat = zip(*w1_sampled)
w2s_t, w2s_dat = zip(*w2_sampled)
w2s_mean = np.mean(w2s_dat)
w2s_stdev = np.std(w2s_dat)
w1s_mean = np.mean(w1s_dat)
w1s_stdev = np.std(w1s_dat)
taus = []
dcfs = []
bins = 40
for i in w2s_t:
for j in w1s_t:
tau_datpoint = i-j
taus.append(tau_datpoint)
for k in w2s_dat:
for l in w1s_dat:
dcf_datpoint = ((k - w2s_mean)*(l - w1s_mean))/((w2s_stdev*w1s_stdev))
dcfs.append(dcf_datpoint)
plotdat = np.vstack((taus, dcfs)).T
sort_plotdat = sorted(plotdat, key=lambda x:x[0])
np.savetxt("w1sw2sarray.txt", sort_plotdat)
taus_sort, dcfs_sort = np.loadtxt("w1w2array.txt").T
dcfs_means, taubins_edges, taubins_number = stats.binned_statistic(taus_sort, dcfs_sort, statistic='mean', bins=bins)
taubin_edge = np.delete(taubins_edges, 0)
import operator
indexs, values = max(enumerate(dcfs_means), key=operator.itemgetter(1))
percents = values*0.8
dcf_lists = dcfs_means.tolist()
centarr_negs, centarr_poss = np.split(dcfs_means, [indexs])
centind_negs = np.argmin(np.abs(centarr_negs - percents))
centind_poss = np.argmin(np.abs(centarr_poss - percents))
lagcent_negs = taubins_edges[centind_negs]
lagcent_poss = taubins_edges[int((bins/2)+centind_poss)]
sampled_peak = (np.abs(lagcent_poss - lagcent_negs)/2)+lagcent_negs
peaks.append(sampled_peak)
y=y+1
print peaks

Seeing as you're using numpy already, why not use np.random.randint
In your case:
np.random.randint(len(find_length)-1, size=(N, running_var))
Would give you an N*running_var sized matrix, with random integer entries from 0 to len(find_length)-2 inclusive.
Example Usage:
>>> N=4
>>> running_var=6
>>> find_length = [1,2,3]
>>> np.random.randint(len(find_length)-1, size=(N, running_var))
array([[1, 0, 1, 0, 0, 1],
[1, 0, 1, 1, 0, 0],
[1, 1, 0, 0, 1, 0],
[1, 1, 0, 1, 0, 1]])

Parallelize For loops in Numpy and Dask for creating a multi-dimensional histogram

I have three 4-dimensional arrays need to be binned to create a multi-dimensional histogram. In the example below, I have used numpy, but the actual arrays are being read in from a NetCDF file using xarray.
I know that xarray uses dask in the backend, and I have tried creating a small dask cluster on the machine I am using, which has 20 cores, but I don't get any speedup in the for loops, but I do get the speedup in the digitize step.
I am hoping someone can help me parallelize the for loop based on dask.
import numpy as np
# Initial datasets
s = np.random.rand(5,2,3,4)
ws = np.random.rand(5,2,3,4)
wd = np.random.rand(5, 2, 3, 4)
# Digitize to different bins
s_map = np.digitize(s, [0, .5, 1])
ws_map = np.digitize(ws, [0, .25, .5, .75, 1])
wd_map = np.digitize(wd, [.25, .5, 1])
# Get indexes that have values
s_ids = np.unique(s_map)
ws_ids = np.unique(ws_map)
wd_ids = np.unique(wd_map)
# Create output array
count = np.zeros((s_ids.size, ws_ids.size, wd_ids.size) + s.shape[1:])
# Loop over each of the maps to count how many values fall into each bin
for i, s_id in enumerate(s_ids):
s_mask = s_map == s_id
for j, ws_id in enumerate(ws_ids):
ws_mask = s_mask & (ws_map == ws_id)
for k, wd_id in enumerate(wd_ids):
mask = ws_mask & (wd_map == wd_id)
count[i, j, k, ...] += np.count_nonzero(mask, axis=0)

How to translate or shift batches of Tensors randomly in Tensorflow

I want to make my input image (tensor) to shift up/down or right/left randomly in every batch.
For example, I have a batch of grayscale images with size [10, 48, 64, 1].
If there is one image, I know I can use tf.pad and tf.slice(or other built-in functions)
But I want to apply random shift to 10 different images with one operation.
Is it possible? or should I use loop such as tf.scan?

As an alternative, you could also use tf.contrib.image.transform() and use the parameters a2 and b2 to translate the image:
import numpy as np
import tensorflow as tf
image1 = np.array([[[.1], [.1], [.1], [.1]],
[[.2], [.2], [.2], [.2]],
[[.3], [.3], [.3], [.3]],
[[.4], [.4], [.4], [.4]]])
image2 = np.array([[[.1], [.2], [.3], [.4]],
[[.1], [.2], [.3], [.4]],
[[.1], [.2], [.3], [.4]],
[[.1], [.2], [.3], [.4]]])
images = np.stack([image1, image2])
images_ = tf.convert_to_tensor(images, dtype=tf.float32)
shift1_x = 1
shift1_y = 2
shift2_x = -1
shift2_y = 0
transforms_ = tf.convert_to_tensor([[1, 0, -shift1_x, 0, 1, -shift1_y, 0, 0],
[1, 0, -shift2_x, 0, 1, -shift2_y, 0, 0]],
tf.float32)
shifted_ = tf.contrib.image.transform(images=images_,
transforms=transforms_)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
shifted = sess.run([shifted_])
print(shifted)
The transforms projection matrix can also be a tensor of size N x 8, so it is possible to shift every image of a batch differently. This can be easily extended by tf.random_uniform() to include some randomness to the x/y shift of each image.
Edit:
To use random shifts for every image of the batch:
...
images_ = tf.convert_to_tensor(images, dtype=tf.float32)
num_imgs = images.shape[0]
base_ = tf.convert_to_tensor(np.tile([1, 0, 0, 0, 1, 0, 0, 0], [num_imgs, 1]), dtype=tf.float32)
mask_ = tf.convert_to_tensor(np.tile([0, 0, 1, 0, 0, 1, 0, 0], [num_imgs, 1]), dtype=tf.float32)
random_shift_ = tf.random_uniform([num_imgs, 8], minval=-2.49, maxval=2.49, dtype=tf.float32)
transforms_ = base_ + random_shift_ * mask_
shifted_ = tf.contrib.image.transform(images=images_,
transforms=transforms_)
...
Edit 2:
For the sake of completion, here just another helper function with applies random rotation and shift to each single image of a batch:
def augment_data(input_data, angle, shift):
num_images_ = tf.shape(input_data)[0]
# random rotate
processed_data = tf.contrib.image.rotate(input_data,
tf.random_uniform([num_images_],
maxval=math.pi / 180 * angle,
minval=math.pi / 180 * -angle))
# random shift
base_row = tf.constant([1, 0, 0, 0, 1, 0, 0, 0], shape=[1, 8], dtype=tf.float32)
base_ = tf.tile(base_row, [num_images_, 1])
mask_row = tf.constant([0, 0, 1, 0, 0, 1, 0, 0], shape=[1, 8], dtype=tf.float32)
mask_ = tf.tile(mask_row, [num_images_, 1])
random_shift_ = tf.random_uniform([num_images_, 8], minval=-shift, maxval=shift, dtype=tf.float32)
transforms_ = base_ + random_shift_ * mask_
processed_data = tf.contrib.image.transform(images=processed_data,
transforms=transforms_)
return processed_data

Are you looking for tf.random_crop and tf.pad?
Well, when using tf.random_crop, a random shift will be applied to all images in the batch. The shift inside a batch is the same, but can be different for different batches.
If you want to use different shift within a batch, I think it's better to use a queue/input pipeline. See https://www.tensorflow.org/programmers_guide/reading_data for more.
Here's an example code from part of my own project. self.image_names is a Python list which contains paths to all training images. In an input pipeline, the data flow like a stream: you just need to deal with only one image, and the queue automatically takes care of scheduling things (some threads read the data, some process the data, some group single images into batches, others feed the data to GPU, etc., to keep the whole pipeline busy). In the code below, images and labels are queues. That is to say, when you process this variable (as I do in self.data_augmentation), you can think it contains only one image, but actually the queue processes every item in it (It's like an implicit loop), then tf.train.shuffle_batch will shuffle training data in the queue and group them into batches.
def data_augmentation(images):
if FLAGS.random_flip_up_down:
images = tf.image.random_flip_up_down(images)
if FLAGS.random_brightness:
images = tf.image.random_brightness(images, max_delta=0.3)
if FLAGS.random_contrast:
images = tf.image.random_contrast(images, 0.8, 1.2)
return images
def input_pipeline(self, batch_size, num_epochs=None, aug=False):
images_tensor = tf.convert_to_tensor(self.image_names, dtype=tf.string)
labels_tensor = tf.convert_to_tensor(self.labels, dtype=tf.int64)
input_queue = tf.train.slice_input_producer([images_tensor, labels_tensor], num_epochs=num_epochs)
labels = input_queue[1]
images_content = tf.read_file(input_queue[0])
images = tf.image.convert_image_dtype(tf.image.decode_png(images_content, channels=1), tf.float32)
if aug:
images = self.data_augmentation(images)
new_size = tf.constant([FLAGS.image_size, FLAGS.image_size], dtype=tf.int32)
images = tf.image.resize_images(images, new_size)
image_batch, label_batch = tf.train.shuffle_batch([images, labels], batch_size=batch_size, capacity=50000,
min_after_dequeue=10000)
# print 'image_batch', image_batch.get_shape()
return image_batch, label_batch

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Audio Data Agmentation in python - python

You can use numpy repeat function to replicate your numpy array. ex: In: arr = np.arange(3) out: array([0, 1, 2]) In : arr.repeat(3) Out: array([0, 0, 0, 1, 1, 1, 2, 2, 2]) Hope this will suffice your requirement.

You may refer link for reference: #https://www.geeksforgeeks.org/python-add-similar-value-multiple-times-in-list/ type(y_train)= panda series from itertools import repeat new_label=[] for index, value in y_train.items(): new_label.extend(repeat(value, 2)) len(new_label)

Related

Simple neural network gives wrong output after training

Trying to create a 3D matrix in Python by choosing specific data

How to create a 2D array with N lots of random numbers?

Parallelize For loops in Numpy and Dask for creating a multi-dimensional histogram

How to translate or shift batches of Tensors randomly in Tensorflow

Categories

Resources