Select Multiple slices from Numpy array at once - python

I want to implement a vectorized SGD algorithm and would like to generate multiple mini batches at once.
Suppose data = np.arange(0, 100), miniBatchSize=10, n_miniBatches=10 and indices = np.random.randint(0, n_miniBatches, 5) (5 mini batches). What I would like to achieve is
miniBatches = np.zeros(5, miniBatchSize)
for i in range(5):
miniBatches[i] = data[indices[i]: indices[i] + miniBatchSize]
Is there any way to avoid for loop?
Thanks!

It can be done using stride tricks:
from numpy.lib.stride_tricks import as_strided
a = as_strided(data[:n_miniBatches], shape=(miniBatchSize, n_miniBatches), strides=2*data.strides, writeable=False)
miniBatches = a[:, indices].T
# E.g. indices = array([0, 7, 1, 0, 0])
Output:
array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 7, 8, 9, 10, 11, 12, 13, 14, 15, 16],
[ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])

Related

What is this interpolation called?

I'm working on images, let's say that i have a row of the image matrix that has the values:
[1, 2, 3, 4, 5, 6, 7]
I want to resize this image using interpolation so that the row becomes something like:
[1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 7, 7, 7, 7]
Could someone tell me what is this interpolation technique called, and how can i possibly use it? I tried PIL.Image.resize resampling filters but they don't give me the results i'm looking for.
Thank you in advance!
This doesn't look like an interpolation but rather a repetition.
You can use a custom repeater and numpy.repeat:
a = np.array([1, 2, 3, 4, 5, 6, 7])
MAX, r = divmod(a.shape[0], 2)
rep = np.arange(1, MAX+r+1).astype(int)
rep = np.r_[rep[r:][::-1], rep]
out = np.repeat(a, rep)
output: array([1, 1, 1, 1, 2, 2, 2, 3, 3, 4, 5, 5, 6, 6, 6, 7, 7, 7, 7])
N-dimensional
a = np.arange(20).reshape(4, 5)
def custom_repeat(arr, axis=0):
MAX, r = divmod(arr.shape[axis], 2)
rep = np.arange(1, MAX+r+1).astype(int)
rep = np.r_[rep[r:][::-1], rep]
return np.repeat(arr, rep, axis=axis)
custom_repeat(custom_repeat(a, axis=0), axis=1)
output:
array([[ 0, 0, 0, 1, 1, 2, 3, 3, 4, 4, 4],
[ 0, 0, 0, 1, 1, 2, 3, 3, 4, 4, 4],
[ 5, 5, 5, 6, 6, 7, 8, 8, 9, 9, 9],
[10, 10, 10, 11, 11, 12, 13, 13, 14, 14, 14],
[15, 15, 15, 16, 16, 17, 18, 18, 19, 19, 19],
[15, 15, 15, 16, 16, 17, 18, 18, 19, 19, 19]])
Maybe this is what you're looking for
def my_procedure(current_list: list, index: int) -> list:
index_copy = index
table = []
status_add = True
for i in current_list:
for _ in range(index):
table.append(i)
if status_add:
index -= 1
else:
index += 1
if index == 1 or index == index_copy:
status_add = not status_add
return table
x = [1, 2, 3, 4, 5, 6, 7]
print(my_procedure(x, 4))
Possibly is pincushion distortion

Numpy: convert last axis to list [duplicate]

This question already has an answer here:
Numpy to list over 2nd axis
(1 answer)
Closed 2 years ago.
Let numpy array be shape (x, y, z).
I want it to be (x, y) shape with every element being a list of z-length: [a, b, c, ..., z]
Is there any way to do it with numpy methods?
You can use tolist and assign to a preallocated object array:
import numpy as np
a = np.random.randint(0,10,(100,100,100))
def f():
A = np.empty(a.shape[:-1],object)
A[...] = a.tolist()
return A
f()[99,99]
# [4, 5, 9, 2, 8, 9, 9, 6, 8, 5, 7, 9, 8, 7, 6, 1, 9, 6, 2, 9, 0, 7, 0, 1, 2, 8, 4, 4, 7, 0, 1, 2, 3, 8, 9, 6, 0, 1, 4, 7, 0, 7, 9, 3, 9, 1, 8, 7, 1, 2, 3, 6, 6, 2, 7, 0, 2, 8, 7, 0, 0, 1, 8, 2, 6, 3, 5, 4, 9, 6, 9, 0, 2, 5, 9, 5, 3, 7, 0, 1, 9, 0, 8, 2, 0, 7, 3, 6, 9, 9, 4, 4, 3, 8, 4, 7, 4, 2, 1, 8]
type(f()[99,99])
# <class 'list'>
from timeit import timeit
timeit(f,number=100)*10
# 28.67872992530465
I can't imagine why numpy would need such a method. Here is, more or less, a pythonic solution.
import numpy as np
# an example array with shape [2,3,4]
a = np.random.random([2,3,4])
# create the target array shaped [2,3] with 'object' type (accepting other types than numbers).
b = np.array([[None for row in mat] for mat in a])
for i in range(b.shape[0]):
for j in range(b.shape[1]):
b[i,j] = list(a[i,j])

How to generate a list of numbers with duplicates base on a certain seed in python

I don't know how to generate a list of numbers with duplicates based on a certain seed.
I have tried using the code below, but it cannot generate numbers that have duplicates
random.seed(3340)
test = random.sample(range(100), 100000)
I think this could work, but I got an error saying "ValueError: Sample larger than population or is negative"
I could implement some functions that can do this, but I think it would be a great idea if I can use some libraries.
random.sample samples without replacement. random.choices samples with replacement, which is what you want:
In [1]: import random
In [2]: random.choices([1, 2], k=10)
Out[2]: [2, 1, 1, 2, 1, 1, 1, 2, 2, 1]
You can also do this with numpy:
In [3]: import numpy
In [4]: numpy.random.randint(0, 10, 100)
Out[4]:
array([7, 6, 3, 3, 8, 5, 9, 5, 4, 5, 1, 5, 8, 2, 4, 3, 9, 3, 5, 7, 9, 6,
2, 3, 5, 8, 4, 9, 3, 3, 0, 8, 4, 4, 7, 2, 8, 4, 4, 9, 1, 1, 7, 1,
3, 1, 1, 5, 1, 7, 5, 1, 9, 6, 0, 4, 8, 9, 9, 4, 7, 6, 0, 5, 1, 8,
4, 8, 9, 8, 5, 4, 3, 0, 2, 6, 4, 4, 2, 3, 0, 6, 7, 3, 5, 9, 3, 7,
4, 1, 7, 6, 7, 8, 7, 6, 0, 5, 1, 0])
I dont know if you're looking for a simpler solution, but you could use indexing in a generator:
population = list(range(100))
sample = [population[random.randint(0,99) for _ in range(100000)]]
You could use this comprehension as well:
random.seed(3340)
test = [random.randrange(100) for _ in range(100000)]

3D array to 2d array from pandas Python and Numpy

I have created the array from a csv using pandas and numpy.
This is my code that convert 2D csv to 3D array:
>>> import pandas as pd
>>> import numpy as npp
>>> df = pd.read_csv("test.csv")
>>> df_mat = df.values
>>> seq_len = 3
>>> data=[]
>>> for index in range(len(df_mat) - seq_len):
... data.append(df_mat[index: index + seq_len])
...
>>> data = np.array(data)
>>> data.shape
(4, 3, 9)
The csv is used is:
input1,input2,input3,input4,input5,input6,input7,input8,output
1,2,3,4,5,6,7,8,1
2,3,4,5,6,7,8,9,0
3,4,5,6,7,8,9,10,-1
4,5,6,7,8,9,10,11,-1
5,6,7,8,9,10,11,12,1
6,7,8,9,10,11,12,13,0
7,8,9,10,11,12,13,14,1
Now I want to get the 3D array back to 2D array format.
Kindly, let me know how I can I do that. Not getting any clue.
Slice on the 0th rows along each each block until the last block and stack with the last one -
np.vstack((data[np.arange(data.shape[0]-1),0],data[-1]))
Output with given sample data -
In [24]: np.vstack((data[np.arange(data.shape[0]-1),0],data[-1]))
Out[24]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 1],
[ 2, 3, 4, 5, 6, 7, 8, 9, 0],
[ 3, 4, 5, 6, 7, 8, 9, 10, -1],
[ 4, 5, 6, 7, 8, 9, 10, 11, -1],
[ 5, 6, 7, 8, 9, 10, 11, 12, 1],
[ 6, 7, 8, 9, 10, 11, 12, 13, 0],
[ 7, 8, 9, 10, 11, 12, 13, 14, 1]], dtype=int64)
Or slice 0th rows across all blocks and stack with the last block skipping the first row -
In [28]: np.vstack((data[np.arange(data.shape[0]),0],data[-1,1:]))
Out[28]:
array([[ 1, 2, 3, 4, 5, 6, 7, 8, 1],
[ 2, 3, 4, 5, 6, 7, 8, 9, 0],
[ 3, 4, 5, 6, 7, 8, 9, 10, -1],
[ 4, 5, 6, 7, 8, 9, 10, 11, -1],
[ 5, 6, 7, 8, 9, 10, 11, 12, 1],
[ 6, 7, 8, 9, 10, 11, 12, 13, 0],
[ 7, 8, 9, 10, 11, 12, 13, 14, 1]], dtype=int64)

Indexing a 2d array with a 3d array in numpy

I have two arrays.
"a", a 2d numpy array.
import numpy.random as npr
a = array([[5,6,7,8,9],[10,11,12,14,15]])
array([[ 5, 6, 7, 8, 9],
[10, 11, 12, 14, 15]])
"idx", a 3d numpy array constituting three index variants I want to use to index "a".
idx = npr.randint(5, size=(nsamp,shape(a)[0], shape(a)[1]))
array([[[1, 2, 1, 3, 4],
[2, 0, 2, 0, 1]],
[[0, 0, 3, 2, 0],
[1, 3, 2, 0, 3]],
[[2, 1, 0, 1, 4],
[1, 1, 0, 1, 0]]])
Now I want to index "a" three times with the indices in "idx" to obtain an object as follows:
array([[[6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
The naive "a[idx]" does not work. Any ideas as to how to do this? (I use Python 3.4 and numpy 1.9)
You can use choose to make the selection from a:
>>> np.choose(idx, a.T[:,:,np.newaxis])
array([[[ 6, 7, 6, 8, 9],
[12, 10, 12, 10, 11]],
[[ 5, 5, 8, 7, 5],
[11, 14, 12, 10, 14]],
[[ 7, 6, 5, 6, 9],
[11, 11, 10, 11, 10]]])
As you can see, a has to be reshaped from an array with shape (2, 5) to an array with shape (5, 2, 1) first. This is essentially so that it is broadcastable with idx, which has shape (3, 2, 5).
(I learned this method from #immerrr's answer here: https://stackoverflow.com/a/26225395/3923281)
You can use take array method:
import numpy
a = numpy.array([[5,6,7,8,9],[10,11,12,14,15]])
idx = numpy.random.randint(5, size=(3, a.shape[0], a.shape[1]))
print a.take(idx)

Categories