Torch running out of memory issues - python

Using Torch, I am trying to load a large set of images into the program. But as I approach 50'000 images the kernel starts to crash which I assume is due to memory limitation. A minimal example of my code (results using 20,000 images):
print(f"Before starting to loop: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
X_data = []
y_data = []
for path in paths:
img = cv2.imread(path)
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
X_data.append(np.array(img/255, dtype=np.uint8))
print(f"Before convert to numpy: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
X_data = np.array(X_data)
print(f"Before shuffle: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
shuffle_index = np.random.permutation(X_data.shape[0])
X_data = X_data[shuffle_index]
print(f"Before Convert to tensor: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
X_data = torch.Tensor(X_data).view(-1, 3, 128, 128)
print(f"Before save: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
torch.save(X_data, f"X_data.pt")
print(f"After save: {psutil.Process(os.getpid()).memory_info().rss / 1024 ** 3} GB")
Gives the following memory information:
Before starting to loop: 0.26 GB
Before convert to numpy: 1.29 GB
Before shuffle: 2.28 GB
Before Convert to tensor: 2.28 GB
Before save: 5.22 GB
After save: 4.14 GB
Is there something I am doing inefficiently? I have tried playing around with not using the intermediate steps but both torch.cat and numpy.append are just way too slow.
Is it instead recommended to store data as files in batch sizes and then load the data whenever that batch is going to be fed through the network? I can not find any beginner guides on how to do that and also, 50'000 images of size 1281283 seem to be a rather small amount of images to be causing issues...

Two points:
Use in place shuffles instead of np.random.permutation to create indexes. The latter will create new arrays while the former will not:
np.random.default_rng().shuffle(X_data, 0)
Use torch.from_numpy to create the Tensor instead of torch.Tensor. In this way, the Tensor created will share memory with the numpy array:
X_data = torch.from_numpy(X_data).view(-1, 3, 128, 128)
If you want to shuffle multiple arrays of the same length in the same order, you can use the same seed to build random generator (if you don't want the results to be reproducible, you can first use the default random generator to generate the seed to be used):
>>> a1 = np.arange(10).repeat(2).reshape(-1, 2)
>>> a2 = np.arange(10)
>>> np.random.default_rng(12345).shuffle(a1, 0)
>>> np.random.default_rng(12345).shuffle(a2, 0)
>>> a1
array([[4, 4],
[8, 8],
[1, 1],
[3, 3],
[7, 7],
[9, 9],
[6, 6],
[0, 0],
[2, 2],
[5, 5]])
>>> a2
array([4, 8, 1, 3, 7, 9, 6, 0, 2, 5])

Related

Tensor repeat for image patches

I have a batch of 20 flattened tensors representing 256X256 images.
>>> imgs.shape
(20, 65536)
Each image was split into 32x32 patches (a total of 64 patches per image). I have calculated a score for each patch and got a vector with the shape of (20,64)
I would like to multiply each pixel with the corresponding patch score.
imgs * score yields an error and score.repeat(1,1,64) didn't repeat the scores in a way that preserves the score of each pixel.
How can this be achieved?
EDIT:
A simple example can be using
import torch
img_size = 4
patch_size = 2
img = torch.rand((2,img_size,img_size)) # (2,4,4)
score = torch.tensor([[1,2,3,4],[5,6,7,8]]) # (2,4)
And trying to achieve
score = [[1,1,3,3],[2,2,4,4],[5,5,6,6][7,7,8,8]]
I would suggest reshaping your scores array to preserve information about how it relates to the original image, then using repeat_interleave() twice.
Example:
import torch
img_size = 4
patch_size = 2
patches_per_axis = int(img_size / patch_size)
num_images = 2
img = torch.rand((2,img_size,img_size)) # (2,4,4)
score = torch.tensor([[1,2,3,4],[5,6,7,8]]) # (2,4)
def expand_scores(scores):
# Unflatten scores
scores = scores.reshape((num_images, patches_per_axis, patches_per_axis))
# Repeat scores to match dimensions of image, in vertical direction
scores = scores.repeat_interleave(repeats=patch_size, axis=1)
# Repeat scores to match dimensions of image, in horizontal direction
scores = scores.repeat_interleave(repeats=patch_size, axis=2)
# Optional: use reshape() to re-flatten scores. If you do that here, you'll need to do it to the image tensor too.
return scores
(I added two constants at the top to your example, num_images, and patches_per_axis. In your original example, these would be set to 20 and 8, respectively.)
When you call expand_scores(), you'll get the following output:
tensor([[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]],
[[5, 5, 6, 6],
[5, 5, 6, 6],
[7, 7, 8, 8],
[7, 7, 8, 8]]])
You can multiply that by the pixel values:
expand_scores(score) * img

transform broadcasting to something calculateable. matrix np.multipy

I'm trying to calculate this type of calculation:
arr = np.arange(4)
# array([0, 1, 2, 3])
arr_t =arr.reshape((-1,1))
# array([[0],
# [1],
# [2],
# [3]])
mult_arr = np.multiply(arr,arr_t) # <<< the multiplication
# array([[0, 0, 0, 0],
# [0, 1, 2, 3],
# [0, 2, 4, 6],
# [0, 3, 6, 9]])
to eventually perform it in a bigger matrix index of single row, and to sum all the matrices that are reproduced by the calculation:
arr = np.random.random((600,150))
arr_t =arr.reshape((-1,arr.shape[1],1))
mult = np.multiply(arr[:,None],arr_t)
summed = np.sum(mult,axis=0)
summed
Till now its all pure awesomeness, the problem starts when I try to covert on a bigger dataset, for example this array instead :
arr = np.random.random((6000,1500))
I get the following error - MemoryError: Unable to allocate 101. GiB for an array with shape (6000, 1500, 1500) and data type float64
which make sense, but my question is:
can I get around this anyhow without being forced to use loops that slow down the process entirely ??
my question is mainly about performance and solution that require long running tasks more then 30 secs is not an option.
Looks like you are simply trying to perform a dot product:
arr.T#arr
or
arr.T.dot(arr)
checking this is what you want
arr = np.random.random((600,150))
arr_t =arr.reshape((-1,arr.shape[1],1))
mult = np.multiply(arr[:,None],arr_t)
summed = np.sum(mult,axis=0)
np.allclose((arr.T#arr), summed)
# True

Use information of two arrays to create a third one

I have two numpy-arrays and want to create a third one with the information in these twos.
Here is a simple example:
have = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
use = np.array([[2], [3]])
solution = np.array([[1, 1, 3, 4], [5, 5, 5, 8]])
What I want is to use the "use"-array, which gives me the number of how often I want to use the first element in each row from my "have"-array.
So the 2 in "use" means, that I want to have two times a "1" in my new array "solution". Similary for the "3" in use, I want that my new array has 3 times a "5". The rest from have should be the same.
It is important to use the "use"-array for doing this (or a numpy-array in general).
Do you have some ideas?
If there are only small such data structures and performance is not an issue then you can do this so simple:
np.array([ [a[0]]*b[0]+list(a[b[0]:]) for a,b in zip(have,use)])
Simply iterate through the have and replace the values based on the use.
Use:
for i in range(use.shape[0]):
have[i, :use[i, 0]] = np.repeat(have[i, 0], use[i, 0])
Using only numpy operations:
First create a boolean mask of same size as have. mask(i, j) is True if j < use[i, j] otherwise it's False. So mask is True for indices which are to be replaced by first column value. Now use np.where to replace.
n, m = have.shape
mask = np.repeat(np.arange(m)[None, :], n, axis = 0) < use
have = np.where(mask, have[:, 0:1], have)
Output:
>>> have
array([[1, 1, 3, 4],
[5, 5, 5, 8]])
If performance matters, you can use np.apply_along_axis().
import numpy as np
have = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
use = np.array([[2], [3]])
def rep1st(arr):
rep = arr[0]
res = np.repeat(arr[1], rep)
res = np.concatenate([res, arr[rep+1:]])
return res
solution = np.apply_along_axis(rep1st, 1, np.concatenate([use, have], axis=1))
update:
As #hpaulj said, actually the method using apply_along_axis above is not as efficient as I expected. I misunderstood it. Reference: numpy np.apply_along_axis function speed up?.
However, I made some test on current methods:
import numpy as np
from timeit import timeit
def rep1st(arr):
rep = arr[0]
res = np.repeat(arr[1], rep)
res = np.concatenate([res, arr[rep + 1:]])
return res
def test(row, col, run):
have = np.random.randint(0, 100, size=(row, col))
use = np.random.randint(0, col, size=(row, 1))
d = locals()
d.update(globals())
# method by me
t1 = timeit("np.apply_along_axis(rep1st, 1, np.concatenate([use, have], axis=1))", number=run, globals=d)
# method by #quantummind
t2 = timeit("np.array([[a[0]] * b[0] + list(a[b[0]:]) for a, b in zip(have, use)])", number=run, globals=d)
# method by #Amit Vikram Singh
t3 = timeit(
"np.where(np.repeat(np.arange(have.shape[1])[None, :], have.shape[0], axis=0) < use, have[:, 0:1], have)",
number=run, globals=d
)
print(f"{t1:8.6f}, {t2:8.6f}, {t3:8.6f}")
test(1000, 10, 10)
test(100, 100, 10)
test(10, 1000, 10)
test(1000000, 10, 1)
test(100000, 100, 1)
test(10000, 1000, 1)
test(1000, 10000, 1)
test(100, 100000, 1)
test(10, 1000000, 1)
results:
0.062488, 0.028484, 0.000408
0.010787, 0.013811, 0.000270
0.001057, 0.009146, 0.000216
6.146863, 3.210017, 0.044232
0.585289, 1.186013, 0.034110
0.091086, 0.961570, 0.026294
0.039448, 0.917052, 0.022553
0.028719, 0.919377, 0.022751
0.035121, 1.027036, 0.025216
It shows that the second method proposed by #Amit Vikram Singh always works well even when the arrays are huge.

Adding matrix rows to columns in numpy

Say I have two 3D matrices/tensors with dimensions:
[10, 3, 1000]
[10, 4, 1000]
How do I add each combination of the third dimensions of each vector together such that to get a dimension of:
[10, 3, 4, 1000]
So each row if you will, in the second x third dimension for each of the vectors adds to the other one in every combination. Sorry if this is not clear I'm having a hard time articulating this...
Is there some kind of clever way to do this with numpy or pytorch (perfectly happy with a numpy solution, though I'm trying to use this in a pytorch context so a torch tensor manipulation would be even better) that doesn't involve me writing a bunch of nested for loops?
Nested loops example:
x = np.random.randint(50, size=(32, 16, 512))
y = np.random.randint(50, size=(32, 21, 512))
scores = np.zeros(shape=(x.shape[0], x.shape[1], y.shape[1], 512))
for b in range(x.shape[0]):
for i in range(x.shape[1]):
for j in range(y.shape[1]):
scores[b, i, j, :] = y[b, j, :] + x[b, i, :]
Does it work for you?
import torch
x1 = torch.rand(5, 3, 6)
y1 = torch.rand(5, 4, 6)
dim1, dim2 = x1.size()[0:2], y1.size()[-2:]
x2 = x1.unsqueeze(2).expand(*dim1, *dim2)
y2 = y1.unsqueeze(1).expand(*dim1, *dim2)
result = x2 + y2
print(x1[0, 1, :])
print(y1[0, 2, :])
print(result[0, 1, 2, :])
Output:
0.2884
0.5253
0.1463
0.4632
0.8944
0.6218
[torch.FloatTensor of size 6]
0.5654
0.0536
0.9355
0.1405
0.9233
0.1738
[torch.FloatTensor of size 6]
0.8538
0.5789
1.0818
0.6037
1.8177
0.7955
[torch.FloatTensor of size 6]

NumPy save some arrays at once

I working on different shapes of arrays and I want to save them all with numpy.save, so, consider I have
mat1 = numpy.arange(8).reshape(4, 2)
mat2 = numpy.arange(9).reshape(2, 3)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It works. But when I have two matrices with one dimension of same size it's not working.
mat1 = numpy.arange(8).reshape(2, 4)
mat2 = numpy.arange(10).reshape(2, 5)
numpy.save('mat.npy', numpy.array([mat1, mat2]))
It causes
Traceback (most recent call last):
File "<input>", line 1, in <module>
ValueError: could not broadcast input array from shape (2,4) into shape (2)
And note that the problem caused by numpy.array([mat1, mat2]) and not by numpy.save
I know that such array is possible:
>> numpy.array([[[1, 2]], [[1, 2], [3, 4]]])
array([[[1, 2]], [[1, 2], [3, 4]]], dtype=object)
So, all of what I want is to save two arrays as mat1 and mat2 at once.
If you'd like to save multiple arrays in the same format as np.save, use np.savez.
For example:
import numpy as np
arr1 = np.arange(8).reshape(2, 4)
arr2 = np.arange(10).reshape(2, 5)
np.savez('mat.npz', name1=arr1, name2=arr2)
data = np.load('mat.npz')
print data['name1']
print data['name2']
If you have several arrays, you can expand the arguments:
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
np.savez('mat.npz', *data)
container = np.load('mat.npz')
data = [container[key] for key in container]
Note that the order is not preserved. If you do need to preserve order, you might consider using pickle instead.
If you use pickle, be sure to specify the binary protocol, otherwise the you'll write things using ascii pickle, which is particularly inefficient for numpy arrays. With a binary protocol, ndarrays more or less pickle to the same format as np.save/np.savez. For example:
# Note: This is Python2.x specific. It's identical except for the import on 3.x
import cPickle as pickle
import numpy as np
data = [np.arange(8).reshape(2, 4), np.arange(10).reshape(2, 5)]
with open('mat.pkl', 'wb') as outfile:
pickle.dump(data, outfile, pickle.HIGHEST_PROTOCOL)
with open('mat.pkl', 'rb') as infile:
result = pickle.load(infile)
In this case, result and data will have identical contents and the order of the input list of arrays will be preserved.
Small addition: if you'd like to use numpy.savez() and preserve names associated with the saved arrays (instead of arr_0, arr_1, ...) you can pass a dictionary as **kwargs using the double-star operator.
d = {}
d['a'] = np.random.randint(10, size=5)
d['b'] = np.random.randint(10, size=5)
print(d)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}
np.savez("test", **d)
container = np.load("test.npz")
e = {name: container[name] for name in container}
print(e)
# {'a': array([8, 9, 5, 0, 0]), 'b': array([1, 7, 6, 9, 2])}

Categories