So, I'm trying to do this image processing exercise. I have to separate a 512px image in 16 equal size block, mixing them up in a pre arranged configuration, creating a **mozaic**. For this I'm using the **numpy** library and running some **tests** to see how it works. That said, I have this code:
import numpy as np
from scipy import misc
from scipy import ndimage
import matplotlib.pyplot as plt
img = misc.imread("test.png")
list = [(img[0:128, 0:128]), (img[0:128, 128:256]), (img[0:128, 256:384]), (img[0:128, 384:512]),
(img[128:256, 0:128]), (img[128:256, 128:256]), (img[128:256, 256:384]), (img[128:256, 384:256]),
(img[256:384, 0:128]), (img[256:384, 128:256]), (img[256:384, 256:384]), (img[256:384, 384:512]),
(img[384:512, 0:128]), (img[384:512, 128:256]), (img[384:512, 256:384]), (img[384:512, 384:512])]
list[0] = list[1]
print list[0], '\n\n\n', list[1]
plt.imshow(img)
plt.show()
As you can see, I separated the image with the index and put it all in the list. Then I'm trying to replace one "block" for other with this:
list[0] = list[1]
The problem is when I print them
print list[0], '\n\n\n', list[1]
they'are equal, but if I do
print img[0:128, 0:128],'\n\n', img[0:128, 128:256]
I can see that the real array wasn't changed at all. **How can I make the canges apply to the image array and not just to the list?** Or there is a simple way to do this? I took a look in the split functions of numpy and none of them seemed to help here.
PS: I know this works, but I need to do with the list.
img[0:128, 0:128] = list[1]
So you have a list of views of img (the () are not needed)
alist = [img[0:128, 0:128], img[0:128, 128:256], img[0:128, 256:384], ...]
Doing
alist[0] = alist[1]
amounts to changing the list to:
alist = [img[0:128, 128:256], img[0:128, 128:256], img[0:128, 256:384], ...]
because the action just changes the pointer in alist[0] to the pointer in alist[1]. Now the first 2 elements are same thing.
alist[0][...] = alist[1]
would change the contents of alist[0] array without changing the object that it points to. In other words img[0:128, 128:256] is a mutable object, and img[0:128, 128:256][...]=... changes the contents (if done correctly).
This kind of action is a little tricky. You have to match dimensions. I first wrote alist[0][:] = alist[1] and then realized the alist[0] is a 2d array, so I need to use [:,:] or [...].
Illustrating on a smaller array
In [69]: img=np.arange(16).reshape(4,4)
In [70]: alist=[img[0:2,0:2],img[0:2,2:4],img[2:4,0:2],img[2:4,2:4]]
In [71]: alist[0][...]=alist[1]
In [72]: alist
Out[72]:
[array([[2, 3],
[6, 7]]), array([[2, 3],
[6, 7]]), array([[ 8, 9],
[12, 13]]), array([[10, 11],
[14, 15]])]
In [73]: img
Out[73]:
array([[ 2, 3, 2, 3],
[ 6, 7, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
I'm not super clear on exactly what you're asking - can you just set
img[0:128, 0:128] = img[128:256, 128:256]
or are you looking for something else?
Also, be careful calling your list list in Python - list is a built-in type in Python and changing it might make things confusing later on down the track.
Related
I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]
I was surprised that numpy.split yields a list and not an array. I would have thought it would be better to return an array, since numpy has put a lot of work into making arrays more useful than lists. Can anyone justify numpy returning a list instead of an array? Why would that be a better programming decision for the numpy developers to have made?
A comment pointed out that if the slit is uneven, the result can't be a array, at least not one that has the same dtype. At best it would be an object dtype.
But lets consider the case of equal length subarrays:
In [124]: x = np.arange(10)
In [125]: np.split(x,2)
Out[125]: [array([0, 1, 2, 3, 4]), array([5, 6, 7, 8, 9])]
In [126]: np.array(_) # make an array from that
Out[126]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
But we can get the same array without split - just reshape:
In [127]: x.reshape(2,-1)
Out[127]:
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Now look at the code for split. It just passes the task to array_split. Ignoring the details about alternative axes, it just does
sub_arys = []
for i in range(Nsections):
# st and end from `div_points
sub_arys.append(sary[st:end])
return sub_arys
In other words, it just steps through array and returns successive slices. Those (often) are views of the original.
So split is not that sophisticate a function. You could generate such a list of subarrays yourself without a lot of numpy expertise.
Another point. Documentation notes that split can be reversed with an appropriate stack. concatenate (and family) takes a list of arrays. If give an array of arrays, or a higher dim array, it effectively iterates on the first dimension, e.g. concatenate(arr) => concatenate(list(arr)).
Actually you are right it returns a list
import numpy as np
a=np.random.randint(1,30,(2,2))
b=np.hsplit(a,2)
type(b)
it will return type(b) as list so, there is nothing wrong in the documentation, i also first thought that the documentation is wrong it doesn't return a array, but when i checked
type(b[0])
type(b[1])
it returned type as ndarray.
it means it returns a list of ndarrary's.
I have a list of lists, where each inner list represents a row in a spreadsheet. With my current data structure, how can I perform an operation on each element on an inner list with the same index ( which amounts to basically performing operations down a column in a spreadsheet.)
Here is an example of what I am looking for (in terms of addition)
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
>>> sumindex = [1+4+7, 2+5+8, 3+6+9]
>>> sumindex = [12, 15, 18]
This problem can probably be solved with slicing, but I'm unable to see how to do that cleanly. Is there a nifty tool/library out there that can accomplish this for me?
Just use zip:
sumindex = [sum(elts) for elts in zip(*lisolis)]
#tzaman has a good solution for lists, but since you have also put numpy in the tags, there's an even simpler solution if you have a numpy 2D array:
>>> inport numpy
>>> a = numpy.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
>>> a.sum(axis=0)
array([12, 15, 18])
This should be faster if you have large arrays.
>>> sumindex = numpy.array(lisolis).sum(axis=0).tolist()
>>import pandas as pd
>>df = pd.DataFrame([[1,2,3], [4,5,6], [7,8,9]], columns=['A','B','C'])
>>df.sum()
A 12
B 15
C 18
The list(), map(), zip(), and sum() functions make short work of this problem:
>>> list(map(sum, zip(*lisolis)))
[12, 15, 18]
First I'll fill out the lists of what we want to add up:
>>> lisolis = [[1,2,3], [4,5,6], [7,8,9]]
Then create an empty list for my sums:
>>> sumindex = []
Now I'll start a loop inside a loop. I'm going to add the numbers that are in same positions of each little list (that's the "y" loop) and I'm going to do that for each position (that's the "x" loop).
>>> for x in range(len(lisolis[0])):
z = 0
for y in range(len(lisolis)):
z += lisolis[y][x]
sumindex.append(z)
range(len(lisolis[0])) gives me the length of the little lists so I know how many positions there are to add up, and range(len(lisolis)) gives me the amount of little lists so I know how many numbers need to be added up for any particular position in the list.
"z" is a placecard holder. Each number in a particular list position is going to be added to "z" until they're summed up. After that, I'll put the value of "z" into the next slot in sumindex.
To see the results, I would then type:
>>>sumindex
which would give me:
[12, 15, 18]
I know that list aliasing is an issue in Python, but I can't figure out a way around it.
def zeros(A):
new_mat = A
for i in range(len(A)):
for j in range(len(A[i])):
if A[i][j]==0:
for b in range(len(A)):
new_mat[b][j] = 0
else:
new_mat[i][j] = A[i][j]
return A
Even though I don't change the values of A at all, when I return A, it is still modified:
>>> Matrix = [[1,2,3],[5,0,78],[7,3,45]]
>>> zeros(Matrix)
[[1, 0, 3], [5, 0, 78], [7, 0, 45]]
Is this list aliasing? If so, how do you modify elements of a 2D array without aliasing occurring? Thanks so muhc <3.
new_mat = A does not create a new matrix. You have merely given a new name to the object you also knew as A. If it's a list of lists of numbers, you might want to use copy.deepcopy to create a full copy, and if it's a numpy array you can use the copy method.
new_mat = A[:]
This creates a copy of the list instead of just referencing it. Give it a try.
[:] just specifies a slice from beginning to end. You could have [1:] and it would be from element 1 to the end, or [1:4] and it would be element 1 to 4, for instance. Check out list slicing regarding that.
This might help someone else. just do this.
import copy
def zeros(A):
new_mat = copy.deepcopy(A)
QUICK EXPLANATION
If you want A to be the original matrix your function should be like this. Just use np.copy() function
def zeros(A):
new_mat = np.copy(A) #JUST COPY IT WITH np.copy() function
for i in range(len(A)):
for j in range(len(A[i])):
if A[i][j]==0:
for b in range(len(A)):
new_mat[b][j] = 0
else:
new_mat[i][j] = A[i][j]
return A
THOROUGH EXPLANATION
Let's say we don't want numpy ndarray a to have aliasing. For example, we want to prevent a to change any of its values when we take (for example) a slice of it, we assign this slice to b and then we modify one element of b. We want to AVOID this:
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = a[0]
b[2] = 50
a
Out[]:
array([[ 1, 2, 50],
[ 4, 5, 6],
[ 7, 8, 9]])
Now you might think that treating a numpy-ndarray object as if it was a list object might solve our problem. But it DOES NOT WORK either:
%reset #delete all previous variables
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = a[:][0] #this is how you would have copied a slice of "a" if it was a list
b[2] = 50
a
Out[]:
array([[ 1, 2, 50], #problem persists
[ 4, 5, 6],
[ 7, 8, 9]])
The problem gets solved, in this case, if you think of numpy arrays as different objects than python lists. In order to copy numpy arrays you can't do copy = name_array_to_be_copied[:], but copy = np.copy(name_array_to_be_copied). Therefore, this would solve our aliasing problem:
%reset #delete all previous variables
import numpy as np
a = np.array([[1,2,3],[4,5,6],[7,8,9]])
b = np.copy(a)[0] #this is how you copy numpy ndarrays. Calling np.copy() function
b[2] = 50
a
Out[]:
array([[1, 2, 3], #problem solved
[4, 5, 6],
[7, 8, 9]])
P.S. Watch out with the zeros() function. Even after fixing the aliasing issue, your function does not convert to 0 the columns in the new_matrix who have at least one zero in the same column for the matrix A (this is what I think you wanted to acomplish by seeing the incorrectly reported output of your function [[1, 0, 3], [5, 0, 78], [7, 0, 45]], since it actually yields [[1,0,3],[5,0,78],[7,3,45]]). If you want that you can try this:
def zeros_2(A):
new_mat = np.copy(A)
for i in range(len(A[0])): #I assume each row has same length.
if 0 in new_mat[:,i]:
new_mat[:,i] = 0
print(new_mat)
return A
I'm really confused by the index logic of numpy arrays with several dimensions. Here is an example:
import numpy as np
A = np.arange(18).reshape(3,2,3)
[[[ 0, 1, 2],
[ 3, 4, 5]],
[[ 6, 7, 8],
[ 9, 10, 11]],
[[12, 13, 14],
[15, 16, 17]]])
this gives me an array of shape (3,2,3), call them (x,y,z) for sake of argument. Now I want an array B with the elements from A corresponding to x = 0,2 y =0,1 and z = 1,2. Like
array([[[ 1, 2],
[4, 5]],
[[13, 14],
[16, 17]]])
Naively I thought that
B=A[[0,2],[0,1],[1,2]]
would do the job. But it gives
array([ 2, 104])
and does not work.
A[[0,2],:,:][:,:,[1,2]]
does the job. But I still wonder whats wrong with my first try. And what is the best way to do what I want to do?
There are two types of indexing in NumPy basic and advanced. Basic indexing uses tuples of slices for indexing, and does not copy the array, but rather creates a view with adjusted strides. Advanced indexing in contrast also uses lists or arrays of indices and copies the array.
Your first attempt
B = A[[0, 2], [0, 1], [1, 2]]
uses advanced indexing. In advanced indexing, all index lists are first broadcasted to the same shape, and this shape is used for the output array. In this case, they already have the same shape, so the broadcasting does not do anything. The output array will also have this shape of two entries. The first entry of the output array is obtained by using all first indices of the three lists, and the second by using all second indices:
B = numpy.array([A[0, 0, 1], A[2, 1, 2]])
Your second approach
B = A[[0,2],:,:][:,:,[1,2]]
does work, but it is inefficient. It uses advanced indexing twice, so your data will be copied twice.
To get what you actually want with advanced indexing, you can use
A[np.ix_([0,2],[0,1],[1,2])]
as pointed out by nikow. This will copy the data only once.
In your example, you can get away without copying the data at all, using only basic indexing:
B = A[::2, :, 1:2]
I recommend the following advanced tutorial, which explains the various indexing methods: NumPy MedKit
Once you understand the powerful ways to index arrays (and how they can be combined) it will make sense. If your first try was valid then this would collide with some of the other indexing techniques (reducing your options in other use cases).
In your example you can exploit that the third index covers a continuous range:
A[[0,2],:,1:]
You could also use
A[np.ix_([0,2],[0,1],[1,2])]
which is handy in more general cases, when the latter indices are not continuous. np.ix_ simply constructs three index arrays.
As Sven pointed out in his answer, there is a more efficient way in this specific case (using a view instead of a copied version).
Edit: As pointed out by Sven my answer contained some errors, which I have removed. I still think that his answer is better, but unfortunately I can't delete mine now.
A[(0,2),:,1:]
If you wanted
array([[[ 1, 2],
[ 4, 5]],
[[13, 14],
[16, 17]]])
A[indices you want,rows you want, col you want]