Using every first element in a multidimensional array - python

I had thought that if you ran perhaps print mdarray[::][1], you would print the first sub-element of every element in the array. Where did I go wrong with this?
I especially need this for a p.plot(x,y[::][1]) where I definitely do not want to use a for loop, as it is horribly slow, unless I'm getting things confused.
What am I getting wrong? Thanks!
EDIT
I still don't know where I got the [::] thing but I solved my problem with either
p.plot(x,c[:,1],color='g',label="Closing value")
or
p.plot(x,[i[1] for i in c],color='g',label="Closing value")
There doesn't seem to be any appreciable difference in time, so I guess I'll use the second because it looks more pythonic/readable to me. Or am I missing something?
Thanks for all of the help!

If mdarray is a numpy array you can access first column of it with mdarray[:,0]
In [8]: mdarray = np.array([[1, 2, 4], [4, 5, 6], [7, 8, 9]])
In [9]: mdarray
Out[9]:
array([[1, 2, 4],
[4, 5, 6],
[7, 8, 9]])
In [10]: mdarray[:,0]
Out[10]: array([1, 4, 7])
UPD
Quick and dirty test
In [28]: mdarray = np.zeros((10000,10000))
In [29]: %timeit -n1000 [x[0] for x in mdarray]
1000 loops, best of 3: 2.7 ms per loop
In [30]: %timeit -n1000 mdarray[:,0]
1000 loops, best of 3: 567 ns per loop

What you did:
You used mdarray[::]. That makes a (shallow) copy of mdarray. Then you accessed the second element of it with [1]. [0] would be the first.
What you can do is a list comprehension:
[item[0] for item in mdarray]
This will return a list of the first elements of the lists in mdarray.
Talking about loops: A (one time) loop is rather effective to access something. Internally all the magic functions (like the comprehension above) are iterating over the data.

How about:
>>> Matrix = [[x for x in range(5)] for x in range(5)]
>>> Matrix
[[0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4], [0, 1, 2, 3, 4]]
>>> [item[0] for item in Matrix]
[0, 0, 0, 0, 0]
As for ::, you can read more about it here, It will return the same list.

Not sure whether you use array or list, but for Python's lists:
Python 2:
>>> mdarray = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> zip(*mdarray)[0]
(1, 4, 7)
Python 3:
>>> mdarray = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> list(zip(*mdarray))[0]
(1, 4, 7)
Or for the special case of index 0:
>>> mdarray = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
>>> next(zip(*mdarray))
(1, 4, 7)

Related

Numpy Array: Slice several values at every step

I am trying to extract several values at once from an array but I can't seem to find a way to do it in a one-liner in Numpy.
Simply put, considering an array:
a = numpy.arange(10)
> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
I would like to be able to extract, say, 2 values, skip the next 2, extract the 2 following values etc. This would result in:
array([0, 1, 4, 5, 8, 9])
This is an example but I am ideally looking for a way to extract x values and skip y others.
I thought this could be done with slicing, doing something like:
a[:2:2]
but it only returns 0, which is the expected behavior.
I know I could obtain the expected result by combining several slicing operations (similarly to Numpy Array Slicing) but I was wondering if I was not missing some numpy feature.
If you want to avoid creating copies and allocating new memory, you could use a window_view of two elements:
win = np.lib.stride_tricks.sliding_window_view(a, 2)
array([[0, 1],
[1, 2],
[2, 3],
[3, 4],
[4, 5],
[5, 6],
[6, 7],
[7, 8],
[8, 9]])
And then only take every 4th window view:
win[::4].ravel()
array([0, 1, 4, 5, 8, 9])
Or directly go with the more dangerous as_strided, but heed the warnings in the documentation:
np.lib.stride_tricks.as_strided(a, shape=(3,2), strides=(32,8))
You can use a modulo operator:
x = 2 # keep
y = 2 # skip
out = a[np.arange(a.shape[0])%(x+y)<x]
Output: array([0, 1, 4, 5, 8, 9])
Output with x = 2 ; y = 3:
array([0, 1, 5, 6])

Create a 2-D numpy array with list comprehension

I need to create a 2-D numpy array using only list comprehension, but it has to follow the following format:
[[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]]]
So far, all I've managed to figure out is:
two_d_array = np.array([[x+1 for x in range(3)] for y in range(5)])
Giving:
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
Just not very sure how to change the incrementation. Any help would be appreciated, thanks!
EDIT: Accidentally left out [3, 4, 5] in example. Included it now.
Here's a quick one-liner that will do the job:
np.array([np.arange(i, i+3) for i in range(1, 6)])
Where 3 is the number of columns, or elements in each array, and 6 is the number of iterations to perform - or in this case, the number of arrays to create; which is why there are 5 arrays in the output.
Output:
array([[1, 2, 3],
[2, 3, 4],
[3, 4, 5],
[4, 5, 6],
[5, 6, 7]])
Change the code, something like this can work:
two_d_array = np.array([[(y*3)+x+1 for x in range(3)] for y in range(5)])
>>> [[1,2,3],[4,5,6],...]
two_d_array = np.array([[y+x+1 for x in range(3)] for y in range(5)])
>>> [[1,2,3],[2,3,4],...]
You've got a couple of good comprehension answers, so here are a couple of numpy solutions.
Simple addition:
np.arange(1, 6)[:, None] + np.arange(3)
Crazy stride tricks:
base = np.arange(1, 8)
np.lib.stride_tricks.as_strided(base, shape=(5, 3), strides=base.strides * 2).copy()
Reshaped cumulative sum:
base = np.ones(15)
base[3::3] = -1
np.cumsum(base).reshape(5, 3)

Put numpy arrays split with np.split() back together

I have split a numpy array like so:
x = np.random.randn(10,3)
x_split = np.split(x,5)
which splits x equally into five numpy arrays each with shape (2,3) and puts them in a list. What is the best way to combine a subset of these back together (e.g. x_split[:k] and x_split[k+1:]) so that the resulting shape is similar to the original x i.e. (something,3)?
I found that for k > 0 this is possible with you do:
np.vstack((np.vstack(x_split[:k]),np.vstack(x_split[k+1:])))
but this does not work when k = 0 as x_split[:0] = [] so there must be a better and cleaner way. The error message I get when k = 0 is:
ValueError: need at least one array to concatenate
The comment by Paul Panzer is right on target, but since NumPy now gently discourages vstack, here is the concatenate version:
x = np.random.randn(10, 3)
x_split = np.split(x, 5, axis=0)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=0)
Note the explicit axis argument passed both times (it has to be the same); this makes it easy to adapt the code to work for other axes if needed. E.g.,
x_split = np.split(x, 3, axis=1)
k = 0
np.concatenate(x_split[:k] + x_split[k+1:], axis=1)
np.r_ can turn several slices into a list of indices.
In [20]: np.r_[0:3, 4:5]
Out[20]: array([0, 1, 2, 4])
In [21]: np.vstack([xsp[i] for i in _])
Out[21]:
array([[9, 7, 5],
[6, 4, 3],
[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[2, 2, 5],
[4, 4, 5]])
In [22]: np.r_[0:0, 1:5]
Out[22]: array([1, 2, 3, 4])
In [23]: np.vstack([xsp[i] for i in _])
Out[23]:
array([[9, 8, 0],
[1, 2, 2],
[3, 3, 0],
[8, 1, 4],
[3, 2, 0],
[0, 3, 8],
[2, 2, 5],
[4, 4, 5]])
Internally np.r_ has a lot of ifs and loops to handle the slices and their boundaries, but it hides it all from us.
If the xsp (your x_split) was an array, we could do xsp[np.r_[...]], but since it is a list we have to iterate. Well we could also hide that iteration with an operator.itemgetter object.
In [26]: operator.itemgetter(*Out[22])
Out[26]: operator.itemgetter(1, 2, 3, 4)
In [27]: np.vstack(operator.itemgetter(*Out[22])(xsp))

subtracting a certain row in a matrix

So I have a 4 by 4 matrix. [[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]]
I need to subtract the second row by [1,2,3,4]
no numpy if possible. I'm a beginner and don't know how to use that
thnx
With regular Python loops:
a = [[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]]
b = [1,2,3,4]
for i in range(4):
a[1][i] -= b[i]
Simply loop over the entries in the b list and subtract from the corresponding entries in a[1], the second list (ie row) of the a matrix.
However, NumPy can do this for you faster and easier and isn't too hard to learn:
In [47]: import numpy as np
In [48]: a = np.array([[1,2,3,4],[2,3,4,5],[3,4,5,6],[4,5,6,7]])
In [49]: a
Out[49]:
array([[1, 2, 3, 4],
[2, 3, 4, 5],
[3, 4, 5, 6],
[4, 5, 6, 7]])
In [50]: a[1] -= [1,2,3,4]
In [51]: a
Out[51]:
array([[1, 2, 3, 4],
[1, 1, 1, 1],
[3, 4, 5, 6],
[4, 5, 6, 7]])
Note that NumPy vectorizes many of its operations (such as subtraction), so the loops involved are handled for you (in fast, pre-compiled C-code).

Removing items from a nested list Python

I am trying to remove items from a nested list in Python. I have a nested list as follows:
families = [[0, 1, 2],[0, 1, 2, 3],[0, 1, 2, 3, 4],[1, 2, 3, 4, 5],[2, 3, 4, 5, 6]]
I want to remove the entries in each sublist that coorespond to the indexed position of the sublist in the master list. So, for example, I need to remove 0 from the first sublist, 1 from second sublist, etc. I am trying to use a list comrehension do do this. This is what I have tried:
familiesNew = [ [ families[i][j] for j in families[i] if i !=j ] for i in range(len(families)) ]
This works for range(len(families)) up to 3, however beyond that I get IndexError: list index out of range. I am not sure why. Can somebody give me an idea of how to do this. Preferably a one-liner (list comprehension).
Thanks.
You almost got it right. Just replace families[i][j] with j and it works:
>>> [ [ j for j in families[i] if i !=j ] for i in range(len(families)) ]
[[1, 2], [0, 2, 3], [0, 1, 3, 4], [1, 2, 4, 5], [2, 3, 5, 6]]
It can be written a bit cleaner using the enumerate function:
>>> [[f for f in family if f != i] for i, family in enumerate(families)]
[[1, 2], [0, 2, 3], [0, 1, 3, 4], [1, 2, 4, 5], [2, 3, 5, 6]]
Or even using remove if you don't mind changing the original list:
>>> for i, family in enumerate(families): family.remove(i)
Edited question, removing my answer which was solving the wrong problem. Also, added additional answer by #Ashwini:
For comparison's sake:
root# python -m timeit 'families = [[0, 1, 2],[0, 1, 2, 3],[0, 1, 2, 3, 4],[1, 2, 3, 4, 5],[2, 3, 4, 5, 6]]' '[x.remove(ind) for ind,x in enumerate(families) ]'
100000 loops, best of 3: 3.42 usec per loop
root# python -m timeit -s 'families = [[0, 1, 2],[0, 1, 2, 3],[0, 1, 2, 3, 4],[1, 2, 3, 4, 5],[2, 3, 4, 5, 6]]' '[[f for f in family if f != i] for i, family in enumerate(families)]'
100000 loops, best of 3: 4.87 usec per loop
root# python -m timeit -s 'families = [[0, 1, 2],[0, 1, 2, 3],[0, 1, 2, 3, 4],[1, 2, 3, 4, 5],[2, 3, 4, 5, 6]]' '[ filter(lambda x:x!=i,j) for i,j in enumerate(families) ]'
100000 loops, best of 3: 7.99 usec per loop
These are micro-second, so I think whatever you want to do is fine unless you are going to be doing this a lot of times.
Does this do what you want?
familiesNew=[ filter(lambda x:x!=i,j) for i,j in enumerate(families) ]
EDIT
Also note, the reason yours failed is because at the third element of the outer list ([1, 2, 3, 4, 5]) you're trying to get the fifth element in your for loop (for j in families[i] == for j in [1,2,3,4,5]), but families[i] has a length of 5, meaning the largest index is 4. Sorry if that explanation is a little unclear...perhaps the following will help clear it up a little:
families = [[0, 1, 2],[0, 1, 2, 3],[0, 1, 2, 3, 4],[1, 2, 3, 4, 5],[2, 3, 4, 5, 6]]
def f(i,j):
print i,j,families[i]
return families[i][j]
#THIS DOES NOT WORK -- but it will tell you where it failed.
familiesNew = [ [ f(i,j) for j in families[i] if i !=j ] for i in range(len(families)) ]
If you want to modify the original list then try this:
>>>[x.remove(ind) for ind,x in enumerate(families) ]
>>>families
[[1, 2], [0, 2, 3], [0, 1, 3, 4], [1, 2, 4, 5], [2, 3, 5, 6]]

Categories