Python and Numeric/numpy Array Slicing Behavior - python

On Python2.4, the single colon slice operator : works as expected on Numeric matrices, in that it returns all values for the dimension it was used on. For example all X and/or Y values for a 2-D matrix.
On Python2.6, the single colon slice operator seems to have a different effect in some cases: for example, on a regular 2-D MxN matrix, m[:] can result in zeros(<some shape tuple>, 'l') being returned as the resulting slice. The full matrix is what one would expect - which is what one gets using Python2.4.
Using either a double colon :: or 3 dots ... in Python2.6, instead of a single colon, seems to fix this issue and return the proper matrix slice.
After some guessing, I discovered you can get the same zeros output when inputting 0 as the stop index. e.g. m[<any index>:0] returns the same "zeros" output as m[:]. Is there any way to debug what indexes are being picked when trying to do m[:]? Or did something change between the two Python versions (2.4 to 2.6) that would affect the behavior of slicing operators?
The version of Numeric being used (24.2) is the same between both versions of Python. Why does the single colon slicing NOT work on Python 2.6 the same way it works with version 2.4?
Python2.6:
>>> a = array([[1,2,3],[4,5,6]])
**>>> a[:]
zeros((0, 3), 'l')**
>>> a[::]
array([[1,2,3],[4,5,6]])
>>> a[...]
array([[1,2,3],[4,5,6]])
Python2.4:
>>> a = array([[1,2,3],[4,5,6]])
**>>> a[:]
array([[1,2,3],[4,5,6]])**
>>> a[::]
array([[1,2,3],[4,5,6]])
>>> a[...]
array([[1,2,3],[4,5,6]])
(I typed the "code" up from scratch, so it may not be fully accurate syntax or printout-wise, but shows what's happening)

It seems the problem is an integer overflow issue. In the Numeric source code, the matrix data structure being used is in a file called MA.py. The specific class is called MaskedArray. There is a line at the end of the class that sets the "array()" function to this class. I had much trouble finding this information but it turned out to be very critical.
There is also a getslice(self, i, j) method in the MaskedArray class that takes in the start/stop indices and returns the proper slice. After finding this and adding debug for those indices, I discovered that under the good case with Python2.4, when doing a slice for an entire array the start/stop indices automatically input are 0 and 2^31-1, respectively. But under Python2.6, the stop index automatically input changed to be 2^63-1.
Somewhere, probably in the Numeric source/library code, there is only 32 bits to store the stop index when slicing arrays. Hence, the 2^63-1 value was overflowing (but any value greater than 2^31 would overflow). The output slice in these bad cases ends up being equivalent to slicing from start 0 to stop 0, e.g. an empty matrix. When you slice from [0:-1] you do get a valid slice. I think (2^63 - 1) interpreted as a 32 bit number would come out to -1. I'm not quite sure why the output of slicing from 0 to 2^63-1 is the same as slicing from 0 to 0 (where you get an empty matrix), and not from 0 to -1 (where you get at least some output).
Although, if I input ending slice indexes that would overflow (i.e. greater than 2^31), but the lower 32 bits were a valid positive non-zero number, I would get a valid slice back. E.g. a stop index of 2^33+1 would return the same slice as a stop index of 1, because the lower 32 bits are 1 in both cases.
Python 2.4 Example code:
>>> a = array([[1,2,3],[4,5,6]])
>>> a[:] # (which actually becomes a[0:2^31-1])
[[1,2,3],[4,5,6]] # correct, expect the entire array
Python 2.6 Example code:
>>> a = array([[1,2,3],[4,5,6]])
>>> a[:] # (which actually becomes a[0:2^63-1])
zeros((0, 3), 'l') # incorrect b/c of overflow, should be full array
>>> a[0:0]
zeros((0, 3), 'l') # correct, b/c slicing range is null
>>> a[0:2**33+1]
[ [1,2,3]] # incorrect b/c of overflow, should be full array
# although it returned some data b/c the
# lower 32 bits of (2^33+1) = 1
>>> a[0:-1]
[ [1,2,3]] # correct, although I'm not sure why "a[:]" doesn't
# give this output as well, given that the lower 32
# bits of 2^63-1 equal -1

I think I was using 2.4 10 years ago. I used numpy back then, but may have added Numeric for its NETCDF capabilities. But the details are fuzzy. And I don't have any of those versions now for testing.
Python documentation back then should be easy to explore. numpy/Numeric documentation was skimpier.
I think Python has always had the basic : slicing for lists. alist[:] to make a copy, alist[1:-1] to slice of the first and last elements, etc.
I don't know when the step was added, e.g. alist[::-1] to reverse a list.
Python started to recognize indexing tuples at the request of numeric developers, e.g. arr[2,4], arr[(2,4)], arr[:, [1,2]], arr[::-1, :]. But I don't know when that appeared
Ellipsis is also mainly of value for multidimensional indexing. The Python interpreter recognizes ..., but lists don't handle it. About the same time the : notation was formally implemented as slice, e.g.
In 3.5, we can reverse a list with a slice
In [6]: list(range(10)).__getitem__(slice(None,None,-1))
Out[6]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
I would suggest a couple of things:
make sure you understand numpy (and list) indexing/slicing in a current system
try the same things in the older versions; ask SO questions with concrete examples of the differences. Don't count on any of us to have memories of the old code.
study the documentation to find when suspected features where changed or added.

Related

What is this expression and how is this defined in python as i can't use it with lists

I have the following code in Python:
import numpy as np
arr = numpy.array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
print(arr[::-1]) # reverses the array
print(arr[:,::-1] # reverses the array in the second dimension
print(arr[:,:,::-1] # reverses the array in the third dimension,i.e. all elements
print(arr[...,::-1]# gives the same output as above line
Output:
array([[[7,8,9],[10,11,12]],[[1,2,3],[4,5,6]]])
array([[[4,5,6],[1,2,3]],[[10,11,12],[7,8,9]]])
array([[[3,2,1],[6,5,4]],[[9,8,7],[12,11,10]]])
array([[[1,2,3],[4,5,6]],[[7,8,9],[10,11,12]]])
now, i wanna know what is arr[...] as it prints the whole list as it is,but is working in a different way for thefinal print statement.
And, why doesn't the same syntax work on python lists.
Also, even though this shall not be the part of the question am somewhat curious to learn that if i were to implement the same fucnitonality for one of my class objects how would i do..?
In arr[..., ::-1] the ... stands for a variable number of :, hence the last 2 cases just reverse the last dimension.
arr[...] is the same as arr[:], a view without change.
Lists only have one level of indexing, so alist[:] and alist[::-1] work, but nothing which a comma, or the ellipsis. Sublists in a nested list have to have their own indexing.
alist[:] # a copy
alist[::-1] # a reverse copy
these also work for strings
In [187]: 'astring'[:]
Out[187]: 'astring'
In [188]: 'astring'[::-1]
Out[188]: 'gnirtsa'
Basic reference for numpy indexing:
https://numpy.org/doc/stable/reference/arrays.indexing.html
a start for python sequence indexing (for lists and strings)
https://docs.python.org/3/library/stdtypes.html#common-sequence-operations
I'm more familiar with numpy documentation, since I learned python basics too long ago. List indexing should be well covered in any Python intro book.

Filtered Numpy Array Changes Number of Dimensions

I'm having trouble getting used to Numpy arrays (I'm a Matlab user). When I try to select just a range of values from an array, I see the resulting array has an extra dimension:
ioi = np.nonzero((self.data_array[0,:] >= range_start) & (self.data_array[0,:] <= range_end))
print("self.data_array.shape = {0}".format(self.data_array.shape))
print("self.data_array.shape[:,ioi] = {0}".format(self.data_array[:,ioi].shape))
The result is:
self.data_array.shape = (5, 50000)
self.data_array.shape[:,ioi] = (5, 1, 408)
I also see that ioi is a tuple. I don't know if that has anything to do with it.
What is happening here to create that extra dimension and what should I do, in the most direct way, to get an array shape of (5,408) in this case?
The simplest and most efficient thing would be to get rid of the np.nonzero call, and use logical indexing just as one would in Matlab. Here's an example. (I'm using random data of the same shape, FYI.)
>>> data = np.random.randn(5, 5000)
>>> start, end = -0.5, 0.5
>>> ioi = (data[0] > start) & (data[0] < end)
>>> print(ioi.shape)
(5000,)
>>> print(ioi.sum())
1900
>>> print(data[:, ioi].shape)
(5, 1900)
The np.nonzero call is not usually needed. Just like Matlab's find function, it's slow compared with logical indexing, and usually one's goal can be more efficiently accomplished with logical indexing. np.nonzero, just like find, should mostly be used only when you need the actual index values themselves.
As you suspected, the reason for the extra dimensions is that tuples are handled differently from other types of indexing arrays in NumPy. This is to allow more flexible indexing, such as with slices, ellipses, etc. See this useful page for in-depth explanation, especially the last section.
There are at least two other options to solve the problem. One is to use the ioi array, as returned from np.nonzero, directly as your only index to the data array. As in: self.data_array[ioi]. Part of why you have an extra dimension is that you actually have two set of indices in your call: the slice (:) and the tuple ioi. np.nonzero is guaranteed to return a tuple exactly for this reason, so that its output can always be used to directly index the source array.
The last option is to call np.squeeze on the returned array, but I'd opt for one of the above first.

Is there a built-in method for checking individual list indices in Python

Python has a built in functionality for checking the validity of entire slices: slice.indices. Is there something similar that is built-in for individual indices?
Specifically, I have an index, say a = -2 that I wish to normalize with respect to a 4-element list. Is there a method that is equivalent to the following already built in?
def check_index(index, length):
if index < 0:
index += length
if index < 0 or index >= length:
raise IndexError(...)
My end result is to be able to construct a tuple with a single non-None element. I am currently using list.__getitem__ to do the check for me, but it seems a little awkward/overkill:
items = [None] * 4
items[a] = 'item'
items = tuple(items)
I would like to be able to do
a = check_index(a, 4)
items = tuple('item' if i == a else None for i in range(4))
Everything in this example is pretty negotiable. The only things that are fixed is that I am getting a in a way that can have all of the problems that an arbitrary index can have and that the final result has to be a tuple.
I would be more than happy if the solution used numpy and only really applied to numpy arrays instead of Python sequences. Either one would be perfect for the application I have in mind.
If I understand correctly, you can use range(length)[index], in your example range(4)[-2]. This properly handles negative and out-of-bounds indices. At least in recent versions of Python, range() doesn't literally create a full list so this will have decent performance even for large arguments.
If you have a large number of indices to do this with in parallel, you might get better performance doing the calculation with Numpy vectorized arithmetic, but I don't think the technique with range will work in that case. You'd have to manually do the calculation using the implementation in your question.
There is a function called numpy.core.multiarray.normalize_axis_index which does exactly what I need. It is particularly useful to be because the implementation I had in mind was for numpy array indexing:
from numpy.core.multiarray import normalize_axis_index
>>> normalize_axis_index(3, 4)
3
>>> normalize_axis_index(-3, 4)
1
>>> normalize_axis_index(-5, 4)
...
numpy.core._internal.AxisError: axis -5 is out of bounds for array of dimension 4
The function was added in version 1.13.0. The source for this function is available here, and the documentation source is here.

How do I declare an array in Python?

How do I declare an array in Python?
variable = []
Now variable refers to an empty list*.
Of course this is an assignment, not a declaration. There's no way to say in Python "this variable should never refer to anything other than a list", since Python is dynamically typed.
*The default built-in Python type is called a list, not an array. It is an ordered container of arbitrary length that can hold a heterogenous collection of objects (their types do not matter and can be freely mixed). This should not be confused with the array module, which offers a type closer to the C array type; the contents must be homogenous (all of the same type), but the length is still dynamic.
This is surprisingly complex topic in Python.
Practical answer
Arrays are represented by class list (see reference and do not mix them with generators).
Check out usage examples:
# empty array
arr = []
# init with values (can contain mixed types)
arr = [1, "eels"]
# get item by index (can be negative to access end of array)
arr = [1, 2, 3, 4, 5, 6]
arr[0] # 1
arr[-1] # 6
# get length
length = len(arr)
# supports append and insert
arr.append(8)
arr.insert(6, 7)
Theoretical answer
Under the hood Python's list is a wrapper for a real array which contains references to items. Also, underlying array is created with some extra space.
Consequences of this are:
random access is really cheap (arr[6653] is same to arr[0])
append operation is 'for free' while some extra space
insert operation is expensive
Check this awesome table of operations complexity.
Also, please see this picture, where I've tried to show most important differences between array, array of references and linked list:
You don't actually declare things, but this is how you create an array in Python:
from array import array
intarray = array('i')
For more info see the array module: http://docs.python.org/library/array.html
Now possible you don't want an array, but a list, but others have answered that already. :)
I think you (meant)want an list with the first 30 cells already filled.
So
f = []
for i in range(30):
f.append(0)
An example to where this could be used is in Fibonacci sequence.
See problem 2 in Project Euler
This is how:
my_array = [1, 'rebecca', 'allard', 15]
For calculations, use numpy arrays like this:
import numpy as np
a = np.ones((3,2)) # a 2D array with 3 rows, 2 columns, filled with ones
b = np.array([1,2,3]) # a 1D array initialised using a list [1,2,3]
c = np.linspace(2,3,100) # an array with 100 points beteen (and including) 2 and 3
print(a*1.5) # all elements of a times 1.5
print(a.T+b) # b added to the transpose of a
these numpy arrays can be saved and loaded from disk (even compressed) and complex calculations with large amounts of elements are C-like fast.
Much used in scientific environments. See here for more.
JohnMachin's comment should be the real answer.
All the other answers are just workarounds in my opinion!
So:
array=[0]*element_count
A couple of contributions suggested that arrays in python are represented by lists. This is incorrect. Python has an independent implementation of array() in the standard library module array "array.array()" hence it is incorrect to confuse the two. Lists are lists in python so be careful with the nomenclature used.
list_01 = [4, 6.2, 7-2j, 'flo', 'cro']
list_01
Out[85]: [4, 6.2, (7-2j), 'flo', 'cro']
There is one very important difference between list and array.array(). While both of these objects are ordered sequences, array.array() is an ordered homogeneous sequences whereas a list is a non-homogeneous sequence.
You don't declare anything in Python. You just use it. I recommend you start out with something like http://diveintopython.net.
I would normally just do a = [1,2,3] which is actually a list but for arrays look at this formal definition
To add to Lennart's answer, an array may be created like this:
from array import array
float_array = array("f",values)
where values can take the form of a tuple, list, or np.array, but not array:
values = [1,2,3]
values = (1,2,3)
values = np.array([1,2,3],'f')
# 'i' will work here too, but if array is 'i' then values have to be int
wrong_values = array('f',[1,2,3])
# TypeError: 'array.array' object is not callable
and the output will still be the same:
print(float_array)
print(float_array[1])
print(isinstance(float_array[1],float))
# array('f', [1.0, 2.0, 3.0])
# 2.0
# True
Most methods for list work with array as well, common
ones being pop(), extend(), and append().
Judging from the answers and comments, it appears that the array
data structure isn't that popular. I like it though, the same
way as one might prefer a tuple over a list.
The array structure has stricter rules than a list or np.array, and this can
reduce errors and make debugging easier, especially when working with numerical
data.
Attempts to insert/append a float to an int array will throw a TypeError:
values = [1,2,3]
int_array = array("i",values)
int_array.append(float(1))
# or int_array.extend([float(1)])
# TypeError: integer argument expected, got float
Keeping values which are meant to be integers (e.g. list of indices) in the array
form may therefore prevent a "TypeError: list indices must be integers, not float", since arrays can be iterated over, similar to np.array and lists:
int_array = array('i',[1,2,3])
data = [11,22,33,44,55]
sample = []
for i in int_array:
sample.append(data[i])
Annoyingly, appending an int to a float array will cause the int to become a float, without throwing an exception.
np.array retain the same data type for its entries too, but instead of giving an error it will change its data type to fit new entries (usually to double or str):
import numpy as np
numpy_int_array = np.array([1,2,3],'i')
for i in numpy_int_array:
print(type(i))
# <class 'numpy.int32'>
numpy_int_array_2 = np.append(numpy_int_array,int(1))
# still <class 'numpy.int32'>
numpy_float_array = np.append(numpy_int_array,float(1))
# <class 'numpy.float64'> for all values
numpy_str_array = np.append(numpy_int_array,"1")
# <class 'numpy.str_'> for all values
data = [11,22,33,44,55]
sample = []
for i in numpy_int_array_2:
sample.append(data[i])
# no problem here, but TypeError for the other two
This is true during assignment as well. If the data type is specified, np.array will, wherever possible, transform the entries to that data type:
int_numpy_array = np.array([1,2,float(3)],'i')
# 3 becomes an int
int_numpy_array_2 = np.array([1,2,3.9],'i')
# 3.9 gets truncated to 3 (same as int(3.9))
invalid_array = np.array([1,2,"string"],'i')
# ValueError: invalid literal for int() with base 10: 'string'
# Same error as int('string')
str_numpy_array = np.array([1,2,3],'str')
print(str_numpy_array)
print([type(i) for i in str_numpy_array])
# ['1' '2' '3']
# <class 'numpy.str_'>
or, in essence:
data = [1.2,3.4,5.6]
list_1 = np.array(data,'i').tolist()
list_2 = [int(i) for i in data]
print(list_1 == list_2)
# True
while array will simply give:
invalid_array = array([1,2,3.9],'i')
# TypeError: integer argument expected, got float
Because of this, it is not a good idea to use np.array for type-specific commands. The array structure is useful here. list preserves the data type of the values.
And for something I find rather pesky: the data type is specified as the first argument in array(), but (usually) the second in np.array(). :|
The relation to C is referred to here:
Python List vs. Array - when to use?
Have fun exploring!
Note: The typed and rather strict nature of array leans more towards C rather than Python, and by design Python does not have many type-specific constraints in its functions. Its unpopularity also creates a positive feedback in collaborative work, and replacing it mostly involves an additional [int(x) for x in file]. It is therefore entirely viable and reasonable to ignore the existence of array. It shouldn't hinder most of us in any way. :D
How about this...
>>> a = range(12)
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
>>> a[7]
6
Following on from Lennart, there's also numpy which implements homogeneous multi-dimensional arrays.
Python calls them lists. You can write a list literal with square brackets and commas:
>>> [6,28,496,8128]
[6, 28, 496, 8128]
I had an array of strings and needed an array of the same length of booleans initiated to True. This is what I did
strs = ["Hi","Bye"]
bools = [ True for s in strs ]
You can create lists and convert them into arrays or you can create array using numpy module. Below are few examples to illustrate the same. Numpy also makes it easier to work with multi-dimensional arrays.
import numpy as np
a = np.array([1, 2, 3, 4])
#For custom inputs
a = np.array([int(x) for x in input().split()])
You can also reshape this array into a 2X2 matrix using reshape function which takes in input as the dimensions of the matrix.
mat = a.reshape(2, 2)
# This creates a list of 5000 zeros
a = [0] * 5000
You can read and write to any element in this list with a[n] notation in the same as you would with an array.
It does seem to have the same random access performance as an array. I cannot say how it allocates memory because it also supports a mix of different types including strings and objects if you need it to.

Python/numpy tricky slicing problem

I have a problem with some numpy stuff. I need a numpy array to behave in an unusual manner by returning a slice as a view of the data I have sliced, not a copy. So heres an example of what I want to do:
Say we have a simple array like this:
a = array([1, 0, 0, 0])
I would like to update consecutive entries in the array (moving left to right) with the previous entry from the array, using syntax like this:
a[1:] = a[0:3]
This would get the following result:
a = array([1, 1, 1, 1])
Or something like this:
a[1:] = 2*a[:3]
# a = [1,2,4,8]
To illustrate further I want the following kind of behaviour:
for i in range(len(a)):
if i == 0 or i+1 == len(a): continue
a[i+1] = a[i]
Except I want the speed of numpy.
The default behavior of numpy is to take a copy of the slice, so what I actually get is this:
a = array([1, 1, 0, 0])
I already have this array as a subclass of the ndarray, so I can make further changes to it if need be, I just need the slice on the right hand side to be continually updated as it updates the slice on the left hand side.
Am I dreaming or is this magic possible?
Update: This is all because I am trying to use Gauss-Seidel iteration to solve a linear algebra problem, more or less. It is a special case involving harmonic functions, I was trying to avoid going into this because its really not necessary and likely to confuse things further, but here goes.
The algorithm is this:
while not converged:
for i in range(len(u[:,0])):
for j in range(len(u[0,:])):
# skip over boundary entries, i,j == 0 or len(u)
u[i,j] = 0.25*(u[i-1,j] + u[i+1,j] + u[i, j-1] + u[i,j+1])
Right? But you can do this two ways, Jacobi involves updating each element with its neighbours without considering updates you have already made until the while loop cycles, to do it in loops you would copy the array then update one array from the copied array. However Gauss-Seidel uses information you have already updated for each of the i-1 and j-1 entries, thus no need for a copy, the loop should essentially 'know' since the array has been re-evaluated after each single element update. That is to say, every time we call up an entry like u[i-1,j] or u[i,j-1] the information calculated in the previous loop will be there.
I want to replace this slow and ugly nested loop situation with one nice clean line of code using numpy slicing:
u[1:-1,1:-1] = 0.25(u[:-2,1:-1] + u[2:,1:-1] + u[1:-1,:-2] + u[1:-1,2:])
But the result is Jacobi iteration because when you take a slice: u[:,-2,1:-1] you copy the data, thus the slice is not aware of any updates made. Now numpy still loops right? Its not parallel its just a faster way to loop that looks like a parallel operation in python. I want to exploit this behaviour by sort of hacking numpy to return a pointer instead of a copy when I take a slice. Right? Then every time numpy loops, that slice will 'update' or really just replicate whatever happened in the update. To do this I need slices on both sides of the array to be pointers.
Anyway if there is some really really clever person out there that awesome, but I've pretty much resigned myself to believing the only answer is to loop in C.
Late answer, but this turned up on Google so I probably point to the doc the OP wanted. Your problem is clear: when using NumPy slices, temporaries are created. Wrap your code in a quick call to weave.blitz to get rid of the temporaries and have the behaviour your want.
Read the weave.blitz section of PerformancePython tutorial for full details.
accumulate is designed to do what you seem to want; that is, to proprigate an operation along an array. Here's an example:
from numpy import *
a = array([1,0,0,0])
a[1:] = add.accumulate(a[0:3])
# a = [1, 1, 1, 1]
b = array([1,1,1,1])
b[1:] = multiply.accumulate(2*b[0:3])
# b = [1 2 4 8]
Another way to do this is to explicitly specify the result array as the input array. Here's an example:
c = array([2,0,0,0])
multiply(c[:3], c[:3], c[1:])
# c = [ 2 4 16 256]
Just use a loop. I can't immediately think of any way to make the slice operator behave the way you're saying you want it to, except maybe by subclassing numpy's array and overriding the appropriate method with some sort of Python voodoo... but more importantly, the idea that a[1:] = a[0:3] should copy the first value of a into the next three slots seems completely nonsensical to me. I imagine that it could easily confuse anyone else who looks at your code (at least the first few times).
It is not the correct logic.
I'll try to use letters to explain it.
Image array = abcd with a,b,c,d as elements.
Now, array[1:] means from the element in position 1 (starting from 0) on.
In this case:bcd and array[0:3] means from the character in position 0 up to the third character (the one in position 3-1) in this case: 'abc'.
Writing something like:
array[1:] = array[0:3]
means: replace bcd with abc
To obtain the output you want, now in python, you should use something like:
a[1:] = a[0]
It must have something to do with assigning a slice. Operators, however, as you may already know, do follow your expected behavior:
>>> a = numpy.array([1,0,0,0])
>>> a[1:]+=a[:3]
>>> a
array([1, 1, 1, 1])
If you already have zeros in your real-world problem where your example does, then this solves it. Otherwise, at added cost, set them to zero either by multiplying by zero or assigning to zero, (whichever is faster)
edit:
I had another thought. You may prefer this:
numpy.put(a,[1,2,3],a[:3])
Numpy must be checking if the target array is the same as the input array when doing the setkey call. Luckily, there are ways around it. First, I tried using numpy.put instead
In [46]: a = numpy.array([1,0,0,0])
In [47]: numpy.put(a,[1,2,3],a[0:3])
In [48]: a
Out[48]: array([1, 1, 1, 1])
And then from the documentation of that, I gave using flatiters a try (a.flat)
In [49]: a = numpy.array([1,0,0,0])
In [50]: a.flat[1:] = a[0:3]
In [51]: a
Out[51]: array([1, 1, 1, 1])
But this doesn't solve the problem you had in mind
In [55]: a = np.array([1,0,0,0])
In [56]: a.flat[1:] = 2*a[0:3]
In [57]: a
Out[57]: array([1, 2, 0, 0])
This fails because the multiplication is done before the assignment, not in parallel as you would like.
Numpy is designed for repeated application of the exact same operation in parallel across an array. To do something more complicated, unless you can find decompose it in terms of functions like numpy.cumsum and numpy.cumprod, you'll have to resort to something like scipy.weave or writing the function in C. (See the PerfomancePython page for more details.) (Also, I've never used weave, so I can't guarantee it will do what you want.)
You could have a look at np.lib.stride_tricks.
There is some information in these excellent slides:
http://mentat.za.net/numpy/numpy_advanced_slides/
with stride_tricks starting at slide 29.
I'm not completely clear on the question though so can't suggest anything more concrete - although I would probably do it in cython or fortran with f2py or with weave. I'm liking fortran more at the moment because by the time you add all the required type annotations in cython I think it ends up looking less clear than the fortran.
There is a comparison of these approaches here:
www. scipy. org/ PerformancePython
(can't post more links as I'm a new user)
with an example that looks similar to your case.
In the end I came up with the same problem as you. I had to resort to use Jacobi iteration and weaver:
while (iter_n < max_time_steps):
expr = "field[1:-1, 1:-1] = (field[2:, 1:-1] "\
"+ field[:-2, 1:-1]+"\
"field[1:-1, 2:] +"\
"field[1:-1, :-2] )/4."
weave.blitz(expr, check_size=0)
#Toroidal conditions
field[:,0] = field[:,self.flow.n_x - 2]
field[:,self.flow.n_x -1] = field[:,1]
iter_n = iter_n + 1
It works and is fast, but is not Gauss-Seidel, so convergence can be a bit tricky. The only option of doing Gauss-Seidel as a traditional loop with indexes.
i would suggest cython instead of looping in c. there might be some fancy numpy way of getting your example to work using a lot of intermediate steps... but since you know how to write it in c already, just write that quick little bit as a cython function and let cython's magic make the rest of the work easy for you.

Categories