Suppose a = array([1,2,3]) and b = array([4,5,6]). I want to slice a and b using a list and perform some operation on each part, then return the result to an array. For example i propose a dummy function that demonstrates the usage:
def dummy_function(i):
A = sum(a[:i])
B = sum(cumsum(b[i:]))
return A*B
For example, this function would return dummy_function(2) = 18, and dummy_function(1) = 16 but I would like to evaluate it using a list as its argument:
>>> dummy_function([2,1])
array([18,16])
Instead I get IndexError: invalid slice. I don't want to use a loop to iterate over the elements of [2,1] because I believe it can be done more effectively. How can I do what I want?
I don't know if I understood what you want correctly, but this worked for me:
import numpy as np
def func(i):
a = np.array([1,2,3])
b = np.array([4,5,6])
A = np.sum(a[:i])
B = np.cumsum(b[i:])
C = A*B
return C[0]
print(func(2))
The result is 18
If you want to give your 'func' a list as argument, then you probably should loop over the list elements..
Related
I have a question about how Python(3) internally loops when computing multiple maps. Here's a nonsense example:
from random import randint
A = [randint(0,20) for _ in range(100)]
map1 = map(lambda a: a+1, A)
map2 = map(lambda a: a-1, map1)
B = list(map2)
Because map() produces a lazy expression, nothing is actually computed until list(map2) is called, correct?
When it does finally do this computation, which of these methods is it more akin to?
Loop method 1:
A = [randint(0,20) for _ in range(100)]
temp1 = []
for a in A:
temp1.append(a+1)
B = []
for t in temp1:
B.append(t-1)
Loop method 2:
A = [randint(0,20) for _ in range(100)]
B = []
for a in A:
temp = a+1
B.append(temp-1)
Or does it compute in some entirely different manner?
In general, the map() function produces a generator, which in turn doesn't produce any output or calculate anything until it's explicitly asked to. Converting a generator to a list is essentially akin to asking it for the next element until there is no next element.
We can do some experiments on the command line in order to find out more:
>>> B = [i for i in range(5)]
>>> map2 = map(lambda b:2*b, B)
>>> B[2] = 50
>>> list(map2)
[0, 2, 100, 6, 8]
We can see that, even though we modify B after creating the generator, our change is still reflected in the generator's output. Thus, it seems that map holds onto a reference to the original iterable from which it was created, and calculates one value at a time only when it's asked to.
In your example, that means the process goes something like this:
A = [2, 4, 6, 8, 10]
b = list(map2)
b[0] --> next(map2) = (lambda a: a-1)(next(map1))
--> next(map1) = (lambda a: a+1)(next(A))
--> next(A) = A[0] = 2
--> next(map1) = 2+1 = 3
--> next(map2) = 3-1 = 2
...
In human terms, the next value of map2 is calculated by asking for the next value of map1. That, in turn, is calculated from A that you originally set.
This can be investigated by using map on functions with side-effects. Generally speaking, you shouldn't do this for real code, but it's fine for investigating the behaviour.
def f1(x):
print('f1 called on', x)
return x
def f2(x):
print('f2 called on', x)
return x
nums = [1, 2, 3]
map1 = map(f1, nums)
map2 = map(f2, map1)
for x in map2:
print('printing', x)
Output:
f1 called on 1
f2 called on 1
printing 1
f1 called on 2
f2 called on 2
printing 2
f1 called on 3
f2 called on 3
printing 3
So, each function is called at the latest time it could possibly be called; f1(2) isn't called until the loop is finished with the number 1. Nothing needs to be done with the number 2 until the loop needs the second value from the map.
Say I have ndarray a and b of compatible type and shape. I now wish for the data of b to be referring to the data of a. That is, without changing the array b object itself or creating a new one. (Imagine that b is actually an object of a class derived from ndarray and I wish to set its data reference after construction.) In the following example, how do I perform the b.set_data_reference?
import numpy as np
a = np.array([1,2,3])
b = np.empty_like(a)
b.set_data_reference(a)
This would result in b[0] == 1, and setting operations in one array would affect the other array. E.g. if we set a[1] = 22 then we can inspect that b[1] == 22.
N.B.: In case I controlled the creation of array b, I am aware that I could have created it like
b = np.array(a, copy=True)
This is, however, not the case.
NumPy does not support this operation. If you controlled the creation of b, you might be able to create it in such a way that it uses a's data buffer, but after b is created, you can't swap its buffer out for a's.
Every variable in python is a pointer so you can use directly = as follow
import numpy as np
a = np.array([1,2,3])
b = a
You can check that b refers to a as follow
assert a[1] == b[1]
a[1] = 4
assert a[1] == b[1]
Usually when functions are not always supposed to create their own buffer they implement an interface like
def func(a, b, c, out=None):
if out is None:
out = numpy.array(x, y)
# ...
return out
that way the caller can control if an existing buffer is used or not.
trying out something simple and it's frustratingly not working:
def myfunc(a,b):
return a+b[0]
v = np.vectorize(myfunc, exclude=['b'])
a = np.array([1,2,3])
b = [0]
v(a,b)
This gives me "IndexError: invalid index to scalar variable."
Upon printing b, it appears that the b taken in by the function is always 0, instead of [0]. Can I specify which arguments should be vectorized and which should remain constant?
When you use excluded=['b'] the keyword parameter b is excluded.
Therefore, you must call v with keyword arguments, e.g. v(a=a, b=b) instead of v(a, b).
If you wish to call v with positional arguments with the second positional argument excluded, then use
v = np.vectorize(myfunc)
v.excluded.add(1)
For example,
import numpy as np
def myfunc(a, b):
return a+b[0]
a = np.array([1,2,3])
b = [0, 1]
v = np.vectorize(myfunc, excluded=['b'])
print(v(a=a, b=b))
# [1 2 3]
v = np.vectorize(myfunc)
v.excluded.add(1)
print(v(a, b))
# [1 2 3]
Well here is the answer:
v.excluded.add(1) works, though passing exclude=['b'] does not, for some reason.
Just add print to see what happens:
def myfunc(a, b):
print(a, b)
return a + b
v = np.vectorize(myfunc)
a = np.array([1,2,3])
b = np.array([0])
v(a, b)
Output:
1 0
1 0
2 0
3 0
The function is applied to all elements of the array. So it receives only scalar values. You cannot index a scalar.
In Python, is it possible to make multiple assignments in the following manner (or, rather, is there a shorthand):
import random
def random_int():
return random.randint(1, 100)
a, b = # for each variable assign the return values from random_int
Do you want to assign different return values from two different calls to your random function or a single value to two variables generated by a single call to the function.
For the former, use tuple unpacking
t = (2,5)
a,b = t #valid!
def random_int():
return random.randint(1, 100)
#valid: unpack a 2-tuple to a 2-tuple of variables
a, b = random_int(), random_int()
#invalid: tries to unpack an int as a 2-tuple
a, b = random_int()
#valid: you can also use comprehensions
a, b = (random_int() for i in range(2))
For the second, you can chain assignments to assign the same values to multiple variables.
#valid, "normal" way
a = random_int()
b = a
#the same, shorthand
b = a = random_int()
In my test, if you have exactly as many variables as items in your return, then it works.
def algo():
return range(5)
a5 = algo() # works
b1,b2,b3,b4,b5 = algo() # works
c1,c2,c3 = algo() # doesn't work
d1,d2,d3,d4,d5,d6 = algo() # doesn't work.
dictionaries are a good way, or else you can use namedtuples from collections in python, try something like this, you will receive multiple call results in a single variable say a
num = namedtuple("num", "b c d")
a = num(random_int(),random_int(),random_int())
print a.b,a.c,a.d
imports :
from collections import namedtuple
You can return any object, including list and tuple that may be upacked on assignment:
import random
def random_ints():
return random.randint(1, 100), random.randint(1, 100)
a, b = random_ints()
print(b, a)
in fact this a, b is a shortcut for tuple as well, when you do multiple assignment the comma separated list of variables on the left is tuple too and could be written as:
(a, b) = range(2)
My personal favourite would be to convert your function to a generator (provided that it it fits well into your program).
Example:
>>> import random
>>>
>>> def rnd_numbers(how_many=1): # assuming how_many is positive
... for _ in range(how_many): # use xrange() in Python2.x
... yield random.randint(1, 100)
...
>>> x, y, z = rnd_numbers(3)
>>> x
98
>>> y
69
>>> z
16
>>> a,b = rnd_numbers(2)
>>> a
52
>>> b
33
I am trying to write a function and I want it to return one element when the input is element and an array of outputs if the input is array such that each element of output array is associated with the same place in input array. I am giving a dummy example:
import numpy as np
def f(a):
if a<5:
print a;
f(np.arange(11))
This code returns the error:
if a<5:
ValueError: The truth value of an array with more than one element is
ambiguous. Use a.any() or a.all()
I expect the output to be:
0
1
2
3
4
How can I make it to work the way I explained as I believe many python functions are working in this way?
Thanks.
When I have had to deal with this kind of thing, I usually start by doing an np.asarray of the input at the beginning, setting a flag if it is 0-dimensional (i.e. a scalar), promoting it to 1-D, running the function on the array, and converting it back to a scalar before returning if the flag was set. With you example:
def f(a):
a = np.asarray(a)
is_scalar = False if a.ndim > 0 else True
a.shape = (1,)*(1-a.ndim) + a.shape
less_than_5 = a[a < 5]
return (less_than_5 if not is_scalar else
(less_than_5[0] if less_than_5 else None))
>>> f(4)
4
>>> f(5)
>>> f([3,4,5,6])
array([3, 4])
>>> f([5,6,7])
array([], dtype=int32)
If you do this very often, you could add all that handling in a function decorator.
if you want the function to react depending upon whether the given input is a list or just an int, use:
def f(a):
if type(a)==type([]):
#do stuff
elif type(a)==type(5):
#do stuff
else
print "Enter an int or list"
by the above step, the function checks if the given input is an array, if the condition is true, the first block is used. next if block checks if the input is an int. Else the else block is executed
import numpy as np
def f(a):
result = a[a<5]
return result
def report(arr):
for elt in arr:
print(elt)
report(f(np.arange(11)))
Generally speaking I dislike putting print statements in functions (unless the function does nothing but print.) If you keep the I/O separate from the computation, then your functions will be more re-usable.
It is also generally a bad idea to write a function that returns different types of output, such as a scalar for some input and an array for other input. If you do that, then subsequent code that uses this function will all have to check if the output is a scalar or an array. Or, the code will have to be written very carefully to control what kind of input is sent to the function. The code can be come very complicated or very buggy if you do this.
Write simple functions -- ones that either always return an array, or always return a scalar.
You can use isinstance to check the type of an argument, and then have your function take the correct action;
In [15]: a = np.arange(11)
In [16]: isinstance(a, np.ndarray)
Out[16]: True
In [17]: b = 12.7
In [18]: isinstance(b, float)
Out[18]: True
In [19]: c = 3
In [20]: isinstance(c, int)
Out[20]: True
In [21]: d = '43.1'
In [23]: isinstance(d, str)
Out[23]: True
In [24]: float(d)
Out[24]: 43.1
In [25]: float('a3')
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-25-caad719e0e75> in <module>()
----> 1 float('a3')
ValueError: could not convert string to float: a3
This way you can create a function that does the right thing wether it is given a str, a float, an int, a list or an numpy.ndarray.