negative values in my probability vector

negative values in my probability vector - python

hi I wanna create a probability vector for my 2 Dimensional array. I wrote a function myself to iterate through the elements and calculate the probability of each value. When I only enter positive values everything works, but as soon as there is a negative number I create a negative probability, which shouldn't be possible as the value must be 0<=x<=1.
def createProbabilityVector(inputArray):
vector = inputArray
probabilityVector = np.zeros(vector.shape)
for x in range(vector.shape[0]):
vectorSum = sum(vector[x])
probabilityVector[[x]] = vector[[x]] / vectorSum
return probabilityVector
is the mistake in the code or do I simply fail to understand what I want to do?
edit: some examples
input
[[ 1.62242568 1.27356428 -1.88008155 1.37183247]
[-1.10638392 0.18420085 -1.68558966 -1.59951709]
[ 1.79166467 -0.21911691 -1.29066019 0.4565108 ]
[-0.20459109 1.59912774 0.47735207 1.6398782 ]]
output:
[[ 0.67948147 0.53337625 -0.78738927 0.57453155]
[ 0.26296832 -0.04378136 0.4006355 0.38017754]
[ 2.42642012 -0.2967462 -1.74791851 0.61824459]
[-0.05825873 0.45536272 0.13592931 0.4669667 ]]
-----
input
[[ 1.50162225 -0.31502279 -1.40281248 -1.09221922]
[ 1.93663826 1.31671237 -1.14334774 1.54792572]
[ 1.21376416 -1.44547074 0.0045907 1.4099986 ]
[ 0.51903455 -0.80046238 -1.69780354 -1.29893969]]
output:
[[-1.14764998 0.24076355 1.0721323 0.83475413]
[ 0.52943577 0.3599612 -0.31256699 0.42317002]
[ 1.02610693 -1.2219899 0.00388094 1.19200202]
[-0.15833053 0.24417956 0.51791182 0.39623914]]
-----
input
[[-1.6333837 -0.50469549 -1.62305585 -1.43558978]
[ 0.29636416 -0.22401163 -1.82816273 0.10676174]
[-1.6599302 -0.2516563 -1.64843802 -0.86857615]
[ 1.31762542 0.8690911 1.5888384 -1.83204102]]
output:
[[ 0.31431022 0.09711799 0.31232284 0.27624895]
[-0.17971828 0.13584296 1.10861674 -0.06474142]
[ 0.37482047 0.05682524 0.37222548 0.1961288 ]
[ 0.67796038 0.44717514 0.81750812 -0.94264364]]
-----
input
[[ 0.15369025 1.05426071 -0.61295255 0.95033555]
[ 0.04138761 -1.41072628 1.90319561 -1.2563338 ]
[ 1.85131197 -1.24551221 -1.62731374 0.43129381]
[ 0.21235188 1.21581691 -0.57470021 -0.58482563]]
output:
[[ 0.09945439 0.68222193 -0.3966473 0.61497099]
[-0.05728572 1.95262488 -2.63426518 1.73892602]
[-3.1366464 2.11025017 2.75713 -0.73073377]
[ 0.79046139 4.52577253 -2.13927148 -2.17696245]]

You need to transform all the values of the input array into positive values, a few alternatives are:
Convert all the negatives to 0, function zeroed
Shift all the values by the absolute value of the minimum element, function shifted
Apply the exponential function to the values, function exponential
After you have converted the values of the input array you can use your function as usual, follow the definition of the transformation functions:
def zeroed(arr):
return arr.clip(min=0)
def shifted(arr):
return arr + abs(np.min(arr))
def exponential(arr):
return np.exp(arr)
In your function you can use the transformation as follows:
def createProbabilityVector(inputArray):
vector = inputArray
probabilityVector = np.zeros(vector.shape)
for x in range(vector.shape[0]):
new_vector = zeroed(vector[x])
vectorSum = sum(new_vector)
probabilityVector[[x]] = new_vector / vectorSum
return probabilityVector
The function zeroed can be replace by shifted or exponential, for the input:
array = np.array([[1.62242568, 1.27356428, -1.88008155, 1.37183247],
[-1.10638392, 0.18420085, -1.68558966, -1.59951709],
[1.79166467, -0.21911691, -1.29066019, 0.4565108],
[-0.20459109, 1.59912774, 0.47735207, 1.6398782]])
These are the results for the function zeroed:
[[0.38015304 0.29841079 0. 0.32143616]
[0. 1. 0. 0. ]
[0.79694165 0. 0. 0.20305835]
[0. 0.43029432 0.1284462 0.44125948]]
for shifted:
[[0.35350056 0.31829072 0. 0.32820872]
[0.22847732 0.73756992 0. 0.03395275]
[0.52233595 0.18158552 0. 0.29607853]
[0. 0.41655061 0.15748787 0.42596152]]
and exponential:
[[0.39778013 0.28063027 0.01198184 0.30960776]
[0.17223667 0.62606504 0.09651165 0.10518664]
[0.69307072 0.09279107 0.03177905 0.18235916]
[0.06504215 0.39494808 0.12863496 0.41137482]]

Related

adding values in a 2d array provided that the first value is greater than 5

there is such question, it seems elementary, but for some reason at me it does not turn out. I have the 2 d list, I need to add a line to a line so that the sum on the first number was not less than 5 (it is possible to sum up only the next lines). For example
array([[ 0. , 3.817549],
[ 3. , 21.275711],
[ 11. , 59.286198],
[ 47. , 110.136649],
[132. , 153.451585],
[263. , 171.041259],
[301. , 158.872652],
[198. , 126.488376],
[ 50. , 200.63002 ]])
and I need outpuut like this:
array([[ 14. , 84.3794...],
[ 47. , 110.136649],
[132. , 153.451585],
[263. , 171.041259],
[301. , 158.872652],
[198. , 126.488376],
[ 50. , 200.63002 ]])

Try:
arr = np.array([[ 0. , 3.817549],
[ 3. , 21.275711],
[ 11. , 59.286198],
[ 47. , 110.136649],
[132. , 153.451585],
[263. , 171.041259],
[301. , 158.872652],
[198. , 126.488376],
[ 50. , 200.63002 ]])
for i in range(len(arr)):
if arr[i, 0] >= 5.0:
arr = arr[i:, :]
break
else:
arr[i + 1, :] += arr[i, :]

I'm not entirely sure if I understand the question, but I will try to help.
I would approach this problem with the following steps:
Create a separate 2D list to store your final output and a two-value accumulator list to temporary store values. Initialize the accumulator to the values at index [0][] of your input array
Iterate over the values in the original 2D list
For each item:
a. if accumulator[0] >= 5, add the accumulated values to your output and then set the accumulator to the values at current_index + 1
b. otherwise, add the values at current_index + 1 to your accumulator
The following code was able to take your input and reproduce the exact ouput you wanted:
# Assuming current_vals is the input list...
final_vals = []
accumulator = [current_vals[0][0], current_vals[0][1]]
for sublist_index in range(1, len(current_vals) - 1):
if accumulator[0] >= 5:
final_vals.append([accumulator[0], accumulator[1]])
accumulator[0] = current_vals[sublist_index][0]
accumulator[1] = current_vals[sublist_index][1]
else:
accumulator[0] += current_vals[sublist_index][0]
accumulator[1] += current_vals[sublist_index][1]
return final_vals

Numpy method to return the index of the occurrence of an array within an array of arrays

I have an array of arrays that represents a set of unique colour values:
[[0. 0. 0. ]
[0. 0. 1. ]
[0. 1. 1. ]
[0.5019608 0.5019608 0.5019608 ]
[0.64705884 0.16470589 0.16470589]
[0.9607843 0.9607843 0.8627451 ]
[1. 0. 0. ]
[1. 0.84313726 0. ]
[1. 1. 0. ]
[1. 1. 1. ]]
And another numpy array that represents one of the colours:
[0.9607843 0.9607843 0.8627451 ]
I need a function to find the index where the colour array occurs in the set of colours, i.e. the function should return 5 for the arrays above.

numpy.where() returns you the exact positions in the array for values of given condition. So here, it would be as following (denoting big array as arr1, and the sought vector as arr2:
np.where(np.all(arr1 == arr2, axis=1))
Which then returns array of row indexes of sought rows.

Assuming that this is a relatively short list of colors (<1000), the simplest thing to do is probably just iterate over the list and compare each element of the sub-array.
color_list = ...
color_index = -1
target_color = [0.9607843, 0.9607843, 0.8627451]
for i in range(0, len(color_list)):
cur_color = color_list[i]
if (cur_color[0] == target_color[0] and cur_color[1] = target_color[1] and cur_color[2] = target_color[2]):
color_index = i
break

Compare items of a list sequentially with another one then use one by one in VPython application

I have grid of objects in Vpython created by this code:
iX = [(x - pointW // 2) * sclFact for x in range(pointW)]
iY = [(x - pointH // 2) * sclFact for x in range(pointH)]
iYr = iY[::-1]
xy = list(itertools.product(iX,iYr,))
ixyz = np.array(list(itertools.product(iX,iYr,[-0.0])))
for element in ixyz:
cube = box(pos = element,
size=( .1, .1, .1 ),)
ixyz list print will look like this:
[[-0.5 0. -0. ]
[-0.5 -0.5 -0. ]
[ 0. 0. -0. ]
[ 0. -0.5 -0. ]
[ 0.5 0. -0. ]
[ 0.5 -0.5 -0. ]]
I have other list that the z value changes sometime depand on certain input and its always updated, it wll look like this
[[-0.5 0. -0. ]
[-0.5 -0.5 -0. ]
[ 0. 0. -0. ]
[ 0. -0.5 -0. ]
[ 0.5 0. -2.3570226]
[ 0.5 -0.5 -0. ]]
I want to move the objects based on the new list, i tried different veriation but it did not work, it always look at the last item in the second list
while True:
.... some code here (the one getting the new list)
...
...
# then I added this:
for obj in scene.objects:
if isinstance(obj, box):
for i in xyz: # xyz is the new list
if obj.pos != i:
obj.pos = i
this variation will make all the boxes be one box and move based on the last position in the list
what I am doing wrong or is there another way to do that ?
or should I change the whole process of creating the objects and move them?
I am really new in VPython and python itself.
Edit
I fixed both lists to be better presented like this
[(-0.5,0.0,-0.0),(-0.5,-0.5,-0.0),...(0.5,-0.5,-0.0)]

You are repeatedly setting the position to each element in the updated positions list:
box.pos = 1
box.pos = 2
box.pos = 3
You need to set the position one time; so compute an index:
i = 0
for obj....
if isinstance ...
obj.pos = xyz [i]
i += 1

+= with numpy.array object modifying original object

In the following code, I am attempting to calculate both the frequency and sum of a set of vectors (numpy vectors)
def calculate_means_on(the_labels, the_data):
freq = dict();
sums = dict();
means = dict();
total = 0;
for index, a_label in enumerate(the_labels):
this_data = the_data[index];
if a_label not in freq:
freq[a_label] = 1;
sums[a_label] = this_data;
else:
freq[a_label] += 1;
sums[a_label] += this_data;
Suppose the_data (a numpy 'matrix') is originally :
[[ 1. 2. 4.]
[ 1. 2. 4.]
[ 2. 1. 1.]
[ 2. 1. 1.]
[ 1. 1. 1.]]
After running the above code, the_data becomes:
[[ 3. 6. 12.]
[ 1. 2. 4.]
[ 7. 4. 4.]
[ 2. 1. 1.]
[ 1. 1. 1.]]
Why is this? I've deduced it down to the line sums[a_label] += this_data; as when i change it to sums[a_label] = sums[a_label] + this_data; it behaves as expected; i.e., the_data is not modified.

This line:
this_data = the_data[index]
takes a view, not a copy, of a row of the_data. The view is backed by the original array, and mutating the view will write through to the original array.
This line:
sums[a_label] = this_data
inserts that view into the sums dict, and this line:
sums[a_label] += this_data
mutates the original array through the view, since += requests that the operation be performed by mutation instead of by creating a new object, when the object is mutable.

Applying several functions to each row of an array

I have a numpy array which has only a few non-zero entries which can be either positive or negative. E.g. something like this:
myArray = np.array([[ 0. , 0. , 0. ],
[ 0.32, -6.79, 0. ],
[ 0. , 0. , 0. ],
[ 0. , 1.5 , 0. ],
[ 0. , 0. , -1.71]])
In the end, I would like to receive a list where each entry of this list corresponds to a row of myArray and is a cumulative product of function outputs which depend on the entries of the respective row of myArray and another list (in the example below it is called l).
The individual terms depend on the sign of the myArray entry: When it is positive, I apply "funPos", when it is negative, I apply "funNeg" and if the entry is 0, the term will be 1. So in the example array from above it would be:
output = [1*1*1 ,
funPos(0.32, l[0])*funNeg(-6.79,l[1])*1,
1*1*1,
1*funPos(1.5, l[1])*1,
1*1*funNeg(-1.71, l[2])]
I implemented this as shown below and it gives me the desired output (note: that is just a highly simplified toy example; the actual matrices are far bigger and the functions more complicated). I go through each row of the array, if the sum of the row is 0, I don't have to do any calculations and the output is just 1. If it is not equal 0, I go through this row, check the sign of each value and apply the appropriate function.
import numpy as np
def doCalcOnArray(Array1, myList):
output = np.ones(Array1.shape[0]) #initialize output
for indRow,row in enumerate(Array1):
if sum(row) != 0: #only then calculations are needed
tempProd = 1. #initialize the product that corresponds to the row
for indCol, valCol in enumerate(row):
if valCol > 0:
tempVal = funPos(valCol, myList[indCol])
elif valCol < 0:
tempVal = funNeg(valCol, myList[indCol])
elif valCol == 0:
tempVal = 1
tempProd = tempProd*tempVal
output[indRow] = tempProd
return output
def funPos(val1,val2):
return val1*val2
def funNeg(val1,val2):
return val1*(val2+1)
myArray = np.array([[ 0. , 0. , 0. ],
[ 0.32, -6.79, 0. ],
[ 0. , 0. , 0. ],
[ 0. , 1.5 , 0. ],
[ 0. , 0. , -1.71]])
l = [1.1, 2., 3.4]
op = doCalcOnArray(myArray,l)
print op
The output is
[ 1. -7.17024 1. 3. -7.524 ]
which is the desired one.
My question is whether there is a more efficient way for doing that since that is quite "expensive" for large arrays.
EDIT:
I accepted gabhijit's answer because the pure numpy solution he came up with seems to be the fastest one for the arrays I am dealing with. Please note, that there is also a nice working solution from RaJa that requires panda and also the solution from dave works fine which can serve as a nice example on how to use generators and numpy's "apply_along_axis".

Here's what I have tried - using reduce, map. I am not sure how fast this is - but is this what you are trying to do?
Edit 4: Simplest and most readable - Make l a numpy array and then greatly simplifies where.
import numpy as np
import time
l = np.array([1.0, 2.0, 3.0])
def posFunc(x,y):
return x*y
def negFunc(x,y):
return x*(y+1)
def myFunc(x, y):
if x > 0:
return posFunc(x, y)
if x < 0:
return negFunc(x, y)
else:
return 1.0
myArray = np.array([
[ 0.,0.,0.],
[ 0.32, -6.79, 0.],
[ 0.,0.,0.],
[ 0.,1.5,0.],
[ 0.,0., -1.71]])
t1 = time.time()
a = np.array([reduce(lambda x, (y,z): x*myFunc(z,l[y]), enumerate(x), 1) for x in myArray])
t2 = time.time()
print (t2-t1)*1000000
print a
Basically let's just look at last line it says cumulatively multiply things in enumerate(xx), starting with 1 (last parameter to reduce). myFunc simply takes the element in myArray(row) and element # index row in l and multiplies them as needed.
My output is not same as yours - so I am not sure whether this is exactly what you want, but may be you can follow the logic.
Also I am not so sure how fast this will be for huge arrays.
edit: Following is a 'pure numpy way' to do this.
my = myArray # just for brevity
t1 = time.time()
# First set the positive and negative values
# complicated - [my.itemset((x,y), posFunc(my.item(x,y), l[y])) for (x,y) in zip(*np.where(my > 0))]
# changed to
my = np.where(my > 0, my*l, my)
# complicated - [my.itemset((x,y), negFunc(my.item(x,y), l[y])) for (x,y) in zip(*np.where(my < 0))]
# changed to
my = np.where(my < 0, my*(l+1), my)
# print my - commented out to time it.
# Now set the zeroes to 1.0s
my = np.where(my == 0.0, 1.0, my)
# print my - commented out to time it
a = np.prod(my, axis=1)
t2 = time.time()
print (t2-t1)*1000000
print a
Let me try to explain the zip(*np.where(my != 0)) part as best as I can. np.where simply returns two numpy arrays first array is an index of row, second array is an index of column that matches the condition (my != 0) in this case. We take a tuple of those indices and then use array.itemset and array.item, thankfully, column index is available for free to us, so we can just take the element # that index in the list l. This should be faster than previous (and by orders of magnitude readable!!). Need to timeit to find out whether it indeed is.
Edit 2: Don't have to call separately for positive and negative can be done with one call np.where(my != 0).

So, let's see if I understand your question.
You want to map elements of your matrix to a new matrix such that:
0 maps to 1
x>0 maps to funPos(x)
x<0 maps to funNeg(x)
You want to calculate the product of all elements in the rows this new matrix.
So, here's how I would go about doing it:
1:
def myFun(a):
if a==0:
return 1
if a>0:
return funPos(a)
if a<0:
return funNeg(a)
newFun = np.vectorize(myFun)
newArray = newFun(myArray)
And for 2:
np.prod(newArray, axis = 1)
Edit: To pass the index to funPos, funNeg, you can probably do something like this:
# Python 2.7
r,c = myArray.shape
ctr = -1 # I don't understand why this should be -1 instead of 0
def myFun(a):
global ctr
global c
ind = ctr % c
ctr += 1
if a==0:
return 1
if a>0:
return funPos(a,l[ind])
if a<0:
return funNeg(a,l[ind])

I think this numpy function would be helpful to you
numpy.apply_along_axis
Here is one implementation. Also I would warn against checking if the sum of the array is 0. Comparing floats to 0 can give unexpected behavior due to machine accuracy constraints. Also if you have -5 and 5 the sum is zero and I'm not sure thats what you want. I used numpy's any() function to see if anything was nonzero. For simplicity I also pulled your list (my_list) into global scope.
import numpy as np
my_list = 1.1, 2., 3.4
def func_pos(val1, val2):
return val1 * val2
def func_neg(val1, val2):
return val1 *(val2 + 1)
def my_generator(row):
for i, a in enumerate(row):
if a > 0:
yield func_pos(a, my_list[i])
elif a < 0:
yield func_neg(a, my_list[i])
else:
yield 1
def reduce_row(row):
if not row.any():
return 1.0
else:
return np.prod(np.fromiter(my_generator(row), dtype=float))
def main():
myArray = np.array([
[ 0. , 0. , 0. ],
[ 0.32, -6.79, 0. ],
[ 0. , 0. , 0. ],
[ 0. , 1.5 , 0. ],
[ 0. , 0. , -1.71]])
return np.apply_along_axis(reduce_row, axis=1, arr=myArray)
There are probably faster implmentations, I think apply_along_axis is really just a loop under the covers.
I didn't test, but I bet this is faster than what you started with, and should be more memory efficient.

I've tried your example with the masking function of numpy arrays. However, I couldn't find a solution to replace the values in your array by funPos or funNeg.
So my suggestion would be to try this using pandas instead as it conserves indices while masking.
See my example:
import numpy as np
import pandas as pd
def funPos(a, b):
return a * b
def funNeg(a, b):
return a * (b + 1)
myPosFunc = np.vectorize(funPos) #vectorized form of funPos
myNegFunc = np.vectorize(funNeg) #vectorized form of funNeg
#Input
I = [1.0, 2.0, 3.0]
x = pd.DataFrame([
[ 0.,0.,0.],
[ 0.32, -6.79, 0.],
[ 0.,0.,0.],
[ 0.,1.5,0.],
[ 0.,0., -1.71]])
b = pd.DataFrame(myPosFunc(x[x>0], I)) #calculate all positive values
c = pd.DataFrame(myNegFunc(x[x<0], I)) #calculate all negative values
b = b.combineMult(c) #put values of c in b
b = b.fillna(1) #replace all missing values that were '0' in the raw array
y = b.product() #multiply all elements in one row
#Output
print ('final result')
print (y)
print (y.tolist())

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

negative values in my probability vector - python

Related

adding values in a 2d array provided that the first value is greater than 5

Numpy method to return the index of the occurrence of an array within an array of arrays

Compare items of a list sequentially with another one then use one by one in VPython application

+= with numpy.array object modifying original object

Applying several functions to each row of an array

Categories

Resources