How can I shorten this 2D array code? - python

I have the following snippet which works fine.
P=im1.copy()
for i in range(P.shape[0]):
for j in range(P.shape[1]):
if (n_dens.data[i][j]==-5000 or T_k.data[i][j]==-5000):
P.data[i][j]=-5000
else :
P.data[i][j]=n_dens.data[i][j]*T_k.data[i][j]
where P is a 2D array.
I was wondering how to trim this down to something along the following lines:
P.data=n_dens.data*T_k.data
P.data=[foo-2.5005*10**7 if n_dens.data==-5000 or T_k.data==-5000 else foo for foo in P.data]
For my trial above I get the following error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How can I correct the error? Or is there another method to trim it down?

The n_dens.data==-5000 produces an array of true/false values, not a single value. So, the if can't handle it. You are close to the idea though. You can use logical indexing in numpy.
Also logical operators cannot be overloaded in python. So, numpy does not handle them as you would wish. So, you have to do something like
index = np.logical_or(n_dens.data ==-5000, T_k.data==-5000)
P.data[index] = -5000
Similarly, P.data[np.logical_not(index)] = n_dens.data * T.data for the second branch of if-else.

You can try this:
P.data[(n_dens.data == -5000) | (T_k.data == -5000)] = -5000
cond = ~(n_dens.data == -5000) & ~(T_k.data == -5000) # 2D array of booleans
P.data[cond] = n_dens.data[cond] * T_k.data[cond]
A complete example:
import numpy as np
from copy import deepcopy
class IMAGE:
def __init__(self, data):
self.data = data
self.shape = self.data.shape
np.random.seed(0)
P, n_dens, T_k = IMAGE(np.zeros((5,5))), IMAGE(np.reshape(np.random.choice([-5000,1,2],25), (5,5))), IMAGE(3*np.ones((5,5)))
P1 = deepcopy(P)
# with loop
for i in range(P.shape[0]):
for j in range(P.shape[1]):
if (n_dens.data[i][j]==-5000 or T_k.data[i][j]==-5000):
P.data[i][j]=-5000
else :
P.data[i][j]=n_dens.data[i][j]*T_k.data[i][j]
# vectorized
P1.data[(n_dens.data == -5000) | (T_k.data == -5000)] = -5000
cond = ~(n_dens.data == -5000) & ~(T_k.data == -5000) # 2D array of booleans
P1.data[cond] = n_dens.data[cond] * T_k.data[cond]
cond
# array([[False, True, False, True, True],
# [ True, False, True, False, False],
# [False, True, True, True, True],
# [False, True, True, True, True],
# [False, True, False, False, True]], dtype=bool)
# with same output for both
P.data == P1.data
# array([[ True, True, True, True, True],
# [ True, True, True, True, True],
# [ True, True, True, True, True],
# [ True, True, True, True, True],
# [ True, True, True, True, True]], dtype=bool)

Related

Is there a fast way to create a bool matrix from another matrix in python?

i would like to know if there is a faster way, not O(n^2), to create a bool matrix out of an integer nxn-matrix.
Example:
given is the matrix:
matrix_int = [[-5,-8,6],[4,6,-9],[7,8,9]]
after transformation i want this:
matrix_bool = [[False,False,True],[True,True,False],[True,True,True]]
so all negative values should be False and all positive values should be True.
The brute force way is O(n^2) and this is too slow for me, too you have any ideas how to make this faster?
matrix_int = [[-5,-8,6],[4,6,-9],[7,8,9]]
matrix_int = np.array(matrix_int)
bool_mat = matrix_int > 0
result:
array([[False, False, True],
[ True, True, False],
[ True, True, True]])
matrix_int = [[-5,-8,6],[4,6,-9],[7,8,9]]
matrix_bool = [[num > 0 for num in row] for row in matrix_int]
# [[False, False, True], [True, True, False], [True, True, True]]

How can you find the index of a list within a list of lists

I know for 1d arrays there is a function called np.in1d that allows you to find the indices of an array that are present in another array, for example:
a = [0,0,0,24210,0,0,0,0,0,21220,0,0,0,0,0,24410]
b = [24210,24610,24410]
np.in1d(a,b)
yields [False, False, False, True, False, False, False, False, False,
False, False, False, False, False, False, True]
I was wondering if there was a command like this for finding lists in a list of lists?
c = [[1,0,1],[0,0,1],[0,0,0],[0,0,1],[1,1,1]]
d = [[0,0,1],[1,0,1]]
something like np.in2d(c,d)
would yield [True, True, False, True, False]
Edit: I should add, I tried this with in1d and it flattens the 2d lists so it does not give the correct output.
I did np.in1d(c,d) and the result was [ True, True, True,
True, True, True, True, True, True, True, True, True, True, True,
True]
What about this?
[x in d for x in c]

Best practice to expand a list (efficiency) in python

I'm working with large data sets. I'm trying to use the NumPy library where I can or python features to process the data sets in an efficient way (e.g. LC).
First I find the relevant indexes:
dt_temp_idx = np.where(dt_diff > dt_temp_th)
Then I want to create a mask containing for each index a sequence starting from the index to a stop value, I tried:
mask_dt_temp = [np.arange(idx, idx+dt_temp_step) for idx in dt_temp_idx]
and:
mask_dt_temp = [idxs for idx in dt_temp_idx for idxs in np.arange(idx, idx+dt_temp_step)]
but it gives me the exception:
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
Example input:
indexes = [0, 100, 1000]
Example output with stop values after 10 integers from each indexes:
list = [0, 1, ..., 10, 100, 101, ..., 110, 1000, 1001, ..., 1010]
1) How can I solve it?
2) Is it the best practice to do it?
Using masks (boolean arrays) are efficient being memory-efficient and performant too. We will make use of SciPy's binary-dilation to extend the thresholded mask.
Here's a step-by-step setup and solution run-
In [42]: # Random data setup
...: np.random.seed(0)
...: dt_diff = np.random.rand(20)
...: dt_temp_th = 0.9
In [43]: # Get mask of threshold crossings
...: mask = dt_diff > dt_temp_th
In [44]: mask
Out[44]:
array([False, False, False, False, False, False, False, False, True,
False, False, False, False, True, False, False, False, False,
False, False])
In [45]: W = 3 # window size for extension (edit it according to your use-case)
In [46]: from scipy.ndimage.morphology import binary_dilation
In [47]: extm = binary_dilation(mask, np.ones(W, dtype=bool), origin=-(W//2))
In [48]: mask
Out[48]:
array([False, False, False, False, False, False, False, False, True,
False, False, False, False, True, False, False, False, False,
False, False])
In [49]: extm
Out[49]:
array([False, False, False, False, False, False, False, False, True,
True, True, False, False, True, True, True, False, False,
False, False])
Compare mask against extm to see how the extension takes place.
As, we can see the thresholded mask is extended by window-size W on the right side, as is the expected output mask extm. This can be use to mask out those in the input array : dt_diff[~extm] to simulate the deleting/dropping of the elements from the input following boolean-indexing or inversely dt_diff[extm] to simulate selecting those.
Alternatives with NumPy based functions
Alternative #1
extm = np.convolve(mask, np.ones(W, dtype=int))[:len(dt_diff)]>0
Alternative #2
idx = np.flatnonzero(mask)
ext_idx = (idx[:,None]+ np.arange(W)).ravel()
ext_mask = np.ones(len(dt_diff), dtype=bool)
ext_mask[ext_idx[ext_idx<len(dt_diff)]] = False
# Get filtered o/p
out = dt_diff[ext_mask]
dt_temp_idx is a numpy array, but still a Python iterable so you can use a good old Python list comprehension:
lst = [ i for j in dt_temp_idx for i in range(j, j+11)]
If you want to cope with sequence overlaps and make it back a np.array, just do:
result = np.array({i for j in dt_temp_idx for i in range(j, j+11)})
But beware the use of a set is robust and guarantee no repetition but it could be more expensive that a simple list.

boolean sat check my code

I am getting a wrong answer for my code
n is the number of variables
and formula is a list containing clauses
Given a SAT instance with 'n' variables and clauses encoded in list 'formula',
returns 'satisfiable' if the instance is satisfiable, and 'unsatisfiable'
otherwise. Each element of 'formula' represents a clause and is a list of
integers where an integer i indicates that the literal Xi is present in the
clause and an integer -i indicates that the literal ~Xi is present in the
clause. For example, a clause "X1 v ~X11 v X7" is represented with the list
[1, -11, 7].
import itertools
n = 4
formula = [[1, -2, 3], [-1, 3], [-3], [2, 3]]
booleanValues = [True,False] * n
allorderings = set(itertools.permutations(booleanValues, n)) #create possible combinations of variables that can check if formula is satisfiable or not
print(allorderings)
for potential in allorderings:
l = [] #boolean value for each variable / different combination for each iteration
for i in potential:
l.append(i)
#possible = [False]*n
aclause = []
for clause in formula:
something = []
#clause is [1,2,3]
for item in clause:
if item > 0:
something.append(l[item-1])
else:
item = item * -1
x = l[item-1]
if x == True:
x = False
else:
x = True
something.append(x)
counter = 0
cal = False
for thingsinclause in something:
if counter == 0:
cal = thingsinclause
counter = counter + 1
else:
cal = cal and thingsinclause
counter = counter + 1
aclause.append(cal)
counter2 = 0
formcheck = False
for checkformula in aclause:
if counter2 == 0:
formcheck = checkformula
counter2 = counter2 + 1
else:
formcheck = formcheck or checkformula
print("this combination works", checkformula)
Here is a corrected version:
import itertools
n = 4
formula = [[1, -2, 3], [-1, 3], [-3], [2, 3]]
allorderings = itertools.product ([False, True], repeat = n)
for potential in allorderings:
print ("Initial values:", potential)
allclauses = []
for clause in formula:
curclause = []
for item in clause:
x = potential[abs (item) - 1]
curclause.append (x if item > 0 else not x)
cal = any (curclause)
allclauses.append (cal)
print ("Clauses:", allclauses)
formcheck = all (allclauses)
print ("This combination works:", formcheck)
Points to consider:
Instead of introducing some complex — and also wrong — logic to find the conjunction and disjunction, you can use any and all. That's cleaner and less prone to bugs.
The natural object to loop over is itertools.product([False, True], repeat = n), that is, the set [False, True] of possible boolean values raised to the power of n. In other words, the Cartesian product of n copies of [False, True]. Here is the documentation for itertools.product.
I introduced a bit more output to see how things are going. Here is the output I get with Python3 (Python2 adds parentheses but prints essentially the same):
Initial values: (False, False, False, False)
Clauses: [True, True, True, False]
This combination works: False
Initial values: (False, False, False, True)
Clauses: [True, True, True, False]
This combination works: False
Initial values: (False, False, True, False)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (False, False, True, True)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (False, True, False, False)
Clauses: [False, True, True, True]
This combination works: False
Initial values: (False, True, False, True)
Clauses: [False, True, True, True]
This combination works: False
Initial values: (False, True, True, False)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (False, True, True, True)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (True, False, False, False)
Clauses: [True, False, True, False]
This combination works: False
Initial values: (True, False, False, True)
Clauses: [True, False, True, False]
This combination works: False
Initial values: (True, False, True, False)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (True, False, True, True)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (True, True, False, False)
Clauses: [True, False, True, True]
This combination works: False
Initial values: (True, True, False, True)
Clauses: [True, False, True, True]
This combination works: False
Initial values: (True, True, True, False)
Clauses: [True, True, False, True]
This combination works: False
Initial values: (True, True, True, True)
Clauses: [True, True, False, True]
This combination works: False

How to construct logical expression for advanced slicing

I am trying to figure out a cleaner way of doing the following:
import numpy
a = np.array([1,2,4,5,1,4,2,1])
cut = a == (1 or 2)
print cut
[ True False False False True False False True]
The above is of course a simplified example. The expression (1 or 2) can be large or complicated. As a start, I would like to generalize this thusly:
cutexp = (1 or 2)
cut = a == cutexp
Maybe, cutexp can be turned into a function or something but I'm not sure where to start looking.
You could also try numpy.in1d. Say
>>> a = np.array([1,2,4,5,1,4,2,1])
>>> b = np.array([1,2]) # Your test array
>>> np.in1d(a,b)
array([ True, True, False, False, True, False, True, True], dtype=bool)
>>> (a == 2) | (a == 1)
array([ True, True, False, False, True, False, True, True], dtype=bool)

Categories