Setting up multiple iterative statement - python

Let me say I'm rather new to python and stack overflow, so please help out if I'm making mistakes while I'm posting here.
I have a set of data where I am building intervals in a loop.
The data consists of three columns (0's and 1's). I would like to start a new interval any time a new 1 appears (where all three data sets are 0 before) and close the interval right before all elements are 0 again. For example:
data = [[0. 0. 0.]
[1. 0. 0.]
[1. 1. 0.]
[1. 1. 1.]
[0. 1. 1.]
[0. 0. 1.]
[0. 0. 0.]]
should come out as one interval with:
intervals = [[[1. 0. 0.]
[1. 1. 0.]
[1. 1. 1.]
[0. 1. 1.]
[0. 0. 1.]]]
and if there was the pattern in data was to repeat or there was a new sequence (following the same rules) , it would start a new interval. As an example, if the data set had the same repeating information, intervals would become:
intervals = [[[1. 0. 0.]
[1. 1. 0.]
[1. 1. 1.]
[0. 1. 1.]
[0. 0. 1.]],
[[1. 0. 0.]
[1. 1. 0.]
[1. 1. 1.]
[0. 1. 1.]
[0. 0. 1.]] ]
I able to achieve this style for an array of values with the following code, but am now trying to extend it to a n x 3 format.
A = np.array([0, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1])
b = [[A[0]]]
last_char = A[0]
num_seq = 0
for i in range(1, len(A)):
if A[i] != last_char:
num_seq += 1
if len(b) <= num_seq:
b.append([])
b[num_seq].append(A[i])
last_char = A[i]

This is what ended up working. It is an extension of the last part of information I have in my statement. If anyone has anything more clean, I'd love to see it.
if data[0].any() == 1:
last_char = 1
access_matrix = [[1]]
else:
last_char = 0
access_matrix = [[0]]
num_seq = 0
for i in range(1, len(data)):
if data[i].any() == 1:
temp_var = 1
else:
temp_var = 0
if temp_var != last_char:
num_seq += 1
if len(access_matrix) <= num_seq:
access_matrix.append([])
access_matrix[num_seq].append(temp_var)
last_char = temp_var

I dont know the context of the problem required for the code but here's the code which gives you the result that you are looking for.
data = [0, 0, 0,
1, 0, 0,
1, 1, 0,
1, 1, 1,
0, 1, 1,
0, 0, 1,
0, 0, 0,]
idx=0
result=[]
temp_list=[]
for i in data:
if idx <= len(data)-1:
ele = data[idx]
if ele ==0:
if idx%3==0:
if data[idx+1] == 0 and data[idx+2] == 0:
# break the interval
if temp_list:
result.append(temp_list)
temp_list=[]
idx+=3
else:
# add ele to interval
temp_list.append(ele)
idx+=1
else:
# add ele to interval
temp_list.append(ele)
idx+=1
else:
# add ele to interval
temp_list.append(ele)
idx+=1
print(result)
The result is a list of your intervals found.
[
[
1, 0, 0,
1, 1, 0,
1, 1, 1,
0, 1, 1,
0, 0, 1
]
]

Related

cupy/numpy ignores duplicate indexes

When we uses arrays as indexes cupy/numpy ignores duplicates.
Example:
import cupy as cp
matrix = cp.zeros((3, 3))
xi = cp.asarray([0, 1, 1, 2])
yi = cp.asarray([0, 1, 1, 2])
matrix[xi, yi] += 1
print(matrix.get())
Output:
[[1. 0. 0.]
[0. 1. 0.]
[0. 0. 1.]]
Desired output:
[[1. 0. 0.]
[0. 2. 0.]
[0. 0. 1.]]
The second one (1, 1) index is ignored. How to apply operation for duplicate indexes also?

Looking to replace min and max of ndarray in Python

I have the following ndarray :
c_dist = [[0. 5.83095189]
[2.23606798 3.60555128]
[5.83095189 0. ]
[5.83095189 2.82842712]
[4.12310563 2.23606798]]
and I would like for each sub-array to replace the min with 1 and the max with 0, in order to obtain the following :
[[1. 0.]
[1. 0.]
[0. 1.]
[0. 1.]
[0. 1.]]
I used the following :
for i in range(len(c_dist)):
max_of_row = c_dist[i].max()
for elements_of_row in range(len(c_dist[i])):
if c_dist[i][elements_of_row] == max_of_row:
c_dist[i][elements_of_row] = 1
else:
c_dist[i][elements_of_row] = 0
but it is obviously not very elegant.
Is there an python way of doing the comparison array by array please ?
Try this in one line:
c_dist = [[0. ,5.83095189],
[2.23606798 ,3.60555128],
[5.83095189 ,0. ],
[5.83095189 ,2.82842712],
[4.12310563 ,2.23606798]]
new_list = [[int(i<=j), int(i>j)] for i,j in c_dist]
The result will be:
In [6]: new_list
Out[6]: [[1, 0], [1, 0], [0, 1], [0, 1], [0, 1]]
If you have more than 2 columns:
out = c_dist.copy()
np.put_along_axis(out, c_dist.argmax(0), 1)
np.put_along_axis(out, c_dist.argmin(0), 0)
Or if there are multiple max and min values per row:
out = np.where(c_dist == c_dist.max(0, keepdims = True), 1, c_dist)
out = np.where(c_dist == c_dist.min(0, keepdims = True), 0, out)

Truncating a 2D array for a given tolerance [Python]

An old question on Singular Value Decomposition lead me to ask this question:
How could I truncate a 2-Dimensional array, to a number of columns dictated by a certain tolerance?
Specifically, please consider the following code snippet, which defines an accepted tolerance of 1e-4 and applies Singular Value Decomposition to a matrix 'A'.
#Python
tol=1e-4
U,Sa,V=np.linalg.svd(A)
S=np.diag(Sa)
The resulting singular value diagonal matrix 'S' holds non-negative singular values in decreasing order of magnitude.
What I want to obtain is a truncated 'S' matrix, in a way that the columns of the matrix holding singular values lower than 1e-4 would be removed. Then, apply this truncation to the matrix 'U'.
Is there a simple way of doing this? I have been looking around, and found some solutions to the problem for Matlab, but didn't find anything similar for Python.
For Matlab, the code would look something like:
%Matlab
tol=1e-4
mask=any(Sigma>=tol,2);
sigRB=Sigma(:,mask);
mask2=any(U>=tol,2);
B=U(:,mask);
Thanks in advance. I hope my post was not too messy to understand.
I am not sure if I understand you correctly. If my solution is not what you ask for, please consider adding an example to your question.
The following code drops all columns from array s that consist only of values smaller than tol.
s = np.array([
[1, 0, 0, 0, 0, 0],
[0, .9, 0, 0, 0, 0],
[0, 0, .5, 0, 0, 0],
[0, 0, 0, .4, 0, 0],
[0, 0, 0, 0, .3, 0],
[0, 0, 0, 0, 0, .2]
])
print(s)
tol = .4
ind = np.argwhere(s.max(axis=1) < tol)
s = np.delete(s, ind, 1)
print(s)
Output:
[[1. 0. 0. 0. 0. 0. ]
[0. 0.9 0. 0. 0. 0. ]
[0. 0. 0.5 0. 0. 0. ]
[0. 0. 0. 0.4 0. 0. ]
[0. 0. 0. 0. 0.3 0. ]
[0. 0. 0. 0. 0. 0.2]]
[[1. 0. 0. 0. ]
[0. 0.9 0. 0. ]
[0. 0. 0.5 0. ]
[0. 0. 0. 0.4]
[0. 0. 0. 0. ]
[0. 0. 0. 0. ]]
I am applying max to axis 1 and then using np.argwhere to get the indices of the columns where the max value is smaller than tol.
Edit: In order to truncate the columns of matrix 'U', so it coincides in size with the reduced matrix 'S', the following code works:
k = len(S[0])
Ured = U[:,0:k]
Uredsize = np.shape(Ured) # To check it has worked
print(Uredsize)

how to generate a modified version of identity matrix in python

I want to generate a modified version of the identity matrix, call it C, such that Cii is zero until some index i, the rest is still 1.
I can use brute force to set Cii to 0, but I think that is not good.
Is there any efficient functions I can use, this is hard to search.
Example below:
the original identity matrix for 3 * 3 is
1 0 0
0 1 0
0 0 1
, I want to change this into:
0 0 0
0 1 0
0 0 1
so the i is 0 in this case, want to change Ckk, k goes from [0, i] to 0.
np.diag makes a 2d array from a 1d diagonal:
In [97]: np.diag((np.arange(6)>2).astype(int))
Out[97]:
array([[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0],
[0, 0, 0, 0, 1, 0],
[0, 0, 0, 0, 0, 1]])
basically the same as PPanzer's, but generating the diagonal a different way. Similar speed.
Here is one possibility:
N = 5
k = 2
np.diag(np.bincount([k],None,N).cumsum())
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 1, 0, 0],
[0, 0, 0, 1, 0],
[0, 0, 0, 0, 1]])
Update: fast solution:
out = np.zeros((N,N))
out.reshape(-1)[(N+1)*k::N+1] = 1
You can build an NxN identity matrix and assign zero to the top left KxK corner:
N,K = 10,3
im = np.identity(N)
im[:K,:K] = 0
print(im)
output:
[[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 1. 0. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 1. 0. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 1. 0.]
[0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]]
40% faster than hpaulj's but not as fast at Paul Panzer's fast solution (which is 3x faster than this)

Add 2-d array to 3-d array with constantly changing index fast

I'm trying to add a 2-d array to a 3-d array with constantly changing index , I come up with following code:
import numpy as np
a = np.zeros([8, 3, 5])
k = 0
for i in range(2):
for j in range(4):
a[k, i: i + 2, j: j + 2] += np.ones([2, 2], dtype=int)
k += 1
print(a)
which will give exactly what i want:
[[[1. 1. 0. 0. 0.]
[1. 1. 0. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 1. 1. 0. 0.]
[0. 1. 1. 0. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 1. 1. 0.]
[0. 0. 1. 1. 0.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 1. 1.]
[0. 0. 0. 1. 1.]
[0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[1. 1. 0. 0. 0.]
[1. 1. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 1. 1. 0. 0.]
[0. 1. 1. 0. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 1. 1. 0.]
[0. 0. 1. 1. 0.]]
[[0. 0. 0. 0. 0.]
[0. 0. 0. 1. 1.]
[0. 0. 0. 1. 1.]]]
I wish it can be faster so I create an array for index and trying to use np.vectorize. But as manual described, vectorize is not for performance. And my goal is running through an array with shape of (10^6, 15, 15) which end up with 10^6 iteration. I hope there are some cleaner solution can get rid of all the for-loop.
This is the first time I using stack overflow, any suggestion are appreciated.
Thank you.
A efficient solution using numpy.lib.stride_tricks, which can "view" all the possibilities.
N=4 #tray size #(square)
P=3 # chunk size
R=N-P
from numpy.lib.stride_tricks import as_strided
tray = zeros((N,N),numpy.int32)
chunk = ones((P,P),numpy.int32)
tray[R:,R:] = chunk
tray = np.vstack((tray,tray))
view = as_strided(tray,shape=(R+1,R+1,N,N),strides=(4*N,4,4*N,4))
a_view = view.reshape(-1,N,N)
a_hard = a_view.copy()
Here is the result :
In [3]: a_view
Out[3]:
array([[[0, 0, 0, 0],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1]],
[[0, 0, 0, 0],
[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0]],
[[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 1, 1, 1],
[0, 0, 0, 0]],
[[1, 1, 1, 0],
[1, 1, 1, 0],
[1, 1, 1, 0],
[0, 0, 0, 0]]])
a_view is just a view on possible positions of a chunk on the tray. It doesn't cost any computation, and it just uses twice the tray space.
a_hard is a hard copy, necessary if you need to modify it.

Categories