Creating a subarray with no of aubarrays passed as arguments in python - python

I have a large 100x15 array like this:
[a b c d e f g h i j k l m n o]
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15]
.
.
.(Up to 100 rows)
I want to select a portion of this data into a subset using a function which has an argument 'k' in which 'k' denotes the no of subsets to be made, like say k=5 means the data attributes are divided into 3 subsets like below:
[a b c d e] [f g h i j] [k l m n o]
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
[1 2 3 4 5] [6 7 8 9 10] [11 12 13 14 15]
.
.
.(Up to 100 rows)
and they are stored in a different array. I want to implement this using python. I have implemented this partially. Can any one implement this and provide me the code in the answer?
Partial logic for the inner loop
given k
set start_index = 0
end_index = length of array/k = increment
for j from start_index to end_index
start_index=end_index + 1
end_index = end_index + increment
//newarray[][] (I'm not sure abt here)
Thank You.

This returns an array of matrices with columnsize = 2 , which works for k=2:
import numpy as np
def portion(mtx, k):
array = []
array.append( mtx[:, :k])
for i in range(1, mtx.shape[1]-1):
array.append( mtx[:, k*i:k*(i+1)])
return array[:k+1]
mtx = np.matrix([[1,2,3,10,13,14], [4,5,6,11,15,16], [7,8,9,12,17,18]])
k = 2
print(portion(mtx, k))

Unfortunately I have to do it myself and this is the code in python for the logic. Anyway thanks to #astaning for the attempt.
def build_rotationtree_model(k):
mtx =np.array([[2.95,6,63,23],[2,53,7,79],[3.57,5,65,32],[3.16,5,47,34],[21,2.58,4,46],[3.1,2.16,6,22],[3.5,3.27,3,52],[12,2.56,4,42]])
#Length of attributes (width of matrix)
a = mtx.shape[1]
newArray =[[0 for x in range(k)] for y in range(len(mtx))]
#Height of matrix(total rows)
b = mtx.shape[0]
#Seperation limit
limit = a/k
#Starting of sub matrix
start = 0
#Ending of sub matrix
end = a/k
print(end)
print(a)
#Loop
while(end != a):
for i in range(0,b-1):
for j in range(start,int(end)):
newArray[i][j] = mtx[i][j]
print(newArray[i])
#Call LDA function and add the result to Sparse Matrix
#sparseMat = LDA(newArray) SHould be inside a loop
start = end + 1
end = end + limit

a=list(input())
for i in range(0,len(a)):
for j in range(i,len(a)):
for k in range(i,j+1):
print(a[k],end=" ")
print("\n",end="")

Related

Find local maxima or peaks(index) in a numeric series using numpy and pandas Peak refers to the values surrounded by smaller values on both sides

Write a python program to find all the local maxima or peaks(index) in a numeric series using numpy and pandas Peak refers to the values surrounded by smaller values on both sides
Note
Create a Pandas series from the given input.
Input format:
First line of the input consists of list of integers separated by spaces to from pandas series.
Output format:
Output display the array of indices where peak values present.
Sample testcase
input1
12 1 2 1 9 10 2 5 7 8 9 -9 10 5 15
output1
[2 5 10 12]
smapletest cases image
How to solve this problem?
import pandas as pd
a = "12 1 2 1 9 10 2 5 7 8 9 -9 10 5 15"
a = [int(x) for x in a.split(" ")]
angles = []
for i in range(len(a)):
if i!=0:
if a[i]>a[i-1]:
angles.append('rise')
else:
angles.append('fall')
else:
angles.append('ignore')
prev=0
prev_val = "none"
counts = []
for s in angles:
if s=="fall" and prev_val=="rise":
prev_val = s
counts.append(1)
else:
prev_val = s
counts.append(0)
peaks_pd = pd.Series(counts).shift(-1).fillna(0).astype(int)
df = pd.DataFrame({
'a':a,
'peaks':peaks_pd
})
peak_vals = list(df[df['peaks']==1]['a'].index)
This could be improved further. Steps I have followed:
First find the angle whether its rising or falling
Look at the index at which it starts falling after rising and call it as peaks
Use:
data = [12, 1, 2, 1.1, 9, 10, 2.1, 5, 7, 8, 9.1, -9, 10.1, 5.1, 15]
s = pd.Series(data)
n = 3 # number of points to be checked before and after
from scipy.signal import argrelextrema
local_max_index = argrelextrema(s.to_frame().to_numpy(), np.greater_equal, order=n)[0].tolist()
print (local_max_index)
[0, 5, 14]
local_max_index = s.index[(s.shift() <= s) & (s.shift(-1) <= s)].tolist()
print (local_max_index)
[2, 5, 10, 12]
local_max_index = s.index[s == s.rolling(n, center=True).max()].tolist()
print (local_max_index)
[2, 5, 10, 12]
EDIT: Solution for processing value in DataFrame:
df = pd.DataFrame({'Input': ["12 1 2 1 9 10 2 5 7 8 9 -9 10 5 15"]})
print (df)
Input
0 12 1 2 1 9 10 2 5 7 8 9 -9 10 5 15
s = df['Input'].iloc[[0]].str.split().explode().astype(int).reset_index(drop=True)
print (s)
0 12
1 1
2 2
3 1
4 9
5 10
6 2
7 5
8 7
9 8
10 9
11 -9
12 10
13 5
14 15
Name: Input, dtype: int32
local_max_index = s.index[(s.shift() <= s) & (s.shift(-1) <= s)].tolist()
print (local_max_index)
[2, 5, 10, 12]
df['output'] = [local_max_index]
print (df)
Input output
0 12 1 2 1 9 10 2 5 7 8 9 -9 10 5 15 [2, 5, 10, 12]

Create dynamic nested for loops

I have some arrays m rows by 2 `columns (like series of coordinates) and I want to automatize my code so that I will not use nested loop for every coord. Here is my code it runs well and gives right answer coordinates but I want to make a dynamic loop:
import numpy as np
A = np.array([[1,5,7,4,6,2,2,6,7,2],[2,8,2,9,3,9,8,5,6,2],[3,4,0,2,4,3,0,2,6,7],\
[1,5,7,3,4,5,2,7,9,7],[6,2,8,8,6,7,9,6,9,7],[0,2,0,3,3,5,2,3,5,5],[5,5,5,0,6,6,8,5,9,0]\
,[0,5,7,6,0,6,9,9,6,7],[5,5,8,5,0,8,5,3,5,5],[0,0,6,3,3,3,9,5,9,9]])
number = 8292
number = np.asarray([int(i) for i in str(number)]) #split number into array
#the coordinates of every single value contained in required number
coord1=np.asarray(np.where(A == number[0])).T
coord2=np.asarray(np.where(A == number[1])).T
coord3=np.asarray(np.where(A == number[2])).T
coord4=np.asarray(np.where(A == number[3])).T
coordinates = np.array([[0,0]]) #initialize the array that will return all the desired coordinates
solutions = 0 #initialize the array that will give the number of solutions
for j in coord1:
j = j.reshape(1, -1)
for i in coord2 :
i=i.reshape(1, -1)
if (i[0,0]==j[0,0]+1 and i[0,1]==j[0,1]) or (i[0,0]==j[0,0]-1 and i[0,1]==j[0,1]) or (i[0,0]==j[0,0] and i[0,1]==j[0,1]+1) or (i[0,0]==j[0,0] and i[0,1]==j[0,1]-1) :
for ii in coord3 :
ii=ii.reshape(1, -1)
if (np.array_equal(ii,j)==0 and ii[0,0]==i[0,0]+1 and ii[0,1]==i[0,1]) or (np.array_equal(ii,j)==0 and ii[0,0]==i[0,0]-1 and ii[0,1]==i[0,1]) or (np.array_equal(ii,j)==0 and ii[0,0]==i[0,0] and ii[0,1]==i[0,1]+1) or (np.array_equal(ii,j)==0 and ii[0,0]==i[0,0] and ii[0,1]==i[0,1]-1) :
for iii in coord4 :
iii=iii.reshape(1, -1)
if (np.array_equal(iii,i)==0 and iii[0,0]==ii[0,0]+1 and iii[0,1]==ii[0,1]) or (np.array_equal(iii,i)==0 and iii[0,0]==ii[0,0]-1 and iii[0,1]==ii[0,1]) or (np.array_equal(iii,i)==0 and iii[0,0]==ii[0,0] and iii[0,1]==ii[0,1]+1) or (np.array_equal(iii,i)==0 and iii[0,0]==ii[0,0] and iii[0,1]==ii[0,1]-1) :
point = np.concatenate((j,i,ii,iii))
coordinates = np.append(coordinates,point,axis=0)
solutions +=1
coordinates = np.delete(coordinates, (0), axis=0)
import itertools
A = [1, 2, 3]
B = [4, 5, 6]
C = [7, 8, 9]
for (a, b, c) in itertools.product (A, B, C):
print (a, b, c);
outputs:
1 4 7
1 4 8
1 4 9
1 5 7
1 5 8
1 5 9
1 6 7
1 6 8
1 6 9
2 4 7
2 4 8
2 4 9
2 5 7
2 5 8
2 5 9
2 6 7
2 6 8
2 6 9
3 4 7
3 4 8
3 4 9
3 5 7
3 5 8
3 5 9
3 6 7
3 6 8
3 6 9
See documentation for details.

Is there a faster way than the for loop to label matrix(3D array) in python?

I wrote a code for labeling matrix(3D array) in Python.
The concept of code is
check the 2 by 2 by 2 matrix in 3D array(whatever size I want)
if the matrix has 1, 2, and 3 as element, all elements in matrix would be changed into "max unique number + 1" in matrix.
import numpy as np
def label_A(input_field):
labeling_A = np.copy(input_field)
labeling_test = np.zeros((input_field.shape))
for i in range(0,input_field.shape[0]-1):
for j in range(0,input_field.shape[1]-1):
for k in range(0,input_field.shape[2]-1):
test_unit = input_field[i:i+2,j:j+2,k:k+2]
if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
labeling_test[i:i+2,j:j+2,k:k+2] = np.max(input_field)+1
labeling_A[labeling_test == np.max(input_field)+1] = np.max(input_field)+1
return labeling_A
This is a simple example code in matrix in 3D.
example = np.random.randint(0, 10, size=(10, 10, 10))
label_example = label_A(example)
label_example
In my view, the code itself has no problem and it works, actually. However, I am curious about that is there any faster way to do the same function for this?
This implementation returns the suggested result and handles a (140,140,140) sized tensor in 1.8 seconds.
import numpy as np
from scipy.signal import convolve
def strange_convolve(mat, f_shape, _value_set, replace_value):
_filter =np.ones(tuple(s*2-1 for s in f_shape))
replace_mat = np.ones(mat.shape)
for value in _value_set:
value_counts = convolve((mat==value),_filter,mode='same')
replace_mat*=(value_counts>0)
mat[replace_mat==1]=replace_value
return mat
example = np.random.randint(0, 8, size=(10, 10, 10))
print('same_output validation is '+str((strange_convolve(example,(2,2,2),(1,2,3),4) == label_A(example)).min()))
import time
example = np.random.randint(0, 10, size=(140, 140, 140))
timer = time.time()
strange_convolve(example,(2,2,2),(1,2,3),4)
print(time.time()-timer)
1.8871610164642334
First, you have a couple of issues with your code that can be easily resolved and sped up.
For every loop, you are recalculating np.max(input_field)+1 three times.
The larger your matrix becomes, the impact becomes much more noticeable. Note the difference in tests A and B.
I tried running tests with the convolve example above, and while it was fast, the results were never the same as the other test (which in the setup below should have been identical). I believe it's looking for 1, 2, or 3 in a 3x3x3 block.
Label A with size of 10 --- 0:00.015628
Label B with size of 10 --- 0:00.015621
Label F with size of 10 --- 0:00.015628
Label A with size of 50 --- 0:15.984662
Label B with size of 50 --- 0:10.093478
Label F with size of 50 --- 0:02.265621
Label A with size of 80 --- 4:02.564660
Label B with size of 80 --- 2:29.439298
Label F with size of 80 --- 0:09.437868
------ Edited ------
The convolve method is definately faster, though I believe there is some issue with the code as given by Peter.
Label A with size of 10 : 00.013985
[[ 2 10 10 10 10 4 9 0 8 7]
[ 9 10 10 10 10 0 9 8 5 9]
[ 3 8 4 0 9 4 2 8 7 1]
[ 4 7 6 10 10 4 8 8 5 4]]
Label B with size of 10 : 00.014002
[[ 2 10 10 10 10 4 9 0 8 7]
[ 9 10 10 10 10 0 9 8 5 9]
[ 3 8 4 0 9 4 2 8 7 1]
[ 4 7 6 10 10 4 8 8 5 4]]
Label Flat with size of 10 : 00.020001
[[ 2 10 10 10 10 4 9 0 8 7]
[ 9 10 10 10 10 0 9 8 5 9]
[ 3 8 4 0 9 4 2 8 7 1]
[ 4 7 6 10 10 4 8 8 5 4]]
Label Convolve with size of 10 : 00.083996
[[ 2 2 10 8 4 10 9 0 8 7]
[ 9 10 0 4 7 10 9 10 10 9]
[ 3 8 4 0 9 4 2 10 7 10]
[ 4 7 10 5 0 4 8 10 5 4]]
The OP wanted all elements of the 2x2x2 matrix set to the higher value.
Note that convolve in it's present setup sets some single space elements and not in the 2x2x2 matrix pattern.
Below is my code:
import numpy as np
from scipy.signal import convolve
from pandas import datetime as dt
def label_A(input_field):
labeling_A = np.copy(input_field)
labeling_test = np.zeros((input_field.shape))
for i in range(0,input_field.shape[0]-1):
for j in range(0,input_field.shape[1]-1):
for k in range(0,input_field.shape[2]-1):
test_unit = input_field[i:i+2,j:j+2,k:k+2]
if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
labeling_test[i:i+2,j:j+2,k:k+2] = np.max(input_field)+1
labeling_A[labeling_test == np.max(input_field)+1] = np.max(input_field)+1
return labeling_A
def label_B(input_field):
labeling_B = np.copy(input_field)
labeling_test = np.zeros((input_field.shape))
input_max = np.max(input_field)+1
for i in range(0,input_field.shape[0]-1):
for j in range(0,input_field.shape[1]-1):
for k in range(0,input_field.shape[2]-1):
test_unit = input_field[i:i+2,j:j+2,k:k+2]
if set(np.unique(test_unit).astype(int)) >= set((1,2,3)):
labeling_test[i:i+2,j:j+2,k:k+2] = input_max
labeling_B[labeling_test == input_max] = input_max
return labeling_B
def label_Convolve(input_field):
_filter =np.ones([2,2,2])
replace_mat = np.ones(input_field.shape)
input_max = np.max(input_field)+1
for value in (1,2,3):
value_counts = convolve((input_field==value),_filter,mode='same')
replace_mat*=(value_counts>0)
input_field[replace_mat==1] = input_max
return input_field
def flat_mat(matrix):
flat = matrix.flatten()
dest_mat = np.copy(flat)
mat_width = matrix.shape[0]
mat_length = matrix.shape[1]
mat_depth = matrix.shape[2]
input_max = np.max(matrix)+1
block = 0
for w in range(mat_width*(mat_length)*(mat_depth-1)):
if (w+1)%mat_width != 0:
if (block+1)%mat_length == 0:
pass
else:
set1 = flat[w:w+2]
set2 = flat[w+mat_width:w+2+mat_width]
set3 = flat[w+(mat_width*mat_length):w+(mat_width*mat_length)+2]
set4 = flat[w+(mat_width*mat_length)+mat_width:w+(mat_width*mat_length)+mat_width+2]
fullblock = np.array([set1, set2, set3, set4])
blockset = np.unique(fullblock)
if set(blockset) >= set((1,2,3)):
dest_mat[w:w+2] = input_max
dest_mat[w+mat_width:w+2+mat_width] = input_max
dest_mat[w+(mat_width*mat_length):w+(mat_width*mat_length)+2] = input_max
dest_mat[w+(mat_width*mat_length)+mat_width:w+(mat_width*mat_length)+mat_width+2] = input_max
else:
block += 1
return_mat = dest_mat.reshape(mat_width, mat_length, mat_depth)
return(return_mat)
def speedtest(matrix,matrixsize):
starttime = dt.now()
label_A_example = label_A(matrix)
print(f'Label A with size of {matrixsize} : {dt.now() - starttime}')
print(label_A_example[0][0:4], '\n')
starttime = dt.now()
label_B_example = label_B(matrix)
print(f'Label B with size of {matrixsize} : {dt.now() - starttime}')
print(label_B_example[0][0:4], '\n')
starttime = dt.now()
label_Inline_example = flat_mat(matrix)
print(f'Label Flat with size of {matrixsize} : {dt.now() - starttime}')
print(label_Inline_example[0][0:4], '\n')
starttime = dt.now()
label_Convolve_example = label_Convolve(matrix)
print(f'Label Convolve with size of {matrixsize} : {dt.now() - starttime}')
print(label_Convolve_example[0][0:4], '\n')
tests = 1 #each test will boost matrix size by 10
matrixsize = 10
for i in range(tests):
example = np.random.randint(0, 10, size=(matrixsize, matrixsize, matrixsize))
speedtest(example,matrixsize)
matrixsize += 10

Printing all solutions in the shape of a matrix using \n\

This function returns all possible multiplication from 1 to d. I want to print the solution in the shape of a d×d matrix.
def example(d):
for i in range(1,d+1):
for l in range(1,d+1):
print(i*l)
For d = 5, the expected output should look like:
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
4 8 12 16 20
5 10 15 20 25
You could add the values in your second for loop to a list, join the list, and finally print it.
def mul(d):
for i in range(1, d+1):
list_to_print = []
for l in range(1, d+1):
list_to_print.append(str(l*i))
print(" ".join(list_to_print))
>>> mul(5)
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
4 8 12 16 20
5 10 15 20 25
If you want it to be printed in aligned rows and columns, have a read at Pretty print 2D Python list.
EDIT
The above example will work for both Python 3 and Python 2. However, for Python 3 (as #richard has put in the comments), you can use:
def mul(d):
for i in range(1, d+1):
for l in range(1, d+1):
print(i*l, end=" ")
print()
>>> mul(5)
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
4 8 12 16 20
5 10 15 20 25
Try this:
mm = []
ll = []
def mul(d):
for i in range(1,d+1):
ll = []
for l in range(1,d+1):
# print(i*l),
ll.append((i*l))
mm.append(ll)
mul(5)
for x in mm:
print(x)
[1, 2, 3, 4, 5]
[2, 4, 6, 8, 10]
[3, 6, 9, 12, 15]
[4, 8, 12, 16, 20]
[5, 10, 15, 20, 25]

Creating the node-edge triangle adjacency graph in Python/R

How can I write an R/Python program which creates a node-edge adjacency matrix in which rows denote nodes and columns denote the edges and an entry is one in this adjacency matrix if the edge is part of a triangle and the node is part of the same triangle. I am actually more interested to make use of igraph or linkcomm for this purpose but wouldn't mind seeing a different package/program for this purpose.
I know I can use maximal.clique(g) for locating the triangle but I am not sure of how to make use of this data to create the node-edge triangle adjacency matrix.
> g <- erdos.renyi.game(15, 45, type="gnm", dir=TRUE)
> triad.census(g)
[1] 113 168 38 16 13 49 23 17 7 2
[11] 2 1 2 2 2 0
> str(g)
IGRAPH D--- 15 45 -- Erdos renyi (gnm) graph
+ attr: name (g/c), type (g/c), loops
(g/x), m (g/n)
+ edges:
1 -> 3 4 6 12 13 2 -> 1 3 7
3 -> 2 5 10 15 4 -> 5 12 14
5 -> 6 7 9 6 -> 4 8 12
7 -> 5 9 12 8 -> 2 7 15
9 -> 1 4 11 13 10 -> 4 5 8
11 -> 1 2 9 12 -> 1 4 14 15
13 -> 15 14 -> 11 12
15 -> 3
> maximal.cliques(g)
[[1]]
[1] 13 15
[[2]]
[1] 13 1 9
[[3]]
[1] 2 8 7
[[4]]
[1] 2 1 3
[[5]]
[1] 2 1 11
[[6]]
[1] 3 5 10
[[7]]
[1] 3 15
[[8]]
[1] 4 14 12
[[9]]
[1] 4 10 5
[[10]]
[1] 4 5 6
[[11]]
[1] 4 5 9
[[12]]
[1] 4 1 9
[[13]]
[1] 4 1 12 6
[[14]]
[1] 5 7 9
[[15]]
[1] 6 8
[[16]]
[1] 7 12
[[17]]
[1] 8 15
[[18]]
[1] 8 10
[[19]]
[1] 9 1 11
[[20]]
[1] 11 14
[[21]]
[1] 12 15
Warning message:
In maximal.cliques(g) :
At maximal_cliques_template.h:203 :Edge directions are ignored for maximal clique calculation
According to the Vincent's answer when I use the following I am doubtful if it finds the clique of exactly size 3 or it finds cliques of size 3 and greater? (I just need the triangles). One problem is that this code is super slow. Any idea on how to speed up this?
library(igraph)
set.seed(1)
g <- erdos.renyi.game(100, .6)
#print(g)
plot(g)
ij <- get.edgelist(g)
print(ij)
library(Matrix)
m <- sparseMatrix(
i = rep(seq(nrow(ij)), each=2),
j = as.vector(t(ij)),
x = 1
)
print(m)
# Maximal cliques of size at least 3
cl <- maximal.cliques(g)
print(cl)
cl <- cl[ sapply(cl, length) > 2 ]
print(cl)
# Function to test if an edge is part of a triangle
triangle <- function(e) {
any( sapply( cl, function(u) all( e %in% u ) ) )
}
print(triangle)
# Only keep those edges
kl <- ij[ apply(ij, 1, triangle), ]
print(kl)
# Same code as before
m <- sparseMatrix(
i = rep(seq(nrow(kl)), each=2),
j = as.vector(t(kl)),
x = 1
)
print(m)
Also for some reasons the function cocluster tells me that the output m is not a matrix. Any idea on what I should do to make use of m sparse matrix in the cocluster function?
>library("blockcluster")
> out<-cocluster(m,datatype="binary",nbcocluster=c(2,3))
Error in cocluster(m, datatype = "binary", nbcocluster = c(2, 3)) :
Data should be matrix.
The following gives you an edge/vertex adjacency matrix,
but for all edges, not just those included in triangles.
library(igraph)
set.seed(1)
g <- erdos.renyi.game(6, .6)
plot(g)
ij <- get.edgelist(g)
library(Matrix)
m <- sparseMatrix(
i = rep(seq(nrow(ij)), each=2),
j = as.vector(t(ij)),
x = 1
)
As you suggest, you can use maximal.cliques
to identify the edges that are part of triangle
(equivalently, that are part of a maximal
clique of size at least 3).
# Maximal cliques of size at least 3
cl <- maximal.cliques(g)
cl <- cl[ sapply(cl, length) > 2 ]
# Function to test if an edge is part of a triangle
triangle <- function(e) {
any( sapply( cl, function(u) all( e %in% u ) ) )
}
# Only keep those edges
kl <- ij[ apply(ij, 1, triangle), ]
# Same code as before
m <- sparseMatrix(
i = rep(seq(nrow(kl)), each=2),
j = as.vector(t(kl)),
x = 1
)
m
# 5 x 5 sparse Matrix of class "dgCMatrix"
# [1,] 1 1 . . .
# [2,] . 1 1 . .
# [3,] 1 . . . 1
# [4,] . 1 . . 1
# [5,] . . 1 . 1

Categories