Python: Rearrange indices from np.where() - python

I would like to rearrange the indices in a tuple which was created with np.where.
The reason for this is, that I would like to apply values to a number of special position (a pipe) in a mesh, which were pre-selected. The values shall be applied in the direction of flow. The direction of flow is defined from top left to bottom left = from (3,0) to (3,6) to (7,6) to (7,0). Currently, the order of the index tuple ind is according to the automatic sorting of the indices. This leads to the figure, below, where the values 1:10 are correctly applied, but 11:17 are obviously in reverse order.
Is there a better way to grab the indices or how can I rearrange the tuple so that the values are applied in the direction of flow?
import numpy as np
import matplotlib.pyplot as plt
# mesh size
nx, ny = 10, 10
# special positions
sx1, sx2, sy = .3, .7, .7
T = 1
# create mesh
u0 = np.zeros((nx, ny))
# assign values to mesh
u0[int(nx*sx1), 0:int(ny*sy)] = T
u0[int(nx*sx2), 0:int(ny*sy)] = T
u0[int(nx*sx1+1):int(nx*sx2), int(ny*sy-1)] = T
# get indices of special positions
ind = np.where(u0 == T)
# EDIT: hand code sequence
length = len(u0[int(nx*sx2), 0:int(ny*sy)])
ind[0][-length:] = np.flip(ind[0][-length:])
ind[1][-length:] = np.flip(ind[1][-length:])
# apply new values on special positions
u0[ind] = np.arange(1, len(ind[1])+1,1)
fig, ax = plt.subplots()
fig = ax.imshow(u0, cmap=plt.get_cmap('RdBu_r'))
ax.figure.colorbar(fig)
plt.show()
Old image (without edit)
New image (after edit)

I think it's a fallacy to think that you can algorithmically deduce the correct "flow-sequence" of the grid points, by examining the contents of the tuple ind.
Here's an example that illustrates why:
0 0 0 0 0 0 0 0 0 0
A B C D E 0 0 0 0 0
0 0 0 0 F 0 0 0 0 0
0 0 0 0 G 0 0 0 0 0
0 0 0 I H 0 0 0 0 0
0 0 0 J K 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0
This is a schematic representation of your grid matrix, where, if you follow the letters A, B, C, etc, you will get the sequence of the flow through the grid-elements.
However, note that, no matter how smart an algorithm is, it will be unable to choose between the two possible flows:
A, B, C, D, E, F, G, H, I, J, K
and
A, B, C, D, E, F, G, H, K, J, I
So, I think you will have to record the sequence explicitly yourself, rather than deduce it from the positions of the T values in the grid.
Any algorithm will stop at the ambiguity at the grid location H in the above example

Related

How to create an adjacency matrix in pandas such that the labels are preserved when rows and cols are rearranged

I have never used pandas or numpy for this purpose before and am wondering what's the idiomatic way to construct labeled adjacency matrices in pandas.
My data comes in a shape similar to this. Each "uL22" type of thing is a protein and the the arrays are the neighbors of this protein. Hence( in this example below) an adjacency matrix would have 1s in bL31 row, uL5 column, and the converse, etc.
My problem is twofold:
The actual dimension of the adjacency matrix is dictated by a set of protein-names that is generally much larger than those contained in the nbrtree, so i'm wondering what's the best way to map my nbrtree data to that set, say a 100 by 100 matrix corresponding to neighborhood relationships of a 100 proteins.
I'm not quite sure how to "bind" the names(i.e.uL32etc.) of those 100 proteins to the rows and columns of this matrix such that when I start moving rows around the names move accordingly. ( i'm planning to rearange the adjacency matrix into to have a block-diagonal structure)
"nbrtree": {
"bL31": ["uL5"],
"uL5": ["bL31"],
"bL32": ["uL22"],
"uL22": ["bL32","bL17"],
...
"bL33": ["bL35"],
"bL35": ["bL33","uL15"],
"uL13": ["bL20"],
"bL20": ["uL13","bL21"]
}
>>>len(nbrtree)
>>>40
I'm sure this is a manipulation that people perform daily, i'm just not quite familiar with how dataframes function properly, so i'm probably looking for something very obvious.
Thank you so much!
I don't fully understand your question, But from what I get try out this code.
from pprint import pprint as pp
import pandas as pd
dic = {"first": {
"a": ["b","d"],
"b": ["a","h"],
"c": ["d"],
"d": ["c","g"],
"e": ["f"],
"f": ["e","d"],
"g": ["h","a"],
"h": ["g","b"]
}}
col = list(dic['first'].keys())
data = pd.DataFrame(0, index = col, columns = col, dtype = int)
for x,y in dic['first'].items():
data.loc[x,y] = 1
pp(data)
The output from this code being
a b c d e f g h
a 0 1 0 1 0 0 0 0
b 1 0 0 0 0 0 0 1
c 0 0 0 1 0 0 0 0
d 0 0 1 0 0 0 1 0
e 0 0 0 0 0 1 0 0
f 0 0 0 1 1 0 0 0
g 1 0 0 0 0 0 0 1
h 0 1 0 0 0 0 1 0
Note that this adjaceny matrix here is not symmetric as I have taken some random data
To absorb your labels into the dataframe change to the following
data = pd.DataFrame(0, index = ['index']+col, columns = ['column']+col, dtype = int)
data.loc['index'] = [0]+col
data.loc[:, 'column'] = ['*']+col

Using itertools to generate an exponential binary space

I am interested in generating all binary combination of N variables without having to implement a manual loop of iterating N times over N and each time looping over N/2 and so on.
Do we have such functionality in python?
E.g:
I have N binary variables:
pool=['A','B','C',...,'I','J']
len(pool)=10
I would like to generate 2^10=1024 space out of these such as:
[A B C ... I J]
iter0 = 0 0 0 ... 0 0
iter1 = 0 0 0 ... 0 1
iter2 = 0 0 0 ... 1 1
...
iter1022 = 1 1 1 ... 1 0
iter1023 = 1 1 1 ... 1 1
You see that I don't have repetitions here, each variable is enabled once per each of these iter's sequences. How can I do that using Python's itertools?
itertools.product with the repeat parameter is the simplest answer:
for A, B, C, D, E, F, G, H, I, J in itertools.product((0, 1), repeat=10):
The values of each variable will cycle fastest on the right, and slowest on the left, so you'll get:
0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 0 1 1
0 0 0 0 0 0 0 1 0 0
etc. This may be recognizable to you: It's just the binary representation of an incrementing 10 bit number. Depending on your needs, you may actually want to just do:
for i in range(1 << 10):
then mask i with 1 << 9 to get the value of A, 1 << 8 for B, and so on down to 1 << 0 (that is, 1) for J. If the goal is just to print them, you can even get more clever, by binary stringifying and then using join to insert the separator:
for i in range(1 << 10):
print(' '.join('{:010b}'.format(i)))
# Or letting print insert the separator:
print(*'{:010b}'.format(i)) # If separator isn't space, pass sep='sepstring'

Efficient way of finding rectangle coordinates in 0-1 arrays

Say I have an MxN matrix of 0's and 1's. It may or may not be sparse.
I want a function to efficiently find rectangles in the array, where by rectangle I mean:
a set of 4 elements that are all 1's that create the 4 corners of a
rectangle, such that the sides of the rectangle are orthogonal to the
array axes. In other words, a rectangle is a set of 4 1's elements
with coordinates [row index, column index] like so: [r1,c1], [r1,c2],
[r2,c2], [r2,c1].
E.g. this setup has one rectangle:
0 0 0 1 0 1 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 1 0 1 0
0 0 0 0 0 0 0
0 0 0 1 0 0 1
For a given MxN array, I want a Python function F(A) that returns an array L of subarrays, where each subarray is the coordinate pair of the corner of a rectangle (and includes all of the 4 corners of the rectangle). For the case where the same element of the array is the corner of multiple rectangles, it's ok to duplicate those coordinates.
My thinking so far is:
1) find the coordinates of the apex of each right triangle in the array
2) check each right triangle apex coordinate to see if it is part of a rectangle
Step 1) can be achieved by finding those elements that are 1's and are in a column with a column sum >=2, and in a row with a row sum >=2.
Step 2) would then iterate through each coordinate determined to be the apex of a right triangle. For a a given right triangle coordinate pair, it would iterate through that column, looking at every other right triangle coordinate from 1) that is in that column. For any pair of 2 right triangle points in a column, it would then check which row has a smaller row sum to know which row would be faster to iterate through. Then it would iterate through all of the right triangle column coordinates in that row and see if the other row also has a right triangle point in that column. If it does, those 4 points form a rectangle.
I think this will work, but there will be repetition, and overall this procedure seems like it would be reasonably computationally intensive. What are some better ways for detecting rectangle corners in 0-1 arrays?
This is from the top of my head and during 5 hrs layover at LAX. Following is my algorithm:
Step 1: Search all rows for at least two ones
| 0 0 0 1 0 1 0
| 0 0 0 0 0 0 0
| 0 1 0 0 0 0 0
\|/ 1 0 0 1 0 1 0
0 0 0 0 0 0 0
0 0 0 1 0 0 1
Output:
-> 0 0 0 1 0 1 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
-> 1 0 0 1 0 1 0
0 0 0 0 0 0 0
-> 0 0 0 1 0 0 1
Step 2: For each pair of ones at each row get the index for one's in the column corresponding to the ones, lets say for the first row:
-> 0 0 0 1 0 1 0
you check for ones in the following columns:
| |
\|/ \|/
0 0 0 1 0 1 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 1 0 1 0
0 0 0 0 0 0 0
0 0 0 1 0 0 1
Step 3: If both index match; return the indices of all four. This can be easily accessed as you know the row and index of ones at all steps. In our case the search at columns 3, 5 are going to return 3 assuming you start index from 0. So we get the indicies for the following:
0 0 0 ->1 0 ->1 0
0 0 0 0 0 0 0
0 1 0 0 0 0 0
1 0 0 ->1 0 ->1 0
0 0 0 0 0 0 0
0 0 0 1 0 0 1
Step 4: Repeat for all pairs
Algorithm Complexity
I know you need to search for columns * rows * number of pairs but you can always use hashmaps to optimize search O(1). Which will make over complexity bound to the number of pairs. Please feel free to comment with any questions.
Here's an Python implementation which is similar to #PseudoAj solution. It will process the rows starting from top while constructing a dict where keys are x coordinates and values are sets of respective y coordinates.
For every row following steps are done:
Generate a list of x-coordinates with 1s from current row
If length of list is less than 2 move to next row
Iterate over all coordinate pairs left, right where left < right
For every coordinate pair take intersection from dict containing processed rows
For every y coordinate in the intersection add rectangle to result
Finally update dict with coordinates from current row
Code:
from collections import defaultdict
from itertools import combinations
arr = [
[0, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[1, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 1]
]
# List corner coords
result = []
# Dict {x: set(y1, y2, ...)} of 1s in processed rows
d = defaultdict(set)
for y, row in enumerate(arr):
# Find indexes of 1 from current row
coords = [i for i, x in enumerate(row) if x]
# Move to next row if less than two points
if len(coords) < 2:
continue
# For every pair on this row find all pairs on previous rows
for left, right in combinations(coords, 2):
for top in d[left] & d[right]:
result.append(((top, left), (top, right), (y, left), (y, right)))
# Add coordinates on this row to processed rows
for x in coords:
d[x].add(y)
print(result)
Output:
[((0, 3), (0, 5), (3, 3), (3, 5))]

Is there any easy way to rotate the values of a matrix/array?

So, let's say I have the following matrix/array -
[0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 1 1 1 0 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 1 1 1 1 1 1 0 0 0 0
0 0 0 1 1 1 1 0 0 0 0 0
0 0 0 0 1 1 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0]
It would be fairly trivial to write something that would translate these values up and down. What if I wanted to rotate it by an angle that isn't a multiple of 90 degrees? I know that It is obviously impossible to get the exact same shape (made of 1s), because of the nature of the grid. The idea that comes to mind is converting each value of 1 to a coordinate vector. Then it would amount to rotating the coordinates (which should be more simple) about a point. One could then write something which would take the coordinates, and compare them to the matrix grid, and if there is a point in the right box, it will be filled. I know I'll also have to find a center around which to rotate.
Does this seem like a reasonable way to do this? If anyone has a better idea, I'm all ears. I know with a small grid like this, the shape would probably be entirely different, however if I had a large shape represented by 1s, in a large grid, the difference between representations would be smaller.
First of all, rotating a shape like that with only 1's and 0's at non 90 degree angles is not really going to look much like the original at all, when it's done at such a low "resolution". However, I would recommend looking into rotation matrices. Like you said, you would probably want to find each value as a coordinate pair, and rotate it around the center. It would probably be easier if you made this a two-dimensional array. Good luck!
I think this should work:
from math import sin, cos, atan2, radians
i0,j0 = 0,0 #point around which you'll rotate
alpha = radians(3) #3 degrees
B = np.zeros(A.shape)
for i,j in np.swapaxes(np.where(A==1),0,1):
di = i-i0
dj = j-j0
dist = (di**2 + dj**2)**0.5
ang = atan2(dj,di)
pi = round(sin(ang+alpha)*dist) + i0
pj = round(cos(ang+alpha)*dist) + j0
B[pi][pj] = 1
But, please, don't forget about segmentation fault!
B array should be much bigger than A and origin should be (optimally) in the middle of the array.

Easy way to add a number as an image in a matrix?

I'm creating a checkerboard pattern as follows:
def CheckeredBoard( x=10 , y=10 , sq=2 , xmax = None , ymax = None ):
coords = np.ogrid[0:x , 0:y]
idx = (coords[0] // sq + coords[1] // sq) % 2
if xmax != None: idx[xmax:] = 0.
if ymax != None: idx[:, ymax:] = 0.
return idx
ch = CheckeredBoard( 100, 110 , 10 )
plt.imshow2( ch )
What I would like is to add a number in some of the boxes to number them so that when I run plt.imshow2( ch ) I get the numbers be part of the image.
The only way I can think of doing this is by using some sort of annotation and then saving the image and loading the annotated image but this seems really messy.
For example a succesfull matrix would look like:
1 1 1 1 0 0 0 0
1 1 0 1 0 0 0 0
1 0 0 1 0 0 0 0
1 1 0 1 0 0 0 0
1 0 0 0 0 0 0 0
1 1 1 1 0 0 0 0
1 1 1 1 0 0 0 0
0 0 0 0 1 1 1 1
0 0 0 0 1 1 0 1
0 0 0 0 1 0 1 0
0 0 0 0 1 0 0 0
0 0 0 0 1 0 1 0
0 0 0 0 1 1 0 1
0 0 0 0 1 1 1 1
The matrix above has a 1 and an 8 in the two corners.
Appreciate any help, let me know if you want additional information.
Thanks
EDIT
Here is something closer to what I'd actually like to end up with.
Red circles added for emphasis.
What about using PIL / Pillow?
import numpy as np
import pylab
from PIL import Image, ImageDraw, ImageFont
#-- your data array
xs = np.zeros((20,20))
#-- prepare the text drawing
img = Image.fromarray(xs)
d = ImageDraw.Draw(img)
d.text( (2,2), "4", fill=255)
#-- back to array
ys = np.asarray(img)
#-- just show
pylab.imshow(ys)
how about something like this?
n = 8
board = [[(i+j)%2 for i in range(n)] for j in range(n)]
from matplotlib import pyplot
fig = pyplot.figure()
ax = fig.add_subplot(1,1,1)
ax.imshow(board, interpolation="nearest")
from random import randint
for _ in range(10):
i = randint(0, n-1)
j = randint(0, n-1)
number = randint(0,9)
ax.annotate(str(number), xy=(i,j), color="white")
pyplot.show()
obviously you'll have your own way of locating the numbers, and and choosing them, other than that though, the annotate functionality has everything you need.
You might need to offset the numbers, and in that case you can either just have a set size and work out how much you need to offset them by, or you can work out the bounding box of the squares and offset them that way if you want.
Colouring the numbers you also have a few options - you can go with a standard colour for all, or you can opposite colour them;
for _ in range(10):
i = randint(0, n-1)
j = randint(0, n-1)
number = randint(0,9)
colour = "red"
if (i+j)%2 == 1:
colour = "blue"
ax.annotate(str(number), xy=(i,j), color=colour)
but honestly, i think the white option is more readable.

Categories