Average diameter of complex shapes from pixels in df, Python - python

I have a DataFrame of multiple particles, that have gotten the group numbers (1,2,3,4) like this:
Groups:
[[0 0 0 1 1 1 0 0]
[0 2 0 1 1 1 0 0]
[0 0 0 1 1 1 0 0]
[0 0 0 0 1 0 0 0]
[0 3 3 0 0 4 0 0]
[0 3 0 0 0 4 0 0]
[0 0 0 0 0 4 0 0]
[0 0 0 0 0 4 0 0]]
Number of particles: 4
I have then calculated the areas of the particles and created a DataFrame (assuming 1 pixel = 1 nm):
Particle # Size [pixel #] A [nm2]
1 1 10 10
2 2 1 1
3 3 3 3
4 4 4 4
Now I want to calculate the diameter of the particles. However, the shapes of the particles are complex, therefore I am looking for a method to calculate the average diameter (considering the shapes are not perfectly round) and adding another column next to A [nm2] with the average diameter.
Will this be possible?
Here is my full code:
import numpy as np
from skimage import measure
import pandas as pd
final = [
[0, 0, 0, 255, 255, 255, 0, 0],
[0, 255, 0, 255, 255, 255, 0, 0],
[0, 0, 0, 255, 255, 255, 0, 0, ],
[0, 0, 0, 0, 255, 0, 0, 0],
[0, 255, 255, 0, 0, 255, 0, 0],
[0, 255, 0, 0, 0, 255, 0, 0],
[0, 0, 0, 0, 0, 255, 0, 0],
[0, 0, 0, 0, 0, 255, 0, 0]
]
final = np.asarray(final)
groups, group_count = measure.label(final > 0, return_num = True, connectivity = 1)
print('Groups: \n', groups)
print(f'Number of particles: {group_count}')
df = (pd.DataFrame(dict(zip(['Particle #', 'Size [pixel #]'],
np.unique(groups, return_counts=True))))
.loc[lambda d: d['Particle #'].ne(0)]
)
pixel_nm_size = 1*1
df['A [nm2]'] = df['Size [pixel #]'] * pixel_nm_size
Any help is appreciated!

I think you are looking for regionprops.
Specifically, either equivalent_diameter, or just perimeter.
props = measure.regionprops_table(groups, properties = ['label', 'equivalent_diameter', 'perimeter'])
df = pd.DataFrame(props)
edit
from the docs:
equivalent_diameter_area: float
The diameter of a circle with the same area as the region.
So, the function takes your labeled region, measures the area and constructs a circle with that area (there is only one such circle for each area).
Then it measures the diameter of the circle.
You can also look at major_axis_length and minor_axis_length. These are computed by fitting an ellipse around the object and measuring the long and short axis that define it.

IIUC, you could use a custom function to find the height/width of the bounding box and compute the average of both dimensions:
def get_diameter(g):
a = (groups==g)
h = (a.sum(1)!=0).sum()
w = (a.sum(0)!=0).sum()
return (h+w)/2
df['diameter'] = df['Particle #'].map(get_diameter)
output:
Particle # Size [pixel #] A [nm2] diameter
1 1 10 10 3.5
2 2 1 1 1.0
3 3 3 3 2.0
4 4 4 4 2.5

Related

how to create an array of (100,19) size with each row as a vector of 19 values [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0] in python?

I need to create an array of size (100,19) in python where each row is a fixed 19 valued vector of value [0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]?
Any solution suggested?
a = np.zeros((100,19))
a[:,11] = 1
a = [0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
b = np.array(a)
c = np.tile(a,(100,1))
c.shape
Output:
(100, 19)
You can do it with np.zeros
array0 = np.zeros((100,19))
array0[:,11] = 1
On the other hand, if you want to have all one element
array1 = np.ones((100,19))
array1[:,11] = 0
np.full is a useful function for this purpose:
a = [0,0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0]
result=np.full((100,19),a )
result.shape
output:
(100,19)

Identifying the number of clusters in a Python DataFrame

Picture showing the clusters I wish to count
I am looking to identify the number of clusters of non-zeros in my DataFrame.
Here I have a DataFrame with four (4) clusters in total, but I have trouble finding a code, that can count them for me.
data = [
[0,0,0,255,255,255,0,0],
[0,255,0,255,255,255,0,0],
[0,0,0,255,255,255,0,0,],
[0,0,0,0,255,0,0,0],
[0,255,255,0,0,255,0,0],
[0,255,0,0,0,255,0,0],
[0,0,0,0,0,255,0,0],
[0,0,0,0,0,255,0,0]
]
df2 = pd.DataFrame(data)
Any help is appreciated!
I searched a bit myself and got this. It is a bit try and error without background knowledge but I changed the number of groups in your data a bit and skimage.measure always got the right result:
import numpy as np
from skimage import measure
data = [
[0, 0, 0, 255, 255, 255, 0, 0],
[0, 255, 0, 255, 255, 255, 0, 0],
[0, 0, 0, 255, 255, 255, 0, 0, ],
[0, 0, 0, 0, 255, 0, 0, 0],
[0, 255, 255, 0, 0, 255, 0, 0],
[0, 255, 0, 0, 0, 255, 0, 0],
[0, 0, 0, 0, 0, 255, 0, 0],
[0, 0, 0, 0, 0, 255, 0, 0]
]
arr = np.array(data)
groups, group_count = measure.label(arr == 255, return_num = True, connectivity = 1)
print('Groups: \n', groups)
print(f'Number of groups: {group_count}')
Output:
Groups:
[[0 0 0 1 1 1 0 0]
[0 2 0 1 1 1 0 0]
[0 0 0 1 1 1 0 0]
[0 0 0 0 1 0 0 0]
[0 3 3 0 0 4 0 0]
[0 3 0 0 0 4 0 0]
[0 0 0 0 0 4 0 0]
Number of Groups: 4
In measure.label you define what the criteria is. In your case arr==255 works or just simply arr>0 if the values are not always only 255. Connectivity needs to be set to 1 because you don't want clusters to be connected diagonally (if you do, set it to 2). If return_num = True the result is a tuple where the 2nd element is the number of different clusters.

Creating a 2D array from a one line txt file in Python

I'm attempting to read a one line text file into an array in python, but I am struggling with actually getting the file to transform into an 2D array. This is the text file:
6 4 0 0 1 0 0 0 2 0 1 0 1 1 0 0 1 0 0 0 0 0 0 0 3 0
The first number (6) represents the columns and the second number (4) represents the rows. Here is the code I have so far:
maze_1d_arr = open(sys.argv[1], 'r')
maze = []
maze_split = np.array([maze_1d_arr])
size_X = len(maze_split)
size_Y = len(maze_split[0])
maze_grid = [int(x) for x in maze_split[2:]]
maze = np.array(maze_grid).reshape(size_X, size_Y)
start = np.where(maze_split == 2)
end = np.where(maze_split == 3)
path = astar(maze, start, end)
print(path)
Sorry if this question has been asked before but I'm stumped at how to get it to work. Any help would be appreciated!
import numpy as np
x = np.array([6, 4, 0, 0, 1, 0, 0, 0, 2, 0, 1, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 3, 0])
print(x[2:].reshape(x[[1,0]]))
[[0 0 1 0 0 0]
[2 0 1 0 1 1]
[0 0 1 0 0 0]
[0 0 0 0 3 0]]

How to get an object bounding box given pixel label in python?

Say I have a scene parsing map for an image, each pixel in this scene parsing map indicates which object this pixel belongs to. Now I want to get bounding box of each object, how can I implement this in python?
For a detail example, say I have a scene parsing map like this:
0 0 0 0 0 0 0
0 1 1 0 0 0 0
1 1 1 1 0 0 0
0 0 1 1 1 0 0
0 0 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
So the bounding box is:
0 0 0 0 0 0 0
1 1 1 1 1 0 0
1 0 0 0 1 0 0
1 0 0 0 1 0 0
1 1 1 1 1 0 0
0 0 0 0 0 0 0
0 0 0 0 0 0 0
Actually, in my task, just know the width and height of this object is enough.
A basic idea is to search four edges in the scene parsing map, from top, bottom, left and right direction. But there might be a lot of small objects in the image, this way is not time efficient.
A second way is to calculate the coordinates of all non-zero elements and find the max/min x/y. Then calculate weight and height using these x and y.
Is there any other more efficient way to do this? Thx.
If you are processing images, you can use scipy's ndimage library.
If there is only one object in the image, you can get the measurements with scipy.ndimage.measurements.find_objects (http://docs.scipy.org/doc/scipy-0.16.1/reference/generated/scipy.ndimage.measurements.find_objects.html):
import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
# Find the location of all objects
objs = ndimage.find_objects(a)
# Get the height and width
height = int(objs[0][0].stop - objs[0][0].start)
width = int(objs[0][1].stop - objs[0][1].start)
If there are many objects in the image, you first have to label each object and then get the measurements:
import numpy as np
from scipy import ndimage
a = np.array([[0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 0, 0, 0],
[1, 1, 1, 1, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 1, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 1, 0, 0]]) # Second object here
# Label objects
labeled_image, num_features = ndimage.label(a)
# Find the location of all objects
objs = ndimage.find_objects(labeled_image)
# Get the height and width
measurements = []
for ob in objs:
measurements.append((int(ob[0].stop - ob[0].start), int(ob[1].stop - ob[1].start)))
If you check ndimage.measurements, you can get more measurements: center of mass, area...
using numpy:
import numpy as np
ind = np.nonzero(arr.any(axis=0))[0] # indices of non empty columns
width = ind[-1] - ind[0] + 1
ind = np.nonzero(arr.any(axis=1))[0] # indices of non empty rows
height = ind[-1] - ind[0] + 1
a bit more explanation:
arr.any(axis=0) gives a boolean array telling you if the columns are empty (False) or not (True). np.nonzero(arr.any(axis=0))[0] then extract the non zero (i.e. True) indices from that array. ind[0] is the first element of that array, hence the left most column non empty column and ind[-1] is the last element, hence the right most non empty column. The difference then gives the width, give or take 1 depending on whether you include the borders or not.
Similar stuff for the height but on the other axis.

How to search shapes in a graph?

I have an adjancency matrix am of a 5 node undirected graph where am(i,j) = 1 means node i is connected to node j. I generated all possible
versions of this 5-node graph by the following code:
import itertools
graphs = list(itertools.product([0, 1], repeat=10))
This returns me an array of arrays where each element is a possible configuration of the matrix (note that I only generate these for upper triangle since matrix is symetric):
[ (0, 0, 0, 0, 0, 0, 1, 0, 1, 1),
(0, 0, 0, 0, 0, 0, 1, 1, 0, 0),
(0, 0, 0, 0, 0, 0, 1, 1, 0, 1),
(0, 0, 0, 0, 0, 0, 1, 1, 1, 0),
(0, 0, 0, 0, 0, 0, 1, 1, 1, 1),
....]
where (0, 0, 0, 0, 0, 0, 1, 1, 1, 1) actually corresponds to:
m =
0 0 0 0 0
0 0 0 0 0
0 0 1 1 1
0 0 0 0 1
0 0 0 0 0
I would like to search for all possible triangle shapes in this graph. For example, here, (2, 4), (2,5) and (4, 5) together makes a triangle shape:
m =
0 0 0 0 0
0 0 0 1 1
0 0 0 0 0
0 0 0 0 1
0 0 0 0 0
Is there a known algorithm to do such a search in a graph? Note that triangle shape is an example here, ideally I would like to find a solution that would search any particular shape, for example a square or a pentagon. How can I encode these shapes to search in the first place? Any help, reference, algorithm name is appreciated.
Your explanation for the graph representation is not quite understandable.
However, finding cycles of size k is NP-complete problem when k is your input (since it includes the NP-complete hamiltonian-cycle problem).
If that is the case, then you should have a look at these posts:
Finding all cycles of a certain length in a graph
Finding all cycles in undirected graphs
But, if you have fixes size lengths, then this problem can be solved in good polinomial time.
Here is an article about this very issue:
Finding and Counting Given Length Cycles | Algorithmica

Categories