Python: Convert 2d point cloud to grayscale image - python

I have an array of variable length filled with 2d coordinate points (coming from a point cloud) which are distributed around (0,0) and i want to convert them into a 2d matrix (=grayscale image).
# have
array = [(1.0,1.1),(0.0,0.0),...]
# want
matrix = [[0,100,...],[255,255,...],...]
how would i achieve this using python and numpy

Looks like matplotlib.pyplot.hist2d is what you are looking for.
It basically bins your data into 2-dimensional bins (with a size of your choice).
here the documentation and a working example is given below.
import numpy as np
import matplotlib.pyplot as plt
data = [np.random.randn(1000), np.random.randn(1000)]
plt.scatter(data[0], data[1])
Then you can call hist2d on your data, for instance like this
plt.hist2d(data[0], data[1], bins=20)
note that the arguments of hist2d are two 1-dimensional arrays, so you will have to do a bit of reshaping of our data prior to feed it to hist2d.

Quick solution using only numpy without the need for matplotlib and therefor plots:
import numpy as np
# given a 2dArray "array" and a desired image shape "[x,y]"
matrix = np.histogram2d(array[:,0], array[:,1], bins=[x,y])

Related

Histogram of 2D arrays and determine array which contains highest and lowest values

I have a 2D array of shape 5 and 10. So 5 different arrays with 10 values. I am hoping to get a histogram and see which array is on the lower end versus higher end of a histogram. Hope that makes sense. I am attaching an image of an example of what I mean (labeled example).
Looking for one histogram but the histogram is organized by the distribution of the highest and lowest of each array.
I'm having trouble doing this with Python. I tried a few ways of doing this:
# setting up 2d array
import numpy as np
from scipy import signal
np.random.seed(1234)
array_2d = np.random.random((5,20))
I thought you could maybe just plot all the histograms of each array (5 of them) like this:
for i in range(5):
plt.hist(signal.detrend(array_2d[i,:],type='constant'),bins=20)
plt.show()
And then looking to see which array's histogram is furthest to the right or left, but not sure if that makes too much sense...
Then also considered using .ravel to make the 2D array into a 1D array which makes a nice histogram. But all the values within each array are being shifted around so it's difficult to tell which array is on the lower or higher end of the histogram:
plt.hist(signal.detrend(array_2d.ravel(),type='constant'),bins=20)
plt.xticks(np.linspace(-1,1,10));
How might I get a histogram of the 5 arrays (shape 5, 10) and get the range of the arrays with the lowest values versus array with highest values?
Also please let me know if this is unclear or not possible at all too haha. Thanks!
Maybe you could use a kdeplot? This would replace each input value with a small Gaussian curve and sum them.
from matplotlib import pyplot as plt
import seaborn as sns
import pandas as pd
import numpy as np
np.random.seed(1234)
array_2d = np.random.random((5, 20))
sns.kdeplot(data=pd.DataFrame(array_2d.T, columns=range(1, 6)), palette='Set1', multiple='layer')

Scipy interpolation of non-uniform data

I have a set of data loaded in from a csv file, with 1D arrays representing the x,y,z coords of the data points, and another 1D array, T, representing the value of a field at the corresponding points. The points are not uniform in space.
I am struggling to interpolate T a given point xi,yi,zi. scipy's interpn seems to want to accept T only as a 3D array, which doesn't make sense to me as T is simply 1D data?
Any advice would be appreciated.
Edit:
Example:
import numpy as np
x = np.array([1.0,1.5,1.1,1.3,1.4])
y = np.array([1.1,1.3,1.2,1.4,1.45])
z = np.array([1.0,1.1,1.4,1.2,1.0])
T = np.array([5.0,5.1,5.4,4.6,4.9])
point = ([1.2,1.1,1.25])
from scipy.interpolate import interpn
out = interpn((x,y,z),T,point)
print(out)
Cheers

counting points in grid cells in python, np.histogramdd

I have a numpy array including the coordinates of the points in 3-dimensional space:
import numpy as np
testdata=np.array([[0.5,0.5,0.5],[0.6,0.6,0.6],[0.7,0.7,0.7],[1.5,0.5,0.5],[1.5,0.6,0.6],[0.5,1.5,0.5],[0.5,1.5,1.5]])
Each row for one particle including 3 coordinates (x y z).There are 8 points in this example. is there any python package for griding the 3D space, then counting the particles in each cell?
I tried np.histogramdd in this way
xcoord=testdata[:,0]
ycoord=testdata[:,1]
zcoord=testdata[:,2]
xedg=[0,1,2]
yedg=[0,1,2]
zedg=[0,1,2]
histo=np.histogramdd([xcoord,ycoord,zcoord],bins=(xedg,yedg,zedg),range=[[0,2],[0,2],[0,2]])
and it seems it is working but the indexing is strange. I mean the final array that np.histogramdd returns has no meaningful indexing regarding the original coordinates. is there any other way for griding the 3d space and count the number of points in each cell?
Not sure if this is what you are needing but you can use Pandas.
import pandas as pd
coords = [[1,2,3],[4,5,6],[7,8,9]]
df_coords = pd.DataFrame(coords)
df_coords.count()

What does matrix[x] for different x indicate?

While using the MNIST datasetfrom kaggle,i have noticed that all the tutorials use mnist[x] for different values of x to retrieve different pictures.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
mnist=pd.read_csv(r"(dir of dataset)").values
img=mnist[1]
img.shape=(28,28)
plt.imshow(img)
plt.show()
My doubt is what mnist[1] retrieves,also i have noticed that mnist[-1] also works,so that is why i am confused.
In Python, a matrix is just an array of array. Notice the second "array" I mentioned here could be another "matrix".
So your "matrix[x]" simply means the (x+1)th element of your object.
In case of the matrix for a dataset, mostly the first dimension of the matrix would be the sample id.
So your "matrix[x]" means the argument array of the (x+1)th sample.

Plot 3rd axis of a 3D numpy array

I have a 3D numpy array that is a stack of 2D (m,n) images at certain timestamps, t. So my array is of shape (t, m, n). I want to plot the value of one of the pixels as a function of time.
e.g.:
import numpy as np
import matplotlib.pyplot as plt
data_cube = []
for i in xrange(10):
a = np.random(100,100)
data_cube.append(a)
So my (t, m, n) now has shape (10,100,100). Say I wanted a 1D plot the value of index [12][12] at each of the 10 steps I would do:
plt.plot(data_cube[:][12][12])
plt.show()
But I'm getting index out of range errors. I thought I might have my indices mixed up, but every plot I generate seems to be in the 'wrong' axis, i.e. across one of the 2D arrays, but instead I want it 'through' the vertical stack. Thanks in advance!
Here is the solution: Since you are already using numpy, convert you final list to an array and just use slicing. The problem in your case was two-fold:
First: Your final data_cube was not an array. For a list, you will have to iterate over the values
Second: Slicing was incorrect.
import numpy as np
import matplotlib.pyplot as plt
data_cube = []
for i in range(10):
a = np.random.rand(100,100)
data_cube.append(a)
data_cube = np.array(data_cube) # Added this step
plt.plot(data_cube[:,12,12]) # Modified the slicing
Output
A less verbose version that avoids iteration:
data_cube = np.random.rand(10, 100,100)
plt.plot(data_cube[:,12,12])

Categories