Blank screen when generating point cloud from image with Open3D

Blank screen when generating point cloud from image with Open3D - python

So I tried creating a point cloud with the Open3D library in python and in the end, it's basically just the 2 lines as referenced in here, but when I run my code (see below) all I get is a white screen popping up. I've ran it in Jupyter notebook, but running it in a python script from console didn't change anything nor it threw an error.
I should mention that I created the images in Blender and saved it as OpenExr, meaning that the depth value range between 0 and 4 (I've truncated it to 4 for the background). You can see that they are proper images below and I could also transform them to Open3D pictures and display them without problems.
Edit (27-03-2020): Added complete minimal example
import OpenEXR
import numpy as np
import array
import matplotlib.pyplot as plt
import open3d as o3d
%matplotlib inline
exr_img = OpenEXR.InputFile('frame0100.exr')
depth_img = array.array('f', exr_img.channel('View Layer.Depth.Z'))
r_img = array.array('f', exr_img.channel('View Layer.Combined.R'))
g_img = array.array('f', exr_img.channel('View Layer.Combined.G'))
b_img = array.array('f', exr_img.channel('View Layer.Combined.B'))
def reshape_img(img):
return np.array(img).reshape(720, 1280)
img_array = np.dstack((reshape_img(r_img),
reshape_img(g_img),
reshape_img(b_img),
reshape_img(depth_img)))
#Background returns very large value, truncate it to 4
img_array[img_array == 10000000000.0] = 4
colour = o3d.geometry.Image(np.array(img_array[:, :, :3]))
depth = o3d.geometry.Image(np.array(img_array[:, :, 3]*1000).astype('uint16'))
o3d.draw_geometries([depth])
pinhole_cam = o3d.open3d.camera.PinholeCameraIntrinsic(width= 1280, height=720, cx=640,cy=360,fx=500,fy=500)
rgbd = o3d.create_rgbd_image_from_color_and_depth(colour, depth, convert_rgb_to_intensity = False, depth_scale=1000)
pcd = o3d.create_point_cloud_from_rgbd_image(rgbd, pinhole_cam)
o3d.draw_geometries([pcd])
Please overlook the hacky way of importing the data, as I am new to Open3D and produced the data myself, I did it step-by-step for checks and for confirming the data integrity
I assume it might have to do with my parameters for the pinhole camera. Tbh, I have no idea what would be the proper parameters except that cy, cy should be the centre of the image and fx, fy should be sensible. As my depth values are in Blender metres but Open3D apparently expects millimetres, the scaling should make sense.
I'd appreciate it if you could give me any help debugging it. But if you were to point me in the direction of a better working library to create 3D point clouds from images I wouldn't mind either. The documentation I found with Open3D is lacking at best.

In short, Open3D expects your 3-channel color image to be of uint8 type.
Otherwise, it would return an empty point cloud, resulting in the blank window you see.
Update 2020-3-27, late night in my time zone:)
Now that you have provided your code, let's dive in!
From your function names, I guess you are using Open3D 0.7.0 or something like that. The code I provided is in 0.9.0. Some function names changed and new functionality added in.
When I run your code in 0.9.0 (after some minor modifications of course), there's a RuntimeError:
RuntimeError: [Open3D ERROR] [CreatePointCloudFromRGBDImage] Unsupported image format.
And we can see from the Open3D 0.9.0 source that your color image must be of 3 channels and take only 1 byte each (uint8) or be of 1 channel and take 4 bytes (float, that means intensity image):
std::shared_ptr<PointCloud> PointCloud::CreateFromRGBDImage(
const RGBDImage &image,
const camera::PinholeCameraIntrinsic &intrinsic,
const Eigen::Matrix4d &extrinsic /* = Eigen::Matrix4d::Identity()*/,
bool project_valid_depth_only) {
if (image.depth_.num_of_channels_ == 1 &&
image.depth_.bytes_per_channel_ == 4) {
if (image.color_.bytes_per_channel_ == 1 &&
image.color_.num_of_channels_ == 3) {
return CreatePointCloudFromRGBDImageT<uint8_t, 3>(
image, intrinsic, extrinsic, project_valid_depth_only);
} else if (image.color_.bytes_per_channel_ == 4 &&
image.color_.num_of_channels_ == 1) {
return CreatePointCloudFromRGBDImageT<float, 1>(
image, intrinsic, extrinsic, project_valid_depth_only);
}
}
utility::LogError(
"[CreatePointCloudFromRGBDImage] Unsupported image format.");
return std::make_shared<PointCloud>();
}
Otherwise, there'll be errors like I encountered.
However, in the version of 0.7.0, the source code is:
std::shared_ptr<PointCloud> CreatePointCloudFromRGBDImage(
const RGBDImage &image,
const camera::PinholeCameraIntrinsic &intrinsic,
const Eigen::Matrix4d &extrinsic /* = Eigen::Matrix4d::Identity()*/) {
if (image.depth_.num_of_channels_ == 1 &&
image.depth_.bytes_per_channel_ == 4) {
if (image.color_.bytes_per_channel_ == 1 &&
image.color_.num_of_channels_ == 3) {
return CreatePointCloudFromRGBDImageT<uint8_t, 3>(image, intrinsic,
extrinsic);
} else if (image.color_.bytes_per_channel_ == 4 &&
image.color_.num_of_channels_ == 1) {
return CreatePointCloudFromRGBDImageT<float, 1>(image, intrinsic,
extrinsic);
}
}
utility::PrintDebug(
"[CreatePointCloudFromRGBDImage] Unsupported image format.\n");
return std::make_shared<PointCloud>();
}
That means Open3D still does not support it, but it would only warn you. And only in debug mode!
After that, it will return an empty point cloud. (Actually both versions do this.) That explains the blank window.
Now you should know, you can make convert_rgb_to_intensity=True and succeed. Though you still should normalize your color image first.
Or you can convert the color image to be in range [0, 255] and of type uint8.
Both will work.
Now I hope all is clear. Hooray!
P.S. Actually I usually found Open3D source code to be easy to read. And if you know C++, you could read it whenever something like this happens.
Update 2020-3-27:
I cannot reproduce your result and I don't know why it happened (you should provide a minimal reproducible example).
Using the image you provided in the comment, I wrote the following code and it works well. If it still doesn't work on your computer, maybe your Open3D is broken.
(I'm not familiar with .exr images, hence the data extraction might be ugly :)
import Imath
import array
import OpenEXR
import numpy as np
import open3d as o3d
# extract data from exr files
f = OpenEXR.InputFile('frame.exr')
FLOAT = Imath.PixelType(Imath.PixelType.FLOAT)
cs = list(f.header()['channels'].keys()) # channels
data = [np.array(array.array('f', f.channel(c, FLOAT))) for c in cs]
data = [d.reshape(720, 1280) for d in data]
rgb = np.concatenate([data[i][:, :, np.newaxis] for i in [3, 2, 1]], axis=-1)
# rgb /= np.max(rgb) # this will result in a much darker image
np.clip(rgb, 0, 1.0) # to better visualize as HDR is not supported?
# get rgbd image
img = o3d.geometry.Image((rgb * 255).astype(np.uint8))
depth = o3d.geometry.Image((data[-1] * 1000).astype(np.uint16))
rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
# some guessed intrinsics
intr = o3d.open3d.camera.PinholeCameraIntrinsic(1280, 720, fx=570, fy=570, cx=640, cy=360)
# get point cloud and visualize
pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, intr)
o3d.visualization.draw_geometries([pcd])
And the result is:
Original answer:
You have misunderstood the meaning of depth_scale.
Use this line:
depth = o3d.geometry.Image(np.array(img_array[:, :, 3]*1000).astype('uint16'))
The Open3D documentation said the depth values will first be scaled and then truncated. Actually it means the pixel values in your depth image will first divide this number rather than multiply, as you can see in the Open3D source:
std::shared_ptr<Image> Image::ConvertDepthToFloatImage(
double depth_scale /* = 1000.0*/, double depth_trunc /* = 3.0*/) const {
// don't need warning message about image type
// as we call CreateFloatImage
auto output = CreateFloatImage();
for (int y = 0; y < output->height_; y++) {
for (int x = 0; x < output->width_; x++) {
float *p = output->PointerAt<float>(x, y);
*p /= (float)depth_scale;
if (*p >= depth_trunc) *p = 0.0f;
}
}
return output;
}
Actually we usually take it for granted that values in depth images should be integers (I guess that's why Open3D did not point that out clearly in their documentation), since floating-point images are less common.
You cannot store 1.34 meters in .png images, since they will lose precision. As a result, we store 1340 in depth images and later processes will first convert it back to 1.34.
As for camera intrinsics for your depth image, I guess there'll be configuration parameters in Blender when you create it. I'm not familiar with Blender, so I'll not talk about it. However, if you do not understand general camera intrinsics, you might want to take a look at this.

#Jing Zhaos's answer worked! However, I assume his version of Open3D is different than mine, I had to change 2 function calls likes this (and changed the naming slightly):
exr_img = OpenEXR.InputFile('frame0100.exr')
cs = list(exr_img.header()['channels'].keys()) # channels
FLOAT = Imath.PixelType(Imath.PixelType.FLOAT)
img_data = [np.array(array.array('f', exr_img.channel(c, FLOAT))) for c in cs]
img_data = [d.reshape(720,1280) for d in img_data]
rgb = np.concatenate([img_data[i][:, :, np.newaxis] for i in [3, 2, 1]], axis=-1)
np.clip(rgb, 0, 1.0) # to better visualize as HDR is not supported?
img = o3d.geometry.Image((rgb * 255).astype(np.uint8))
depth = o3d.geometry.Image((img_data[-1] * 1000).astype(np.uint16))
#####rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
rgbd = o3d.create_rgbd_image_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
# some guessed intrinsics
intr = o3d.open3d.camera.PinholeCameraIntrinsic(1280, 720, fx=570, fy=570, cx=640, cy=360)
# get point cloud and visualize
#####pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, intr)
pcd = o3d.create_point_cloud_from_rgbd_image(rgbd, intr)
o3d.visualization.draw_geometries([pcd])
otherwise I got following error:
AttributeError: type object 'open3d.open3d.geometry.RGBDImage' has no attribute 'create_from_color_and_depth'
Hopefully that also helps others with my Python/Open3D version. Not quite sure where exactly the mistake in my code is, but I am satisfied to have usable code. Thanks again to Jing Zhao!

Related

Why does imclose(Image,nhood) in MATLAB give different output than MORP.CLOSE in OpenCV?

I am trying to convert some MATLAB code to Python, related to image-processing.
When I did
% matlab R2017a
nhood = true(5); % will give 5x5 matrix containing 1s size 5x5
J = imclose(Image,nhood);
in MATLAB, the result is different than when I did
import cv2 as cv
kernel = np.ones((5,5),np.uint8) # will give result like true(5)
J = cv.morphologyEx(Image,cv.MORPH_CLOSE,kernel)
in Python.
This is the result of MATLAB:
And this is for the Python:
The difference is 210 pixels, see below. The red circle shows the pixels that exist in Python with 1 value but not in the MATLAB.
Sorry if it’s so small, my image size is 2048x2048 and have values 0 and 1, and the error just 210 pixels.
When I use another library such as skimage.morphology.closing and mahotas.close with the same parameter, it will give me the same result as MORPH.CLOSE.
What I want to ask is:
Am I using the wrong parameter in Python like the kernel = np.ones((5,5),np.uint8)?
If not, is there any library that will give me the same exact result like imclose() MATLAB?
Which of the MATLAB and Python results is correct?
I already looked at this Q&A. When I use borderValue = 0 in MORPH.CLOSE, my result will give me error 2115 pixels that contain 1 value in MATLAB but not in the Python.
[ UPDATE ]
the input image is Input Image
the cropped of the difference pixels is cropped difference image
So for the difference pixels image, it turns out that the pixels are not only in that position but scattered in several positions. You can see it here
And if seen from the results, the location of the pixel error coincides at the ends of the row or column of the matrix.
I hope it can make more hints for this question.
This is the program in MATLAB that i use to check the error,
mask = zeros(2048,2048); %inisialisasi error matrix
error = 0;
for x = 1:size(J_Matlab,1)
for y = 1:size(J_Matlab,2)
if J_Matlab(x,y)== J_Python(x,y)
mask(x,y) = 0; % no differences
else
mask(x,y) = 1;
error = error + 1;
end
end
end
so i load the Python data into MATLAB, then i compare it in with the MATLAB data. And if you want to check the data that i use for the input in closing function, you can look it in the comment section ( in drive link )
so for this problem, my teacher said that it was ok to use either MATLAB or Python program because the error is not significant. but if i found the solution, i will post it here ASAP. Thanks for the instruction, suggestions, and critics for my first post.

How to use levels gamma/mid-tones slider through code for grayscale image

I have a grayscale image that I want to apply gamma correction to. Manually, I adjust the middle slider in photoshop levels without touching the black or white points until the median value of the histogram reaches 127, or somewhere close to it. I want to do the same through python.
I looked for a formula that takes the gamma value as input and produces an output image. However, I only found one that adjusts the low and high values and not the gamma(or midtones).
Here, R is the RGB red value of pixels in my image as a numpy array, H is the high input and L the low input. I start with L=0 and H=255, and keep passing different values to the function until the image returned has median 127 on the histogram.
def get_levelcorrected_image(R,H,L):
temp=((R-L)*255.0)/(H-L)
temp[temp<0]=0
temp[temp>255]=255
temp[temp==255]=0 #added this line because for some reason there were a lot of white pixels where it should have been dark
final = temp.astype('uint8')
return final
I've tried searching the gimp documentation for a formula behind the gamma slider, and found a code that looked like this.
if (gray)
{
gdouble input;
gdouble range;
gdouble inten;
gdouble out_light;
gdouble lightness;
/* Calculate lightness value */
lightness = GIMP_RGB_LUMINANCE (gray->r, gray->g, gray->b);
input = gimp_levels_config_input_from_color (channel, gray);
range = config->high_input[channel] - config->low_input[channel];
if (range <= 0)
goto out;
input -= config->low_input[channel];
if (input < 0)
goto out;
/* Normalize input and lightness */
inten = input / range;
out_light = lightness / range;
/* See bug 622054: picking pure black or white as gamma doesn't
* work. But we cannot compare to 0.0 or 1.0 because cpus and
* compilers are shit. If you try to check out_light using
* printf() it will give exact 0.0 or 1.0 anyway, probably
* because the generated code is different and out_light doesn't
* live in a register. That must be why the cpu/compiler mafia
* invented epsilon and defined this shit to be the programmer's
* responsibility.
*/
if (out_light <= 0.0001 || out_light >= 0.9999)
goto out;
/* Map selected color to corresponding lightness */
config->gamma[channel] = log (inten) / log (out_light);
config->gamma[channel] = CLAMP (config->gamma[channel], 0.1, 10.0);
g_object_notify (G_OBJECT (config), "gamma");
I know that the gamma input value should be between 0.1 and 10, and that it's probably a log or quadratic function. I'm not sure if this is the right code fragment, as it seems to take the high, low and pixel values and produce a gamma value (I may be mistaken) whereas I want to input the gamma value and get a corrected image.
The problem is that while the image produced with this method is close to what I want, it's not the same as moving the gamma slider in say photoshop or gimp.
I'm an amateur at image manipulation with code, and this is my first question at StackOverFlow, so please forgive me for anything stupid I may have asked.

I think this is correct, but please try it out thoroughly before putting into production:
#!/usr/bin/env python3
import numpy as np
import math
from PIL import Image
# Load starting image and ensure greyscale. Make into Numpy array for doing maths.
im = Image.open('start.png').convert('L')
imnp = np.array(im)/255
# DEBUG: print(imnp.max())
# DEBUG: print(imnp.min())
# DEBUG: print(imnp.mean()
# Calculate new gamma
gamma=math.log(imnp.mean())/math.log(0.5)
# Apply new gamma to image
new = ((imnp**(1/gamma))*255).astype(np.uint8)
# Convert back to PIL from Numpy and save
Image.fromarray(new).save('result.png')
It turns this:
into this:
If your images are large, it might be worth making a LUT (Look Up Table) of the 256 possible greyscale values and applying that rather than doing powers and things.
Or, if you don't feel like writing any Python, you can just use ImageMagick which is installed on most Linux distros and is available for macOS and Windows. So, just in Terminal (or Command Prompt on Windows):
magick input.png -auto-gamma result.png

How can I find endpoints of binary skeleton image in OpenCV?

I have a skeleton as binary pixels, such as this:
I would like to find the coordinates of the end points of this skeleton (in this case there are four), using Open CV if applicable.
Efficiency is important as I'm analysing a number of these in real time from a video feed and need to be doing lots of other things at the same time.
(Note, apologies that the screenshot above has resizing artefacts, but it is an 8-connected skeleton I am working with.)

Given your tags of your questions and answers in your profile, I'm going to assume you want a C++ implementation. When you skeletonize an object, the object should have a 1 pixel thickness. Therefore, one thing that I could suggest is find those pixels that are non-zero in your image, then search in an 8-connected neighbourhood surrounding this pixel and count those pixels that are non-zero. If the count is only 2, then that is a candidate for an skeleton endpoint. Note that I'm also going to ignore the border so we don't go out of bounds. If the count is 1, it's a noisy isolated pixel so we should ignore it. If it's 3 or more, then that means that you're examining part of the skeleton at either a point within the skeleton, or you're at a point where multiple lines are connected together, so this shouldn't be an endpoint either.
I honestly can't think of any algorithm other than checking all of the skeleton pixels for this criteria.... so the complexity will be O(mn), where m and n are the rows and columns of your image. For each pixel in your image, the 8 pixel neighbourhood check takes constant time and this will be the same for all skeleton pixels you check. However, this will certainly be sublinear as the majority of your pixels will be 0 in your image, so the 8 pixel neighbourhood checking won't happen most of the time.
As such, this is something that I would try, assuming that your image is stored in a cv::Mat structure called im, it being a single channel (grayscale) image, and is of type uchar. I'm also going to store the co-ordinates of where the skeleton end points are in a std::vector type. Every time we detect a skeleton point, we will add two integers to the vector at a time - the row and column of where we detect the ending skeleton point.
// Declare variable to count neighbourhood pixels
int count;
// To store a pixel intensity
uchar pix;
// To store the ending co-ordinates
std::vector<int> coords;
// For each pixel in our image...
for (int i = 1; i < im.rows-1; i++) {
for (int j = 1; j < im.cols-1; j++) {
// See what the pixel is at this location
pix = im.at<uchar>(i,j);
// If not a skeleton point, skip
if (pix == 0)
continue;
// Reset counter
count = 0;
// For each pixel in the neighbourhood
// centered at this skeleton location...
for (int y = -1; y <= 1; y++) {
for (int x = -1; x <= 1; x++) {
// Get the pixel in the neighbourhood
pix = im.at<uchar>(i+y,j+x);
// Count if non-zero
if (pix != 0)
count++;
}
}
// If count is exactly 2, add co-ordinates to vector
if (count == 2) {
coords.push_back(i);
coords.push_back(j);
}
}
}
If you want to show the co-ordinates when you're done, just check every pair of elements in this vector:
for (int i = 0; i < coords.size() / 2; i++)
cout << "(" << coords.at(2*i) << "," coords.at(2*i+1) << ")\n";
To be complete, here's a Python implementation as well. I'm using some of numpy's functions to make this easier for myself. Assuming that your image is stored in img, which is also a grayscale image, and importing the OpenCV library and numpy (i.e. import cv2, import numpy as np), this is the equivalent code:
# Find row and column locations that are non-zero
(rows,cols) = np.nonzero(img)
# Initialize empty list of co-ordinates
skel_coords = []
# For each non-zero pixel...
for (r,c) in zip(rows,cols):
# Extract an 8-connected neighbourhood
(col_neigh,row_neigh) = np.meshgrid(np.array([c-1,c,c+1]), np.array([r-1,r,r+1]))
# Cast to int to index into image
col_neigh = col_neigh.astype('int')
row_neigh = row_neigh.astype('int')
# Convert into a single 1D array and check for non-zero locations
pix_neighbourhood = img[row_neigh,col_neigh].ravel() != 0
# If the number of non-zero locations equals 2, add this to
# our list of co-ordinates
if np.sum(pix_neighbourhood) == 2:
skel_coords.append((r,c))
To show the co-ordinates of the end points, you can do:
print "".join(["(" + str(r) + "," + str(c) + ")\n" for (r,c) in skel_coords])
Minor note: This code is untested. I don't have C++ OpenCV installed on this machine so hopefully what I wrote will work. If it doesn't compile, you can certainly translate what I have done into the right syntax. Good luck!

A bit late, but this still might be useful for people!
There's a way of doing the exact same thing as #rayryeng suggests, but with the builtin functions of openCV! This makes it much smaller, and probably way faster (especially with Python, if you are using that, as I am) It is the same solution as this one.
Basically, what we are trying to find is the pixels that are non-zero, with one non-zero neighbor. So what we do is use openCV's built in filter2D function to convolve the skeleton image with a custom kernel that we make. I just learned about convolution and kernels, and this page is really helpful at explaining what these things mean.
So, what kernel would work? How about
[[1, 1,1],
[1,10,1],
[1, 1,1]]?
Then, after applying this kernel, any pixel with the value 11 is one that we want!
Here is what I use:
def skeleton_endpoints(skel):
# Make our input nice, possibly necessary.
skel = skel.copy()
skel[skel!=0] = 1
skel = np.uint8(skel)
# Apply the convolution.
kernel = np.uint8([[1, 1, 1],
[1, 10, 1],
[1, 1, 1]])
src_depth = -1
filtered = cv2.filter2D(skel,src_depth,kernel)
# Look through to find the value of 11.
# This returns a mask of the endpoints, but if you
# just want the coordinates, you could simply
# return np.where(filtered==11)
out = np.zeros_like(skel)
out[np.where(filtered==11)] = 1
return out
Edit: this technique will not work for some skeletons, such as missing the "staircase" pattern of
000
010
110
See comments for more info.

Here is my Python implementation:
import cv2
import numpy as np
path = 'sample_image.png'
img = cv2.imread(path, 0)
# Find positions of non-zero pixels
(rows, cols) = np.nonzero(img)
# Initialize empty list of coordinates
endpoint_coords = []
# Loop through all non-zero pixels
for (r, c) in zip(rows, cols):
top = max(0, r - 1)
right = min(img.shape[1] - 1, c + 1)
bottom = min(img.shape[0] - 1, r + 1)
left = max(0, c - 1)
sub_img = img[top: bottom + 1, left: right + 1]
if np.sum(sub_img) == 255*2:
endpoint_coords.append((r,c))
print(endpoint_coords)

Adobe Photoshop-style posterization and OpenCV

It seems Adobe Photoshop does posterization by quantizing each color channel separately, based on the number of levels specified. So for example, if you specify 2 levels, then it will take the R value, and set it to 0 if your R value is less than 128 or 255 if your value is >= 128. It will do the same for G and B.
Is there an efficient way to do this in python with OpenCV besides iterating through each pixel and making that comparison and setting the value separately? Since an image in OpenCV 2.4 is a NumPy ndarray, is there perhaps an efficient way to do this calculation strictly through NumPy?

Your question specifically seems to be asking about a level of 2. But what about levels more than 2. So i have added a code below which can posterize for any level of color.
import numpy as np
import cv2
im = cv2.imread('messi5.jpg')
n = 2 # Number of levels of quantization
indices = np.arange(0,256) # List of all colors
divider = np.linspace(0,255,n+1)[1] # we get a divider
quantiz = np.int0(np.linspace(0,255,n)) # we get quantization colors
color_levels = np.clip(np.int0(indices/divider),0,n-1) # color levels 0,1,2..
palette = quantiz[color_levels] # Creating the palette
im2 = palette[im] # Applying palette on image
im2 = cv2.convertScaleAbs(im2) # Converting image back to uint8
cv2.imshow('im2',im2)
cv2.waitKey(0)
cv2.destroyAllWindows()
This code uses a method called palette method in Numpy which is really fast than iterating through the pixels. You can find more details how it can be used to speed up code here : Fast Array Manipulation in Numpy
Below are the results I obtained for different levels:
Original Image :
Level 2 :
Level 4 :
Level 8 :
And so on...

We can do this quite neatly using numpy, without having to worry about the channels at all!
import cv2
im = cv2.imread('1_tree_small.jpg')
im[im >= 128]= 255
im[im < 128] = 0
cv2.imwrite('out.jpg', im)
output:
input:

The coolest "posterization" I have seen uses
Mean Shift Segmentation. I used the code
from the author's GitHub repo to create the following image (you need to
uncomment line 27 of Maincpp.cpp to perform the segmentation step).

Use cv::LUT(). It is simplest and fastest way.
cv::Mat posterize(const cv::Mat &bgrmat, uint8_t lvls)
{
cv::Mat lookUpTable(1, 256, CV_8U);
uchar* p = lookUpTable.ptr();
float step = 255.0f / lvls;
for(int i = 0; i < 256; ++i)
p[i] = static_cast<uchar>(step * std::floor(i / step));
cv::Mat omat;
cv::LUT(bgrmat,lookUpTable,omat);
return omat;
}

Generalization for n levels of the answer from fraxel
import cv2 as cv
import matplotlib.pyplot as plt
im = cv.imread("Lenna.png")
n = 5
for i in range(n):
im[(im >= i*255/n) & (im < (i+1)*255/n)] = i*255/(n-1)
plt.imshow(cv.cvtColor(im, cv.COLOR_BGRA2RGB))
plt.show()
n = 2
n = 5

Numpy manipulating array of True values dependent on x/y index

So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?

Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Blank screen when generating point cloud from image with Open3D - python

Related

Why does imclose(Image,nhood) in MATLAB give different output than MORP.CLOSE in OpenCV?

How to use levels gamma/mid-tones slider through code for grayscale image

How can I find endpoints of binary skeleton image in OpenCV?

Adobe Photoshop-style posterization and OpenCV

Numpy manipulating array of True values dependent on x/y index

Categories

Resources