It seems Adobe Photoshop does posterization by quantizing each color channel separately, based on the number of levels specified. So for example, if you specify 2 levels, then it will take the R value, and set it to 0 if your R value is less than 128 or 255 if your value is >= 128. It will do the same for G and B.
Is there an efficient way to do this in python with OpenCV besides iterating through each pixel and making that comparison and setting the value separately? Since an image in OpenCV 2.4 is a NumPy ndarray, is there perhaps an efficient way to do this calculation strictly through NumPy?
Your question specifically seems to be asking about a level of 2. But what about levels more than 2. So i have added a code below which can posterize for any level of color.
import numpy as np
import cv2
im = cv2.imread('messi5.jpg')
n = 2 # Number of levels of quantization
indices = np.arange(0,256) # List of all colors
divider = np.linspace(0,255,n+1)[1] # we get a divider
quantiz = np.int0(np.linspace(0,255,n)) # we get quantization colors
color_levels = np.clip(np.int0(indices/divider),0,n-1) # color levels 0,1,2..
palette = quantiz[color_levels] # Creating the palette
im2 = palette[im] # Applying palette on image
im2 = cv2.convertScaleAbs(im2) # Converting image back to uint8
cv2.imshow('im2',im2)
cv2.waitKey(0)
cv2.destroyAllWindows()
This code uses a method called palette method in Numpy which is really fast than iterating through the pixels. You can find more details how it can be used to speed up code here : Fast Array Manipulation in Numpy
Below are the results I obtained for different levels:
Original Image :
Level 2 :
Level 4 :
Level 8 :
And so on...
We can do this quite neatly using numpy, without having to worry about the channels at all!
import cv2
im = cv2.imread('1_tree_small.jpg')
im[im >= 128]= 255
im[im < 128] = 0
cv2.imwrite('out.jpg', im)
output:
input:
The coolest "posterization" I have seen uses
Mean Shift Segmentation. I used the code
from the author's GitHub repo to create the following image (you need to
uncomment line 27 of Maincpp.cpp to perform the segmentation step).
Use cv::LUT(). It is simplest and fastest way.
cv::Mat posterize(const cv::Mat &bgrmat, uint8_t lvls)
{
cv::Mat lookUpTable(1, 256, CV_8U);
uchar* p = lookUpTable.ptr();
float step = 255.0f / lvls;
for(int i = 0; i < 256; ++i)
p[i] = static_cast<uchar>(step * std::floor(i / step));
cv::Mat omat;
cv::LUT(bgrmat,lookUpTable,omat);
return omat;
}
Generalization for n levels of the answer from fraxel
import cv2 as cv
import matplotlib.pyplot as plt
im = cv.imread("Lenna.png")
n = 5
for i in range(n):
im[(im >= i*255/n) & (im < (i+1)*255/n)] = i*255/(n-1)
plt.imshow(cv.cvtColor(im, cv.COLOR_BGRA2RGB))
plt.show()
n = 2
n = 5
Related
I am trying to make my image sepia, but I get the wrong filter and I cant see why? Is this the incorrect formula of the sepia filter?
im = Image.open("some.jpg")
image = np.asarray(im)
sepia_image = np.empty_like(image)
for i in range(image.shape[0]):
for j in range(image.shape[1]):
sepia_image[i][j][0] = 0.393*image[i][j][0] + 0.769*image[i][j][1] + 0.189*image[i][j][2]
sepia_image[i][j][1] = 0.349*image[i][j][0] + 0.686*image[i][j][1] + 0.168*image[i][j][2]
sepia_image[i][j][2] = 0.272*image[i][j][0] + 0.534*image[i][j][1] + 0.131*image[i][j][2]
for k in range(image.shape[2]):
if sepia_image[i][j][k] > 255:
sepia_image[i][j][k] = 255
sepia_image = sepia_image.astype("uint8")
Image.fromarray(sepia_image).show()
The image i get is this
The problem is that your values are going out of bounds.
For example, using your formula on my example image below, the red channel in the first pixel ends up being 205*0.393 + 206*0.769 + 211*0.189, which is 278. If you are using unsigned 8-bit integers, this will overflow to 22.
To fix it, you need to use floats and clip the range back to 0 to 255, for example by using this instead of your np.empty_like() instantiation:
sepia_image = np.zeros_like(image, dtype=float)
Then, after running your loops:
sepia_image.astype(np.uint8)
Then your code works on my image at least.
Unsolicited advice: don't use loops
Another issue is the difficulty of debugging code like this. In general, you want to avoid loops over arrays in Python. It's slow, and it tends to require more code. Instead, take advantage of NumPy's elementwise maths. For example, you can use np.matmul (or the # operator, which does the same thing) like so:
from io import BytesIO
import requests
import numpy as np
from PIL import Image
# Image CC BY-SA Leiju / Wikimedia Commons
uri = 'https://upload.wikimedia.org/wikipedia/commons/thumb/3/35/Neckertal_20150527-6384.jpg/640px-Neckertal_20150527-6384.jpg'
r = requests.get(uri)
img = Image.open(BytesIO(r.content))
# Turn this PIL Image into a NumPy array.
imarray = np.asarray(img)[..., :3] / 255
# Make a `sepia` multiplier.
sepia = np.array([[0.393, 0.349, 0.272],
[0.769, 0.686, 0.534],
[0.189, 0.168, 0.131]])
# Compute the result and clip back to 0 to 1.
imarray_sepia = np.clip(imarray # sepia, 0, 1)
This produces:
So I tried creating a point cloud with the Open3D library in python and in the end, it's basically just the 2 lines as referenced in here, but when I run my code (see below) all I get is a white screen popping up. I've ran it in Jupyter notebook, but running it in a python script from console didn't change anything nor it threw an error.
I should mention that I created the images in Blender and saved it as OpenExr, meaning that the depth value range between 0 and 4 (I've truncated it to 4 for the background). You can see that they are proper images below and I could also transform them to Open3D pictures and display them without problems.
Edit (27-03-2020): Added complete minimal example
import OpenEXR
import numpy as np
import array
import matplotlib.pyplot as plt
import open3d as o3d
%matplotlib inline
exr_img = OpenEXR.InputFile('frame0100.exr')
depth_img = array.array('f', exr_img.channel('View Layer.Depth.Z'))
r_img = array.array('f', exr_img.channel('View Layer.Combined.R'))
g_img = array.array('f', exr_img.channel('View Layer.Combined.G'))
b_img = array.array('f', exr_img.channel('View Layer.Combined.B'))
def reshape_img(img):
return np.array(img).reshape(720, 1280)
img_array = np.dstack((reshape_img(r_img),
reshape_img(g_img),
reshape_img(b_img),
reshape_img(depth_img)))
#Background returns very large value, truncate it to 4
img_array[img_array == 10000000000.0] = 4
colour = o3d.geometry.Image(np.array(img_array[:, :, :3]))
depth = o3d.geometry.Image(np.array(img_array[:, :, 3]*1000).astype('uint16'))
o3d.draw_geometries([depth])
pinhole_cam = o3d.open3d.camera.PinholeCameraIntrinsic(width= 1280, height=720, cx=640,cy=360,fx=500,fy=500)
rgbd = o3d.create_rgbd_image_from_color_and_depth(colour, depth, convert_rgb_to_intensity = False, depth_scale=1000)
pcd = o3d.create_point_cloud_from_rgbd_image(rgbd, pinhole_cam)
o3d.draw_geometries([pcd])
Please overlook the hacky way of importing the data, as I am new to Open3D and produced the data myself, I did it step-by-step for checks and for confirming the data integrity
I assume it might have to do with my parameters for the pinhole camera. Tbh, I have no idea what would be the proper parameters except that cy, cy should be the centre of the image and fx, fy should be sensible. As my depth values are in Blender metres but Open3D apparently expects millimetres, the scaling should make sense.
I'd appreciate it if you could give me any help debugging it. But if you were to point me in the direction of a better working library to create 3D point clouds from images I wouldn't mind either. The documentation I found with Open3D is lacking at best.
In short, Open3D expects your 3-channel color image to be of uint8 type.
Otherwise, it would return an empty point cloud, resulting in the blank window you see.
Update 2020-3-27, late night in my time zone:)
Now that you have provided your code, let's dive in!
From your function names, I guess you are using Open3D 0.7.0 or something like that. The code I provided is in 0.9.0. Some function names changed and new functionality added in.
When I run your code in 0.9.0 (after some minor modifications of course), there's a RuntimeError:
RuntimeError: [Open3D ERROR] [CreatePointCloudFromRGBDImage] Unsupported image format.
And we can see from the Open3D 0.9.0 source that your color image must be of 3 channels and take only 1 byte each (uint8) or be of 1 channel and take 4 bytes (float, that means intensity image):
std::shared_ptr<PointCloud> PointCloud::CreateFromRGBDImage(
const RGBDImage &image,
const camera::PinholeCameraIntrinsic &intrinsic,
const Eigen::Matrix4d &extrinsic /* = Eigen::Matrix4d::Identity()*/,
bool project_valid_depth_only) {
if (image.depth_.num_of_channels_ == 1 &&
image.depth_.bytes_per_channel_ == 4) {
if (image.color_.bytes_per_channel_ == 1 &&
image.color_.num_of_channels_ == 3) {
return CreatePointCloudFromRGBDImageT<uint8_t, 3>(
image, intrinsic, extrinsic, project_valid_depth_only);
} else if (image.color_.bytes_per_channel_ == 4 &&
image.color_.num_of_channels_ == 1) {
return CreatePointCloudFromRGBDImageT<float, 1>(
image, intrinsic, extrinsic, project_valid_depth_only);
}
}
utility::LogError(
"[CreatePointCloudFromRGBDImage] Unsupported image format.");
return std::make_shared<PointCloud>();
}
Otherwise, there'll be errors like I encountered.
However, in the version of 0.7.0, the source code is:
std::shared_ptr<PointCloud> CreatePointCloudFromRGBDImage(
const RGBDImage &image,
const camera::PinholeCameraIntrinsic &intrinsic,
const Eigen::Matrix4d &extrinsic /* = Eigen::Matrix4d::Identity()*/) {
if (image.depth_.num_of_channels_ == 1 &&
image.depth_.bytes_per_channel_ == 4) {
if (image.color_.bytes_per_channel_ == 1 &&
image.color_.num_of_channels_ == 3) {
return CreatePointCloudFromRGBDImageT<uint8_t, 3>(image, intrinsic,
extrinsic);
} else if (image.color_.bytes_per_channel_ == 4 &&
image.color_.num_of_channels_ == 1) {
return CreatePointCloudFromRGBDImageT<float, 1>(image, intrinsic,
extrinsic);
}
}
utility::PrintDebug(
"[CreatePointCloudFromRGBDImage] Unsupported image format.\n");
return std::make_shared<PointCloud>();
}
That means Open3D still does not support it, but it would only warn you. And only in debug mode!
After that, it will return an empty point cloud. (Actually both versions do this.) That explains the blank window.
Now you should know, you can make convert_rgb_to_intensity=True and succeed. Though you still should normalize your color image first.
Or you can convert the color image to be in range [0, 255] and of type uint8.
Both will work.
Now I hope all is clear. Hooray!
P.S. Actually I usually found Open3D source code to be easy to read. And if you know C++, you could read it whenever something like this happens.
Update 2020-3-27:
I cannot reproduce your result and I don't know why it happened (you should provide a minimal reproducible example).
Using the image you provided in the comment, I wrote the following code and it works well. If it still doesn't work on your computer, maybe your Open3D is broken.
(I'm not familiar with .exr images, hence the data extraction might be ugly :)
import Imath
import array
import OpenEXR
import numpy as np
import open3d as o3d
# extract data from exr files
f = OpenEXR.InputFile('frame.exr')
FLOAT = Imath.PixelType(Imath.PixelType.FLOAT)
cs = list(f.header()['channels'].keys()) # channels
data = [np.array(array.array('f', f.channel(c, FLOAT))) for c in cs]
data = [d.reshape(720, 1280) for d in data]
rgb = np.concatenate([data[i][:, :, np.newaxis] for i in [3, 2, 1]], axis=-1)
# rgb /= np.max(rgb) # this will result in a much darker image
np.clip(rgb, 0, 1.0) # to better visualize as HDR is not supported?
# get rgbd image
img = o3d.geometry.Image((rgb * 255).astype(np.uint8))
depth = o3d.geometry.Image((data[-1] * 1000).astype(np.uint16))
rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
# some guessed intrinsics
intr = o3d.open3d.camera.PinholeCameraIntrinsic(1280, 720, fx=570, fy=570, cx=640, cy=360)
# get point cloud and visualize
pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, intr)
o3d.visualization.draw_geometries([pcd])
And the result is:
Original answer:
You have misunderstood the meaning of depth_scale.
Use this line:
depth = o3d.geometry.Image(np.array(img_array[:, :, 3]*1000).astype('uint16'))
The Open3D documentation said the depth values will first be scaled and then truncated. Actually it means the pixel values in your depth image will first divide this number rather than multiply, as you can see in the Open3D source:
std::shared_ptr<Image> Image::ConvertDepthToFloatImage(
double depth_scale /* = 1000.0*/, double depth_trunc /* = 3.0*/) const {
// don't need warning message about image type
// as we call CreateFloatImage
auto output = CreateFloatImage();
for (int y = 0; y < output->height_; y++) {
for (int x = 0; x < output->width_; x++) {
float *p = output->PointerAt<float>(x, y);
*p /= (float)depth_scale;
if (*p >= depth_trunc) *p = 0.0f;
}
}
return output;
}
Actually we usually take it for granted that values in depth images should be integers (I guess that's why Open3D did not point that out clearly in their documentation), since floating-point images are less common.
You cannot store 1.34 meters in .png images, since they will lose precision. As a result, we store 1340 in depth images and later processes will first convert it back to 1.34.
As for camera intrinsics for your depth image, I guess there'll be configuration parameters in Blender when you create it. I'm not familiar with Blender, so I'll not talk about it. However, if you do not understand general camera intrinsics, you might want to take a look at this.
#Jing Zhaos's answer worked! However, I assume his version of Open3D is different than mine, I had to change 2 function calls likes this (and changed the naming slightly):
exr_img = OpenEXR.InputFile('frame0100.exr')
cs = list(exr_img.header()['channels'].keys()) # channels
FLOAT = Imath.PixelType(Imath.PixelType.FLOAT)
img_data = [np.array(array.array('f', exr_img.channel(c, FLOAT))) for c in cs]
img_data = [d.reshape(720,1280) for d in img_data]
rgb = np.concatenate([img_data[i][:, :, np.newaxis] for i in [3, 2, 1]], axis=-1)
np.clip(rgb, 0, 1.0) # to better visualize as HDR is not supported?
img = o3d.geometry.Image((rgb * 255).astype(np.uint8))
depth = o3d.geometry.Image((img_data[-1] * 1000).astype(np.uint16))
#####rgbd = o3d.geometry.RGBDImage.create_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
rgbd = o3d.create_rgbd_image_from_color_and_depth(img, depth, convert_rgb_to_intensity=False)
# some guessed intrinsics
intr = o3d.open3d.camera.PinholeCameraIntrinsic(1280, 720, fx=570, fy=570, cx=640, cy=360)
# get point cloud and visualize
#####pcd = o3d.geometry.PointCloud.create_from_rgbd_image(rgbd, intr)
pcd = o3d.create_point_cloud_from_rgbd_image(rgbd, intr)
o3d.visualization.draw_geometries([pcd])
otherwise I got following error:
AttributeError: type object 'open3d.open3d.geometry.RGBDImage' has no attribute 'create_from_color_and_depth'
Hopefully that also helps others with my Python/Open3D version. Not quite sure where exactly the mistake in my code is, but I am satisfied to have usable code. Thanks again to Jing Zhao!
The end goal is to take an image and slice it up into samples that I save. The problem is that my slices are randomly returning black/ incorrect patches. Bellow is a small sample program.
import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np
image32 = misc.imread("work0.png")
patches = np.zeros((36, 8, 8))
for i in range(4):
for j in range(4):
patches[i*4 + j] = image32[i:i+8,j:j+8]
misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])
An example of my image would be:
Patch of 0,0 of 8x8 patch yields:
Two things:
You are initializing your patch matrix to be the wrong data type. By default, numpy will make patches matrix a np.float64 type and if you use this with saving, you won't get the results you would expect. Specifically, if you consult Mr. F's answer, there is actually some scaling performed on floating-point images where the minimum and maximum values of the image get scaled to black and white respectively and so if you have an image that is completely uniform in background, both the minimum and maximum will be the same and will get visualized to black. As such, the best thing is to respect the original image's data type, namely setting the dtype of your patches matrix to np.uint8.
Judging from your for loop indexing, you want to extract out 8 x 8 patches that are non-overlapping. This means that if you have a 32 x 32 image with 8 x 8 patches, you have 16 patches in total arranged in a 4 x 4 grid.
Therefore, you need to change the patches statement so that it has 16 in the first dimension, not 36. In addition, you'll have to adjust the way you're indexing into your image to extract out the 8 x 8 patches because right now, the patches are overlapping. Specifically, you want to make the image patch indexing go from 8*i to 8*(i+1) for the rows and 8*j to 8*(j+1) for the columns. If you substitute sample values of i and j yourself, you'll see that we get unique 8 x 8 patches for each grid in your image.
With both of the above things I noted, the modified code should be:
import scipy.ndimage as ndimage
import scipy.misc as misc
import numpy as np
image32 = misc.imread('work0.png')
patches = np.zeros((16,8,8), dtype=np.uint8) # Change
for i in range(4):
for j in range(4):
patches[i*4 + j] = image32[8*i:8*(i+1),8*j:8*(j+1)] # Change
misc.imsave("{0}{1}.png".format(i,j), patches[i*4 + j])
When I do this and take a look at the output images, I get what I expect.
To be absolutely sure, let's plot the segments using matplotlib. You've conveniently saved all of the patches in patches so it shouldn't be a problem showing what we need. However, I'll place some code in comments so that you can read in the images that were saved from disk with your above code so you can verify that it still works, regardless of looking at patches or the images on disk:
import matplotlib.pyplot as plt
plt.figure()
for i in range(4):
for j in range(4):
plt.subplot(4, 4, 4*i + j + 1)
img = patches[4*i + j]
# or you can do this:
# img = misc.imread('{0}{1}.png'.format(i,j))
img = np.dstack([img, img, img])
plt.imshow(img)
plt.show()
The weird thing about matplotlib.pyplot.imshow is that if you have an image that is single channel (such as your case) that has the same intensity all around, it gets visualized to black no matter what the colour map is, much like what we experienced with imsave. Therefore, I had to artificially make this a RGB image but with all of the channels to be the same so this gets visualized as grayscale before we show the image.
We get:
According to this answer the issue is that imsave normalizes the data so that the computed minimum is defined as black (and, if there is a distinct maximum, that is defined as white).
This led me to go digging as to why the suggested use of uint8 did work to create the desired output. As it turns out, in the source there is a function called bytescale that gets called internally.
Actually, imsave itself is a very thin wrapper around toimage followed by save (from the image object). Inside of toimage if mode is None (which it is by default), that's when bytescale gets invoked.
It turns out that bytescale has an if statement that checks for the uint8 data type, and if the data is in that format, it returns the data unaltered. But if not, then the data is scaled according to a max and min transformation (where 0 and 255 are the default low and high pixel values to compare to).
This is the full snippet of code linked above:
if data.dtype == uint8:
return data
if high < low:
raise ValueError("`high` should be larger than `low`.")
if cmin is None:
cmin = data.min()
if cmax is None:
cmax = data.max()
cscale = cmax - cmin
if cscale < 0:
raise ValueError("`cmax` should be larger than `cmin`.")
elif cscale == 0:
cscale = 1
scale = float(high - low) / cscale
bytedata = (data * 1.0 - cmin) * scale + 0.4999
bytedata[bytedata > high] = high
bytedata[bytedata < 0] = 0
return cast[uint8](bytedata) + cast[uint8](low)
For the blocks of your data that are all 255, cscale will be 0, which will be checked for and changed to 1. Then the line
bytedata = (data * 1.0 - cmin) * scale + 0.4999
will result in the whole image block having the float value of 0.4999, thus set explicitly to 0 in the next chunk of code (when casted to uint8 from float) as for example:
In [102]: np.cast[np.uint8](0.4999)
Out[102]: array(0, dtype=uint8)
You can see in the body of bytescale that there are only two possible ways to return: either your data is type uint8 and it's returned as-is, or else it goes through this kind of silly scaling process. So in the end, it is indeed correct, and good practice, to be using uint8 for the pieces of your code that specifically load from or save to an image format via these functions.
So this cascade of stuff is why you were getting all zeros in the outputted image file and why the other suggestion of using dtype=np.uint8 actually helps you. It's not because you need to avoid floating point data for images, just because of this bizarre convention to check and scale data on the part of imsave.
There are many questions over here which checks if two images are "nearly" similar or not.
My task is simple. With OpenCV, I want to find out if two images are 100% identical or not.
They will be of same size but can be saved with different filenames.
You can use a logical operator like xor operator. If you are using python you can use the following one-line function:
Python
def is_similar(image1, image2):
return image1.shape == image2.shape and not(np.bitwise_xor(image1,image2).any())
where shape is the property that shows the size of matrix and bitwise_xor is as the name suggests. The C++ version can be made in a similar way!
C++
Please see #berak code.
Notice: The Python code works for any depth images(1-D, 2-D, 3-D , ..), but the C++ version works just for 2-D images. It's easy to convert it to any depth images by yourself. I hope that gives you the insight! :)
Doc: bitwise_xor
EDIT: C++ was removed. Thanks to #Micka and # berak for their comments.
the sum of the differences should be 0 (for all channels):
bool equal(const Mat & a, const Mat & b)
{
if ( (a.rows != b.rows) || (a.cols != b.cols) )
return false;
Scalar s = sum( a - b );
return (s[0]==0) && (s[1]==0) && (s[2]==0);
}
import cv2
import numpy as np
a = cv2.imread("picture1.png")
b = cv2.imread("picture2.png")
difference = cv2.subtract(a, b)
result = not np.any(difference)
if result is True:
print("Pictures are the same")
else:
print("Pictures are different")
If they are same files except being saved in different file-names, you can check whether their Checksums are identical or not.
Importing the packages we’ll need — matplotlib for plotting, NumPy for numerical processing, and cv2 for our OpenCV bindings. Structural Similarity Index method is already implemented for us by scikit-image, so we’ll just use their implementation
# import the necessary packages
from skimage.measure import structural_similarity as ssim
import matplotlib.pyplot as plt
import numpy as np
import cv2
Then define the compare_images function which we’ll use to compare two images using both MSE and SSIM. The mse function takes three arguments: imageA and imageB, which are the two images we are going to compare, and then the title of our figure.
We then compute the MSE and SSIM between the two images.
We also simply display the MSE and SSIM associated with the two images we are comparing.
def mse(imageA, imageB):
# the 'Mean Squared Error' between the two images is the
# sum of the squared difference between the two images;
# NOTE: the two images must have the same dimension
err = np.sum((imageA.astype("float") - imageB.astype("float")) ** 2)
err /= float(imageA.shape[0] * imageA.shape[1])
# return the MSE, the lower the error, the more "similar"
# the two images are
return err
def compare_images(imageA, imageB, title):
# compute the mean squared error and structural similarity
# index for the images
m = mse(imageA, imageB)
s = ssim(imageA, imageB)
# setup the figure
fig = plt.figure(title)
plt.suptitle("MSE: %.2f, SSIM: %.2f" % (m, s))
# show first image
ax = fig.add_subplot(1, 2, 1)
plt.imshow(imageA, cmap = plt.cm.gray)
plt.axis("off")
# show the second image
ax = fig.add_subplot(1, 2, 2)
plt.imshow(imageB, cmap = plt.cm.gray)
plt.axis("off")
# show the images
plt.show()
Load images off disk using OpenCV. We’ll be using original image, contrast adjusted image, and our Photoshopped image
We then convert our images to grayscale
# load the images -- the original, the original + contrast,
# and the original + photoshop
original = cv2.imread("images/jp_gates_original.png")
contrast = cv2.imread("images/jp_gates_contrast.png")
shopped = cv2.imread("images/jp_gates_photoshopped.png")
# convert the images to grayscale
original = cv2.cvtColor(original, cv2.COLOR_BGR2GRAY)
contrast = cv2.cvtColor(contrast, cv2.COLOR_BGR2GRAY)
shopped = cv2.cvtColor(shopped, cv2.COLOR_BGR2GRAY)
We will generate a matplotlib figure, loop over our images one-by-one, and add them to our plot. Our plot is then displayed to us.
Finally, we can compare our images together using the compare_images function.
# initialize the figure
fig = plt.figure("Images")
images = ("Original", original), ("Contrast", contrast), ("Photoshopped", shopped)
# loop over the images
for (i, (name, image)) in enumerate(images):
# show the image
ax = fig.add_subplot(1, 3, i + 1)
ax.set_title(name)
plt.imshow(image, cmap = plt.cm.gray)
plt.axis("off")
# show the figure
plt.show()
# compare the images
compare_images(original, original, "Original vs. Original")
compare_images(original, contrast, "Original vs. Contrast")
compare_images(original, shopped, "Original vs. Photoshopped")
Reference- https://www.pyimagesearch.com/2014/09/15/python-compare-two-images/
I have done this task.
Compare file sizes.
Compare exif data.
Compare first 'n' byte, where 'n' is 128 to 1024 or so.
Compare last 'n' bytes.
Compare middle 'n' bytes.
Compare checksum
So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?
Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.