Combine two normal texture maps with python cv2 - python

I'm wanting to combine two normal texture maps. My understanding is one of the simplest methods is sum the red/green channels from each normal map and then divide by the length.
Reference a concept from here as well and trying to convert to python: https://blog.selfshadow.com/publications/blending-in-detail/ (the simpler UDN blending method)
float3 r = normalize(float3(n1.xy + n2.xy, n1.z));
Using the concept of dividing the rgb vector by its length as my "normalizing" method.
For any vector V = (x, y, z), |V| = sqrt(xx + yy + z*z) gives the
length of the vector. When we normalize a vector, we actually
calculate V/|V| = (x/|V|, y/|V|, z/|V|).
img1 = cv2.imread(str(base_image)).astype(np.float32)
img2 = cv2.imread(str(top_image)).astype(np.float32)
# img = img1 + img2 # Could just do this to combine r,g channels?
(b1, g1, r1) = cv2.split(img1)
(b2, g2, r2) = cv2.split(img2)
r = r1 + r2
g = g1 + g2
b = b1
r_norm = []
g_norm = []
for (_r, _g, _b) in zip(r.ravel(), g.ravel(), b.ravel()):
_l = length(_r, _g, _b)
r_norm.append((_r / _l)*255)
g_norm.append((_g / _l)*255)
r = np.reshape(r_norm, (-1, 2048))
g = np.reshape(g_norm, (-1, 2048))
img = np.dstack((b, g, r))
cv2.imwrite(str(output_path), img)
where length is defined as:
def length(r, g, b):
return math.sqrt((r ** 2 + g ** 2 + b ** 2))
But its not working..I get a very gray image.
On the side this process is slow so if anyone has ideas to speed up the loop (or remove it entirely) that would be awesome :). Been pulling my hair out on this one...

Related

What are the inaccuracies of this 'inverse map' function in OpenCV?

I am trying to horizontally stretch an image in a very specific way. Each x prime coordinate should follow a tangent path with respect to the original x coordinate. I believe there are two ways to do this:
Inverse the tangent function and map it normally
Map the tangent function and then inverse the mapping
Using this answer for map inversion, Im trying to figure out why the two images are not the same. I know that the first method gives me the correct image that I'm looking for, so why doesnt the second method work? Is it because of the "limited precision" that #ChristophRackwitz commented on the answer?
import cv2
import glob
import numpy as np
import math
A = -1010
B = -3.931
C = 5.258
D = 978.3
M = -193.8
N = 1740
def get_tan_func_value(x):
return A * math.tan((((x-N)/M)+B)/C) + D
def get_inverse_tan_func_value(x):
return M * (C*math.atan((x-D)/A) - B) + N
# answer from linked post
def invert_map(F, shape):
I = np.zeros_like(F)
I[:,:,1], I[:,:,0] = np.indices(shape)
P = np.copy(I)
for i in range(10):
P += I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
return P
# import image
images = glob.glob('*.jpg')
img = cv2.imread(images[0])
h, w = img.shape[:2]
map_x_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_x_inverse_tan = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
map_y = np.zeros((img.shape[0], img.shape[1]), dtype=np.float32)
# x tan function map
for i in range(map_x_tan.shape[0]):
map_x_tan[i,:] = [get_tan_func_value(x) for x in range(map_x_tan.shape[1])]
# x inverse tan function map
for i in range(map_x_inverse_tan.shape[0]):
map_x_inverse_tan[i,:] = [get_inverse_tan_func_value(x) for x in range(map_x_inverse_tan.shape[1])]
# default y map
for j in range(map_y.shape[1]):
map_y[:,j] = [y for y in range(map_y.shape[0])]
# convert x tan map to 2 channel (x,y) map
(xymap_tan, _) = cv2.convertMaps(map1=map_x_tan, map2=map_y, dstmap1type=cv2.CV_32FC2)
# invert the 2 channel x tan map
xymap_inverted = invert_map(xymap_tan, (h,w))
# remap and write the target image (inverse tan function with normal map)
target = cv2.remap(img, map_x_inverse_tan, map_y, cv2.INTER_LINEAR)
cv2.imwrite("target.jpg", target)
# remap and write the attempted image (normal tan function with inverted map)
attempt = cv2.remap(img, xymap_inverted, None, cv2.INTER_LINEAR)
cv2.imwrite("attempt.jpg", attempt)
Method 1: Target Image
Method 2: Attempt Image
The results show that the attempt (normal tan function with inverted map) has less stretching near the edges of the image than expected. Almost everywhere else on the images are identical except the edges. I did not post the original picture to save space.
I've played around with that invert_map procedure. It seems slightly susceptible to oscillation.
use this instead:
def invert_map(F):
(h, w) = F.shape[:2] # (h, w, 2), "xymap"
I = np.zeros_like(F)
I[:,:,1], I[:,:,0] = np.indices((h,w)) # identity map
P = np.copy(I)
for i in range(10):
correction = I - cv2.remap(F, P, None, interpolation=cv2.INTER_LINEAR)
P += correction * 0.5
return P
I simply damped the correction by 0.5, which makes the fixed point iteration tamer, converging a lot faster too.
In my experiments with your tan map, I've found that 5-10 iterations are good enough already, and there's no further progress in further iterations.
Entire notebook of my explorations: https://gist.github.com/crackwitz/67f76f8a9eff21476b080c06d20660d0
Feature request: https://github.com/opencv/opencv/issues/22120

Given 2 vector magnitudes and the angle between them, find the endpoint

I have a vector AB and vector BC in 3D. I have the magnitudes of these 2 vectors as well . I also know the angle between vector AB and vector BC. Moreover, I know the coordinates of A and B as well.
I want to find the coordinates of C.
To be more specific, for a vector AB of fixed length, I want to construct a lot of vectors with different magnitudes and fixed angle between AB and the 'to be constructed vectors' with the information provided above.
One way I know of is to find x y and z coordinates of the endpoint C but in 3D 2 angles are required and I only have one, so I am unable to think of a solution.
Please help me.
Thanks
if you want an algorithm that selects one of the infinitely many such points, maybe the following algorithm would work for you:
import numpy as np
def frame_rotation(v):
i = np.argmin(v)
E = np.zeros(3)
E[i] = 1
U = np.empty((3,3), dtype=float)
U[0, :] = v
U[2, :] = np.cross(v, E)
U[2, :] = U[2, :] / np.linalg.norm(U[2, :])
U[1, :] = np.cross(U[2, :], U[0, :])
return U.T
def find_a_point_C(A, B, length_BC, angle):
AB = B - A
length_AB = np.linalg.norm(B-A)
AB = AB / length_AB
angle = angle * np.pi/180
C = np.array([np.cos(angle), np.sin(angle), 0])
C = length_BC * C
C[0] = C[0] + length_AB
U = frame_rotation(AB)
return U.dot(C) + A
def cos_angle(v,w):
return v.dot(w)/ (np.linalg.norm(v)*np.linalg.norm(w))
A = np.array([1,3,2])
B = np.array([2,7,7])
length_BC = 3
angle = 60 # degrees
C = find_a_point_C(A, B, length_BC, angle)
print(C)
# test for correctness
print(np.linalg.norm(C-B))
print(cos_angle(B-A,C-B))
print(cos_angle(B-A,C-B) - np.cos(angle*np.pi/180))

How can I detect a 'dark' image border and crop to it using Python (or Perl)?

I have a number of small images that are screen captures from a video. A couple of example images follow (with various typesof edges present):
In short, I'm trying to crop the image to the closest part of the 'main' image, which is inside an (almost) uniformly 'black' border... or sometimes, there's a bit of a 'jittery' edge. You could think of it as going to the centre of the image and then radiate out until you hit a 'rectangular ('black' or 'nearly black') border'.
The biggest issue as near as I can see is to determine the location and dimensions of the 'cropping rectangle' around the image.. but so far, I haven't been able to get anywhere with doing that.
I've tried using 'cropdetect' filters in ffmpeg; there's not anything really helpful with Perl; ...and as I'm new to Python, I still haven't worked-out if there's any simple module that can do what I need. I have looked at 'scikit-image'... but was totally bamboozled by it, as I don't have a good enough knowledge of Python, let alone a sufficientlly 'technical' knowledge of image formats, colour depth, manipulation techniques, etc that would let me use 'scikit-image'.
I'd appreciate any suggestions on how to tackle this problem, even better if there was a simple way to do it. From my little bit of understanding of 'scikit-image', it seems like the 'Canny edge detect' or the 'prewitt_v'/'prewitt_h' filters might be relevant...?
I'm using Python 3.7.0 (and Active State Perl v5.20.2, if there's any way to use that), both running under Windows 8.1.
Thanks a lot for any forthcoming suggestions.
I've made an attempt at solving this... but it's not very 'robust'. It uses the luminance values of the pixels when the image has been greyscaled:-
# ----
# this will find the 'black'-ish portions and crop the image to these points
# ozboomer, 25-Apr-2020 3-May-2020
#
# ---------
import pprint
import colorsys
from PIL import Image, ImageFilter
# ---- Define local functions (I *still* don't understand why I have to put these here)
def calculate_luminances(r, g, b):
"""Return luminance values of supplied RGB and greyscale of RGB"""
lum = (0.2126 * r) + (0.7152 * g) + (0.0722 * b) # luminance
H, S, V = colorsys.rgb_to_hsv(r, g, b) # HSV for the pixel RGB
R, G, B = colorsys.hsv_to_rgb(H, 0, V) # ...and greyscale RGB
glum = (0.2126 * R) + (0.7152 * G) + (0.0722 * B) # greyscale luminance
return(lum, glum)
# end calculate_luminances
def radial_edge(radial_vector, ok_range):
"""Return the point in the radial where the luminance marks an 'edge' """
print("radial_edge: test range=", ok_range)
edge_idx = -1
i = 0
for glum_value in radial_vector:
print(" radial_vector: i=", i, "glum_value=", "%.2f" % round(glum_value, 2))
if int(glum_value) in ok_range:
print(" IN RANGE! Return i=", i)
edge_idx = i
break
i = i + 1
# endfor
return(edge_idx)
# ---- End local function definitions
# ---- Define some constants, variables, etc
#image_file = "cap.bmp"
#image_file = "cap2.png"
#image_file = "cap3.png"
image_file = "Sample.jpg"
#image_file = "cap4.jpg"
output_file = "Cropped.png";
edge_threshold = range(0, 70) # luminance in this range = 'an edge'
#
# The image layout:-
#
# [0,0]----------+----------[W,0]
# | ^ |
# | | |
# | R3 |
# | | |
# +<--- R1 ---[C]--- R2 --->+
# | | |
# | R4 |
# | | |
# | v |
# [0,H]----------+----------[W,H]
#
# -------------------------------------
# Main Routine
#
# ---- Get the image file ready for processing
try:
im = Image.open(image_file) # RGB.. mode
except:
print("Unable to load image,", image_file)
exit(1)
# Dammit, Perl, etc code is SO much less verbose:-
# open($fh, "<", $filename) || die("\nERROR: Can't open file, '$filename'\n$!\n");
print("Image - Format, size, mode: ", im.format, im.size, im.mode)
W, H = im.size # The (width x height) of the image
XC = int(W / 2.0) # Approx. centre of image
YC = int(H / 2.0)
print("Image Centre: (XC,YC)=", XC, ",", YC)
# --- Define the ordinate ranges for each radial
R1_range = range(XC, -1, -1) # Actual range: XC->0 by -1 ... along YC ordinate
R2_range = range(XC, W, 1) # : XC->W by +1 ... along YC ordinate
R3_range = range(YC, -1, -1) # : YC->0 by -1 ... along XC ordinate
R4_range = range(YC, H, 1) # : YC->H by +1 ... along XC ordinate
# ---- Check each radial for its 'edge' point
radial_luminance = []
for radial_num in range (1,5): # We'll do the 4 midlines
radial_luminance.clear()
if radial_num == 1:
print("Radial: R1")
for x in R1_range:
R, G, B = im.getpixel((x, YC))
[lum, glum] = calculate_luminances(R, G, B)
print(" CoOrd=(", x, ",", YC, ") RGB=",
(R, G, B), "lum=", "%.2f" % round(lum, 2),
"glum=", "%.2f" % round(glum, 2))
radial_luminance.append(glum)
# end: get another radial pixel
left_margin = XC - radial_edge(radial_luminance, edge_threshold)
elif radial_num == 2:
print("Radial: R2")
for x in R2_range:
R, G, B = im.getpixel((x, YC))
[lum, glum] = calculate_luminances(R, G, B)
print(" CoOrd=(", x, ",", YC, ") RGB=",
(R, G, B), "lum=", "%.2f" % round(lum, 2),
"glum=", "%.2f" % round(glum, 2))
radial_luminance.append(glum)
# end: get another radial pixel
right_margin = XC + radial_edge(radial_luminance, edge_threshold)
elif radial_num == 3:
print("Radial: R3")
for y in R3_range:
R, G, B = im.getpixel((XC, y))
[lum, glum] = calculate_luminances(R, G, B)
print(" CoOrd=(", XC, ",", y, ") RGB=",
(R, G, B), "lum=", "%.2f" % round(lum, 2),
"glum=", "%.2f" % round(glum, 2))
radial_luminance.append(glum)
# end: get another radial pixel
top_margin = YC - radial_edge(radial_luminance, edge_threshold)
elif radial_num == 4:
print("Radial: R4")
for y in R4_range:
R, G, B = im.getpixel((XC, y))
[lum, glum] = calculate_luminances(R, G, B)
print(" CoOrd=(", XC, ",", y, ") RGB=",
(R, G, B), "lum=", "%.2f" % round(lum, 2),
"glum=", "%.2f" % round(glum, 2))
radial_luminance.append(glum)
# end: get another radial pixel
bottom_margin = YC + radial_edge(radial_luminance, edge_threshold)
# end: which radial we're processing
im.close()
crop_items = (left_margin, top_margin, right_margin, bottom_margin)
print("crop_items:", crop_items)
# ---- Crop the original image and save it
im = Image.open(image_file)
im2 = im.crop(crop_items)
im2.save(output_file, 'png')
exit(0)
# [eof]
I would expect the radial_edge() function would need to be modified to do something about checking the surrounding pixels to determine if we have a real edge... 'coz the current ok_range probably needs to be determined for each image, so there's no point to trying to automate the cropping using a script such as this.
Still looking for a robust and reliable way to attack this problem...

Implementing a bilateral filter

I am trying to implement a bilateral filter from the paper Fast Bilateral Filteringfor the Display of High-Dynamic-Range Images. The equation (from the paper) that implements the bilateral filter is given as :
According to what I understood,
f is a Gaussian filter
g is a Gaussian filter
p is a pixel in a given image window
s is the current pixel
Ip is the intensity at the current pixel
With this, I wrote the code to implement these equations, given as :
import cv2
import numpy as np
img = cv2.imread("fish.png")
# image of width 239 and height 200
bl_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
i = cv2.magnitude(
cv2.Sobel(bl_img, cv2.CV_64F, 1, 0, ksize=3),
cv2.Sobel(bl_img, cv2.CV_64F, 0, 1, ksize=3)
)
f = cv2.getGaussianKernel(5, 0.1, cv2.CV_64F)
g = cv2.getGaussianKernel(5, 0.1, cv2.CV_64F)
rows, cols, _ = img.shape
filtered = np.zeros(img.shape, dtype=img.dtype)
for r in range(rows):
for c in range(cols):
ks = []
for index in [-2,-1,1,2]:
if index + c > 0 and index + c < cols-1:
p = img[r][index + c]
s = img[r][c]
i_p = i[index+c]
i_s = i[c]
ks.append(
(f * (p-s)) * (g * (i_p * i_s)) # EQUATION 7
)
ks = np.sum(np.array(ks))
js = []
for index in [-2, -1, 1, 2]:
if index + c > 0 and index + c < cols -1:
p = img[r][index + c]
s = img[r][c]
i_p = i[index+c]
i_s = i[c]
js.append((f * (p-s)) * (g * (i_p * i_s)) * i_p) # EQUATION 6
js = np.sum(np.asarray(js))
js = js / ks
filtered[r][c] = js
cv2.imwrite("f.png", filtered)
But as I run this code I get an error saying:
Traceback (most recent call last):
File "bft.py", line 33, in <module>
(f * (p-s)) * (g * (i_p * i_s))
ValueError: operands could not be broadcast together with shapes (5,3) (5,239)
Did I incorrectly implement the equations? What am I missing?
There are various issues with your code. Foremost, the equation is interpreted in a wrong way. f(p-s) means evaluating the function f at p-s. f is the Gaussian. Likewise with g. The section of the code would look like this:
weight = gaussian(p - s, sigma_f) * gaussian(i_p - i_s, sigma_g)
ks.append(weight)
js.append(weight * i_p)
Note that the two loops can be merged, this way you avoid some duplicated computation. gaussian(x, sigma) would be a function that computes the Gaussian weight at x. You need to define two sigmas, sigma_f and sigma_g, the spatial and the tonal sigma respectively.
The second issue is in the definition of p and s. These are the coordinates of the pixel, not the value of the image at the pixel. i_p and i_s are the value of the image at those locations. p-s is basically the spatial distance between the pixel at (r,c) and the given neighbor.
The third issue is the loop over the neighborhood. The neighborhood is all pixels where gaussian(p - s, sigma_f) is not negligible. So how large the neighborhood is depends on the chosen sigma_f. You should take it at least to be ceil(2*sigma_f). Say sigma_f is 2, then you want the neighborhood to go from -4 to 4 (9 pixels). But this neighborhood is two dimensional, not one-dimensional as in your code. So you need two loops:
for ii in range(-ceil(2*sigma_f), ceil(2*sigma_f)+1):
if ii + c > 0 and ii + c < cols-1:
for jj in range(-ceil(2*sigma_f), ceil(2*sigma_f)+1):
if jj + r > 0 and jj + r < rows-1:
# compute weight here
Note that now, p-s is computed with math.sqrt(ii**2 + jj**2). But also note that the Gaussian uses x**2, so you could skip the computation of the square root by passing x**2 into your gaussian function.

Encrypt then decrypt pixels bits of image

I want to encrypt an image by its bits (010101) of 128 length (i have to perform an algorithm on it), then save the new image.
And read the encrypted image, then reverse back the bits and then save it again.
I have tried several methods:
base64 encode/decode: it works but the encrypted images is not viewable (corrupted header), but when I decrypt it, it back to normal again.
By using PIL library: reading the image data, change the pixel value, then created new image (image is viewable - one requirement is completed), but when I decrypt it, it is not coming back to its previous state.
example with PIL putpixel:
def enc(original_name, encrypted_name):
input_image = Image.open(original_name)
data = input_image.load()
output_image = Image.new(input_image.mode, input_image.size)
for x in range(input_image.size[0]):
for y in range(input_image.size[1]):
r, g, b = data[x, y]
r1 = int('{:08b}'.format(r)[::-1], 2)
g1 = int('{:08b}'.format(g)[::-1], 2)
b1 = int('{:08b}'.format(b)[::-1], 2)
output_image.putpixel((x, y), (r1, g1, b1))
output_image.save(encrypted_name)
input_image.close()
output_image.close()
def dec(encrypted_name, original_name):
encrypted_image = Image.open(encrypted_name)
data = encrypted_image.load()
normal_image = Image.new(encrypted_image.mode, encrypted_image.size)
for x in range(encrypted_image.size[0]):
for y in range(encrypted_image.size[1]):
r, g, b = data[x, y]
r1 = int('{:08b}'.format(r)[::-1], 2)
g1 = int('{:08b}'.format(g)[::-1], 2)
b1 = int('{:08b}'.format(b)[::-1], 2)
normal_image.putpixel((x, y), (r1, g1, b1))
normal_image.save(original_name)
encrypted_image.close()
normal_image.close()
enc('images/lena.jpg', 'images/lena-enc.jpg')
dec('images/lena-enc.jpg', 'images/lena-dec.jpg')
example with putdata:
def enc(original_name, encrypted_name):
original_image = Image.open(original_name)
encrypted_image = Image.new(original_image.mode, original_image.size)
raw = list(original_image.getdata())
eraw = []
for px in raw:
r, g, b = px
r1 = int('{:08b}'.format(r)[::-1], 2)
g1 = int('{:08b}'.format(g)[::-1], 2)
b1 = int('{:08b}'.format(b)[::-1], 2)
eraw.append((r1, g1, b1))
encrypted_image.putdata(eraw)
encrypted_image.save(encrypted_name)
original_image.close()
encrypted_image.close()
def dec(encrypted_name, original_name):
encrypted_image = Image.open(encrypted_name)
raw = list(encrypted_image.getdata())
normal_image = Image.new(encrypted_image.mode, encrypted_image.size)
eraw = []
for px in raw:
r, g, b = px
r1 = int('{:08b}'.format(r)[::-1], 2)
g1 = int('{:08b}'.format(g)[::-1], 2)
b1 = int('{:08b}'.format(b)[::-1], 2)
eraw.append((r1, g1, b1))
normal_image.putdata(eraw)
normal_image.save(original_name)
encrypted_image.close()
normal_image.close()
enc('images/lena.jpg', 'images/lena1.jpg')
dec('images/lena1.jpg', 'images/lena2.jpg')
Is there any way to get the viewable encrypted image that can be decrypted?

Categories