Related
I have a .txt file which includes some frame difference of a video.
The project is to remove noise and stabilize a video using these frame differences and a low pass filter.
The Vibrated2.txt file is:
0.341486, -0.258215
0.121945, 1.27605
-0.0811261, 0.78985
-0.0269414, 1.59913
-0.103227, 0.518159
0.274445, 1.69945
, ...
How can i apply a low pass filter on this data?
I tried this but it didn't work!
import cv2
import numpy as np
from scipy.signal import butter, lfilter
video= cv2.VideoCapture('Vibrated2.avi')
freq = (video.get(cv2.CAP_PROP_FPS))
cutoff = 5
data = np.loadtxt('Vibrated2.txt', delimiter=',')
b, a = butter(5, (cutoff/freq), btype='low', analog=False)
data = lfilter(b, a, data)
Any help? Any idea?
I am not exactly sure how your txt file is structured, but if you want to apply a low pass filter on your frame differencing output, i guess you want to make it binary?
def icv_check_threshold(pixel_value, desired_minimum_value):
if pixel_value < desired_minimum_value:
return False
else:
return True
And for the frame differencing:
def icv_pixel_frame_differencing(frame_1, frame_2):
# first convert frames to numpy arrays to make it easier to work with
first_frame = np.asarray(frame_1, dtype=np.float32)
second_frame = np.asarray(frame_2, dtype=np.float32)
# then compute frame dimensions
frame_width = int(first_frame[0].size)
frame_height = int(first_frame.size/frame_width)
# we then create a stock image for differencing output
frame_difference = np.zeros((frame_height, frame_width), np.uint8)
for i in range(0, frame_width - 1):
for j in range(0, frame_height - 1):
# compute the absolute difference between the current frame and first frame
frame_difference[j, i] = abs(first_frame[j, i] - second_frame[j, i])
# check if the threshold = 25 is satisfied, if not set pixel value to 0, else to 255
# comment out code below to obtain result without threshold / non-binary
if icv_check_threshold(frame_difference[j, i]):
frame_difference[j, i] = 255
else:
frame_difference[j, i] = 0
cv2.imwrite("differenceC.jpg", frame_difference)
cv2.imwrite("frame50.jpg", first_frame)
cv2.imwrite("frame51.jpg", second_frame)
return frame_difference
I hope that this helps. Also here is a link to a project with frame differencing I was working on.
I would like to know how to calculate the percentage of a color in an image, the image below represents 100%:
already this, when the level decreases:
I wanted to learn correctly how do I get the percentage that the bar has at the moment, I tried to use the Matplotlib library, but I could not get the expected result, could anyone help me please? I do not need something ready, someone to teach me ...
I think you want to calculate the progress by looking at the image
I'm not sure if there's a library to this specific thing but here's my simple approach to it,
you can compare images to get until which column they are similar and then can calculate the % task done, let me demonstrate..
!wget https://i.stack.imgur.com/jnxX3.png
a = plt.imread( './jnxX3.png')
plt.imshow( a )
This shall load the image with 100% completion in variable a
c =a
c = c[: , 0:c.shape[1] - 50]
aa = np.zeros( dtype= float , shape=( 11,50, 3 ))
c = np.append( c, aa , axis= 1 )
plt.imshow( c)
plt.imshow( c )
made a sample incomplete image which you should have provided
def status( complete_img , part_image):
"""inputs must be numpy arrays """
complete_img = complete_img[:, 1: ] # as the first pixel column doesn't belong to % completion
part_image = part_image[:, 1:]
counter = 0
while(counter < part_image.shape[1] and counter < complete_img.shape[1]):
if (complete_img[:, counter ] == part_image[:,counter]).all():
counter += 1
else :
break
perc = 100*( float(counter) / complete_img.shape[1])
return
status( a ,c ) # this will return % columns similar in the two images
A proposition:
import numpy as np
from PIL import Image
from urllib.request import urlopen
full = np.asarray(Image.open(urlopen("https://i.stack.imgur.com/jnxX3.png")))
probe = np.asarray(Image.open(urlopen("https://i.stack.imgur.com/vx5zt.png")))
# crop the images to the same shape
# (this step should be avoided, best compare equal shaped arrays)
full = full[:,1:probe.shape[1]+1,:]
def get_percentage(full, probe, threshold):
def profile_red(im):
pr = im[:,:,0] - im[:,:,1]
return pr[pr.shape[0]//2]
def zero(arr):
z = np.argwhere(np.abs(np.diff(np.sign(arr))).astype(bool))
if len(z):
return z[0,0]
else:
return len(arr)
full_red = profile_red(full)
probe_red = profile_red(probe)
mask = full_red > threshold
diff = full_red[mask] - probe_red[mask]
x0 = zero(diff - threshold)
percentage = x0 / diff.size * 100
err = 2./diff.size * 100
return percentage, err
print("{:.1f} p\m {:.1f} %".format(*get_percentage(full, probe, 75.0)))
Result:
94.6 p\m 2.2 %
You're looking for the Pillow library. There are two ways to measure color, Hue, Saturation, Luminance (HSL), and Red, Blue, Green (RGB). There are functionsto do both in the library.
I've been working in a algorithm to convert RGB to HSI and vice-versa in python 3, which it display the resulted images and each channel using matplotlib.
The trouble is displaying HSI to RGB resulted image: Each channel alone is being displayed correctly, but when it shows the tree channels together I get a weird image.
By the way, when I save the resulted image with OpenCV it shows the image correctly.
Resulted display
What I did, but nothing changed:
Round the values and if it pass 1, give 1 to the pixel
In the conversion HSI to RGB, instead define R, G and B arrays with zeros, define arrays with ones
In the conversion RGB to HSI, change the values between [0,360],[0,1],[0,1] to values between [0,360],[0,255],[0,255] rounded or not
Instead use Jupyter notebook, use collab.research by google or Spider
Execute the code on terminal, but it gives me blank windows
Function to display images:
def show_images(T, cols=1):
N = len(T)
fig = plt.figure()
for i in range(N):
a = fig.add_subplot(np.ceil(N/float(cols)), cols, i+1)
try:
img,title = T[i]
except ValueError:
img,title = T[i], "Image %d" % (i+1)
if(img.ndim == 2):
plt.gray()
plt.imshow(img)
a.set_title(title)
plt.xticks([0,img.shape[1]]), plt.yticks([0,img.shape[0]])
fig.set_size_inches(np.array(fig.get_size_inches()) * N)
plt.show()
Then the main function do this:
image = bgr_to_rgb(cv2.imread("rgb.png"))
img1 = rgb_to_hsi(image)
img2 = hsi_to_rgb(img1)
show_images([(image,"RGB"),
(image[:,:,0],"Red"),
(image[:,:,1],"Green"),
(image[:,:,2],"Blue")], 4)
show_images([(img1,"RGB->HSI"),
(img1[:,:,0],"Hue"),
(img1[:,:,1],"Saturation"),
(img1[:,:,2],"Intensity")], 4)
show_images([(img2,"HSI->RGB"),
(img2[:,:,0],"Red"),
(img2[:,:,1],"Green"),
(img2[:,:,2],"Blue")], 4)
Conversion RGB to HSI:
def rgb_to_hsi(img):
zmax = 255 # max value
# values in [0,1]
R = np.divide(img[:,:,0],zmax,dtype=np.float)
G = np.divide(img[:,:,1],zmax,dtype=np.float)
B = np.divide(img[:,:,2],zmax,dtype=np.float)
# Hue, when R=G=B -> H=90
a = (0.5)*np.add(np.subtract(R,G), np.subtract(R,B)) # (1/2)*[(R-G)+(R-B)]
b = np.sqrt(np.add(np.power(np.subtract(R,G), 2) , np.multiply(np.subtract(R,B),np.subtract(G,B))))
tetha = np.arccos( np.divide(a, b, out=np.zeros_like(a), where=b!=0) ) # when b = 0, division returns 0, so then tetha = 90
H = (180/math.pi)*tetha # convert rad to degree
H[B>G]=360-H[B>G]
# saturation = 1 - 3*[min(R,G,B)]/(R+G+B), when R=G=B -> S=0
a = 3*np.minimum(np.minimum(R,G),B) # 3*min(R,G,B)
b = np.add(np.add(R,G),B) # (R+G+B)
S = np.subtract(1, np.divide(a,b,out=np.ones_like(a),where=b!=0))
# intensity = (1/3)*[R+G+B]
I = (1/3)*np.add(np.add(R,G),B)
return np.dstack((H, zmax*S, np.round(zmax*I))) # values between [0,360], [0,255] e [0,255]
Conversion HSI to RGB:
def f1(I,S): # I(1-S)
return np.multiply(I, np.subtract(1,S))
def f2(I,S,H): # I[1+(ScosH/cos(60-H))]
r = math.pi/180
a = np.multiply(S, np.cos(r*H)) # ScosH
b = np.cos(r*np.subtract(60,H)) # cos(60-H)
return np.multiply(I, np.add(1, np.divide(a,b)) )
def f3(I,C1,C2): # 3I-(C1+C2)
return np.subtract(3*I, np.add(C1,C2))
def hsi_to_rgb(img):
zmax = 255 # max value
# values between[0,360], [0,1] and [0,1]
H = img[:,:,0]
S = np.divide(img[:,:,1],zmax,dtype=np.float)
I = np.divide(img[:,:,2],zmax,dtype=np.float)
R,G,B = np.ones(H.shape),np.ones(H.shape),np.ones(H.shape) # values will be between [0,1]
# for 0 <= H < 120
B[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
R[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
G[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], R[(0<=H)&(H<120)], B[(0<=H)&(H<120)])
# for 120 <= H < 240
H = np.subtract(H,120)
R[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
G[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
B[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], R[(0<=H)&(H<120)], G[(0<=H)&(H<120)])
# for 240 <= H < 360
H = np.subtract(H,120)
G[(0<=H)&(H<120)] = f1(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)])
B[(0<=H)&(H<120)] = f2(I[(0<=H)&(H<120)], S[(0<=H)&(H<120)], H[(0<=H)&(H<120)])
R[(0<=H)&(H<120)] = f3(I[(0<=H)&(H<120)], G[(0<=H)&(H<120)], B[(0<=H)&(H<120)])
return np.dstack( ((zmax*R) , (zmax*G) , (zmax*B)) ) # values between [0,255]
If you take a look at the imshow documentation of matplotlib, you will see the following lines:
X : array-like or PIL image The image data. Supported array shapes
are:
(M, N): an image with scalar data. The data is visualized using a
colormap. (M, N, 3): an image with RGB values (float or uint8). (M, N,
4): an image with RGBA values (float or uint8), i.e. including
transparency. The first two dimensions (M, N) define the rows and
columns of the image.
The RGB(A) values should be in the range [0 .. 1] for floats or [0 ..
255] for integers. Out-of-range values will be clipped to these
bounds.
Which tells you the ranges that it should be in... In your case, the HSI values go from 0-360 in the Hue which will be clipped to 255 any value above it. That is one of the reasons why OpenCV uses the Hue range from 0-180, to be able to fit it inside the range.
Then the HSI->RGB seems to return the image in float, then it will be clipped in 1.0.
This will happen only for the display, but also if you save the image it will be clipped most probably, maybe it gets saved as a 16 bit image.
Possible solutions:
normalize the values from 0-1 or from 0-255 (this may change the min and max value) and then display it (dont forget to cast it to np.uint8).
Create a range that is always inside the possible values.
This is for display or saving purposes... If you use 0-360 save it at least in 16 bits
I will want to plot some images using Opencv, and for this I would like to glue images together.
Imagine I have 4 pictures. The best way would be to glue them in a 2x2 image matrix.
a = img; a.shape == (48, 48)
b = img; b.shape == (48, 48)
c = img; c.shape == (48, 48)
d = img; d.shape == (48, 48)
I now use the np.reshape which takes a list such as [a,b,c,d], and then I manually put the dimensions to get the following:
np.reshape([a,b,c,d], (a.shape*2, a.shape*2)).shape == (96, 96)
The issue starts when I have 3 pictures. I kind of figured that I can take the square root of the length of the list and then the ceiling value which will yield the square matrix dimension of 2 (np.ceil(sqrt(len([a,b,c]))) == 2). I would then have to add a white image with the dimension of the first element to the list and there we go. But I imagine there must be an easier way to accomplish this for plotting, most likely already defined somewhere.
So, how to easily combine any amount of square matrices into one big square matrix?
EDIT:
I came up with the following:
def plotimgs(ls):
shp = ls[0].shape[0] # the image's dimension
dim = np.ceil(sqrt(len(ls))) # the amount of pictures per row AND column
emptyimg = (ls[1]*0 + 1)*255 # used to add to the list to allow square matrix
for i in range(int(dim*dim - len(ls))):
ls.append(emptyimg)
enddim = int(shp*dim) # enddim by enddim is the final matrix dimension
# Convert to 600x600 in the end to resize the pictures to fit the screen
newimg = cv2.resize(np.reshape(ls, (enddim, enddim)), (600, 600))
cv2.imshow("frame", newimg)
cv2.waitKey(10)
plotimgs([a,b,d])
Somehow, even though the dimensions are okay, it actually clones some pictures more:
When I give 4 pictures, I get 8 pictures.
When I give 9 pictures, I get 27 pictures.
When I give 16 pictures, I get 64 pictures.
So in fact rather than squared, I get to the third power of images somehow. Though, e.g.
plotimg([a]*9) gives a picture with dimensions of 44*3 x 44*3 = 144x144 which should be correct for 9 images?
Here's a snippet that I use for doing this sort of thing:
import numpy as np
def montage(imgarray, nrows=None, border=5, border_val=np.nan):
"""
Returns an array of regularly spaced images in a regular grid, separated
by a border
imgarray:
3D array of 2D images (n_images, rows, cols)
nrows:
the number of rows of images in the output array. if
unspecified, nrows = ceil(sqrt(n_images))
border:
the border size separating images (px)
border_val:
the value of the border regions of the output array (np.nan
renders as transparent with imshow)
"""
dims = (imgarray.shape[0], imgarray.shape[1]+2*border,
imgarray.shape[2] + 2*border)
X = np.ones(dims, dtype=imgarray.dtype) * border_val
X[:,border:-border,border:-border] = imgarray
# array dims should be [imageno,r,c]
count, m, n = X.shape
if nrows != None:
mm = nrows
nn = int(np.ceil(count/nrows))
else:
mm = int(np.ceil(np.sqrt(count)))
nn = mm
M = np.ones((nn * n, mm * m)) * np.nan
image_id = 0
for j in xrange(mm):
for k in xrange(nn):
if image_id >= count:
break
sliceM, sliceN = j * m, k * n
img = X[image_id,:, :].T
M[sliceN:(sliceN + n), sliceM:(sliceM + m)] = img
image_id += 1
return np.flipud(np.rot90(M))
Example:
from scipy.misc import lena
from matplotlib import pyplot as plt
img = lena().astype(np.float32)
img -= img.min()
img /= img.max()
imgarray = np.sin(np.linspace(0, 2*np.pi, 25)[:, None, None] + img)
m = montage(imgarray)
plt.imshow(m, cmap=plt.cm.jet)
Reusing chunks from How do you split a list into evenly sized chunks? :
def chunks(l, n):
""" Yield successive n-sized chunks from l.
"""
for i in xrange(0, len(l), n):
yield l[i:i+n]
Rewriting your function:
def plotimgs(ls):
shp = ls[0].shape[0] # the image's dimension
dim = int(np.ceil(sqrt(len(ls)))) # the amount of pictures per row AND column
emptyimg = (ls[1]*0 + 1)*255 # used to add to the list to allow square matrix
ls.extend((dim **2 - ls) * [emptyimg]) # filling the list with missing images
newimg = np.concatenate([np.concatenate(c, axis=0) for c in chunks(ls, dim)], axis=1)
cv2.imshow("frame", newimg)
cv2.waitKey(10)
plotimgs([a,b,d])
I need to do some fast thresholding of a large amount of images, with a specific range for each of the RGB channels, i.e. remove (make black) all R values not in [100;110], all G values not in [80;85] and all B values not in [120;140]
Using the python bindings to OpenCV gives me a fast thresholding, but it thresholds all three RGP channels to a single value:
cv.Threshold(cv_im,cv_im,threshold+5, 100,cv.CV_THRESH_TOZERO_INV)
cv.Threshold(cv_im,cv_im,threshold-5, 100,cv.CV_THRESH_TOZERO)
Alternatively I tried to do it manually by converting the image from PIL to numpy:
arr=np.array(np.asarray(Image.open(filename).convert('RGB')).astype('float'))
for x in range(img.size[1]):
for y in range(img.size[0]):
bla = 0
for j in range(3):
if arr[x,y][j] > threshold2[j] - 5 and arr[x,y][j] < threshold2[j] + 5 :
bla += 1
if bla == 3:
arr[x,y][0] = arr[x,y][1] = arr[x,y][2] = 200
else:
arr[x,y][0] = arr[x,y][1] = arr[x,y][2] = 0
While this works as intended, it is horribly slow!
Any ideas as to how I can get a fast implementation of this?
Many thanks in advance,
Bjarke
I think the inRange opencv method is what you are interested in. It will let you set multiple thresholds simultaneously.
So, with your example you would use
# Remember -> OpenCV stores things in BGR order
lowerBound = cv.Scalar(120, 80, 100);
upperBound = cv.Scalar(140, 85, 110);
# this gives you the mask for those in the ranges you specified,
# but you want the inverse, so we'll add bitwise_not...
cv.InRange(cv_im, lowerBound, upperBound, cv_rgb_thresh);
cv.Not(cv_rgb_thresh, cv_rgb_thresh);
Hope that helps!
You can do it with numpy in a much faster way if you don't use loops.
Here's what I came up with:
def better_way():
img = Image.open("rainbow.jpg").convert('RGB')
arr = np.array(np.asarray(img))
R = [(90,130),(60,150),(50,210)]
red_range = np.logical_and(R[0][0] < arr[:,:,0], arr[:,:,0] < R[0][1])
green_range = np.logical_and(R[1][0] < arr[:,:,0], arr[:,:,0] < R[1][1])
blue_range = np.logical_and(R[2][0] < arr[:,:,0], arr[:,:,0] < R[2][1])
valid_range = np.logical_and(red_range, green_range, blue_range)
arr[valid_range] = 200
arr[np.logical_not(valid_range)] = 0
outim = Image.fromarray(arr)
outim.save("rainbowout.jpg")
import timeit
t = timeit.Timer("your_way()", "from __main__ import your_way")
print t.timeit(number=1)
t = timeit.Timer("better_way()", "from __main__ import better_way")
print t.timeit(number=1)
The omitted your_way function was a slightly modified version of your code above. This way runs much faster:
$ python pyrgbrange.py
10.8999910355
0.0717720985413
That's 10.9 seconds vs. 0.07 seconds.
The PIL point function takes a table of 256 values for each band of the image and uses it as a mapping table. It should be pretty fast. Here's how you would apply it in this case:
def mask(low, high):
return [x if low <= x <= high else 0 for x in range(0, 256)]
img = img.point(mask(100,110)+mask(80,85)+mask(120,140))
Edit: The above doesn't produce the same output as your numpy example; I followed the description rather than the code. Here's an update:
def mask(low, high):
return [255 if low <= x <= high else 0 for x in range(0, 256)]
img = img.point(mask(100,110)+mask(80,85)+mask(120,140)).convert('L').point([0]*255+[200]).convert('RGB')
This does a few conversions on the image, making copies in the process, but it should still be faster than operating on individual pixels.
If you stick to using OpenCV, then just cv.Split the image into multiple channels first and then cv.Threshold each channel individually. I'd use something like this (untested):
# Temporary images for each color channel
b = cv.CreateImage(cv.GetSize(orig), orig.depth, 1)
g = cv.CloneImage(b)
r = cv.CloneImage(b)
cv.Split(orig, b, g, r, None)
# Threshold each channel using individual lo and hi thresholds
channels = [ b, g, r ]
thresh = [ (B_LO, B_HI), (G_LO, G_HI), (R_LO, R_HI) ]
for c, (lo, hi) in zip(channels, thresh):
cv.Threshold(ch, ch, hi, 100, cv.CV_THRESH_TOZERO_INV)
cv.Threshold(ch, ch, lo, 100, cv.CV_THRESH_TOZERO)
# Compose a new RGB image from the thresholded channels (if you need it)
dst = cv.CloneImage(orig)
cv.Merge(b, g, r, None, dst)
If your images are all the same size, then you can re-use the created images to save time.