How can I process images faster with Python? - python

I'd trying to write a script that will detect an RGB value on the screen then click the x,y values. I know how to perform the click but I need to process the image a lot faster than my code below currently does. Is this possible with Python?
So far I'm reading a row at a time, when x = 1920 I go onto the second row but it takes about 10 seconds to do one row. By that time the person on screen would have moved into a completely different spot and I have only done one row!
Can I speed this code up OR is there a better way to achieve what I want? If it is not possible in Python I am open to C++ options :)
import Image
x = 0
y = 0
im = Image.open("C:\Users\sean\Desktop\screen.jpg")
pix = im.load()
print im.size #get width and height of the image for iterating over
while x < 1920:
print pix[x,y] #get RGBA value of the pixel of an image
print "x is:" +str(x)
x = x + 1
print "y is: " +str(y)
if x == 1920:
x = 0
y = y + 1

Generally, you want to avoid per-pixel loops in Python. They will always be slow. To get somewhat fast image processing, you need to get used to working with matrices instead of individual pixels. You have basically two options, you can either use NumPy or OpenCV, or a combination of the two. NumPy is a generic mathemtical matrix/array library, but you can do many image-related things with it. If you need something more specific, OpenCV supports many common operations on images.

Thanks for the responses, below is the code I used, I didn't change my original. Turns out it is fast enough but printing is a very costly operation :) It finds the x and y coords of the RGB value in less than a second
#check for certain RGB in image
##need to screen grab
import Image, sys
x = 0
y = 0
im = Image.open('C:\\Users\\sean\\Desktop\\test.jpg')
pix = im.load()
print im.size #get width and height of the image for iterating over
while x < 1914:
value = pix[x,y] #get RGBA value of the pixel of an image
if value == (33, 179, 80):
#call left_click(x,y)
print x,y
x = x + 1
if x == 1914:
x = 0
y = y + 1
print "Finished"
sys.exit()

Image.getpixel is considered very slow. Instead, consider using Image.getdata. That will give you a sequence with data for all the pixels through which you can iterate.
Something like this:
import Image
import math
x = 0
y = 0
im = Image.open("IMG_2977.JPG")
(width, height) = im.size
print width
print height
pix = im.getdata()
i = 0
for pixel in pix:
print pixel
x = i % ( width )
y = math.trunc( i / width)
print "x is: {}".format(x)
print "y is: {}".format(y)
i += 1
Without printing (just storing pixel in a variable) that runs in 2 seconds of user time (0.02 seconds of processor time) on my MacBook Pro.

You may want to do one of two things here.
1. Get a single pixel from the image
In this case, you don't need to iterate through the entire file. Just use im.getpixel. #Daniel makes a valid point that this is slow in a loop, but if you just want a single pixel, this is very efficient.
from PIL import Image
im = Image.open('screenshot.png')
im.getpixel((x, y)) # Returns the colour at (x, y)
2. Process multiple pixels from the image
This is best done using NumPy as #Lukáš suggests. If you want to do something like get the average colour of the 10 x 10 grid around the pixel, for example.
You can get the data as a NumPy array using scipy.misc.fromimage
from PIL import Image
from scipy.misc import fromimage
im = Image.open('screenshot.png')
data = fromimage(img)
Let's compare the time it takes to get this data against a for loop.
In [32]: pix = im.load()
In [33]: %timeit fromimage(im)
10 loops, best of 3: 8.24 ms per loops
In [34]: %timeit [pix[x, y] for x in xrange(im.size[0]) for y in xrange(im.size[1])]
1 loops, best of 3: 637 ms per loop
To summarise:
scipy.misc.fromimage is the fastest, at ~8ms for a 1920x1080 image
Looping through pix[x, y] takes ~640ms, about 80 times slower

There is something called pyautogui and it will find the entire image on the screen within 1-5 seconds usually, which is not too fast but seems better than you current option

You can get the first half of image and second half of image in 2 threads and process these halfs, but for me it speed up only for 15%. Normal speed for me is 2,7 secs at the image of height 375 and width 483. Threads speed it up to 2,3 secs. This is why I'm searching this question.

Related

Finding lines manually in an image

I have an image (saved as a variable called canny_image) and it looks like this after preprocessing.
I am basically trying to find the distance between the first two vertical lines. I tried using the hough_line function from skimage, but it's unable to find the first line, so I thought it might be easier to solve this manually.
I am basically trying to solve this by going through each row in the image until I get to the first pixel with a value of 255, (the lines have a value of 255, while everything else is zero), and then I store the location of that pixel in an array. And I take the mode of the values in the array as the x location of the first line. I'll do the same for the 2nd line by using the first x-value as a starting point.
def find_lines(canny_image):
threshold = 255
for y in range(canny_image.shape[0]):
for x in range(canny_image.shape[1]):
if canny_image[x, y] == threshold:
return x
This is the code I wrote to get the x-location of the first line, however, I'm not getting the desired output. Any help on how to solve this will be much appreciated. Thanks!
Perhaps try something like this
# Returns an array of lines x positions in an image
def find_line_x_positions(image, lines_to_detect: int, buffer_zone: int):
threshold = 255
(height, width) = image.shape
x_position_sums = np.zeros((lines_to_detect, 2), np.double) # For each line, store x_pos sum[i,0] and point count[i,1]
for y in range(height):
buffer = 0
line_index = 0
for x in range(width):
if buffer > 0:
buffer -= 1
if (image[y, x] >= threshold) and (buffer == 0):
buffer = buffer_zone
x_position_sums[line_index, 0] += x
x_position_sums[line_index, 1] += 1
line_index += 1
if ((line_index) == lines_to_detect):
break
# Divide the x position sums by the point counts to get the average x position for each line
results = x_position_sums[np.all(x_position_sums,axis=1)]
results = np.divide(results[:,0],results[:,1])
return results
You can also try OpenCV's HoughLines() function which is simpler to implement than scikit lib. When I tested OpenCV implementation out, it seems to have a hard time finding vertical lines(within 10 degrees from vertical) but you can solve this by rotating your image X degrees and look for lines within that range of rotation.

Mark pixels of a blank image

I Want to mark the pixels,
Mark=[2, 455, 6, 556, 12, 654, 22, 23, 4,86,.....]
in such a way that it will not mark the 1st 2 pixels and then mark next 455 pixels by a color, again for next 6 pixels it will not mark and again mark the next 556 pixels by the same color and so on.
The size of the image is 500x500x3. How do I calculate these steps?
Img=np.zeros((500,500,3),dtype=np.uint8)
Your algorithm is actually in your question. By 500x500x3 I guess that you mean your image is 500 (width) on 500 (height) with 3 color channel?
It could be implemented as follows, without any optimizations:
color = (128, 50, 30)
x, y = 0, 0
for (skip, count) in [Mark[n:n+2] for n in range(len(Mark) // 2)]:
x += skip
y += x // 500 # keep track of the lines, when x > 500,
# it means we are on a new line
x %= 500 # keep the x in bounds
# colorize `count` pixels in the image
for i in range(0, count):
Img[x, y, 0] = color[0]
Img[x, y, 1] = color[1]
Img[x, y, 2] = color[2]
x += 1
y += x // 500
x %= 500 # keep the x in bounds
The zip([a for i, a in enumerate(Mark) if i % 2 == 0], [a for i, a in enumerate(Mark) if i % 2 != 0]) is a just a way to group the pairs (skip, pixel to colorize). It could definitely be improved though, I'm no Python expert.
EDIT: modified the zip() to use [Mark[n:n+2] for n in range(len(Mark) // 2)] as suggested by Peter, much simpler and easier to understand.
The easiest way is probably to convert the image to a Numpy array:
import numpy as np
na = np.array(Img)
And then use Numpy ravel() to give you a flattened (1-D) view of the array
flat = np.ravel(na)
You can now see the shape of your flat view:
print(flat.shape)
Then you can do your colouring by iterating over your array of offsets from your question. Then the good news is that, because ravel() gives you a view into your original data all the changes you make to the view will be reflected in your original data.
So, to get back to a PIL Image, all you need is:
RecolouredImg = Image.fromarray(na)
Try it out by just colouring the first ten pixels before worrying about your long list.
If you like working with Python lists (I don't), you can achieve a similar effect by using PIL getdata() to get a flattened list of pixels and then process the list against your requirements and putdata() to put them all back. The effect will be the same, just choose a method that fits how your brain works.

Fast way to iterate over sparse numpy array but only where elements are 1

I have an array representing a light source in space made as such:
source2D = np.zeros((256, 256))
with any amount of the pixels = 1. An example I have is a point source which is generated by:
source2D[126:128, 126:128] = 1
And I am running a monte carlo simulation which shoots a ray from each part of the array where the value = 1. Currently I am iterating over the entire array but I would save a lot of time by only picking out the elements where array = 1 and iterating over them. I should add that this function should be made to accept a generic 256x256 where any elements could be set to 1, so cropping the array is not an option. What is the fastest way to do this? I am also using tensorflow so if there is an implementation using that, that would also be an option
Right now my code looks somethinglike this:
while pc < 1000000:
pc+=1
# Randomize x and y as coordinates on source
x = np.random.randint(0, source2D.shape[0]) # 0 to 255 for this example
y = np.random.randint(0, source2D.shape[1]) # 0 to 255 for this example
# Shoot raycast from x,y to point on detector
Solved with hpaulj's comment:
source2D = np.zeros((256, 256)) # Testing with point source
source2D[126:128, 126:128] = 1
nonzero_entries = np.nonzero(source2D)
i = np.random.choice(nonzero_entries[0])
j = np.random.choice(nonzero_entries[1])

Numpy: Efficient mapping of single values in np arrays

I have a 3D numpy array respresenting an image in HSV color space (shape = (h=1000, w=3000, 3)).
The last dimension of the image is [H,S, V]. I want to subtract 20 from the H channel from all the pixels IF the pixel value is >20 , but leave S and V intact.
I wrote the following vectorized function:
def sub20(x):
# x is a array in the format [H,S, V]
return np.uint8([H-20, S, V])
vec= np.vectorize(sub20, otypes=[np.uint8],signature="(i)->(i)")
img2= vec(img1)
What this vectorised function does is to accept the last dimension of the image [H,S,V] and output
[H-20, S, V]
I dont know how to make it subtract 20 if H is greater than 20. it also takes 1 minute to execute. I want the script to accept live webcam feed. Is there any way to make it faster?
Thanks
You can simply slice with condition:
img1[:,:,0][img1[:,:,0]>=20] -= 20
Or also make use of np.where:
img1[:,:,0] = np.where(img1[:,:,0]>=20, img1[:,:,0]-20, img1[:,:,0])
Do you need to use the vectorize function?
Otherwise you could only use the following command:
# if you want to make change directly on same image.
img1[:,:,0] -= 20
# if you want to leave img1 in the same state.
img2 = np.array(img1)
img2[:,:,0] = img1[:,:,0] - 20
Update (12:08 - 5.4.2020)
To incorporate that values never get below 0 I would recommend to compute it in two steps as Mercury mentioned:
# if you want to make changes directly on same image.
img1[:,:,0] -= 20
img1[img1[:,:,0] < 0] = 0
# if you want to leave img1 in the same state.
img2 = np.array(img1)
img2[:,:,0] = img2[:,:,0] - 20
img2[img2[:,:,0] < 0] = 0

Numpy manipulating array of True values dependent on x/y index

So I have an array (it's large - 2048x2048), and I would like to do some element wise operations dependent on where they are. I'm very confused how to do this (I was told not to use for loops, and when I tried that my IDE froze and it was going really slow).
Onto the question:
h = aperatureimage
h[:,:] = 0
indices = np.where(aperatureimage>1)
for True in h:
h[index] = np.exp(1j*k*z)*np.exp(1j*k*(x**2+y**2)/(2*z))/(1j*wave*z)
So I have an index, which is (I'm assuming here) essentially a 'cropped' version of my larger aperatureimage array. *Note: Aperature image is a grayscale image converted to an array, it has a shape or text on it, and I would like to find all the 'white' regions of the aperature and perform my operation.
How can I access the individual x/y values of index which will allow me to perform my exponential operation? When I try index[:,None], leads to the program spitting out 'ValueError: broadcast dimensions too large'. I also get array is not broadcastable to correct shape. Any help would be appreciated!
One more clarification: x and y are the only values I would like to change (essentially the points in my array where there is white, z, k, and whatever else are defined previously).
EDIT:
I'm not sure the code I posted above is correct, it returns two empty arrays. When I do this though
index = (aperatureimage==1)
print len(index)
Actually, nothing I've done so far works correctly. I have a 2048x2048 image with a 128x128 white square in the middle of it. I would like to convert this image to an array, look through all the values and determine the index values (x,y) where the array is not black (I only have white/black, bilevel image didn't work for me). I would then like to take all the values (x,y) where the array is not 0, and multiply them by the h[index] value listed above.
I can post more information if necessary. If you can't tell, I'm stuck.
EDIT2: Here's some code that might help - I think I have the problem above solved (I can now access members of the array and perform operations on them). But - for some reason the Fx values in my for loop never increase, it loops Fy forever....
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8
middle = imsize/2
im = Image.new("L", (imsize,imsize))
draw = ImageDraw.Draw(im)
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2))
import sys, os
from scipy.signal import *
import numpy as np
import Image, ImageDraw, ImageFont, ImageOps, ImageEnhance, ImageColor
def createImage(aperature, type):
imsize = aperature*8 #Add 0 padding to make it nice
middle = imsize/2 # The middle (physical 0) of our image will be the imagesize/2
im = Image.new("L", (imsize,imsize)) #Make a grayscale image with imsize*imsize pixels
draw = ImageDraw.Draw(im) #Create a new draw method
box = ((middle-aperature/2, middle-aperature/2), (middle+aperature/2, middle+aperature/2)) #Bounding box for aperature
if type == 'Rectangle':
draw.rectangle(box, fill = 'white') #Draw rectangle in the box and color it white
del draw
return im, middle
def Diffraction(aperaturediameter = 1, type = 'Rectangle', z = 2000000, wave = .001):
# Constants
deltaF = 1/8 # Image will be 8mm wide
z = 1/3.
wave = 0.001
k = 2*pi/wave
# Now let's get to work
aperature = aperaturediameter * 128 # Aperaturediameter (in mm) to some pixels
im, middle = createImage(aperature, type) #Create an image depending on type of aperature
aperaturearray = np.array(im) # Turn image into numpy array
# Fourier Transform of Aperature
Ta = np.fft.fftshift(np.fft.fft2(aperaturearray))/(len(aperaturearray))
# Transforming and calculating of Transfer Function Method
H = aperaturearray.copy() # Copy image so H (transfer function) has the same dimensions as aperaturearray
H[:,:] = 0 # Set H to 0
U = aperaturearray.copy()
U[:,:] = 0
index = np.nonzero(aperaturearray) # Find nonzero elements of aperaturearray
H[index[0],index[1]] = np.exp(1j*k*z)*np.exp(-1j*k*wave*z*((index[0]-middle)**2+(index[1]-middle)**2)) # Free space transfer for ap array
Utfm = abs(np.fft.fftshift(np.fft.ifft2(Ta*H))) # Compute intensity at distance z
# Fourier Integral Method
apindex = np.nonzero(aperaturearray)
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Ufim = abs(np.fft.fftshift(np.fft.fft2(U))/len(U))
# Save image
fim = Image.fromarray(np.uint8(Ufim))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(Utfm))
ftfm.save("PATH\FTFM.jpg")
print "that may have worked..."
return
if __name__ == '__main__':
Diffraction()
You'll need numpy, scipy, and PIL to work with this code.
When I run this, it goes through the code, but there is no data in them (everything is black). Now I have a real problem here as I don't entirely understand the math I'm doing (this is for HW), and I don't have a firm grasp on Python.
U[index[0],index[1]] = aperaturearray[index[0],index[1]] * np.exp(1j*k*((index[0]-middle)**2+(index[1]-middle)**2)/(2*z))
Should that line work for performing elementwise calculations on my array?
Could you perhaps post a minimal, yet complete, example? One that we can copy/paste and run ourselves?
In the meantime, in the first two lines of your current example:
h = aperatureimage
h[:,:] = 0
you set both 'aperatureimage' and 'h' to 0. That's probably not what you intended. You might want to consider:
h = aperatureimage.copy()
This generates a copy of aperatureimage while your code simply points h to the same array as aperatureimage. So changing one changes the other.
Be aware, copying very large arrays might cost you more memory then you would prefer.
What I think you are trying to do is this:
import numpy as np
N = 2048
M = 64
a = np.zeros((N, N))
a[N/2-M:N/2+M,N/2-M:N/2+M]=1
x,y = np.meshgrid(np.linspace(0, 1, N), np.linspace(0, 1, N))
b = a.copy()
indices = np.where(a>0)
b[indices] = np.exp(x[indices]**2+y[indices]**2)
Or something similar. This, in any case, sets some values in 'b' based on the x/y coordinates where 'a' is bigger than 0. Try visualizing it with imshow. Good luck!
Concerning the edit
You should normalize your output so it fits in the 8 bit integer. Currently, one of your arrays has a maximum value much larger than 255 and one has a maximum much smaller. Try this instead:
fim = Image.fromarray(np.uint8(255*Ufim/np.amax(Ufim)))
fim.save("PATH\Fim.jpg")
ftfm = Image.fromarray(np.uint8(255*Utfm/np.amax(Utfm)))
ftfm.save("PATH\FTFM.jpg")
Also consider np.zeros_like() instead of copying and clearing H and U.
Finally, I personally very much like working with ipython when developing something like this. If you put the code in your Diffraction function in the top level of your script (in place of 'if __ name __ &c.'), then you can access the variables directly from ipython. A quick command like np.amax(Utfm) would show you that there are indeed values!=0. imshow() is always nice to look at matrices.

Categories