I'm currently trying to build an ANN that can play the online game "Helicopter Game" (see picture below if you're unfamiliar) using only the pixels of screenshots for training.
I've built similar models in OpenAI Universe but was hoping to try my hand at training directly on an online game instead of using an emulator.
The first thing I tried was to use the Selenium screenshot method to capture 100 screenshots at 10 frames per second.
for i in range(100):
driver.save_screenshot(r'C:\Users\MyName\Desktop\Screenshots\shot'+str(i)+'.png')
time.sleep(0.1)
But Selenium doesn't seem to be able to handle that kind of speed, it can only capture about 2 or 3 screenshots per second, even when I take away the time delay, and this is before even doing any preprocessing of the images.
Does anyone know of a method faster than what I'm trying to accomplish with Selenium?
You can give a try to the MSS module, and more precisely to that example to only capture the revelant part of the screen.
The module can be used with PIL, Numpy and OpenCV for other work, just check the doc' :)
Related
I am trying to analyse the face of a driver with a raspberry pi while he's driving. I am using a Raspberry Pi 4 and I use mediapipe for face recognition with Python. I am trying to maximise my performance to analyse the maximum image per second. So I have 3 process, i am getting the image from the camera, i am analysing them with mediapipe and the final one is i am showing the image. My first question is: is it better to use multithreading for these 3 process or is it better to use multiprocessing. I was able to use multithreading, and I was able to get image at 30 fps (camera limitation) and to show them at this rate too. But, I was only analysing them at 13 fps. I tried to multiprocess but I am not able to figure out how. The first process of getting image is working but the other 2 are not working. This image is my class VideoGet, the first process to show you how I did my multiprocessing. And this code is my function that call every process together.
I was expecting that multiprocessing is the best thing to do.
I saw that maybe I should use pool instead of process but I am not sure.
I'm trying to make a program that takes an image from a specific part of my screen and then converts it to text. I know I should use openCV and tesseract to convert to text, but I don't understand how I can constantly feed it a specific image from my screen. The image will be changing about every 1-2 seconds.
You can use pyautogui. It has an in-build screenshot function.
import pyautogui
screenshot = pyautogui.screenshot()
screenshot.save(r'./screenshot.png')
You can use a for loop in which you can save the screenshot as the 'i'th iteration.
Something like screenshot.save((r'./screenshot'+i+'.png'))
You might want to make a separate folder to store the screenshots to keep everything clean or you can just overwrite the image.
Then you can use OpenCV and tesseract to read the image that you just made by using the same formula as before (r'./screenshot'+i+'.png').
I should say that this may not be fast enough that it can do this every 1-2 seconds. I've still to test it, so I can't say the time it takes. As for looking at only a specific part of the screen, maybe it can be zoomed on, but if anyone has a better way to do it, please tell me and I will update the answer. Same with if you know how to make it faster.
Please tell me if you know any way that it can be improved.
I have this picture. I need to identify the animal in this picture as shown using an image processing algorithm. I'm thinking of using Python for this. But I don't know which algorithm to use and I don't know where to start. Where should I start?
image
The best place to start is fast.ai. You will find the videos you need and the code. You can do it on any computer, even a cheap laptop. You will also want to look into LIME model explanations for image classifiers.
I have a small project that I am tinkering with. I have a small box and I have attached my camera on top of it. I want to get a notification if anything is added or removed from it.
My original logic was to constantly take images and compare it to see the difference but that process is not good even the same images on comparison gives out a difference. I do not know why?
Can anyone suggest me any other way to achieve this?
Well I can suggest a way of doing this. So basically what you can do is you can use some kind of Object Detection coupled with a Machine Learning Algo. So the way this might work is you first train your camera to recongnize the closed box. You can take like 10 pics of the closed box(just an example) and train your program to recognize that closed box. So the program will be able to detect when the box is closed. So when the box is not closed(i.e open or missing or something else) then you can code your program appropriately to fire off a signal or whatever it is you are trying to do. So the first obvious step is to write code for object detection. There are numerous ways of doing this alone like Haar Classification, Support Vector Machines. Once you have trained your program to look for the closed box you can then run this program to predict what's happening in every frame of the camera feed. Hope this answered your question! Cheers!
I am using Python 3.5.1 and OpenCV 3.0.0.
I am working on a python program that can play games, so it needs to 'see' what is going on, on the screen. How can this be achieved?
import numpy as np
import cv2
cap = cv2.VideoCapture(0)
while(True):
#work with frames here
Is there a int 'a' such that cv2.VideoCapture(a) will take the desktop screen as video input? I tried making it and I followed a rather clumsy approach, I captured screen repeatedly using:
import os
os.system('screencapture test.jpg')
Then opening test.jpg using cv2.imread. This was a very slow approach, on searching online I found this question Screen Capture with OpenCV and Python-2.7 which does the same thing, but more efficiently. But the fact still remains that it is capturing individual screenshots and processing them one by one and not a true video stream. I also found this How to capture the desktop in OpenCV (ie. turn a bitmap into a Mat)? which I think is close to what I am trying but is in C++, if someone can help me convert this to Python, I will highly appreciate it.
The main thing is that the program will be doing something like MarI/O, so speed is a concern, any help is appreciated, go easy on me, I am (relatively) new to OpenCV.
Thanks.
Just an update on this question in case anyone wants a solution.
Taking screenshot can be achieved by using module pyautogui
import pyautogui
import matplotlib.pyplot as plt
image = pyautogui.screenshot()
plt.imshow(image)
If you want to read it as a stream,
while(True):
image = pyautogui.screenshot()
#further processing
if finished:
break
According to the documentation,
On a 1920 x 1080 screen, the screenshot() function takes roughly 100 milliseconds
So this solution can be used if your application does not demand high fps rate.
Taking screenshots in separate thread sounds good solution.
Also you can use virtual webcam, but it is a heavy solution.
Or you would capture desktop directly by using ffmpeg. https://trac.ffmpeg.org/wiki/Capture/Desktop