The end goal of the project is to take in a capture of the screen and output circles in the screenshots locations(midpoint and radius). So very start of this project is capturing the screen and sending it through a circle finding function.
I started here: Screen Capture with OpenCV and Python-2.7
This works for its functionality and on my machine does the cv2.imshow sucessfully displays the screenshots as it should. However, I want it to work with this example: https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_houghcircles/py_houghcircles.html
Basically, the code in that example works for cv2.imshow, however, I want it to work with cv2.imread so it's compatible with the example I want to copy.
I've tried a few basic things to no avail, for reference see below!
Attempt 1: http://prntscr.com/n8tpyh
printscreen_pil = Imagegrab()
printscreen_numpy = np.array(printscreen_pil.getdata(),dtype='uint8')
img = cv2.imread(printscreen_humpy , 0)
Errors on cv2.imread with message
TypeError: bad argument type for built-in operation
Attempt 2: http://prntscr.com/n8tqtp
from mss import mss
mon = {'top':160 , 'left': 160 , 'width': 200, 'height': 200 }
sct = mss()
sct.get_pixels(mon)
Errors on sc.getpixels with message AttributeError: 'MSS' object has no attribute `get_pixels'
Attempt 3: https://prnt.sc/n8tw7w
img_grab = ImageGrab.grab(bbox=(0,0,500,500))
img = np.array(img_grab)
#img = cv2.imread(img)
img = cv2.medianBlur(img,5)
cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR)
Errors on cimg = cv2.cvtColor(img,cv2.COLOR_GRAY2BGR) with a very long message which I will summarize into (-2:Unspecified error) and Invalid number of channels in input image: 'VScn::contains(scn)' where 'scn' is 3 for more info please open the screenshot, if not drop a comment and I'll type that chunk for ya!
Attempt 4: http://prntscr.com/n8u0jc
cords = {'top':40 , 'left': 0 , 'width': 800, 'height': 640 }
with mss() as sct :
img = np.array(sct.grab(cords))
img = cv2.medianBlur(img,5)
cimg = cv2.cvtColor(img , cv2.COLOR_GRAY2BGR)
Errors on cimg = cv2.cvtColor(img , cv2.COLOR_GRAY2BGR) with message I will again summarize to Invalid number of channels in input image: `VScn::contains(scn)' where 'scn' is 4
Thanks Everyone!!!!
Based on your attempt 4, use this to define cimg:
cimg = cv2.cvtColor(img, cv2.COLOR_BGRA2GRAY)
Related
I am trying to use webcam to collect a photo and resizing it using cv2.resize(). But I am getting this error. I am using google colab. tensorflow==2.4.1 tensorflow-gpu==2.4.1 opencv-python matplotlib. cv2.resize suppose to take numpy array and dimension details as parameter. But everytime I am using it giving me this error. When I comment it out code works fine.
def take_photo(filename='photo.jpg', quality=0.8):
js = Javascript('''
async function takePhoto(quality) {
const div = document.createElement('div');
const capture = document.createElement('button');
capture.textContent = 'Capture';
div.appendChild(capture);
const video = document.createElement('video');
video.style.display = 'block';
const stream = await navigator.mediaDevices.getUserMedia({video: true});
document.body.appendChild(div);
div.appendChild(video);
video.srcObject = stream;
await video.play();
// Resize the output to fit the video element.
google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);
// Wait for Capture to be clicked.
await new Promise((resolve) => capture.onclick = resolve);
const canvas = document.createElement('canvas');
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
canvas.getContext('2d').drawImage(video, 0, 0);
stream.getVideoTracks()[0].stop();
div.remove();
return canvas.toDataURL('image/jpeg', quality);
}
''')
display(js)
# get photo data
data = eval_js('takePhoto({})'.format(quality))
# get OpenCV format image
img = js_to_image(data)
# grayscale img
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
print(type(gray))
#print(gray.shape)
# get face bounding box coordinates using Haar Cascade
faces = face_cascade.detectMultiScale(gray)
# draw face bounding box on image
for (x,y,w,h) in faces:
img = cv2.rectangle(img,(x,y),(x+w,y+h),(255,0,0),2)
# save image
cv2.imwrite(filename, img)
return filename
## js_to_data function
# function to convert the JavaScript object into an OpenCV image
def js_to_image(js_reply):
"""
Params:
js_reply: JavaScript object containing image from webcam
Returns:
img: OpenCV BGR image
"""
# decode base64 image
image_bytes = b64decode(js_reply.split(',')[1])
# convert bytes to numpy array
jpg_as_np = np.frombuffer(image_bytes, dtype=np.uint8)
#Resize image
jpg_as_np= cv2.resize(jpg_as_np,(250,250))
jpg_as_np.shape
# decode numpy array into OpenCV BGR image
img = cv2.imdecode(jpg_as_np, flags=1) **#error here**
return img
# function to convert OpenCV Rectangle bounding box image into base64 byte string to be overlayed on video stream
def bbox_to_bytes(bbox_array):
"""
Params:
bbox_array: Numpy array (pixels) containing rectangle to overlay on video stream.
Returns:
bytes: Base64 image byte string
"""
# convert array into PIL image
bbox_PIL = PIL.Image.fromarray(bbox_array, 'RGBA')
iobuf = io.BytesIO()
# format bbox into png for return
bbox_PIL.save(iobuf, format='png')
# format return string
bbox_bytes = 'data:image/png;base64,{}'.format((str(b64encode(iobuf.getvalue()), 'utf-8')))
return bbox_bytes
from numpy.lib import type_check
try:
filename = take_photo()
print(type(filename))
print('Saved to {}'.format(filename))
Error: OpenCV(4.6.0) /io/opencv/modules/imgcodecs/src/loadsave.cpp:818: error: (-215:Assertion failed) buf.checkVector(1, CV_8U) > 0 in function 'imdecode_'
I'm trying to make a little weather forecast program which gives you an image with an overview of the weather in python
To get the weather, I'm using openweathermap and it works ok for me, but for the image I'm using PIL to paste the weather icon, but for some reason there's a part of it that's not being pasted, here you can see what the icon should be: https://openweathermap.org/img/wn/04n#2x.png, and here's how it appeared in the image that came out of my script:
Here's the part of the code that generates the image:
def drawImage(d):
img=PIL.Image.open("base.png")
url=f"https://openweathermap.org/img/wn/{d['icon']}#2x.png"
weatherIcon=Image.open(requests.get(url, stream=True).raw)
print(url)
img.paste(weatherIcon, (00, 10))
now=datetime.now()
name="boards/"+now.strftime('%d_%m_%Y_%H_%M_%S')+".png"
img.save(name)
return name
Some notes on this code:
The base.png is just a 720x720 blank image
The d that gets passed in is a dictionary with all the information, but here it only needs the icon, so I'll give this example: {"icon": "04n"}
I got the URL for the image from the website of OpenWeatherMap, see documentation: https://openweathermap.org/weather-conditions
This is happening because the icon image you download has transparency (an alpha channel). To remove that, you can use this answer.
I've simplified it slightly, define the following function:
def remove_transparency(im, bg_colour=(255, 255, 255)):
if im.mode in ('RGBA', 'LA') or (im.mode == 'P' and 'transparency' in im.info):
alpha = im.getchannel('A')
bg = Image.new("RGBA", im.size, bg_colour + (255,))
bg.paste(im, mask=alpha)
return bg
else:
return im
and call it in your code:
weatherIcon=Image.open(requests.get(url, stream=True).raw)
print(url)
weatherIcon = remove_transparency(weatherIcon)
img.paste(weatherIcon, (00, 10))
You might want to adjust that bg_colour parameter.
This code works with users that have .png format in their profile pictures, however, when it comes to users that have .gif animated profile pictures, the code does not work. It gives this error OSError(f"cannot write mode {mode} as PNG") from e OSError: cannot write mode PA as PNG
I attempted to change all .png to .gif but I still had trouble.
ValueError: image has wrong mode
This is the aforementioned code that only works with .png format.
class avatar(commands.Cog):
def __init__(self, client):
self.client = client
#commands.Cog.listener()
async def on_member_join(self, member):
guild = self.client.get_guild(GUILD_ID)
general_channel = guild.get_channel(CHANNEL_ID)
url = requests.get(member.avatar_url)
avatar = Image.open(BytesIO(url.content))
avatar = avatar.resize((285,285));
bigsize = (avatar.size[0] * 3, avatar.size[1] * 3)
mask = Image.new('L', bigsize, 0)
draw = ImageDraw.Draw(mask)
draw.ellipse((0, 0) + bigsize, fill=255)
mask = mask.resize(avatar.size, Image.ANTIALIAS)
avatar.putalpha(mask)
output = ImageOps.fit(avatar, mask.size, centering=(1420, 298))
output.putalpha(mask)
output.save('avatar.png')
img = Image.open('welcomealpha.png')
img.paste(avatar,(1408,265), avatar)
img.save('wel.png')
file = discord.File('wel.png')
channel = self.client.get_channel(CHANNEL_ID)
await channel.send(file=file)
guild = self.client.get_guild(GUILD_ID)
channel = guild.get_channel(CHANNEL_ID)
Could it be that the bot doesn't know how to discern between .gif & .png ? If that's the case, what would be the most efficient way for the bot to recognize which profile picture format each new user has in order to manipulate image/gif accordingly to its format?
The error message is quite clear here: Your original Image object has mode P, i.e. it's a palettised image. When adding an alpha channel as you did, you get mode PA. As Pillow tells you, saving Image objects with mode PA as PNG is not supported. Since you only want to save to some static PNG without any animation, I assume it's save to convert the Image object to mode RGB right in the beginning, such that you get a RGBA mode Image object in the end, which can be saved as PNG without any problems.
I took the following excerpt from your code and added the conversion to mode RGB:
from PIL import Image, ImageDraw, ImageOps
avatar = Image.open('homer.gif').convert('RGB')
avatar = avatar.resize((285, 285))
bigsize = (avatar.size[0] * 3, avatar.size[1] * 3)
mask = Image.new('L', bigsize, 0)
draw = ImageDraw.Draw(mask)
draw.ellipse((0, 0) + bigsize, fill=255)
mask = mask.resize(avatar.size, Image.ANTIALIAS)
avatar.putalpha(mask)
output = ImageOps.fit(avatar, mask.size, centering=(1420, 298))
output.putalpha(mask)
output.save('avatar.png')
The GIF input is Homer; the corresponding Image object has mode P:
The exported PNG is the following; it seems to be the first frame of the GIF:
----------------------------------------
System information
----------------------------------------
Platform: Windows-10-10.0.16299-SP0
Python: 3.9.1
Pillow: 8.1.0
----------------------------------------
well actually i dont really understand how to use this T-API opencl and still newbie at it, in the documentation
https://www.learnopencv.com/opencv-transparent-api/
it use cv2.UMat to the .image file and then read it. In my problem i want to use the opencl T-API to my reconizer.predict line because the image is taken/processing while streaming the camera
recognizer = cv2.face.LBPHFaceRecognizer_create()
#colec = cv2.face.MinDistancePredictCollector()
recognizer.read("trainer_data_array.yml")
labels = {"persons_name":0}
with open("labels.pickle", "rb") as f:
og_labels = pickle.load(f)
labels = {v:k for k,v in og_labels.items()}
cap = cv2.VideoCapture(0)
while(True):
#video cap
ret, frame = cap.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, scaleFactor=1.5, minNeighbors=5)
for (x,y,w,h) in faces:
#print(x,y,w,h)
roi_gray = gray[y:y+h, x:x+w]
roy_color = frame[y:y+h, x:x+w]
#recognize how?
id_ , conf = recognizer.predict(roi_gray) #some error, some say cuz its opencv 3.1.0 bug
#solution : up opencv to 3.3 or just use MinDistancePredictCollector(...)
if conf>=45 and conf<=85:
print(idppl)
print(labels[idppl])
font = cv2.FONT_HERSHEY_SIMPLEX
name = labels[idppl]
color = (255,255,255)
stroke = 2
cv2.putText(frame,name,(x,y),font,1,color,stroke,cv2.LINE_AA)
elif conf > 85:
print("unknown")
can someone help me how to do it, if just put it raw like id_ , conf = cv2.UMat(recognizer.predict(roi_gray))
it give me error cv2.UMat' object is not iterable
well without the T-API mybe this line program still give a good frame rate, but after given many modification or implementation/process it'll run in low frame rate when detecting/recog- people face.
this why i want to use openCl so when it run in gpu mybe will give me a pretty good frame rate
What I want
I wish to resize a gray image and publish this so I can use this in another ROS node. However I encounter a problem with the channel information, which is either OpenCV or CvBridge.
Error
When listerning to a camera(webcam/kinect) and converting this to 'mono8' (gray) you get the following information(row, column, channels) in which channels = 1. For some reason if you save this image and read it again suddenly channels = 3. Why is this important? If you use cv2.resize(image,x,y) on an image with 3 channels the output image is (x,y,channels=3), however when there is only 1 channel, this information is lost and your output is (x,y). The problem with this is that CvBridge won't work without channel information.
The following code works, because cv2.resize is performed on 3 channels:
#!/usr/bin/env python
PKG = 'something'
import roslib; roslib.load_manifest(PKG)
import rospy
import cv2
import sys
from sensor_msgs.msg import Image
from cv_bridge import CvBridge, CvBridgeError
class Test:
def __init__(self):
self.image_sub = rospy.Subscriber("/camera/rgb/image_raw",Image, self.callback)
self.image_sub = rospy.Subscriber("test_image",Image, self.callback2)
self.image_pub = rospy.Publisher("test_image", Image)
self.bridge = CvBridge()
def callback(self, image):
try:
cv_image = self.bridge.imgmsg_to_cv2(image, 'mono8')
except CvBridgeError, e:
print e
print cv_image.shape ### output: (480, 640, 1)
cv2.imshow("Test", cv_image)
cv2.imwrite("Test.png", cv_image)
cv2.waitKey(3)
test2 = cv2.imread("Test.png")
print test2.shape ### output: (480, 640, 3)
cv2.imshow("Test 2",test2)
cv2.waitKey(3)
test2 = cv2.resize(test2,(250,240))
print test2.shape ### output: (250, 240, 3)
self.image_pub.publish(self.bridge.cv2_to_imgmsg(test2))
def callback2(self, image):
try:
cv_image = self.bridge.imgmsg_to_cv2(image)
except CvBridgeError, e:
print e
cv2.imshow("Test3", cv_image)
cv2.waitKey(3)
def main(args):
test = Test()
rospy.init_node('image_converter', anonymous=True)
try:
rospy.spin()
except KeyboardInterrupt:
print "Shutting down"
cv2.destroyAllWindows()
if __name__ == '__main__':
main(sys.argv)
However the following doesn't work (this time trying to publish the resize):
print cv_image.shape ### output: (480, 640, 1)
cv2.imshow("Test", cv_image)
cv2.imwrite("Test.png", cv_image)
cv2.waitKey(3)
test2 = cv2.resize(test2,(250,240))
print test2.shape ### output: (250, 240)
self.image_pub.publish(self.bridge.cv2_to_imgmsg(test2)) ### ERROR
test2 = cv2.imread("Test.png")
print test2.shape
cv2.imshow("Test 2",test2)
cv2.waitKey(3)
Other Error
Changing 'mono8' to '8UC3' gives the following error: [yuv422] is a color format but [8UC3] is not so they must have the same OpenCV type, CV_8UC3, CV16UC1
My real question
How can I resize a gray image and publish this in ROS without either losing channel information or converting it somehow to 3 channels? My only concern is that I can send the resized information, the number of channels is not important to me.
Information
Ubuntu 12.04,
ROS Hydro,
OpenCV 2.4.9
In the master branch of CvBridge it's now fixed that you can send images without channel information: https://github.com/ros-perception/vision_opencv/issues/49.
For the people who are not on the master branch:
#Ugly gray 3 channel hack for CvBridge (old version)
#Save image
cv2.imwrite(self.dir_image_save+'tempface.png', cv2_image)
#Load
cv2_image = cv2.imread(self.dir_image_save+'tempface.png')
Now you got a 3 channel gray image (but ugly).