I've trained a yolov4 neural network to detect pens. Everything has gone currectly, but when I try to test my model, its just not working. Here's what it's showing-
Here's the command-
!./darknet detector test data/obj.data cfg/yolov4-custom.cfg /mydrive/YOLOV4/training/yolov4-custom_last.weights /mydrive/image1.jpeg
imShow('predictions.jpg')
I dont know if this is relevant, but here's the imshow function-
def imShow(path):
import cv2
import matplotlib.pyplot as plt
%matplotlib inline
image = cv2.imread(path)
height, width = image.shape[:2]
resized_image = cv2.resize(image,(3*width, 3*height), interpolation = cv2.INTER_CUBIC)
fig = plt.gcf()
fig.set_size_inches(18, 10)
plt.axis("off")
plt.imshow(cv2.cvtColor(resized_image, cv2.COLOR_BGR2RGB))
plt.show()
Someone please help me fix this issue. Thanks a lot!
Related
I'm trying to convert a plotly express figure to image, then use this image to save it on a power point slide. This is my code:
import plotly.express as px
import plotly.io as pio
from pptx import Presentation
wide_df = px.data.medals_wide()
fig = px.bar(wide_df, x="nation", y=["gold", "silver", "bronze"], title="Wide-Form Input, relabelled",
labels={"value": "count", "variable": "medal"})
# Convert the figure to a bytes object
img_bytes = pio.to_image(fig, format='png')
ppt = Presentation(
"template.pptx"
)
slide = ppt.slides[3]
placeholder = slide.placeholders[13]
placeholder.insert_picture(
img_bytes
)
But I'm getting the following error message:
'bytes' object has no attribute 'seek'
You should be able to do:
import io
...
...
placeholder.insert_picture(io.BytesIO(img_bytes))
The clue was the error message that there is no seek attribute, which is a method that "file-like" objects have in Python, and io.BytesIO() is a way of making a bunch of data appear to come from, and behave like, a file. Documentation is here.
I try to load the pre-trained u2net_human_seg.onnx model in my python program to use it for better background removing.
When I try it, I get an error: _pickle.UnpicklingError: invalid load key, '\x08'.
My code:
import rembg
from PIL import Image
import torch
import numpy as np
def remove_background(input_path):
input = Image.open(input_path)
output = remove(input)
output.save(input_path)
def remove(input):
input_tensor = torch.from_numpy(np.array(input)).float()
output = model(input_tensor)
output_image = Image.fromarray(output.detach().numpy())
return output_image
if __name__ == '__main__':
model = torch.load("models/u2net_human_seg.onnx")
remove_background("images/test.jpg")
My input is test, where 2 people are visible. The paths' should be correct...
I didn't find similar cases online as no one seems to load a model in a custom application.
I wanted to detect tumor in an MRI scan as well as brain and create a mask on both of them.
I have used following code for creating mask only on tumor.
link to notebook is :
https://colab.research.google.com/github/pysource7/utilities/blob/master/Train_Mask_RCNN_(DEMO).ipynb#scrollTo=SyzLXzF5BqiN
Please tell how to make it run for multiple classes.
I am new in this field, I would be highly delighted if someone could help.
This is what I have tried for training the model.
%tensorflow_version 1.x
!pip install --upgrade h5py==2.10.0
!wget https://pysource.com/extra_files/Mask_RCNN_basic_1.zip
!unzip Mask_RCNN_basic_1.zip
import sys
sys.path.append("/content/Mask_RCNN/mrcnn")
from m_rcnn import *
%matplotlib inline
# Extract Images
images_path = "images.zip"
annotations_path = "annotations.json"
extract_images(os.path.join("/content/",images_path), "/content/dataset")
dataset_train = load_image_dataset(os.path.join("/content/", annotations_path), "/content/dataset", "train")
dataset_val = load_image_dataset(os.path.join("/content/", annotations_path), "/content/dataset", "val")
class_number = dataset_train.count_classes()
print('Train: %d' % len(dataset_train.image_ids))
print('Validation: %d' % len(dataset_val.image_ids))
print("Classes: {}".format(class_number))
# Load image samples
display_image_samples(dataset_train)
# Load Configuration
config = CustomConfig(class_number)
#config.display()
model = load_training_model(config)
# Start Training
# This operation might take a long time.
train_head(model, dataset_train, dataset_train, config)
I was practicing OpenCV on google colaboratory becasuse I don't know how to use OpenCV on GPU, when I run OpenCV on my hardware, It takes a lot of CPU, so I went to Google colaboratory.
The link to my notebook is here.
If you don't want to watch it, then here is the code:
import cv2
face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
cap = cv2.VideoCapture(0)
while True:
_, img = cap.read()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2.imshow('img', img)
k = cv2.waitKey(30) & 0xff
if k==27:
break
cap.release()
The same code worked fine on my PC, but not on Google Colaboratory. The error is:
---------------------------------------------------------------------------
error Traceback (most recent call last)
<ipython-input-5-0d9472926d8c> in <module>()
6 while True:
7 _, img = cap.read()
----> 8 gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
9 faces = face_cascade.detectMultiScale(gray, 1.1, 4)
10 for (x, y, w, h) in faces:
error: OpenCV(4.1.2) /io/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'
PS~I have the haarcascade file inside the same directory of my notebook in Google Colaboratory
How to deal with it? If not then is there any "concrete" solution to run OpenCV on my CUDA enabled GPU instead of CPU? Thanks in advance!
_src.empty() means that it had problem to get frame from camera and img is None and when it tries cvtColor(None, ...) then it gives _src.empty().
You should check if img is not None: because cv2 doesn't raise error when it can't get frame from camera or read image from file. And sometimes camera needs time to "warm up" and it can gives few empty frames (None).
VideoCapture(0) reads frame from camera directly connected to computer which runs this code - and when you run code on server Google Colaboratory then it means camera connected directly to server Google Colaboratory (not your local camera) but this server doesn't have camera so VideoCapture(0) can't work on Google Colaboratory.
cv2 can't get image from your local camera when it runs on server. Your web browser may have access to your camera but it needs JavaScript to get frame and send to server - but server needs code to get this frame
I checked in Google if Google Colaboratory can access local webcam and it seems they created script for this - Camera Capture - in first cell is function take_photo() which uses JavaScript to access your camera and display in browser, and in second cell this function is used to display image from local camera and to take screenshot.
You should use this function instead of VideoCapture(0) to work on server with your local camera.
BTW: Belove take_photo() there is also information about cv2.im_show() because it also works only with monitor directly connected to computer which runs this code (and this computer has to run GUI like Windows on Windows , X11 on Linux) - and when you run it on server then it want to display on monitor directly connected to server - but server usually works without monitor (and without GUI)
Google Colaboratory has special replacement which displays in web browser
from google.colab.patches import cv2_imshow
BTW: If you will have problem with loading haarcascades .xml then you may need folder to filename. cv2 has special variable for this cv2.data.haarcascades
path = os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml')
cv2.CascadeClassifier( path )
You can also see what is in this folder
import os
filenames = os.listdir(cv2.data.haarcascades)
filenames = sorted(filenames)
print('\n'.join(filenames))
EDIT:
I created code which can get from local webcam frame by frame without using button and without saving in file. Problem is that it is slow - because it still have to send frame from local web browser to google colab server and later back to local web browser
Python code with JavaScript functions
#
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=2viqYx97hPMi
#
from IPython.display import display, Javascript
from google.colab.output import eval_js
from base64 import b64decode, b64encode
import numpy as np
def init_camera():
"""Create objects and functions in HTML/JavaScript to access local web camera"""
js = Javascript('''
// global variables to use in both functions
var div = null;
var video = null; // <video> to display stream from local webcam
var stream = null; // stream from local webcam
var canvas = null; // <canvas> for single frame from <video> and convert frame to JPG
var img = null; // <img> to display JPG after processing with `cv2`
async function initCamera() {
// place for video (and eventually buttons)
div = document.createElement('div');
document.body.appendChild(div);
// <video> to display video
video = document.createElement('video');
video.style.display = 'block';
div.appendChild(video);
// get webcam stream and assing to <video>
stream = await navigator.mediaDevices.getUserMedia({video: true});
video.srcObject = stream;
// start playing stream from webcam in <video>
await video.play();
// Resize the output to fit the video element.
google.colab.output.setIframeHeight(document.documentElement.scrollHeight, true);
// <canvas> for frame from <video>
canvas = document.createElement('canvas');
canvas.width = video.videoWidth;
canvas.height = video.videoHeight;
//div.appendChild(input_canvas); // there is no need to display to get image (but you can display it for test)
// <img> for image after processing with `cv2`
img = document.createElement('img');
img.width = video.videoWidth;
img.height = video.videoHeight;
div.appendChild(img);
}
async function takeImage(quality) {
// draw frame from <video> on <canvas>
canvas.getContext('2d').drawImage(video, 0, 0);
// stop webcam stream
//stream.getVideoTracks()[0].stop();
// get data from <canvas> as JPG image decoded base64 and with header "data:image/jpg;base64,"
return canvas.toDataURL('image/jpeg', quality);
//return canvas.toDataURL('image/png', quality);
}
async function showImage(image) {
// it needs string "-DATA-ENCODED-BASE64"
// it will replace previous image in `<img src="">`
img.src = image;
// TODO: create <img> if doesn't exists,
// TODO: use `id` to use different `<img>` for different image - like `name` in `cv2.imshow(name, image)`
}
''')
display(js)
eval_js('initCamera()')
def take_frame(quality=0.8):
"""Get frame from web camera"""
data = eval_js('takeImage({})'.format(quality)) # run JavaScript code to get image (JPG as string base64) from <canvas>
header, data = data.split(',') # split header ("data:image/jpg;base64,") and base64 data (JPG)
data = b64decode(data) # decode base64
data = np.frombuffer(data, dtype=np.uint8) # create numpy array with JPG data
img = cv2.imdecode(data, cv2.IMREAD_UNCHANGED) # uncompress JPG data to array of pixels
return img
def show_frame(img, quality=0.8):
"""Put frame as <img src="data:image/jpg;base64,...."> """
ret, data = cv2.imencode('.jpg', img) # compress array of pixels to JPG data
data = b64encode(data) # encode base64
data = data.decode() # convert bytes to string
data = 'data:image/jpg;base64,' + data # join header ("data:image/jpg;base64,") and base64 data (JPG)
eval_js('showImage("{}")'.format(data)) # run JavaScript code to put image (JPG as string base64) in <img>
# argument in `showImage` needs `" "`
And code which uses it in loop
#
# based on: https://colab.research.google.com/notebooks/snippets/advanced_outputs.ipynb#scrollTo=zo9YYDL4SYZr
#
#from google.colab.patches import cv2_imshow # I don't use it but own function `show_frame()`
import cv2
import os
face_cascade = cv2.CascadeClassifier(os.path.join(cv2.data.haarcascades, 'haarcascade_frontalface_default.xml'))
# init JavaScript code
init_camera()
while True:
try:
img = take_frame()
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
#cv2_imshow(gray) # it creates new image for every frame (it doesn't replace previous image) so it is useless
#show_frame(gray) # it replace previous image
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
#cv2_imshow(img) # it creates new image for every frame (it doesn't replace previous image) so it is useless
show_frame(img) # it replace previous image
except Exception as err:
print('Exception:', err)
I don't use from google.colab.patches import cv2_imshow because it always add new image on page instead of replacing existing image.
The same code as Notebook on Google Colab:
https://colab.research.google.com/drive/1j7HTapCLx7BQUBp3USiQPZkA0zBKgLM0?usp=sharing
The possible problem in the code is, you need to give full-path as the directory when using Haar-like features.
face_cascade = cv2.CascadeClassifier('/User/path/to/opencv/data/haarcascades/haarcascade_frontalface_default.xml')
The colab issue with opencv has been known for quite some time, also the same question asked here
As stated here, you can use the cv2_imshow to display the image, but you want to process Camera frames.
from google.colab.patches import cv2_imshow
img = cv2.imread('logo.png', cv2.IMREAD_UNCHANGED)
cv2_imshow(img)
One possible answer:
Insert Camera Capture snippet, the method take_photobut you need to modify the method.
face_cascade = cv2.CascadeClassifier('/opencv/data/haarcascades/haarcascade_frontalface_default.xml')
try:
filename = take_photo()
img = Image(filename)
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
faces = face_cascade.detectMultiScale(gray, 1.1, 4)
for (x, y, w, h) in faces:
cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
cv2_imshow("img", img)
except Exception as err:
print(str(err))
The above code requires editing since there is no direct way to use VideoCapture you have to modify take_photo
I'm trying to reproduce some results from the paper, describing Grad-CAM method, using Keras with Tensorflow-GPU backend, and obtain totally incorrect labels.
I've captured the screenshot of figure 1(a) from that paper and trying to make the pretrained VGG16 from Keras Applications to classify it.
Here is my image:
Here is my code (cell from the Jupyter notebook). Part of code was copied from the Keras manuals
import imageio
from matplotlib import pyplot as plt
from skimage.transform import resize
from keras import activations
from keras.applications import VGG16
from keras.applications.vgg16 import preprocess_input, decode_predictions
# Build the VGG16 network with ImageNet weights
model = VGG16(weights='imagenet', include_top=True)
%matplotlib inline
dog_img = imageio.imread(r"F:\tmp\Opera Snapshot_2018-09-24_133452_arxiv.org.png")
dog_img = dog_img[:, :, 0:3] # Opera has added alpha channel
dog_img = resize(dog_img, (224, 224, 3))
x = np.expand_dims(dog_img, axis=0)
x = preprocess_input(x, mode='tf')
pred = model.predict(x)
decode_predictions(pred)
Output:
[[('n03788365', 'mosquito_net', 0.017053505),
('n03291819', 'envelope', 0.015034639),
('n15075141', 'toilet_tissue', 0.012603286),
('n01737021', 'water_snake', 0.010620943),
('n04209239', 'shower_curtain', 0.009625845)]]
However, when I submit the same image to the online service, run by the paper authors, http://gradcam.cloudcv.org/classification, I see correct label "Boxer"
Here is the output from something that they call "Terminal":
Completed the Classification Task
"Time taken for inference in torch: 9.0"
"Total time taken: 9.12565684319"
{"classify_gcam": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gcam_243.png", "execution_time": 9.0, "label": 243.0, "classify_gb_gcam": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gb_gcam_243.png", "classify_gcam_raw": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gcam_raw_243.png", "input_image": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/Opera Snapshot_2018-09-24_133452_arxiv.org.png", "pred_label": 243.0, "classify_gb": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gb_243.png"}
Completed the Classification Task
"Time taken for inference in torch: 9.0"
"Total time taken: 9.05940508842"
{"classify_gcam": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gcam_243.png", "execution_time": 9.0, "label": 243.0, "classify_gb_gcam": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gb_gcam_243.png", "classify_gcam_raw": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gcam_raw_243.png", "input_image": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/Opera Snapshot_2018-09-24_133452_arxiv.org.png", "pred_label": 243.0, "classify_gb": "./media/grad_cam/classification/86560f84-bfe5-11e8-a657-22000b4a9274/classify_gb_243.png"}
Job published successfully
Publishing job to Classification Queue
Starting classification job on VGG_ILSVRC_16_layers.caffemodel
Job published successfully
Publishing job to Classification Queue
Starting classification job on VGG_ILSVRC_16_layers.caffemodel
I use Anaconda Python 64-bit, on Windows 7.
Versions of relevant software on my PC:
keras 2.2.2 0
keras-applications 1.0.4 py36_1
keras-base 2.2.2 py36_0
keras-preprocessing 1.0.2 py36_1
tensorflow 1.10.0 eigen_py36h849fbd8_0
tensorflow-base 1.10.0 eigen_py36h45df0d8_0
What am I doing wrong? How can I get boxer label?
You cannot do the following line apparently
dog_img = dog_img[:, :, 0:3] # Opera has added alpha channel
So I loaded the image using a utility in Keras called load_img, which doesn't add the alpha channel.
The complete code
import imageio
from matplotlib import pyplot as plt
from skimage.transform import resize
import numpy as np
from keras import activations
from keras.applications import VGG16
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
# Build the VGG16 network with ImageNet weights
model = VGG16(weights='imagenet', include_top=True)
dog_img = image.img_to_array(image.load_img(r"F:\tmp\Opera Snapshot_2018-09-24_133452_arxiv.org.png", target_size=(224, 224)))
x = np.expand_dims(dog_img, axis=0)
x = preprocess_input(x)
pred = model.predict(x)
print(decode_predictions(pred))
[[('n02108089', 'boxer', 0.29122102), ('n02108422', 'bull_mastiff', 0.199128), ('n02129604', 'tiger', 0.10050287), ('n02123159', 'tiger_cat', 0.09733449), ('n02109047', 'Great_Dane', 0.056869864)]]
Considering that all the output probabilities are very low and more or less equally distributed circa 0.01, my guess is that you are pre-processing the image incorrectly and passing some sort of scrambled image that looks like noise to model.predict(). Try to debug and imshow the image right before you predict().