Google Cloud Vision Logo Detection API - failing to identify logos - python

I am trying to scrape images from websites and use Google Cloud Vision API to detect if an image on the website is a logo. It works if I provide it a logo like Apple, but it doesn't seem to work for well known non fortune-500 tech company logos (e.g., LaunchDarkly, LogDNA, etc) despite the images clearly being logos. Is it supposed to work for any type of logo or only large brands? Is there a solution out there better suited for my needs?
client = vision.ImageAnnotatorClient()
with io.open('./img.png', 'rb') as image_file:
content = image_file.read()
image = vision.types.Image(content=content)
response = client.logo_detection(image=image)
logos = response.logo_annotations
for logo in logos:
print(logo.description)
print(logo.score)
if response.error.message:
raise Exception(
'{}\nFor more info on error messages, check: '
'https://cloud.google.com/apis/design/errors'.format(
response.error.message))

As explained in the documentation, Logo Detection detects popular product logos. It is expected that it doesn't detect logos with which the model has not been trained.
The solution you can try within CGP is AutoML Vision. This product allows you to retrain GCP's models to classify your images according to your own defined labels. You can create a dataset with the logos you need to detect and retrain the models with it. It has a very easy interface to be able to do it even if you don't have any Machine Learning expertise.

Related

Is there a way to pass annotated image to cloud vision API for reading text from images?

For my project to read text data from image, I am using the Google Cloud Vision API. It reads everything accurately but there is an issue. The API reads text displayed on the product as well. I don't want that.
Can anyone give me an idea on how to solve this?

Where does Custom Vision store training images

I developed an Image Classification Model in CustomVision.AI
I want to download all of the training images used to train the model
I used the training API and was able to retrieve the "HTML" location of all the images however I'd like to use a script to actually download the image to a local drive from the HTML location but am not great with writing scripts as much as running them.
I was also trying to figure out if the images are stored in an azure resource or are custom vision uses its own load storage the images, I'd like to move the images over to an azure blob.
I'm not really a "Programmer" really more of a high-level technology manager but I am comfortable in running scripts and some python code.
I was also trying to figure out if the images are stored in an azure
resource
I suppose it's , cause the images are stored in blob after I checked the image uri. You could use browser developer tool(F12) directly to check the picture uri. The uri format is blob url with image id.
If you want to get all images uri you could use this api: GetTaggedImages and GetUntaggedImages, it will return all images information including the uri.

How can I get predicted images URL from azure?

I'm using Azure Microsoft Custom Vision.
I've already created my algorithm, and what I need now is the URL of my predicted images.
I'm aware that I can get the training images with methods written in Training API (get_tagged_images), but now I'm trying to get the URL of the prediction image. In the Prediction API, there are no getters.
If I inspect the predicted image in Azure Custom Vision Portal, I can find the blob URL, but I'm unable to get that URL through a method.
How can I get the predicted image URL?
The images are available through the QueryPredictions API in the Training API.
The REST documentation is here.
The Python documentation is here.
Here's what your code might look like:
from azure.cognitiveservices.vision.customvision.training import CustomVisionTrainingClient
from azure.cognitiveservices.vision.customvision.training.models import PredictionQueryToken
# Set your region
endpoint = 'https://<your region>.api.cognitive.microsoft.com'
# Set your Training API key
training_key = '<your training key>'
# Set your Project ID
project_id = '<your project id>'
# Query the stored prediction images
trainer = CustomVisionTrainingClient(training_key, endpoint=endpoint)
token = PredictionQueryToken()
response = trainer.query_predictions(project_id, token)
# Get the image URLs, for example
urls = [result.original_image_uri for result in response.results]
It seems that the links of API references in your description are not correct. And there are several versions of Azure Microsoft Custom Vision APIs as the figure below, you can refer to https://<your region, such as southcentralus>.dev.cognitive.microsoft.com/docs/services/?page=2 to see them, and the APIs for getting training images are belong to training stage.
So if you want to get the urls of the training images, first you need to find out what version of Custom Vision Training you used now. As I know, you can see the version information at the Overview & Quick start tabs of your subscription on Azure portal. For example, my custom vision is 1.0 as the figures below.
Fig 1. Overview tab
Fig 2. Quick start tab, and click the API reference to see its documents related to the version
So I can see there are three APIs satisfied your needs, as the figure below.
Here is my sample code to list all tagged images via GetAllTaggedImages(v1.0).
import requests
projectId = "<your project id from project settings of Cognitive portal>"
endpoint = f"https://southcentralus.api.cognitive.microsoft.com/customvision/v1.0/Training/projects/{projectId}/images/tagged/all"
print(endpoint)
headers = {
'Training-key': '<key from keys tab of Azure portal or project settings of Cognitive portal>',
}
resp = requests.get(endpoint, headers=headers)
print(resp.text)
import json
images = json.loads(resp.text)
image_urls = (image['ImageUri'] for image in images)
for image_url in image_urls:
print(image_url)
Hope it helps.

Differences between Google Cloud Vision OCR in browser demo and via python

I am fairly new to the Google Cloud Vision API so my apologies if there is an obvious answer to this. I am noticing that for some images I am getting different OCR results between the Google Cloud Vision API Drag and Drop (https://cloud.google.com/vision/docs/drag-and-drop) and from local image detection in python.
My code is as follows
import io
# Imports the Google Cloud client library
from google.cloud import vision
from google.cloud.vision import types
# Instantiates a client
client = vision.ImageAnnotatorClient()
# The name of the image file to annotate
file_name = "./test0004a.jpg"
# Loads the image into memory
with io.open(file_name, 'rb') as image_file:
content = image_file.read()
image = types.Image(content=content)
response = client.text_detection(image=image)
texts = response.text_annotations
print('Texts:')
for text in texts:
# print('\n"{}"'.format(text.description.encode('utf-8')))
print('\n"{}"'.format(text.description.encode('ascii','ignore')))
vertices = (['({},{})'.format(vertex.x, vertex.y)
for vertex in text.bounding_poly.vertices])
print('bounds: {}'.format(','.join(vertices)))
A sample image that highlights this is attached Sample Image
The python code above doesn't return anything, but in the browser using drag and drop it correctly identifies "2340" as the text.
Shouldn't both python and the browser return the same result?. And if not, why not?, Do I need to include additional parameters in the code?.
The issue here is that you are using TEXT_DETECTION instead of DOCUMENT_TEXT_DETECTION, which is the feature being used in the Drag and Drop example page that you shared.
By changing the method (to document_text_detection()), you should obtain the desired results (I have tested it with your code, and it did work):
# Using TEXT_DETECTION
response = client.text_detection(image=image)
# Using DOCUMENT_TEXT_DETECTION
response = client.document_text_detection(image=image)
Although both methods can be used for OCR, as presented in the documentation, DOCUMENT_TEXT_DETECTION is optimized for dense text and documents. The image you shared is not a really high-quality one, and the text is not clear, therefore it may be that for this type of images, DOCUMENT_TEXT_DETECTION offers a better performance than TEXT_DETECTION.
See some other examples where DOCUMENT_TEXT_DETECTION worked better than TEXT_DETECTION. In any case, please note that it might not always be the situation, and TEXT_DETECTION may still have better results under certain conditions:
Getting Different Data on using Demo and Actual API
Google Vision API text detection strange behaviour

How do I use Google Cloud Vision API to return the image with the highest confidence for a particular label?

I have a list of external URLs (.jpg or .png images) and want to send those as requests to the Google Cloud Vision API for label detection. I want the image with the highest confidence for a particular label(s) returned first. Basically I would like to sort images in descending order of confidence for a label (such as car).
So far I've figured out how to annotate images stored locally but am trying to figure out how I can feed it a list of external image URLs and sort them by confidence for 'car'.
You can send a request with several image, if you save it in Google Cloud Storage for example. But you have to be aware on total size of 8Mb per request.
Then you can save the metadata locally and order it as you want. Google Vision API doesn't give to you the functionality that you want natively.
Reference:
https://cloud.google.com/vision/docs/best-practices
The newest version of the Python Google Vision SDK allows you to send external URLs, per their documentation: https://cloud.google.com/vision/docs/detecting-labels#detecting_labels_in_a_remote_image.

Categories