Annotating image file to text using googleapiclient

Annotating image file to text using googleapiclient - python

I am trying to annotate a local image file using google cloud services. I followed the instructions given here https://cloud.google.com/natural-language/docs/reference/libraries, and setup the google API. The given test examples on the page executed without any problem. However, when I am trying to actually annotate a file I am getting error, here is the code I am using:
files = [];
files.append("/opt/lampp/htdocs/test.jpg");
def get_text_from_files(fileNames):
texts = detect_text(fileNames);
def detect_text(fileNames):
max_results = 6;
num_retries=3;
service = googleapiclient.discovery.build('language', 'v1');
batch_request = [];
for filename in fileNames:
request = {
'image': {},
'features': [{
'type': 'TEXT_DETECTION',
'maxResults': max_results,
}]
}
with open(filename, 'rb') as image_file:
request['image']['content'] = base64.b64encode(image_file.read()).decode('UTF-8');
batch_request.append(request);
request = service.images().annotate(body={'requests': batch_request});
try:
responses = request.execute(num_retries=num_retries)
if 'responses' not in responses:
return {};
text_response = {};
for filename, response in zip(
input_filenames, responses['responses']):
if 'error' in response:
logging.error('API Error for {}: {}'.format(
filename,
response['error'].get('message', '')))
continue
text_response[filename] = response.get('textAnnotations', [])
return text_response;
except googleapiclient.errors.HttpError as e:
print ('Http Error for {}: {}', e)
except KeyError as e2:
print ('Key error: {}', e2)
get_text_from_files(files);
But I am getting error, I have given the stack trace below:
Traceback (most recent call last):
File "test.py", line 68, in <module>
get_text_from_files(pdf);
File "test.py", line 21, in get_text_from_files
texts = detect_text(fileNames);
File "test.py", line 41, in detect_text
request = service.images().annotate(body={'requests': batch_request});
AttributeError: 'Resource' object has no attribute 'images'
Thanks in advance.

Note that you are using the wrong Google API Client Python Library. You are using the Natural Language API, while the one that you want to use is the Vision API. The error message AttributeError: 'Resource' object has no attribute 'images' indicates that the resource associated to the Language API does not have any images attribute. In order to solve this issue, it should be enough to do the following change:
# Wrong API being used
service = googleapiclient.discovery.build('language', 'v1');
# Correct API being used
service = googleapiclient.discovery.build('vision', 'v1');
In this Google API Client Libraries page you will find the whole list of available APIs and their names and versions available. And here, there's the complete documentation for the Vision API legacy API Client Library.
Finally, let me recommend you the usage of the idiomatic Client Libraries instead of the legacy API Client Libraries. They are much more intuitive to use and there are some good documentation references in their GitHub page.

Related

TrasnslationAPI a bytes-like object is required, not 'Repeated'

I am trying to translate a pdf document from english to french using google translation api and python, however I get a type error.
Traceback (most recent call last):
File "C:\Users\troberts034\Documents\translate_test\translate.py", line 42, in <module>
translate_document()
File "C:\Users\troberts034\Documents\translate_test\translate.py", line 33, in translate_document
f.write(response.document_translation.byte_stream_outputs)
TypeError: a bytes-like object is required, not 'Repeated'
I have a feeling that it has something to do with writing to the file as binary, but I open it as binary too so I am unsure what the issue is. I want it to take a pdf file that has english text and edit the text and translate it to french using the api. Any ideas whats wrong?
from google.cloud import translate_v3beta1 as translate
def translate_document():
client = translate.TranslationServiceClient()
location = "global"
project_id = "translatedocument"
parent = f"projects/{project_id}/locations/{location}"
# Supported file types: https://cloud.google.com/translate/docs/supported-formats
with open("C:/Users/###/Documents/translate_test/test.pdf", "rb") as document:
document_content = document.read()
document_input_config = {
"content": document_content,
"mime_type": "application/pdf",
}
response = client.translate_document(
request={
"parent": parent,
"target_language_code": "fr-FR",
"document_input_config": document_input_config,
}
)
# To output the translated document, uncomment the code below.
f = open('test.pdf', 'wb')
f.write(response.document_translation.byte_stream_outputs)
f.close()
# If not provided in the TranslationRequest, the translated file will only be returned through a byte-stream
# and its output mime type will be the same as the input file's mime type
print("Response: Detected Language Code - {}".format(
response.document_translation.detected_language_code))
translate_document()

I think there is a bug on the sample code (I'm assuming you got the sample from the Cloud Translate API documentation).
To fix your code, you do need to use response.document_translation.byte_stream_outputs[0]. So basically changing this line:
f.write(response.document_translation.byte_stream_outputs)
by:
f.write(response.document_translation.byte_stream_outputs[0])
then your code will work.

Python getting results from Azure Storage Table with azure-data-tables

I am trying to query an Azure storage table to get all rows to turn into a table on a web site, however I cannot get the entries from the table, I get the same error every time "azure.core.exceptions.HttpResponseError: The requested operation is not implemented on the specified resource."
For code I am following the examples here and it is not working as expected.
from azure.data.tables import TableServiceClient
from azure.core.credentials import AzureNamedKeyCredential
def read_storage_table():
credential = AzureNamedKeyCredential(os.environ["AZ_STORAGE_ACCOUNT"], os.environ["AZ_STORAGE_KEY"])
service = TableServiceClient(endpoint=os.environ["AZ_STORAGE_ENDPOINT"], credential=credential)
client = service.get_table_client(table_name=os.environ["AZ_STORAGE_TABLE"])
entities = client.query_entities(query_filter="PartitionKey eq 'tasksSeattle'")
client.close()
service.close()
return entities
Then calling the function.
table = read_storage_table()
for record in table:
for key in record.keys():
print("Key: {}, Value: {}".format(key, record[key]))
And that returns:
Traceback (most recent call last):
File "C:\Program Files\Python310\Lib\site-packages\azure\data\tables\_models.py", line 363, in _get_next_cb
return self._command(
File "C:\Program Files\Python310\Lib\site-packages\azure\data\tables\_generated\operations\_table_operations.py", line 386, in query_entities
raise HttpResponseError(response=response, model=error)
azure.core.exceptions.HttpResponseError: Operation returned an invalid status 'Not Implemented'
Content: {"odata.error":{"code":"NotImplemented","message":{"lang":"en-US","value":"The requested operation is not implemented on the specified resource.\nRequestId:cd29feda-1002-006b-679c-3d39e8000000\nTime:2022-03-22T03:27:00.5993216Z"}}}
Using a similar function I am able to write to the table. But even trying entities = client.list_entities() I get the same error. I'm at a loss.

KrunkFu thank you for identifying and sharing the solution here. Posting the same into answer section to help other community members.
replacing https://<accountname>.table.core.windows.net/<table>, with
https://<accountname>.table.core.windows.net to the query solved the
issue

Getting KeyError: 'Endpoint' error in Python when calling Custom Vision API

from azure.cognitiveservices.vision.customvision.prediction import CustomVisionPredictionClient
from msrest.authentication import CognitiveServicesCredentials
from azure.cognitiveservices.vision.customvision import prediction
from PIL import Image
endpoint = "https://southcentralus.api.cognitive.microsoft.com/"
project_id = "projectidhere"
prediction_key = "predictionkeyhere"
predict = CustomVisionPredictionClient(prediction_key, endpoint)
with open("c:/users/paul.barbin/pycharmprojects/hw3/TallowTest1.jpg", mode="rb") as image_data:
tallowresult = predict.detect_image(project_id, "test1", image_data)
Python 3.7, and I'm using Azure Custom Vision 3.1? (>azure.cognitiveservices.vision.customvision) (3.1.0)
Note that I've seen the same question on SO but no real solution. The posted answer on the other question says to use the REST API instead.
I believe the error is in the endpoint (as stated in the error), and I've tried a few variants - with the slash, without, using an environment variable, without, I've tried appending various strings to my endpoint but I keep getting the same message. Any help is appreciated.
Full error here:
Traceback (most recent call last):
File "GetError.py", line 15, in <module>
tallowresult = predict.detect_image(project_id, "test1", image_data)
File "C:\Users\paul.barbin\PycharmProjects\hw3\.venv\lib\site-packages\azure\cognitiveservices\vision\customvision\prediction\operations\_custom_vision_
prediction_client_operations.py", line 354, in detect_image
request = self._client.post(url, query_parameters, header_parameters, form_content=form_data_content)
File "C:\Users\paul.barbin\PycharmProjects\hw3\.venv\lib\site-packages\msrest\service_client.py", line 193, in post
request = self._request('POST', url, params, headers, content, form_content)
File "C:\Users\paul.barbin\PycharmProjects\hw3\.venv\lib\site-packages\msrest\service_client.py", line 108, in _request
request = ClientRequest(method, self.format_url(url))
File "C:\Users\paul.barbin\PycharmProjects\hw3\.venv\lib\site-packages\msrest\service_client.py", line 155, in format_url
base = self.config.base_url.format(**kwargs).rstrip('/')
KeyError: 'Endpoint'

CustomVisionPredictionClient takes two required, positional parameters: endpoint and credentials. Endpoint needs to be passed in before credentials, try swapping the order:
predict = CustomVisionPredictionClient(endpoint, prediction_key)

Why does this python script work on my local machine but not on Heroku?

there. I'm building a simple scraping tool. Here's the code that I have for it.
from bs4 import BeautifulSoup
import requests
from lxml import html
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import datetime
scope = ['https://spreadsheets.google.com/feeds']
credentials = ServiceAccountCredentials.from_json_keyfile_name('Programming
4 Marketers-File-goes-here.json', scope)
site = 'http://nathanbarry.com/authority/'
hdr = {'User-Agent':'Mozilla/5.0'}
req = requests.get(site, headers=hdr)
soup = BeautifulSoup(req.content)
def getFullPrice(soup):
divs = soup.find_all('div', id='complete-package')
price = ""
for i in divs:
price = i.a
completePrice = (str(price).split('$',1)[1]).split('<', 1)[0]
return completePrice
def getVideoPrice(soup):
divs = soup.find_all('div', id='video-package')
price = ""
for i in divs:
price = i.a
videoPrice = (str(price).split('$',1)[1]).split('<', 1)[0]
return videoPrice
fullPrice = getFullPrice(soup)
videoPrice = getVideoPrice(soup)
date = datetime.date.today()
gc = gspread.authorize(credentials)
wks = gc.open("Authority Tracking").sheet1
row = len(wks.col_values(1))+1
wks.update_cell(row, 1, date)
wks.update_cell(row, 2, fullPrice)
wks.update_cell(row, 3, videoPrice)
This script runs on my local machine. But, when I deploy it as a part of an app to Heroku and try to run it, I get the following error:
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 219, in put_feed
r = self.session.put(url, data, headers=headers)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 82, in put
return self.request('PUT', url, params=params, data=data, **kwargs)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/httpsession.py", line 69, in request
response.status_code, response.content))
gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "AuthorityScraper.py", line 44, in
wks.update_cell(row, 1, date)
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/models.py", line 517, in update_cell
self.client.put_feed(uri, ElementTree.tostring(feed))
File "/app/.heroku/python/lib/python3.6/site-packages/gspread/client.py", line 221, in put_feed
if ex[0] == 403:
TypeError: 'RequestError' object does not support indexing
What do you think might be causing this error? Do you have any suggestions for how I can fix it?

There are a couple of things going on:
1) The Google Sheets API returned an error: "Invalid query parameter value for cell_id":
gspread.exceptions.RequestError: (400, "400: b'Invalid query parameter value for cell_id.'")
2) A bug in gspread caused an exception upon receipt of the error:
TypeError: 'RequestError' object does not support indexing
Python 3 removed __getitem__ from BaseException, which this gspread error handling relies on. This doesn't matter too much because it would have raised an UpdateCellError exception anyways.
My guess is that you are passing an invalid row number to update_cell. It would be helpful to add some debug logging to your script to show, for example, which row it is trying to update.
It may be better to start with a worksheet with zero rows and use append_row instead. However there does seem to be an outstanding issue in gspread with append_row, and it may actually be the same issue you are running into.

I encountered the same problem. BS4 works fine at a local machine. However, for some reason, it is way too slow in the Heroku server resulting into giving error.
I switched to lxml and it is working fine now.
Install it by command:
pip install lxml
A sample code snippet is given below:
from lxml import html
import requests
getpage = requests.get("https://url_here")
gethtmlcontent = html.fromstring(getpage.content)
data = gethtmlcontent.xpath('//div[#class = "class-name"]/text()')
#this is a sample for fetching data from the dummy div
data = data[0:n] # as per your requirement
#now inject the data into django tmeplate.

Upload file to GCS from Appengine (Endpoints). AttributeTypeError:

I'm trying to upload a file to GCS from Appengine Endpoints. I'm using Python. When the file ends to upload, shows an error " AttributeError: 'str' object has no attribute 'ToMessage' ".
So, if I go to GCS File Explorer, in the browser, I see the recently filename uploaded but its size is 0K.
This is my model:
class File(EndpointsModel):
_message_fields_schema = ('blob', 'url')
blob = ndb.BlobKeyProperty() #stored in GCS
url = ndb.StringProperty()
enable = ndb.BooleanProperty(default=True)
def create_file(filename):
file_info = blobstore.FileInfo(filename)
filename = '/gs'+ str(file_info.filename.blob)
gcs.open(secrets.BUCKET_NAME +'/' + filename, 'w').close()
return blobstore.create_gs_key(filename)
So, what I need to do to upload correctly a file to GCS from Appengine Endpoints.
Traceback:
ERROR 2014-11-25 20:35:22,654 service.py:191] Encountered unexpected error from ProtoRPC method implementation: AttributeError ('str' object has no attribute 'ToMessage')
Traceback (most recent call last):
File "/home/alpocr/workspace/google_appengine/lib/protorpc-1.0/protorpc/wsgi/service.py", line 181, in protorpc_service_app
response = method(instance, request)
File "/home/alpocr/workspace/google_appengine/lib/endpoints-1.0/endpoints/api_config.py", line 1332, in invoke_remote
return remote_method(service_instance, request)
File "/home/alpocr/workspace/google_appengine/lib/protorpc-1.0/protorpc/remote.py", line 412, in invoke_remote_method
response = method(service_instance, request)
File "/home/alpocr/workspace/mall4g-backend/libs/endpoints_proto_datastore/ndb/model.py", line 1429, in EntityToRequestMethod
response = response.ToMessage(fields=response_fields)
AttributeError: 'str' object has no attribute 'ToMessage'

It sounds like you have defined the return type correctly for your endpoints method, and it's expecting to turn the result into a Message object, but the endpoints method code is actually returning a string. Can you post the endpoints method that is called when this error occurs?
Either that or endpoints proto model is acting weird when you (somewhere in your code) assign a string value to one of its properties. When it tries to convert it to a Message (and thus recursively to turn its properties into Messages), it finds the String and bugs out. It's hard to tell without seeing the affected endpoint method's code.
UPDATE: Also, checking the source of endpoints_proto_datastore, we see the following comment above the line that bugs:
# If developers using a custom request message class with
# response_fields to create a response message class for them, it is
# up to them to return an instance of the current EndpointsModel
# class. If not, their API users will receive a 503 from an uncaught
# exception.
Could this apply to you?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.

Annotating image file to text using googleapiclient - python

Related

TrasnslationAPI a bytes-like object is required, not 'Repeated'

Python getting results from Azure Storage Table with azure-data-tables

Getting KeyError: 'Endpoint' error in Python when calling Custom Vision API

Why does this python script work on my local machine but not on Heroku?

Upload file to GCS from Appengine (Endpoints). AttributeTypeError:

Categories

Resources