TextRazor Custom Dicitionary - AttributeError

TextRazor Custom Dicitionary - AttributeError - python

From: https://www.textrazor.com/docs/python#Dictionary
I implemented following code:
client = textrazor.TextRazor("API_key", extractors=["words", "phrases", "topics", "relations","entities"])
client.set_classifiers(["textrazor_iab"])
manager = textrazor.DictionaryManager('API_key')
manager.create_dictionary({'id':'dict_ID'})
new_entity_type_data = {'type': ['cpp_developer']}
manager.add_entries('dict_ID', [{'id': 'DEV1', 'text': 'Andrei Alexandrescu', 'data':new_entity_type_data}, {'id': 'DEV2', 'text':'Bjarne Stroustrup', 'data':new_entity_type_data}])
client.set_entity_dictionaries(['dict_ID'])
response = client.analyze('Although it is very early in the process, higher-level parallelism is slated to be a key theme of the next version of C++, says Bjarne Stroustrup')
I get following error when running response.entities():
AttributeError: 'NoneType' object has no attribute 'encode'
When I don't use the custom dicitionary, I get following output:
[TextRazor Entity b'Bjarne Stroustrup' at positions [26, 27],
TextRazor Entity b'C++' at positions [23]]
I tested this with different sentences, and different entities in the custom dictionary.
I get the same error every time the entity that I added to the custom dictionary occurs in the sentence I am analyzing.
If I create a custom dictionary, but there are no words present in the sentence that are entities in the dictionary, there is no error thrown.
This indicates that the entity is recognized using the custom dictionary, otherwise the error wouldn't be thrown. But for some reason, this entity doesn't have a datatype.
So my question could probably be translated to; how do I add a valid data type to an entity that I add to a custom dictionary?

I got a response from TextRazor:
I can reproduce this and there does appear to be a bug in our Python client when printing entities.
When printing an entity we generate a string that assumes the entity has an ID, custom entities do not, causing this unhelpful error. We’ll fix this in the client. The bug only occurs when printing the entity, you can still access the matched custom entities as normal:
for entity in response.entities():
print(entity.matched_positions, entity.data, entity.custom_entity_id)

Related

AttributeError:'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'DESCRIPTOR'

def get_ad_full_details(ad_json, current_topics, ad_group_ad, ad):
mandatory_data = {
"ad_group_ad.ad.responsive_search_ad.headlines": ad_group_ad.ad.responsive_search_ad.headlines,
"ad_group_ad.ad.responsive_search_ad.descriptions": ad_group_ad.ad.responsive_search_ad.descriptions}
ad_json["mandatory_data"] = mandatory_data
when I run json.dumps(ad_json) I get AttributeError:'google.protobuf.pyext._message.RepeatedCompositeCo' object has no attribute 'DESCRIPTOR'. I've tried to follow this post but it still gives the same error.
I've tried to iterate and map the ad_group_ad.ad.responsive_search_ad.headlines repeated protof field to its "[text]" values, but the code fails.
Any idea, how I can fetch the "text" member of this repeated proto field?
I can use regex, but thought their might be an easier way

Getting SalesforceMalformedRequest: Malformed request error

I am trying to execute this following code to push data to Salesforce using the simple_salesforce python library :
from simple_salesforce import Salesforce
staging_df = hive.execute("select * from hdmni")
staging_df = staging_df.toPandas()
# # staging_df['birth_date']= staging_df['birth_date'].dt.date
staging_df['birth_date'] = staging_df['birth_date'].astype(str)
staging_df['encounter_start_date'] = staging_df['encounter_start_date'].astype(str)
staging_df['encounter_end_date'] = staging_df['encounter_end_date'].astype(str)
bulk_data = []
for row in staging_df.itertuples():
d= row._asdict()
del d['Index']
bulk_data.append(d)
sf = Salesforce(password='', username='', security_token='')
sf.bulk.Delivery_Detail__c.insert(bulk_data)
I am getting this error while trying to send dictionary to salesforce :
SalesforceMalformedRequest: Malformed request
https://subhotutorial-dev-ed.my.salesforce.com/services/async/38.0/job/7500o00000HtWP6AAN/batch/7510o00000Q15TnAAJ/result.
Response content: {'exceptionCode': 'InvalidBatch',
'exceptionMessage': 'Records not processed'}

There's something about your query that is not correct. While I don't know your use case, by reading this line, you can tell that you are attempting to insert into a custom object/entity in Salesforce:
sf.bulk.Delivery_Detail__c.insert(bulk_data)
The reason you can tell is because of the __c suffix, which gets appended onto custom objects and fields (that's two underscores, by the way).
Since you're inserting into a custom object, your fields would have to be custom, too. And note, you've not appended that suffix onto them.
Note: Every custom object/entity in Salesforce does come with a few standard fields to support system features like record key (Id), record name (Name), audit fields (CreatedById, CreatedDate, etc.). These wouldn't have a suffix. But none of the fields you reference are any of these standard system fields...so the __c suffix would be expected.
I suspect that what Salesforce is expecting in your insert operation are field names like this:
Birth_Date__c
Encounter_Start_Date__c
Encounter_End_Date__c
These are referred to as the API name for both objects and fields, and anytime code interacts with them (whether via integration, or on code that executes directly on the Salesforce platform) you need to make certain you're using this API name.
Incidentally, you can retrieve this API name through a number of ways. Probably easiest is to log into your Salesforce org, and in Setup > Object Manager > [some object] > Fields and Relationships you can view details of each field, including the API name. Here's a screen shot.
You can also use SObject describe APIs, either in native Apex code, or via integration and either the REST or SOAP APIs. Here's part of the response from the describe API request to the describe REST endpoint for the same object as my UI example above, found here at https://[domain]/services/data/v47.0/sobjects/Expense__c/describe:
Looking at the docs for the simple-salesforce python library you're using, they've surfaced the describe API. You can find some info under Other Options. You would invoke it as sf.SObject.describe where "SObject" is the actual object you want to find the information about. For instance, in your case you would use:
sf.Delivery_Detail__c.describe()
As a good first troubleshooting step when interacting with a Salesforce object, I'd always recommend double-checking correctly referencing an API name. I can't tell you how many times I've bumped into little things like adding or missing an underscore. Especially with the __c suffix.

Clarification on Protobuf PrintField()

I'm trying to use the Google Protobuf API found here and I'm having trouble with the built in PrintField() method with the following info:
PrintField(field, value, out, indent=0, as_utf8=False, as_one_line=False)
Print a single field name/value pair. For repeated fields, the value should be a single element.
After merging my message, I'm able to print out the fully merged layout. However, I'd like the specific field/value pair and I'm a bit unsure how to go about doing that as I can't find any full fledged internet examples.
I have tried the following:
proto.PrintField(1, 1, cStringIO.StringIO())
, proto.PrintField('field1', 'subfield', cStringIO.StringIO())
Where my message looks like:
message field1 {subfield = 1;}
Running as such yields the following error: "AttributeError: 'int' object has no attribute 'is_extension'" this is the same in both cases, the only change being 'int' or 'string'.

How to use dynamic variable in python mongodb script

I want to update records in the collection books.
I want to create new field whose name and value are the values from variables.
for book in db.books.find():
title = book['title']
author, value = getAuthor(title)
db.dataset.update({"_id": book['_id']}, {"$set": {author: value}})
When I did this I got the error: WriteError: The update path contains an empty field name. It is not allowed which is not true because both variables have values. I googled and resolved this issue by enclosing author into []. So the code looks like this:
for book in db.books.find():
title = book['title']
author, value = getAuthor(title)
db.dataset.update({"_id": book['_id']}, {"$set": {[author]: value}})
But now I am getting this error which I am not able to resolve:
TypeError: unhashable type: 'list'
Does anyone have encountered such problem? How can I resolve it?

It sounds like getAuthor() is returning back nothing for it's first value, so author is getting set to nothing. From what I can see you did not resolve the error, you just changed it to a different error. By making it say [author] (though it's been a while since I've been in Python) I believe you're just trying to set the key to be an empty list, or a list with an empty string as the only value, depending on what author is.
If I were you, I would print out what author is, or do some debugging and figure out what you're getting back from getAuthor(). Since I can't see that code, nor the data in your database, I'm not sure how to help further without more information.

save tag in json format in Kairosdb

I'm learning KairosDB with Cassandra backend. And I came across with the following issue:
I'm trying to save metrics in the following fashion:
available_methods = json.dumps(available_methods)
data = []
for definition in archive_policy.definition:
data.append({'name': '%s-archives' % metric,
'timestamp': time.time(),
'value': archive_policy.back_window,
'tags': {'metric': self._to_hex(metric),
'timespan': float(definition.timespan),
'granularity_points': '%s_%s' % (
float(definition.granularity),
definition.points),
'aggregation_methods': available_methods}})
And as a result, the metric isn't posted with all the importation, there is only name in it. I tried to post the metric with 'aggregation_methods' equal to string and and worked.
So, the question is: is it possible to save dict or json format in tags?
For the record, I'm using pyKairosDB python client.
thx

actually not all characters are allowed in metric names and tag keys/values.
cf. http://kairosdb.github.io/kairosdocs/FAQ.html#why-can-tags-only-handle-ascii-characters-and
This will evolve in future versions to become more permissive, but some particular character should remain forbidden are the column (:) and equals (=) because they is used as a field separator for constructing Cassandra Row key from tags and metric name.
cf. http://kairosdb.github.io/kairosdocs/CassandraSchema.html
May I ask you why do you need to put a JSON object in a tag?

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.