pyArango bulkImport_json is complaining about improper indicies - python

I'm testing the ability to store PyTest results, generated by the json plugin for that test harness, into ArangoDB. I am attempting to import as follows
import pyArango.connection as adbConn
dbConn = adbConn.Connection(...)
db = dbConn['mydb']
collection = db.collections['PyTestResults']
collection.bulkImport_json('/path/to/results.json')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.8/site-packages/pyArango/collection.py", line 777, in bulkImport_json
errorMessage = "At least: %d errors. The first one is: '%s'\n\n more in <this_exception>.data" %
(len(data), data[0]["errorMessage"])
TypeError: string indices must be integers
What isn't making sense is that the JSON file is properly formed. In fact, using the regular Python JSON module, it works just fine:
import json
with open('/path/to/results.json') as fd:
data = json.load(fd)
print(data)
This works. The beginning of the file is
{"report":
{"environment":
{
"Python": "3.6.9", "Platform": "Linux-4.4.0-17763-Microsoft-x86_64-with-Ubuntu-18.04-bionic"
},
It seems that the library, pyArango, is wanting the keys to be integers. I tried this, that is I tried changing "report" to 0. However, this resulted in invalidating the JSON structure.
How is one to use the pyArango library to import JSON? The overall structure of this JSON file doesn't look much different than any of the examples in this page. Any pointers are greatly appreciated.

Related

TrasnslationAPI a bytes-like object is required, not 'Repeated'

I am trying to translate a pdf document from english to french using google translation api and python, however I get a type error.
Traceback (most recent call last):
File "C:\Users\troberts034\Documents\translate_test\translate.py", line 42, in <module>
translate_document()
File "C:\Users\troberts034\Documents\translate_test\translate.py", line 33, in translate_document
f.write(response.document_translation.byte_stream_outputs)
TypeError: a bytes-like object is required, not 'Repeated'
I have a feeling that it has something to do with writing to the file as binary, but I open it as binary too so I am unsure what the issue is. I want it to take a pdf file that has english text and edit the text and translate it to french using the api. Any ideas whats wrong?
from google.cloud import translate_v3beta1 as translate
def translate_document():
client = translate.TranslationServiceClient()
location = "global"
project_id = "translatedocument"
parent = f"projects/{project_id}/locations/{location}"
# Supported file types: https://cloud.google.com/translate/docs/supported-formats
with open("C:/Users/###/Documents/translate_test/test.pdf", "rb") as document:
document_content = document.read()
document_input_config = {
"content": document_content,
"mime_type": "application/pdf",
}
response = client.translate_document(
request={
"parent": parent,
"target_language_code": "fr-FR",
"document_input_config": document_input_config,
}
)
# To output the translated document, uncomment the code below.
f = open('test.pdf', 'wb')
f.write(response.document_translation.byte_stream_outputs)
f.close()
# If not provided in the TranslationRequest, the translated file will only be returned through a byte-stream
# and its output mime type will be the same as the input file's mime type
print("Response: Detected Language Code - {}".format(
response.document_translation.detected_language_code))
translate_document()
I think there is a bug on the sample code (I'm assuming you got the sample from the Cloud Translate API documentation).
To fix your code, you do need to use response.document_translation.byte_stream_outputs[0]. So basically changing this line:
f.write(response.document_translation.byte_stream_outputs)
by:
f.write(response.document_translation.byte_stream_outputs[0])
then your code will work.

How to print from response object

I'm currently trying to get data with python from the Internet. Fortunately the Website I want to Approach Hands me the data in json Format.
Now this is my Code:
from GetWebJson import simple_get
import json
raw_data = simple_get('http://api.openweathermap.org/data/2.5/weather?q=Hamburg,de')
raw_data.encoding
data_json = raw_data.json
print(data_json["temp"])
The simple_get function executes "Get" from the requests library and Returns the Response object.
After formatting the raw data to json In want to print the value of the key "temp". But when i debug this Code I get the following error:
Traceback (most recent call last):
File "C:\Users\nikhi\source\repos\WebScraper\WebScraper\WebScraper.py", line 7, in
print(data_json["temp"])
TypeError: 'method' object is not subscriptable
Can't I use the variable data_json like a dict? If not, how do I convert the json, which contains lists and dicts to a printable Format?

Executing arbitrary Python code (user submitted) inside an Azure Function

I'm trying to run a sample function that allows a user to execute arbitrary code
Note: I"m assuming this is ok because Azure Functions will by default provide a sandbox. (And the end user will need to write code with dataframes, objects etc. I've looked into pypy.org but don't think I need it as I am not worried about attacks that use it as a spambot or something):
import os
import json
import ast
print('==============in python function========================')
postreqdata = json.loads(open(os.environ['req']).read())
response = open(os.environ['res'], 'w')
response.write("hello world from "+postreqdata['name'])
response.close()
logic = (postreqdata['logic'])
eval(logic)
but I keep getting the following output/error:
2018-01-17T09:09:08.949 ==============in python function========================
2018-01-17T09:09:09.207 Exception while executing function: Functions.ccfinopsRunModel. Microsoft.Azure.WebJobs.Script: Traceback (most recent call last):
File "D:\home\site\wwwroot\ccfinopsRunModel\run.py", line 12, in <module>
eval(logic)
File "<string>", line 1
print('code sent from client')
^
SyntaxError: invalid syntax
.
My POST request body contains the following:
{
"name": "Python Function App",
"logic": "print('code sent from client')"
}
So the "logic" variable is being read in, and eval() is trying to interpret the string as python code, but it is causing a Syntax Error where there appears to be none.
What am I doing wrong? If there was a restriction on 'eval' I'm assuming it would say that instead of "Syntax Error"
Thanks for any help you can provide!
Use exec to run your code. eval is used evaluating expressions.
logic = (postreqdata['logic'])
exec(logic)
Also can try sending your code as multi-line string as below,
>>> s = '''
for i in range(3):
print("i")
'''
>>> exec(s)
0
1
2

Python jsonpickle error: 'OrderedDict' object has no attribute '_OrderedDict__root'

I'm hitting this exception with jsonpickle, when trying to pickle a rather complex object that unfortunately I'm not sure how to describe here. I know that makes it tough to say much, but for what it's worth:
>>> frozen = jsonpickle.encode(my_complex_object_instance)
>>> thawed = jsonpickle.decode(frozen)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Python/2.7/site-packages/jsonpickle/__init__.py",
line 152, in decode
return unpickler.decode(string, backend=backend, keys=keys)
:
:
File "/Library/Python/2.7/site-packages/jsonpickle/unpickler.py",
line 336, in _restore_from_dict
instance[k] = value
File "/Library/Python/2.7/site-packages/botocore/vendored/requests/packages/urllib3/packages/ordered_dict.py",
line 49, in __setitem__
root = self.__root
AttributeError: 'OrderedDict' object has no attribute '_OrderedDict__root'
I don't find much of assistance when googling the error. I do see what looks like the same issue was resolved at some time past for simpler objects:
https://github.com/jsonpickle/jsonpickle/issues/33
The cited example in that report works for me:
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict()))
OrderedDict()
>>> jsonpickle.decode(jsonpickle.encode(collections.OrderedDict(a=1)))
OrderedDict([(u'a', 1)])
Has anyone ever run into this themselves and found a solution? I ask with the understanding that my case may be "differently idiosynchratic" than another known example.
The requests module for me seems to be running into problems when I .decode(). After looking at the jsonpickle code a bit, I decided to fork it and change the following lines to see what was going on (and I ended up keeping a private copy of jsonpickle with the changes so I can move forward).
In jsonpickle/unpickler.py (in my version it's line 368), search for the if statement section in the method _restore_from_dict():
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
instance[k] = value
else:
setattr(instance, k, value)
and change it to this (it will logERROR the ones that are failing and then you can either keep the code in place or change your OrderedDict's version that have __root)
if (util.is_noncomplex(instance) or
util.is_dictionary_subclass(instance)):
# Currently requests.adapters.HTTPAdapter is using a non-standard
# version of OrderedDict which doesn't have a _OrderedDict__root
# attribute
try:
instance[k] = value
except AttributeError as e:
import logging
import pprint
warnmsg = 'Unable to unpickle {}[{}]={}'.format(pprint.pformat(instance), pprint.pformat(k), pprint.pformat(value))
logging.error(warnmsg)
else:
setattr(instance, k, value)

how to use TTreeReader in PyROOT

I'm trying to get up and running using the TTreeReader approach to reading TTrees in PyROOT. As a guide, I am using the ROOT 6 Analysis Workshop (http://root.cern.ch/drupal/content/7-using-ttreereader) and its associated ROOT file (http://root.cern.ch/root/files/tutorials/mockupx.root).
from ROOT import *
fileName = "mockupx.root"
file = TFile(fileName)
tree = file.Get("MyTree")
treeReader = TTreeReader("MyTree", file)
After this, I am a bit lost. I attempt to access variable information using the TTreeReader object and it doesn't quite work:
>>> rvMissingET = TTreeReaderValue(treeReader, "missingET")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/user/ROOT/v6-03-01/root/lib/ROOT.py", line 198, in __call__
result = _root.MakeRootTemplateClass( *newargs )
SystemError: error return without exception set
Where am I going wrong here?
TTreeReaderValue is a templated class, as shown in the example on the TTreeReader documentation, so you need to specify the template type.
You can do this with
rvMissingET = ROOT.TTreeReaderValue(ROOT.Double)(treeReader, "missingET")
The Python built-ins can be used for int and float types, e.g.
rvInt = ROOT.TTreeReaderValue(int)(treeReader, "intBranch")
rvFloat = ROOT.TTreeReaderValue(float)(treeReader, "floatBranch")
Also note that using TTreeReader in PyROOT is not recommended. (If you're looking for faster ntuple branch access in Python, you might look in to the Ntuple class I wrote.)

Categories