Python Github update file throws UnknownObjectException (404) - python

I am trying to update the contents of a file on github with the following code
file = repo.get_contents('test.csv', ref="main")
data = file.decoded_content.decode("utf-8")
data += "\n test;test;test"
repo.update_file(file.path, "automatic update", data, file.sha, branch='main')
It returns a cryptic error stack with this last line: github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/reference/repos#create-or-update-file-contents"}
the get-method seems to work, as it allows me to print out the original content of the file. Any pointers, why the update_file method is not working?
TIA!

Related

Document AI process document fails with invalid argument when processing docs from GCS

I am getting an error very similar to the below, but I am not in EU:
Document AI: google.api_core.exceptions.InvalidArgument: 400 Request contains an invalid argument
When I use the raw_document and process a local pdf file, it works fine. However, when I specify a pdf file on a GCS location, it fails.
Error message:
the processor name: projects/xxxxxxxxx/locations/us/processors/f7502cad4bccdd97
the form process request: name: "projects/xxxxxxxxx/locations/us/processors/f7502cad4bccdd97"
inline_document {
uri: "gs://xxxx/temp/test1.pdf"
}
Traceback (most recent call last):
File "C:\Python39\lib\site-packages\google\api_core\grpc_helpers.py", line 66, in error_remapped_callable
return callable_(*args, **kwargs)
File "C:\Python39\lib\site-packages\grpc\_channel.py", line 946, in __call__
return _end_unary_response_blocking(state, call, False, None)
File "C:\Python39\lib\site-packages\grpc\_channel.py", line 849, in _end_unary_response_blocking
raise _InactiveRpcError(state)
grpc._channel._InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
status = StatusCode.INVALID_ARGUMENT
details = "Request contains an invalid argument."
debug_error_string = "{"created":"#1647296055.582000000","description":"Error received from peer ipv4:142.250.80.74:443","file":"src/core/lib/surface/call.cc","file_line":1070,"grpc_message":"Request contains an invalid argument.","grpc_status":3}"
>
Code:
client = documentai.DocumentProcessorServiceClient(client_options=opts)
# The full resource name of the processor, e.g.:
# projects/project-id/locations/location/processor/processor-id
# You must create new processors in the Cloud Console first
name = f"projects/{project_id}/locations/{location}/processors/{processor_id}"
print(f'the processor name: {name}')
# document = {"uri": gcs_path, "mime_type": "application/pdf"}
document = {"uri": gcs_path}
inline_document = documentai.Document()
inline_document.uri = gcs_path
# inline_document.mime_type = "application/pdf"
# Configure the process request
# request = {"name": name, "inline_document": document}
request = documentai.ProcessRequest(
inline_document=inline_document,
name=name
)
print(f'the form process request: {request}')
result = client.process_document(request=request)
I do not believe I have permission issues on the bucket since the same set up works fine for a document classification process on the same bucket.
This is a known issue for Document AI, and is already reported in this issue tracker. Unfortunately the only workaround for now is to either:
Download your file, read the file as bytes and use process_documents(). See Document AI local processing for the sample code.
Use batch_process_documents() since by default is only accepts files from GCS. This is if you don't want to do the extra step on downloading the file.
This is still an issue 5 months later, and something not mentioned in the accepted answer is (and I could be wrong but it seems to me) that batch processes are only able to output their results to GCS, so you'll still incur the extra step of downloading something from a bucket (be it the input document under Option 1 or the result under Option 2). On top of that, you'll end up having to do cleanup in the bucket if you don't want the results there, so under many circumstances, Option 2 won't present much of an advantage other than the fact that the result download will probably be smaller than the input file download.
I'm using the client library in a Python Cloud Function and I'm affected by this issue. I'm implementing Option 1 for the reason that it seems simplest and I'm holding out for the fix. I also considered using the Workflow client library to fire a Workflow that runs a Document AI process, or calling the Document AI REST API, but it's all very suboptimal.

AttributeError: 'dict' object has no attribute 'append' Trying to write to .JSON file

I have a script where I am trying to write to a .JSON file called alerted_comments.json
I had this working on Windows, but when I have moved it to Ubuntu I am getting the error in the title.
I am kind of new at this and am a little stuck. I know that its an issue with lists and dictionaries but I cannot figure it out. Any help would be appreciated
I am using python 3.6 on Ubuntu.
My current code related to this problem is here.
with open(save_path, 'r') as fp:
alerted_comments = json.load(fp)
for comment in comment_stream:
if comment.id in alerted_comments:
continue
if comment.author: # if comment author hasn't deleted
if comment.author.name in ignore_users:
continue
alerted_comments.append(comment.id)
if len(alerted_comments) > 100:
alerted_comments = alerted_comments[-100:]
with open(save_path, 'w') as fp:
json.dump(alerted_comments, fp)
else:
# You'll probably want to be more discerning than "not 200",
# but that's fine for now.
raise SlackError('Request to Slack returned an error %s, the response is:\n%s' % (response.status_code, response.text))
if __name__ == '__main__':
while True:
try:
main(save_path='alerted_comments.json')
except Exception as e:
print('There was an error: {}'.format(str(e)))
time.sleep(60) # wait for 60 seconds before restarting
What is meant to happen is, The script reads a keyword, and then puts that comment id of the keyword into the alerted_comments.json so it remembers and does not keep repeating the same keyword.
From an error we can find that we are working with dict instance when list was assumed, so the problem is in passed JSON or in our assumption about its structure.
From conversation it becomes clear that initial JSON has
{}
empty object literal in its contents which becomes Python dict after deserialization with json.load, so if it should be an empty array instead we can replace contents with
[]

How to print a message in the terminal when debugging the Django app?

In the Django backend I have the following lines of code:
csvData = request.GET.get('csvData')
print(csvData)
X = pd.DataFrame(csvData)
It throws the following error message that I see in the terminal:
X = pd.DataFrame(csvData) File
"/Users/terr/anaconda3/lib/python3.6/site-packages/pandas/core/frame.py",
line 422, in init
raise ValueError('DataFrame constructor not properly called!')
However, the output of print(csvData) does not appear in the terminal. How can I print it out there?
The above code should work. If it isn't working, it could be that the csvData variable points to an empty string. Try deleting the last line (pd.DataFrame) and just printing CSVData so that you can see what happens when an error is not thrown.

How to avoid or skip error 400 in python while calling the API

Note:- I have written my code after referring to few examples in stack overflow but still could not get the required output
I have a python script in which loop iterates with an Instagram API. I give the user_id as an input to the API which gets the no of posts, no of followers and no of following. Each time it gets a response, I load it into a JSON schema and append to lists data1, data2 and data3.
The issue is:= Some accounts are private accounts and the API call is not allowed to it. When I run the script in IDLE Python shell, its gives the error
Traceback (most recent call last):
File "<pyshell#144>", line 18, in <module>
beta=json.load(url)
File "C:\Users\rnair\AppData\Local\Programs\Python\Python35\lib\site- packages\simplejson-3.8.2-py3.5-win-amd64.egg\simplejson\__init__.py", line 455, in load
return loads(fp.read(),
File "C:\Users\rnair\AppData\Local\Programs\Python\Python35\lib\tempfile.py", line 483, in func_wrapper
return func(*args, **kwargs)
**ValueError: read of closed file**
But the JSON contains this:-
{
"meta": {
"error_type": "APINotAllowedError",
"code": 400,
"error_message": "you cannot view this resource"
}
}
My code is:-
for r in range(307,601):
var=r,sheet.cell(row=r,column=2).value
xy=var[1]
ij=str(xy)
if xy=="Account Deleted":
data1.append('null')
data2.append('null')
data3.append('null')
continue
myopener=Myopen()
try:
url=myopener.open('https://api.instagram.com/v1/users/'+ij+'/?access_token=641567093.1fb234f.a0ffbe574e844e1c818145097050cf33')
except urllib.error.HTTPError as e: // I want the change here
data1.append('Private Account')
data2.append('Private Account')
data3.append('Private Account')
continue
beta=json.load(url)
item=beta['data']['counts']
data1.append(item['media'])
data2.append(item['followed_by'])
data3.append(item['follows'])
I am using Python version 3.5.2. The main question is If the loop runs and a particular call is blocked and getting this error, how to avoid it and keep running the next iterations? Also, if the account is private, I want to append "Private account" to the lists.
Looks like the code that is actually fetching the URL is within your custom type - "Myopen" (which is not shown). It also looks like its not throwing the HTTPError you are expecting since your "json.load" line is still being executed (and leading to the ValueError that is being thrown).
If you want your error handling block to fire, you would need to check the response status code to see if its != 200 within Myopen and throw the HTTPError you are expecting instead of whatever its doing now.
I'm not personally familiar with FancyURLOpener, but it looks like it supports a getcode method. Maybe try something like this instead of expecting an HTTPError:
url = myopener.open('yoururl')
if url.getcode() == 400:
data1.append('Private Account')
data2.append('Private Account')
data3.append('Private Account')
continue

Google Drive indexableText for a video in Python

I am trying to add indexable text to a video already in drive. From the Drive SDK Docs I used the Patch method.
def add_indexable_text(service, file_id, text_to_index):
try:
file = {'indexableText': {'text': text_to_index}}
updated_file = service.files().patch(
fileId = file_id,
body = file,
fields = 'indexableText').execute()
return updated_file
except errors.HttpError, error:
print 'An error occurred: %s' % error
return None
I'm not seeing any errors in the terminal but when I search in drive for the metadata text there are no results. I've tried waiting for a while but that didn't seem to help/matter. The video says it was updated when I ran the script so there's something. Thanks for any help you can provide.
Update: It seems I can only add indexableText to a video once but not change it afterward.
The code looks fine to me. Try using the "try it" function of the docs, does that work?
https://developers.google.com/drive/v2/reference/files/patch
Maybe the request went through but failed, which causes no HttpError.
What's the return value from a typical call?

Categories