GEE Python API: Export image to Google Drive fails

GEE Python API: Export image to Google Drive fails - python

Using GEE Python API in an application running with App Engine (on localhost), I am trying to export an image to a file in Google Drive. The task seems to start and complete successfully but no file is created in Google Drive.
I have tried to execute the equivalent javascript code in GEE code editor and this works, the file is created in Google Drive.
In python, I have tried various ways to start the task, but it always gives me the same result: the task completes but no file is created.
My python code is as follows:
landsat = ee.Image('LANDSAT/LC08/C01/T1_TOA/LC08_123032_20140515').select(['B4', 'B3', 'B2'])
geometry = ee.Geometry.Rectangle([116.2621, 39.8412, 116.4849, 40.01236])
task_config = {
'description': 'TEST_todrive_desc',
'scale': 30,
'region': geometry,
'folder':'GEEtest'
}
task = ee.batch.Export.image.toDrive(landsat, 'TEST_todrive', task_config)
ee.batch.data.startProcessing(task.id, task.config)
# Note: I also tried task.start() instead of this last line but the problem is the same, task completed, no file created.
# Printing the task list successively
for i in range(10):
tasks = ee.batch.Task.list()
print(tasks)
time.sleep(5)
In the printed task list, the status of the task goes from READY to RUNNING and then COMPLETED. But after completion no file is created in Google Drive in my folder "GEEtest" (nor anywhere else).
What am I doing wrong?

I think that the file is been generated and stored on the google drive of the 'Service Account' used for Python API not on your private account that is normally used when using the web code editor.

You can't pass a dictionary of arguments directly in python. You need to pass it using the kwargs convention (do a web search for more info). Basically, you just need to preface the task_config argument with double asteriks like this:
task = ee.batch.Export.image.toDrive(landsat, 'TEST_todrive', **task_config)
Then proceed as you have (I assume your use of task.config rather than task_config in the following line is a typo). Also note that you can query the task directly (using e.g. task.status()) and it may give more information about when / why the task failed. This isn't well documented as far as I can tell but you can read about it in the API code.

Related

OpenAI GPT3 Search API not working locally

I am using the python client for GPT 3 search model on my own Jsonlines files. When I run the code on Google Colab Notebook for test purposes, it works fine and returns the search responses. But when I run the code on my local machine (Mac M1) as a web application (running on localhost) using flask for web service functionalities, it gives the following error:
openai.error.InvalidRequestError: File is still processing. Check back later.
This error occurs even if I implement the exact same example as given in OpenAI documentation. The link to the search example is given here.
It runs perfectly fine on local machine and on colab notebook if I use the completion API that is used by the GPT3 playground. (code link here)
The code that I have is as follows:
import openai
openai.api_key = API_KEY
file = openai.File.create(file=open(jsonFileName), purpose="search")
response = openai.Engine("davinci").search(
search_model = "davinci",
query = query,
max_rerank = 5,
file = file.id
)
for res in response.data:
print(res.text)
Any idea why this strange behaviour is occurring and how can I solve it? Thanks.

The problem was on this line:
file = openai.File.create(file=open(jsonFileName), purpose="search")
It returns the call with a file ID and status uploaded which makes it seem like the upload and file processing is complete. I then passed that fileID to the search API, but in reality it had not completed processing and so the search API threw the error openai.error.InvalidRequestError: File is still processing. Check back later.
The returned file object looks like this (misleading):
It worked in google colab because the openai.File.create call and the search call were in 2 different cells, which gave it the time to finish processing as I executed the cells one by one. If I write all of the same code in one cell, it gave me the same error there.
So, I had to introduce a wait time for 4-7 seconds depending on the size of your data, time.sleep(5) after openai.File.create call before calling the openai.Engine("davinci").search call and that solved the issue. :)

Run Python code on Google Form submission or Google Sheet update

I've created a Google Form. When somebody fills it out and submits, I would like to run some Python code using the submission answers.
My current system is having a Google Sheet tied to the Google Form, which constantly updates with new answers. Then, I have an infinite loop running in Python on my PC that grabs the sheet contents in a database and checks if they have change. If they have, run code with the new entry.
This process requires me to have an infinite loop running constantly on my PC, which isn't great.
Anyone got any suggestions?

This can be easily solved by making use of Cloud Functions!
Google Cloud Functions is a serverless environment for building and connecting cloud services and it is used to create single purpose functions that respond to events without a server or a runtime. They are triggered when an event being watched is fired.
Therefore, you should start by creating a Cloud Function in the Google Cloud Platform, configure it to your needs and input your Python code into the editor, along with the dependencies:
Afterwards, you can easily trigger this function from Apps Script, by using the following function (assuming that triggerCloudFunction is the function that gets run whenever a form submission is sent):
function triggerCloudFunction() {
const response = UrlFetchApp.fetch(
'CLOUD_FUNCTION_TRIGGER_URL',
{
method: 'GET',
muteHttpExceptions: true,
headers: {
'Authorization': `Bearer ${ScriptApp.getIdentityToken()}`
}
}
)
console.log(response.getContentText())
}
In order to make this work appropriately, you will also have to include the openid scope in the appsscript.json manifest file.
Reference
Cloud Functions.

Google Sheets Python Quickstart requires sign in regularly

I have set up a script which pulls data from a number of spreadsheets using the Google Sheets API in Python. All works perfectly. However, I noticed that it requires me to manually sign in to my Google account every now and then. More precisely, this happens after a custom clean of my PC's temporary files or a cleaning of my browser's cache data.
However, this affects my script being run autonomously. On the other hand, I cannot and shouldn't let cache and other temporary PC files to pile up just because of this single script. Surely, I can manually approve it every time but that just takes away the point of automation, since I'd need to be always around.
My question is, how can I identify the file where the Python Quickstart application's data is stored and set up an exception to that folder so that Google Chrome, CCleaner and such applications do not remove the data stored within?
I use Windows 10.

Did you check:
https://developers.google.com/sheets/api/quickstart/python
Import pickle should be used to store your credentials and you should just confirm the login manually ones.
The file token.pickle stores the user's access and refresh tokens, and is
created automatically when the authorization flow completes for the first
time.

I am not able to pass parameter in aws batch job?

I have a Dockerfile with the following lines:
FROM python
COPY sysargs.py /
CMD python sysargs.py --date Ref::date
And my python file sysargs.py looks like this:
import sys
print('The command line arguments are:')
a = sys.argv[1]
print(a)
I just want to pass parameter date and print the date but after passing date value I am getting output as "Ref::date".
Can someone help me what I have done wrong?
I am trying to replicate as mentioned in how to retrieve aws batch parameter value in python?.

In the context of your Dockerfile Ref::date is just a string, hence that is what is printed as output in the python script.
If you want date to be a value passed in from an external source, then you can use ARG. Take 10 minutes to read through this guide.

My experience with AWS Batch "parameters"
I have been working on a project for about 4 months now. One of my tasks was to connect several AWS services together to process in the Cloud the last uploaded file that an application had placed in a S3 bucket.
What I needed
The way this works is the following. Through a website, a user uploads a file that is sent to a back-end server, and then to a S3 bucket. This event triggers a AWS Lambda function, which inside creates and runs an instance of a AWS Batch Job, that has already been defined previously (based on a Docker image) and would retrieve from the S3 bucket the file to process it and the save in a database some results. By the way, all the code I am using is done with Python.
Everything worked as charm until I found it really hard to get as a parameter the filename of the file in the S3 bucket that generated the event, inside the python script that was being executed inside the Docker container, run by the AWS Batch Job.
What I did
After a lot of research and development, I came up with a solution for my problem. The issue was based on the fact that the word "parameter", for AWS Batch Jobs, is not what a user may expect. In return, we need to use containerOverrides, the way I show below: defining an "environment" variable value inside the running container by providing a pair of name and value of that variable.
# At some point we had defined aws_batch like this:
#
#aws_batch = boto3.client(
# service_name="batch",
# region_name='<OurRegion>',
# aws_access_key_id='<AWS_ID>',
# aws_secret_access_key='<AWS_KEY>',
#)
aws_batch.submit_job(
jobName='TheJobNameYouWant',
jobQueue='NameOfThePreviouslyDefinedQueue',
jobDefinition='NameOfThePreviouslyDefinedJobDefinition',
# parameters={ #THIS DOES NOT WORK
# 'FILENAME': FILENAME #THIS DOES NOT WORK
# }, #THIS DOES NOT WORK
containerOverrides={
'environment': [
{
'name': 'filename',
'value': 'name_of_the_file.png'
},
],
},
)
This way, from my Python script, inside the Docker container, I could access the environment variable value using the well-known os.getenv('<ENV_VAR_NAME>') function.
You can also check on your AWS console, under the Batch menu, both Job configuration and Container details tabs, to make sure everything makes sense. The container that the Job is running will never see the Job parameters. In the opposite way, it will know the environment variables.
Final notes
I do not know if there is a better way to solve this. So far, I share with all the community something that does work.
I have tested it myself, and the main idea came from reading the links that I list below:
AWSBatchJobs parameters (Use just as context info)
submit_job function AWS Docs (Ideal to learn about what kind of actions we are allowed to do or configure when creating a job)
I honestly hope this helps you and wish you a happy coding!

How does wit-ai connect with python files in my computer?

I just started with Wit-ai and I'm trying to make the weather forecast bot in quickstart. In the quickstart, it mentions that the action of the bot (getForecast) should be a function defined in a python file (.py) in my computer. However, I'm not sure how Wit-ai connects with the python files in my computer? Like how does Wit-ai know which file to run when a function is called?
PS: I have downloaded the pywit examples and read through the code, but I still don't see how the Wit-ai platform will find the correct function in the correct file.

You will be writing the actions specifying the function name in wit
client = Wit(access_token=access_token, actions=actions)
actions = {
'send': send,
'merge': merge,
'select-joke': select_joke,
}
It checks with this name in stories that is trained. You can check https://github.com/wit-ai/pywit/blob/master/examples/joke.py for python code and it's respective wit story at https://wit.ai/patapizza/example-joke

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.