OpenAI GPT3 Search API not working locally - python

I am using the python client for GPT 3 search model on my own Jsonlines files. When I run the code on Google Colab Notebook for test purposes, it works fine and returns the search responses. But when I run the code on my local machine (Mac M1) as a web application (running on localhost) using flask for web service functionalities, it gives the following error:
openai.error.InvalidRequestError: File is still processing. Check back later.
This error occurs even if I implement the exact same example as given in OpenAI documentation. The link to the search example is given here.
It runs perfectly fine on local machine and on colab notebook if I use the completion API that is used by the GPT3 playground. (code link here)
The code that I have is as follows:
import openai
openai.api_key = API_KEY
file = openai.File.create(file=open(jsonFileName), purpose="search")
response = openai.Engine("davinci").search(
search_model = "davinci",
query = query,
max_rerank = 5,
file = file.id
)
for res in response.data:
print(res.text)
Any idea why this strange behaviour is occurring and how can I solve it? Thanks.

The problem was on this line:
file = openai.File.create(file=open(jsonFileName), purpose="search")
It returns the call with a file ID and status uploaded which makes it seem like the upload and file processing is complete. I then passed that fileID to the search API, but in reality it had not completed processing and so the search API threw the error openai.error.InvalidRequestError: File is still processing. Check back later.
The returned file object looks like this (misleading):
It worked in google colab because the openai.File.create call and the search call were in 2 different cells, which gave it the time to finish processing as I executed the cells one by one. If I write all of the same code in one cell, it gave me the same error there.
So, I had to introduce a wait time for 4-7 seconds depending on the size of your data, time.sleep(5) after openai.File.create call before calling the openai.Engine("davinci").search call and that solved the issue. :)

Related

Why does spotipy don't come to an end when using the search function?

I recently started using the Spotify API spotipy in python (with Spyder 4.1.4 IDE and Anaconda) to get the song IDs from Spotify given a song name and artist. Therefore I use the search function defined in the API as follows:
import api_config
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id=api_config.APP_CLIENT_ID, client_secret=api_config.APP_CLIENT_SECRET))
query = "foo"
res = sp.search(q=query, limit=1)
When executing this script, it should send a GET-request to Spotify as if i type the query in the search bar but unfortunately it is somehow stuck in a loop so it doesn't come to an end and gives me the result from the search. Furthermore when I try to cancel the execution the script seems to run on and I'm not able to execute the same or another script unless restarting the kernel. There is also no error message telling me what could cause the problem.
I already tried to limit my problem to a single word instead of the whole song title and artist to avoid issues with the query string being wrong.
To make sure there's no error with the network connection, spotipy module installation or Spotify authentication I tried different other spotipy commands and they all worked great except the search function.

Google Cloud Function succeeds, but not showing expected output

I am testing out cloud function and I have things setup, but output is not populating correctly (the output is not being saved into Cloud Storage and my print statements are not populating). Here is my code and my requirements below. I have setup the Cloud Function to just run as a HTTP request trigger type with unauthenticated invocations and having a Runtime service account as a specified account that has write access to Cloud Storage. I have verified that I am calling the correct Entry point.
logs
2022-03-22T18:52:02.749482564Z test-example vczj9p85h5m2 Function execution started
2022-03-22T18:52:04.148507183Z test-example vczj9p85h5m2 Function execution took 1399 ms.
Finished with status code: 200
main.py
import requests
from google.cloud import storage
import json
def upload_to_gsc(data):
print("saving to cloud storage")
client = storage.Client(project="my-project-id")
bucket = client.bucket("my-bucket-name")
blob = bucket.blob("subfolder/name_of_file")
blob.upload_from_string(data)
print("data uploaded to cloud storage")
def get_pokemon(request):
url = "https://pokeapi.co/api/v2/pokemon?limit=100&offset=200"
data = requests.get(url).json()
output = [i.get("name") for i in data["results"]]
data = json.dumps(output)
upload_to_gsc(data=data)
print("saved data!")
requirements.txt
google-cloud-storage
requests==2.26.0
As #JackWotherspoon mentioned, be sure to make sure you double check your project-id,bucket-name and entry point if you have a case like I did. For myself, I recreated the Cloud Function and tested it and it worked again.
As #dko512 mentioned in comments, issue was resolved by recreating and redeploying the Cloud Function.
Posting the answer as community wiki for the benefit of the community that might encounter this use case in the future.
Feel free to edit this answer for additional information.

Fast API call out of system resource error

I have build an an application for some processing & its code has been wrapped by using fast api so that other people can use the processing i.e. the functionality. The api accepts only 2 inputs & returns a single json output. The code works fine till 500 hits to the api after that the following error starts to show up in log file :
FASTAPI version == 0.68.1
Code structure:

GEE Python API: Export image to Google Drive fails

Using GEE Python API in an application running with App Engine (on localhost), I am trying to export an image to a file in Google Drive. The task seems to start and complete successfully but no file is created in Google Drive.
I have tried to execute the equivalent javascript code in GEE code editor and this works, the file is created in Google Drive.
In python, I have tried various ways to start the task, but it always gives me the same result: the task completes but no file is created.
My python code is as follows:
landsat = ee.Image('LANDSAT/LC08/C01/T1_TOA/LC08_123032_20140515').select(['B4', 'B3', 'B2'])
geometry = ee.Geometry.Rectangle([116.2621, 39.8412, 116.4849, 40.01236])
task_config = {
'description': 'TEST_todrive_desc',
'scale': 30,
'region': geometry,
'folder':'GEEtest'
}
task = ee.batch.Export.image.toDrive(landsat, 'TEST_todrive', task_config)
ee.batch.data.startProcessing(task.id, task.config)
# Note: I also tried task.start() instead of this last line but the problem is the same, task completed, no file created.
# Printing the task list successively
for i in range(10):
tasks = ee.batch.Task.list()
print(tasks)
time.sleep(5)
In the printed task list, the status of the task goes from READY to RUNNING and then COMPLETED. But after completion no file is created in Google Drive in my folder "GEEtest" (nor anywhere else).
What am I doing wrong?
I think that the file is been generated and stored on the google drive of the 'Service Account' used for Python API not on your private account that is normally used when using the web code editor.
You can't pass a dictionary of arguments directly in python. You need to pass it using the kwargs convention (do a web search for more info). Basically, you just need to preface the task_config argument with double asteriks like this:
task = ee.batch.Export.image.toDrive(landsat, 'TEST_todrive', **task_config)
Then proceed as you have (I assume your use of task.config rather than task_config in the following line is a typo). Also note that you can query the task directly (using e.g. task.status()) and it may give more information about when / why the task failed. This isn't well documented as far as I can tell but you can read about it in the API code.

Use Google Cloud Scheduler to get a Pandas Data Frame loaded into Google Big Query?

I'm following the documentation from the link below.
https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_gbq.html#pandas.DataFrame.to_gbq
Everything is setup perfectly fine in my data frame. Now, I'm trying to export if to GBQ. Here's a one-liner that should pretty much work...but it doesn't...
pandas_gbq.to_gbq(my_df, 'table_in_gbq', 'my_project_id', chunksize=None, reauth=False, if_exists='append', private_key=False, auth_local_webserver=True, table_schema=None, location=None, progress_bar=True, verbose=None)
I'm having a lot of trouble getting the cloud scheduler to run the job successfully. The scheduler runs, and I see a message saying the 'Result' was a 'Success', but actually, and no data is loaded into Big Query. When I run the job on the client side, everything is fine. When I run it from the server, no data gets loaded. I'm guessing, the credentials are throwing it off, but I can't tell for sure what's going on. All I know for sure is that Google says the job runs successfully, but no data is loaded into my table.
My question is, how can I modify this to run using Google Cloud Scheduler, with minimal security, so I can rule out some kind of security issue? Or, otherwise determine exactly what's going on here?

Categories