I am getting this valid error while preprocessing some data:
9:46:56.323 PM default_model Function execution took 6008 ms, finished with status: 'crash'
9:46:56.322 PM default_model Traceback (most recent call last):
File "/user_code/main.py", line 31, in default_model
train, endog, exog, _, _, rawDf = preprocess(ledger, apps)
File "/user_code/Wrangling.py", line 73, in preprocess
raise InsufficientTimespanError(args=(appDf, locDf))
That's occurring here:
async def default_model(request):
request_json = request.get_json()
if not request_json:
return '{"error": "empty body." }'
if 'transaction_id' in request_json:
transaction_id = request_json['transaction_id']
apps = [] # array of apps whose predictions we want, or uempty for all
if 'apps' in request_json:
apps = request_json['apps']
modelUrl = None
if 'files' in request_json:
try:
files = request_json['files']
modelUrl = getModelFromFiles(files)
except:
return package(transaction_id, error="no model to execute")
else:
return package(transaction_id, error="no model to execute")
if 'ledger' in request_json:
ledger = request_json['ledger']
try:
train, endog, exog, _, _, rawDf = preprocess(ledger, apps)
# ...
except InsufficientTimespanError as err:
return package(transaction_id, error=err.message, appDf=err.args[0], locDf=err.args[1])
And preprocess is correctly throwing my custom error:
def preprocess(ledger, apps=[]):
"""
convert ledger from the server, which comes in as an array of csv entries.
normalize/resample timeseries, returning dataframes
"""
appDf, locDf = splitLedger(ledger)
if len(appDf) < 3 or len(locDf) < 3:
raise InsufficientDataError(args=(appDf, locDf))
endog = appDf['app_id'].unique().tolist()
exog = locDf['location_id'].unique().tolist()
rawDf = normalize(appDf, locDf)
trainDf = cutoff(rawDf.copy(), apps)
rawDf = cutoff(rawDf.copy(), apps, trim=False)
# TODO - uncomment when on realish data
if len(trainDf) < 2 * WEEKS:
raise InsufficientTimespanError(args=(appDf, locDf))
The thing is, it is in a try``except block precisely because I want to trap the error and return a payload with the error, rather than crashing with a 500 error. But its crashing on my custom error, in the try block, anyway. Right on that line calling preprocess.
This must be a failure on my part to conform to proper python code. But I'm not sure what I am doing wrong. The environment is python 3.7
Here's where that error is defined, in Wrangling.py:
class WranglingError(Exception):
"""Base class for other exceptions"""
pass
class InsufficientDataError(WranglingError):
"""insufficient data to make a prediction"""
def __init__(self, message='insufficient data to make a prediction', args=None):
super().__init__(message)
self.message = message
self.args = args
class InsufficientTimespanError(WranglingError):
"""insufficient timespan to make a prediction"""
def __init__(self, message='insufficient timespan to make a prediction', args=None):
super().__init__(message)
self.message = message
self.args = args
And here is how main.py declares (imports) it:
from Wrangling import preprocess, InsufficientDataError, InsufficientTimespanError, DataNotNormal, InappropriateValueToPredict
Your preprocess function is declared async. This means the code in it isn't actually run where you call preprocess, but instead when it is eventually awaited or passed to a main loop (like asyncio.run). Because the place where it is run is no-longer in the try block in default_model, the exception is not caught.
You could fix this in a few ways:
make preprocess not async
make default_model async too, and await on preprocess.
Do the line numbers in the error match up with the line numbers in your code? If not is it possible that you are seeing the error from a version of the code before you added the try...except?
Related
I have inherited a luigi framework and I'm trying to debug some stuff and add features. The first script runs this command:
yield FinalizeData(environment=environment)
and then within a second script, we have:
#FinalizeData requires UploadData to have run
#UploadData requires ValidateData to have run
#ValidateData requires PullData to have run
#PullData is the first class that fires
I am trying to debug a few things that happening in the the ValidateData class - but in order to run it, I need to have PullData execute first, and it contains a SQL query that takes about an hour to run, and ultimately generates a .PKL file. Because I already have this .PKL file, I would like to "skip" this piece and go directly to the second UploadData class. I am not sure how to do that however.
Here is the first PullData class:
class PullData(luigi.Task):
environment = luigi.Parameter(default='dev')
def requires(self):
resp = list()
tests = {
'dev' : TestDevConnection(self.instance, role='user'),
}
resp.append(tests.get(self.environment))
return resp
def output(self):
return luigi.LocalTarget(f'.out/{CUR_DATE}_{self.environment}_{self.instance}_Data.pkl')
def run(self):
try:
mysql_dict = dict(environment=self.environment, role='user', instance=self.instance)
conn = get_conn(self.environment, role='user', instance=self.instance)
except Exception as e:
log.error_job(mysql_dict, self.environment, self.instance + '_details', e)
raise e
sql = "select * from foo.bar"
log.start_job(mysql_dict, self.environment, self.instance + '_details')
try:
data = pd.read_sql(sql, conn)
except Exception as e:
log.error_job(mysql_dict, self.environment, self.instance + '_details', e)
raise e
with open(self.output().path, 'wb') as out:
data.to_pickle(out, compression=None)
Moving on to the Validate class:
class ValidateData(luigi.Task):
environment = luigi.Parameter(default='dev')
def requires(self):
return { 'data' : PullData(self.environment, self.instance) }
def output(self):
return luigi.LocalTarget(f'.out/{CUR_DATE}_{self.environment}_{self.instance}_AGSCS_ValidateData.txt')
def run(self):
with open(self.input()['data'].path, 'rb') as base_data:
data = pd.read_pickle(base_data, compression=None)
try:
assert len(data) > 0, "SQL Pull contains no data"
except Exception as e:
log.complete_job(dict(environment=self.environment, role='user', instance=self.instance), self.environment, self.instance + '_details', e)
raise e
#### HERE IS WHERE I AM LOOKING TO ADD ADDITIONAL VALIDATIONS ####
#### HERE IS WHERE I AM LOOKING TO ADD ADDITIONAL VALIDATIONS ####
#### HERE IS WHERE I AM LOOKING TO ADD ADDITIONAL VALIDATIONS ####
with self.output().open('w') as out:
summary = f"""Validation Completed Successfully with {len(data)} records."""
out.write(summary)
Basically, I would like to know how to tell the ValidateData class that IF the .PKL file that PullData generates is already there, don't run it again and just proceed with the validation (or, tell PullData that if the PKL file already exists, don't attempt to re-pull it. Either is the same to me)
I'm using openapi to define my API. I just added this security schema:
securitySchemes:
api_key:
type: apiKey
name: X-Auth
in: header
x-apikeyInfoFunc: apikey_auth
where apikey_auth is defined like this
def apikey_auth(token, required_scopes):
decrypted_token = None
try:
decrypted_token = mydecrypter.decrypt(token)
except InvalidToken:
raise OAuthProblem('Invalid token')
return {'decrypted_token': decrypted_token}
Now i'd like to use this authentication for my actual endpoints which are defined in openapi like this:
/myendpoint
get:
operationId: operation
#more stuff
security:
- api_key: []
When now calling myendpoint the authentication is being done and works as expected. What I would like to have now is the return value of apikey_auth being passed into the call of operation so i can access decrypted_token in operation like this:
def operation(decrypted_token):
data = get_data_for_token(decrypted_token)
return data
Does anyone have an idea if this is possible somehow without having an extra parameter in the endpoint defintion?
Solved. For whatever reasons with these changes it works:
def apikey_auth(token, required_scopes):
decrypted_token = None
try:
decrypted_token = mydecrypter.decrypt(token)
except InvalidToken:
raise OAuthProblem('Invalid token')
return {'sub': decrypted_token}
def operation(user):
data = get_data_for_token(user)
return data
#admin.register(Book)
class BookAdmin(ImportExportActionModelAdmin):
resource_class = BookResource
def get_import_form(self):
return CustomImportForm
def get_resource_kwargs(self, request, *args, **kwargs):
rk = super().get_resource_kwargs(request, *args, **kwargs)
rk['input_author'] = None
if request.POST:
author = request.POST.get('input_author', None)
if author:
request.session['input_author'] = author
else:
try:
author = request.session['input_author']
except KeyError as e:
raise Exception("Context failure on row import" + {e})
rk['input_author'] = author
return rk
Have this code in django admin page, but getting an error during the export. Can anyone let me know where is the issue?
Your issue is on this line:
raise Exception("Context failure on row import" + {e})
The ‘{e}’ means that you create a set containing the error, and try to join it to the exception message string. You should be able to get rid of that error by replacing ‘{e}’ with just ‘e’.
I am working on a Flask project and I am using marshmallow to validate user input.
Below is a code snippet:
def create_user():
in_data = request.get_json()
data, errors = Userschema.load(in_data)
if errors:
return (errors), 400
fname = data.get('fname')
lname = data.get('lname')
email = data.get('email')
password = data.get('password')
cpass = data.get('cpass')
When I eliminate the errors part, the code works perfectly. When I run it as it is, I get the following error:
builtins.ValueError
ValueError: too many values to unpack (expected 2)
Traceback (most recent call last)
File
"/home/..project-details.../venv3/lib/python3.6/site-packages/flask/app.py",
line 2000, in call
error = None
ctx.auto_pop(error)
def __call__(self, environ, start_response):
"""Shortcut for :attr:`wsgi_app`."""
return self.wsgi_app(environ, start_response)
def __repr__(self):
return '<%s %r>' % (
self.__class__.__name__,
self.name,
Note: The var in_data is a dict.
Any ideas??
I recommend you check your dependency versions.
Per the Marshmallow API reference, schema.load returns:
Changed in version 3.0.0b7: This method returns the deserialized data rather than a (data, errors) duple. A ValidationError is raised if invalid data are passed.
I suspect python is trying to unpack the dict (returned as a singular object) into two variables. The exception is raised because there is nothing to pack into the 'errors' variable. The below reproduces the error:
d = dict()
d['test'] = 10101
a, b = d
print("%s : %s" % (a, b))
according to the documentation in its most recent version (3.17.1) the way of handling with validation errors is as follows:
from marshmallow import ValidationError
try:
result = UserSchema().load({"name": "John", "email": "foo"})
except ValidationError as err:
print(err.messages) # => {"email": ['"foo" is not a valid email address.']}
print(err.valid_data) # => {"name": "John"}
I am trying to create a client used to mainly test out the responses of a server asynchronously. I have created a function that basically waits for the next response from the server, if a requestId is provided when this function is called it will look for the next response with the requestId provided. Here is the function:
def getNextResponse(self, requestId = None):
logger = logging.getLogger(__name__)
self.acknowledge += 1
logger.info("requestId ack for this response: {}".format(requestId))
while(not self.response):
pass
self.acknowledge -= 1
logger.info("requestId unset for this response: {}".format(requestId))
message = json.loads(self.messagesList[len(self.messagesList)-1])
if(requestId != None):
while(requestId != message['body']['requestId']):
self.acknowledge += 1
while(not self.response):
pass
self.acknowledge -= 1
message = self.messagesList[len(self.messagesList)-1]
self.startMonitor -= 1
return message['body']
I also have helper functions for each command which can be sent to the engine below is one of said helper function for a ping command:
def ping(self, sessionId = None, requestId = None, version="1.0"):
result = {
"method": "Ping"
}
if(None != version):
result['version'] = version
if(None != sessionId):
result['sessionId'] = sessionId
if(None != requestId):
result['requestId'] = requestId
logger = logging.getLogger(__name__)
logger.info("Message Sent: " + json.dumps(result, indent=4))
self.startMonitor += 1
self.ws.send(json.dumps(result))
message = self.getNextResponse(requestId = requestId)
return message
It basically sets up a json object which contains all the parameters that the server expects and then sends the entire json message to the server. After it has been sent i call getNextResponse to await a response from the server. The requestId is set to None by default, so if no requestId is provided, it will just look for the very next response returned by the server. Since this can be quite inconsistent because of other asynchronous commands, one can provided a unique requestId for the command so that the response from the server will also contain this requestId thus making each response unique.
In my test case I am generating a random requestId by using:
def genRequestId(self):
x = random.randint(10000000, 99999999)
print x
reqId = str(x)+"-97a2-11e6-9346-fde5d2234523"
return reqId
The problem that I encountered is that sometimes (seems to be random), when I call ping in one of my test cases, i get this error:
message = self.getNextResponse(requestId = requestId)
TypeError: string indices must be integers, not str
I am quite confused by this error, requestId that I am generating inside ping is supposed to be a string and I am not referencing inside it in any way. I have tried removing the reference to the parameter like so:
message = self.getNextResponse(requestId)
But I am still getting this error. The error doesn't go any deeper inside the getNextResponse function which leads me to believe that it is coming from inside the ping function when I try to call it. Any help would be greatly appreciated!
EDIT: Here is the error
Traceback (most recent call last):
File "ctetest./RegressionTest/WebsocketTest\test_18_RecTimer.py", line 385, in test009_recTimer_Start_withAudio
response = client.endSession(sessionId = sessionId, requestId = requestId_2)
File "ctetest./RegressionTest/WebsocketTest../.././CTESetupClass\WebsocketCl
ient.py", line 528, in endSession
message = self.getNextResponse(requestId)
File "ctetest./RegressionTest/WebsocketTest../.././CTESetupClass\WebsocketCl
ient.py", line 49, in wrapper
raise ret
TypeError: string indices must be integers, not str
you have two statements in your code that look very similar:
message = json.loads(self.messagesList[len(self.messagesList)-1])
and then further down:
message = self.messagesList[len(self.messagesList)-1]
The first will set message to a json object (probably dict) where the second one assigns message to a string, I'm assuming this is not intended and the cause for your error.