How to know when someone returns HIT? - python

For an ExternalQuestion, when a worker views an HIT in preview mode, the url that is sent is something like:
/mturk?assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=3FSEU3P2NR0J4ISYGCVR597YQFLRRR
And then when the user Accepts the HIT, it updates the assignmentId and adds a workerId:
/mturk/?assignmentId=384PI804XS1ASN65RQHJZ77QLSES0H&hitId=3B9XR6P1WEVFQNSWCA0S33G3YCPBJ7&workerId=A1D23ERS0X4J9D&turkSubmitTo=https%3A%2F%2Fworkersandbox.mturk.com
Is there a way to know if an HIT is Returned and not finished? I tried emulating this behavior as a worker, and no request was send to my url. How would I tell then?

This was recently asked on the AWS Developer Forum. I'll copy a modified version of my answer from there:
You can use the Notifications API to trigger a notification every time a worker accepts an assignment. You could then catalog these notifications and compare them to the set of actual responses.
If you are hosting your HIT on your server, you could configure your server to log every view of a HIT (every view would log the workerId for the worker viewing it but with an ASSIGNMENTID_NOT_AVAILABLE value for the assignmentId, while accepted assignments that are returned would register an assignmentId that was never submitted to MTurk. For HITs hosted by AWS (e.g., those created via the requester user interface, or setup as QuestionForm or HTMLQuestion HITs via the API), this option is not available to you.

Related

How to handle high response time

There are two different services. One service -Django is getting the request from the front-end and then calling an API in the other service -Flask.
But the response time of the Flask service is high and if the user navigates to another page that request will be canceled.
Should it be a background task or a pub/sub pattern? If so, how to do it in the background and then tell the user here is your last result?
You have two main options possible:
Make an initial request to a "simple" view of Django, which load a skeleton HTML page with a spinner where some JS will trigger a XHR request to a second Django view which will contain the other service (Flask) call. Thus, you can even properly alert your user the loading takes times and handle the exit on the browser side (ask confirmation before leaving/abort the request...)
If possible, cache the result of the Flask service, so you don't need to call it at each page load.
You can combine those two solutions by calling the service in a asynchronous request and cache its result (depending on context, you may need to customize the cache depending on the user connected for example).
The first solution can be declined with pub/sub, websockets, whatever, but a classical XHR seems fine for your case.
On our project, we have a couple of time-expensive endpoints. Our solution was similar to a previous answer:
Once we receive a request we call a Celery task that does its expensive work in async mode. We do not wait for its results and return a quick response to the user. Celery task sends its progress/results via WebSockets to a user. Frontend handles this WS message. The benefit of this approach is that we do not spend the CPU of our backend. We spend the CPU of the Celery worker that is running on another machine.

How to get Cognito Identity Id in Post Confirmation Lambda Trigger in Python using Amplify React?

I'm working on a ReactJS project and using Amplify for Signup/Signin. On signup, I have a post confirmation lambda trigger in Python that stores the user information (username, cognito id, etc.) in an on-prem database. I would like to also store the identity id, but I can't seem to find it in the event or context variable. I can find the identity id by calling Auth.currentCredentials() in React after the user has signed in, but would like to get this information during the signup process.
Any help on this would be appreciated. Thank you.
I had this same issue, and found that it is indeed not available in the auth trigger because the user has to authenticate to retrieve it, as you said. There is also not a way (that I could find) to grab this information using the AWS admin SDK.
I resorted to running a small check after the user logs into the app and doing a call to save the identityId where I needed it. The purpose was to allow other users to access the user's media after logging in, by using the user's own identityId with amplify to pull a profile picture.
Hope this helps.
Yes, the client app can get the identityId, via Auth.currentCredentials().identityId, but that is not secure because anybody can override any code in the client app and therefore - if you rely on the client app to be your source of truth for identityId - anybody can set identityId to be that of another user, for example, and then log in as them.
One way to get the identityId in the post confirmation trigger lambda function is to call an API hosted on Amazon's API Gateway - the post confirmation lambda calls the API sending the newly confirmed user's credentials, the API has a lambda behind it, the code of that lambda has access to the identityId of whomever called the API in a variable that's tied to the incoming request 'req', namely, in:
req.apiGateway.event.requestContext.identity.cognitoIdentityId
So one secure way to get identityId in the post confirmation lambda function would be to call an API and ask the API to return the identityId in that variable -- all done on the server.
However, please note that there is currently an open issue about the post confirmation lambda not receiving permission from the API (403 errors) - this only happens when you set up all your stuff via Amplify, as opposed to a manual setup. If you use Amplify to set up all your lambda functions you would have to wait for this issue to be resolved: https://github.com/aws-amplify/amplify-cli/issues/6589 before you try the strategy described here to get identityId in the post confirmation trigger lambda function.
You can save the identityId in a custom attribute

What’s the correct way to run a long-running task in Django whilst returning a page to the user immediately?

I’m writing a tiny Django website that’s going to provide users with a way to delete all their contacts on Flickr.
It’s mainly an exercise to learn about Selenium, rather than something actually useful — because the Flickr API doesn’t provide a way to delete contacts, I’m using Selenium to make an actual web browser do the actual deleting of contacts.
Because this might take a while, I’d like to present the user with a message saying that the deleting is being done, and then notify them when it’s finished.
In Django, what’s the correct way to return a web page to the user immediately, whilst performing a task on the server that continues after the page is returned?
Would my Django view function use the Python threading module to make the deleting code run in another thread whilst it returns a page to the user?
Consider using some task queues - one of the most liked by Django community solution is to use Celery with RabbitMQ.
Once I needed this, I set up another Python process, that would communicate with Django via xmlrpc - this other process would take care of the long requests, and be able to answer the status of each. The Django views would call that other process (via xmlrpc) to queue jobs, and query job status. I made a couple proper json views in django to query the xmlrpc process - and would update the html page using javascript asynchronous calls to those views (aka Ajax)

Django - Consuming a RESTful service asynchronously

I need to create a django web portal in which users can select and run ad-hoc reports by providing values, via forms, to parameters defined in each specific report. The view that processes the user’s report execution requests needs to make RESTFul service calls to a remote Jasper Reports Server where the actual output is generated.
I have already written the client to make the RESTful service calls to the remote server. Depending on how large the report is the service calls can take several minutes.
What is the best method for making the service call after the user’s form has been validated so that the call processes asynchronously (in the background) and the user can continue you use the web portal while their report is being generated.
Do I need to make an AJAX call when the parameters form is submitted or should I start a new thread for the RESTful client in the view after the form has validated? Or something else?
django-celery is a popular choice for async tasks, i usually use greenlets as im used to them.
Then to notify the user you can use the notification framework to tell the client that something is done.

Securing RESTapi in flask

The app I'm deving uses a lot of ajax calls. Unfortunately I hit a snag when researching on how to restrict access to the api. For example:
i have table that does an ajax call to http://site/api/tasks/bob
i need to make sure that only bob, logged in, can read that table
(otherwise somebody who knows the pattern might request to see bob's
tasks by simply entering the url in the browser).
on a different page,the same table needs to be able to call http://site/api/tasks/all and show the tasks of all users (only an admin should be able to do that)
Thank you for your time reading this and maybe answering it.
The thousand-foot view is you need to authenticate the user either with:
A) HTTP-Auth (either basic or digest) on each request.
B) Server-side sessions. (The user authenticates and receives a session key - their user information is stored in the session backend on the server, attached to that key Once they have a session they can make requests passing their session key back to you (either in the URL or in a cookie) and the information they have access to is returned to them.)
Flask has a pair of useful extensions that deal with a large part of this sort of thing - check out Flask-Login and Flask-Principal to see examples of how authorization can be added to a Flask application.

Categories