I do chatbot on Dialogflow and connect Heroku with python in Github. How can I prevent Heroku from sleeping by python? Could you share the code with me?
If you are using the Free Plan you cannot prevent the Dyno to sleep (after 30 min inactivity).
There is a workaround if your Chatbot runs on a web site: in this case you could send a dummy request (just to start the Dyno) to your webhook hosted on Heroku when the user access the page, for example
<body onload="wakeUpCall();">
<script language="javascript">
function wakeUpCall() {
var xhr2 = new XMLHttpRequest();
xhr2.open("GET", "https://mywebhook.herokuapp.com/", true);
xhr2.send(null);
}
</script>
It is not obviously a perfect approach (it works only if you control the client and it relies on the Dyno starting before the chatbot sends data to the webhook), but it is an option if you want to keep working with the Free plan.
First some things to keep in mind before you try to use the free dyno for something it wasn't intended to be used for:
Heroku provides 1000 free hours a month. This is enough to only run single Heroku dyno at the free tier level. If you need to avoid the startup delay for two apps, then you'll need to pay for at least one of them.
Heroku still only allows for a single free dyno to run on your app. This you might lose traffic when you you are pushing new code (since the free web down has to shut down so you can built a new one).
There are undoubtably other issues as well, but those are the main ones I can think of offhand.
Now the solution:
Really, you just need something to ping your site at least once every 30 minutes. You could write a script for this, but there is an extremely useful tool that already does something like this that provides more benefit to you.
That would be the Availability (or Uptime) Monitoring tool. This is a tool that ensures your site is "still up and running" by pinging a URL every X minutes and ensuring that the response is a valid, expected response (IE: 200 status code and/or checking for certain text on the page). These often also provide the benefit of contacting you if it receives unexpected response (almost certainly an error) for too long.
Here is an example of an availability monitor:
https://uptimerobot.com/
Related
Background:
I have an application that is supposed to automate some infrastructure & OS-heavy tasks that happen on a network file system (for example: mounting volumes, shutting down / bringing up servers, creating directories, moving data around, ssh-ing etc). Ultimately there are a lot of OS-level commands that need to be run in a sequence for each action. Our consumer/client likely does not know this sequence, but knows "I want to do X task".
Tech stack: Python/Django
I have been tasked with setting this application up but am perplexed on the best way to approach for modularity from the API standpoint & just overall design. Currently, we have a similar application that is a SOAP-style (rpc) but the way it is written is not very modular. Like for example, one function will have a ton of random hardcoded subprocess commands - not the approach I want to emulate here.
Initially I was leaning more towards REST API since Django has a nice django rest framework plug-in, but am having trouble modelling these very action-oriented tasks. The more I read other things online, the more I come to believe I really need to think of every little action as a resource with the client having to GET/POST/PUT to each of these to keep things very modular but when I boil that down further it looks like I may need to set up 15+ endpoints for each situation needed and the client likely isn't going to want to call all 15 endpoints to get their singular behavior they want. That being said - moving to rpc so users can have one endpoint that 'moves the moon on a single call' might not be the best approach either.
I think one of the issues I see is our application is doing a lot of work on a file system, not all contained within our application's database. I reckon that's kind of a central point of this application, but I have trouble modelling things that require file system actions outside our application's database.
Question 1:
One example action that our client might want to call would be responsible for ssh-ing to a remote server and running a command. How might you model this in REST?
Question 2:
How do you all model file system actions in your applications?
Question 3:
After reviewing the above, does RPC seem like the better option?
Other:
Any other help or feedback (even in generally is much appreciated).
REST is similar to SOAP in a sense that you call operations in SOAP and REST just maps those operations to web resources and HTTP methods.
For example
z DoSSHStuffOnARemoteServer(x=1,y=2)
vs
POST /RemoteServer/SSHStuff {x:1,y:2}
If it timeouts, because it takes a lot of time, then you can do
202 accepted
{type: "transaction": href: "/RemoteServer/SSHStuff/123", status: "pending"}
and poll it in every 5-10 mins or use websockets to update the status. After it is done:
200 ok
{type: "transaction": href: "/RemoteServer/SSHStuff/123", status: "done", result: {z:3}}
So there is no magic. Just keep in mind that REST is in the presentation layer of your application, it returns view models and the entire structure is connected to the application services if you do DDD. It should not reflect the database structure unless you have an anemic domain model, or others call it thick client. Normally I would not say anything about RemoteServer/SSHStuff, just tell the client what will be done and stay silent about how it will be done. They don't need to know anything about how you store data or how many servers you have with what protocols and applications. It should not be their concern. The only thing they need to know what will be done, how long it takes to respond and what will be the response. The other part is irrelevant to them and it is a security risk if you share too much of it. When we design an interface like an interface for a REST service or just an OOP interface we always do it to hide implementation details. I hope that helps, have a nice day!
I'm starting with ASK development. I'm a little confused by some behavior and I would like to know how to debug errors from the "service simulator" console. How can I get more information on the The remote endpoint could not be called, or the response it returned was invalid. errors?
Here's my situation:
I have a skill and three Lambda functions (ARN:A, ARN:B, ARN:C). If I set the skill's endpoint to ARN:A and try to test it from the skill's service simulator, I get an error response: The remote endpoint could not be called, or the response it returned was invalid.
I copy the lambda request, I head to the lambda console for ARN:A, I set the test even, paste the request from the service simulator, I test it and I get a perfectly fine ASK response. Then I head to the lambda console for ARN:B and I make a dummy handler that returns exactly the same response that ARN:A gave me from the console (literally copy and paste). I set my skill's endpoint to ARN:B, test it using the service simulator and I get the anticipated response (therefore, the response is well formatted) albeit static. I head to the lambda console again and copy and paste the code from ARN:A into a new ARN:C. Set the skill's endpoint to ARN:C and it works perfectly fine. Problem with ARN:C is that it doesn't have the proper permissions to persist data into DynamoDB (I'm still getting familiar with the system, not sure wether I can share an IAM role between different lambdas, I believe not).
How can I figure out what's going on with ARN:A? Is that logged somewhere? I can't find any entry in cloudwatch/logs related to this particular lambda or for the skill.
Not sure if relevant, I'm using python for my lambda runtime, the code is (for now) inline on the web editor and I'm using boto3 for persisting to DynamoDB.
tl;dr: The remote endpoint could not be called, or the response it returned was invalid. also means there may have been a timeout waiting for the endpoint.
I was able to narrow it down to a timeout.
Seems like the Alexa service simulator (and the Alexa itself) is less tolerant to long responses than the lambda testing console. During development I had increased the timeout of ARN:1 to 30 seconds (whereas I believe the default is 3 seconds). The DynamoDB table used by ARN:1 has more data and it takes slightly longer to process than ARN:3 which has an almost empty table. As soon as I commented out some of the data loading stuff it was running slightly faster and the Alexa service simulator was working again. I can't find the time budget documented anywhere, I'm guessing 3 seconds? I most likely need to move to another backend, DynamoDB+Python on lambda is too slow for very trivial requests.
Unrelated to python, but I have found the same issue occurs for me if I do not have a handler for a specified intent:
# lets pretend intentName is actually 'FooBarIntent'
if (intentName == 'TestIntent') {
handleTestRequest(intent, session, callback);
} else {
throw "Invalid intent";
}
From here, amazon was barking that my lambda was invalid. For others it could indicate an error is being thrown earlier in the stack.
You can also log out your lambda errors with aws cloudwatch which will reveal any warnings or errors.
check out my repo, alexa lambda starter kit for a a simple hello world ask/lambda example.
I think the problem you having for ARN:1 is you probably didn't set a trigger to alexa skill in your lambda function.
Or it can be the alexa session timeout which is by default set to 8 seconds.
My guess would be that you missed a step on setup. There's one where you have to set the "event source". IF you don't do that, I think you get that message.
But the debug options are limited. I wrote EchoSim (the original one on GitHub) before the service simulator was written and, although it is a bit out of date, it does a better job of giving diagnostics.
Lacking debug options, the best is to do what you've done. Partition and re-test. Do static replies until you can work out where the problem is.
I know this is a real open-ended question, but I'm new to python and am building a simple one off web app to give a non technical team some self service capabilities. This team has a bunch of repetitive tasks that they kick over to another team that are just begging to be automated, like restarting a few processes on remote hosts, grep logs, cleanup old files, deploy/restart new versions of an application, get current running versions, etc. The users will be clicking buttons and watching the output in the GUI, they WILL NOT be manually entering commands to run (I know this is dangerous). Any new tasks will be scripted and added to the app from the technical support team.
Now, the only piece I'm not sure on is how to get (near) real time output from the commands back to the GUI. I've built a very similar app in PHP in the past, and what I did was flush the output of the remote commands to a db, and then would poll the db with ajax and append new output. It was pretty simple and worked great even though the output would come back in chunks (I had the output written to the GUI line by line, so it looked like it was real time). Is there a better way to do this? I was thinking of using web sockets to push the output of the command back to the GUI. Good idea? Bad idea? Anything better with a python library? I can also use nodejs, if that makes any difference, but I'm new to both languages (I do already have a simple python flask application up and running that acts as an API to glue together a few business applications, not a big deal to re-write in node).
This is a broad question, but I'll give you few clues.
Nice example is LogIo. Once you are willing to run some commands and than push output to GUI, using Node.js becomes natural approach. This app may contain few elements:
part one that runs commands and harvests output and pushes it to
part two that receives output and saves it to DB/files. After save, this part is throwing event to
part three, that should be a websocket server, which will handle users that are online and distribute events to
part four, which would be preoperly scripted GUI that is able to connect via websocket to part three, log-in user, receive events and broadcast them to other GUI elements.
Once I assume you feel stronger with PHP than python, for you easiest approach would be to create part two as a PHP service to handle input (save harvested output to db) and than, let say use UDP package to part three's UDP listening-socket.
Part one would be python script to just get command output and bypass it properly to part two. It should be as easy to hadle as usual grep case:
tail -f /var/log/apache2/access.log | /usr/share/bin/myharvester
at some point of developing it you will be in demand of passing there also user or unical task id as parameter after myharvester.
Tricky but easier than you think will be to create a Node.js cript as part three. As a single instance script it should be able to receive input and bypass it to users as events. I've commited comething like this before:
var config = {};
var app = require('http').createServer().listen(config.server.port);
var io = require('socket.io').listen(app);
var listenerDgram = require('dgram').createSocket('udp4');
listenerDgram.bind(config.listeners.udp.port);
var sprintf = require('sprintf').sprintf;
var users = [];
app.on('error', function(er) {
console.log(sprintf('[%s] [ERROR] HTTP Server at port %s has thrown %s', Date(), config.server.port, er.toString()));
process.exit();
});
listenerDgram.on('error', function(er) {
console.log(sprintf('[%s] [ERROR] UDP Listener at port %s has thrown %s', Date(), config.listeners.udp.port, er.toString()));
process.exit();
});
listenerDgram.on('message', function(msg, rinfo) {
// handling, let's say, JSONized msg from part two script,
// buildinf a var frame and finally
if(user) {
// emit to single user based on what happened
// inside this method
users[user].emit('notification', frame);
} else {
// emit to all users
io.emit('notification', frame);
}
});
io.sockets.on('connection', function(socket) {
// handling user connection here and pushing users' sockets to
// users aray.
});
This scrap is basic example of not filled with logic what-you-need. Script should be able to open UDP listener on given port and to listen for users running into it within websockets. Honestly, once you become good in Node.js, you may want to fix both part two + part three with it, what will take UDP part off you as harvester will push output directly to script, that maintains websocket inside it. But it has a drawback of duplicating some logic from other back-end as CRM.
Last (fourth) part would be to implement web interface with JavaScript inside, that connects currently logged user to socket server.
I've used similar approach before, and it is working real-time, so we can show our Call-Center employees information about incoming call before even phone actually start to ring. Finally solutions (not counting interface of CRM) closes in two scripts - dedicated CRM API part (where all logic happen) to handle events from Asterisk and Node.js event forwarder.
Newbie on appengine and I really don't know how to phrase the question which sadly results in me not knowing what keywords to google and I hope that i really do get help other than the bashing that a lot of people do.
I'm confused between the behavior of appengine online and the appengine on the local server.
Background info:
Btw this is in Python
Initially i assumed that , when needed or as authored
an instance of the app or module will be created.
And that instance will be the one serving multiple requests from different clients.
In this behavior any initialization code will only be run once.
But in the local development server.
Every time i add something new, specially in the main.py,
the server is able to catch the new changes,
then on browser-refresh be able to run it.
This made me think, wait...
Does it run the entire script over and over again
on every request?
Question:
Does an instance/module run the entire code on every request or is this just an added behavior to the dev server to make development easier?
Both your assumptions - about behaviour in production and development - are wrong.
In production, GAE spins up instances as required. This may be in response to increased load, or the host may simply decide after a certain amount of time to recycle an instance by killing it and starting a new one. Initialization code will always be run whenever a new instance is started.
In development, you only get a single instance. However, the server watches your file system for changes. If it detects a change to the code itself, it will restart itself, and therefore re-run the initialization code. But if you don't make any code changes between requests, the existing process continues indefinitely, and init code will not be re-run.
I have Django running in Apache via mod_wsgi. I believe Django is caching my pages server-side, which is causing some of the functionality to not work correctly.
I have a countdown timer that works by getting the current server time, determining the remaining countdown time, and outputting that number to the HTML template. A javascript countdown timer then takes over and runs the countdown for the user.
The problem arises when the user refreshes the page, or navigates to a different page with the countdown timer. The timer appears to jump around to different times sporadically, usually going back to the same time over and over again on each refresh.
Using HTTPFox, the page is not being loaded from my browser cache, so it looks like either Django or Apache is caching the page. Is there any way to disable this functionality? I'm not going to have enough traffic to worry about caching the script output. Or am I completely wrong about why this is happening?
[Edit] From the posts below, it looks like caching is disabled in Django, which means it must be happening elsewhere, perhaps in Apache?
[Edit] I have a more thorough description of what is happening: For the first 7 (or so) requests made to the server, the pages are rendered by the script and returned, although each of those 7 pages seems to be cached as it shows up later. On the 8th request, the server serves up the first page. On the 9th request, it serves up the second page, and so on in a cycle. This lasts until I restart apache, when the process starts over again.
[Edit] I have configured mod_wsgi to run only one process at a time, which causes the timer to reset to the same value in every case. Interestingly though, there's another component on my page that displays a random image on each request, using order('?'), and that does refresh with different images each time, which would indicate the caching is happening in Django and not in Apache.
[Edit] In light of the previous edit, I went back and reviewed the relevant views.py file, finding that the countdown start variable was being set globally in the module, outside of the view functions. Moving that setting inside the view functions resolved the problem. So it turned out not to be a caching issue after all. Thanks everyone for your help on this.
From my experience with mod_wsgi in Apache, it is highly unlikely that they are causing caching. A couple of things to try:
It is possible that you have some proxy server between your computer and the web server that is appropriately or inappropriately caching pages. Sometimes ISPs run proxy servers to reduce bandwidth outside their network. Can you please provide the HTTP headers for a page that is getting cached (Firebug can give these to you). Headers that I would specifically be interested in include Cache-Control, Expires, Last-Modified, and ETag.
Can you post your MIDDLEWARE_CLASSES from your settings.py file. It possible that you have a Middleware that performs caching for you.
Can you grep your code for the following items "load cache", "django.core.cache", and "cache_page". A *grep -R "search" ** will work.
Does the settings.py (or anything it imports like "from localsettings import *") include CACHE_BACKEND?
What happens when you restart apache? (e.g. sudo services apache restart). If a restart clears the issue, then it might be apache doing caching (it is possible that this could also clear out a locmen Django cache backend)
Did you specifically setup Django caching? From the docs it seems you would clearly know if Django was caching as it requires work beforehand to get it working. Specifically, you need to define where the cached files are saved.
http://docs.djangoproject.com/en/dev/topics/cache/
Are you using a multiprocess configuration for Apache/mod_wsgi? If you are, that will account for why different responses can have a different value for the timer as likely that when timer is initialised will be different for each process handling requests. Thus why it can jump around.
Have a read of:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
Work out in what mode or configuration you are running Apache/mod_wsgi and perhaps post what that configuration is. Without knowing, there are too many unknowns.
I just came across this:
Support for Automatic Reloading To help deployment tools you can
activate support for automatic reloading. Whenever something changes
the .wsgi file, mod_wsgi will reload all the daemon processes for us.
For that, just add the following directive to your Directory section:
WSGIScriptReloading On