Trace speed issues in python running on google app engine - python

I'm new to python, django and google app engine. All great tools and have been enjoying working with them.
However, on my production site its taking 4 seconds to load a webpage, which I think is horrible and needs to be less than a second. I've also verified the long amount of time is in the request to the get the page, not downloading any media files.
First thought is yes, it still has the first start issues any gae app would, I'm not trying to fix those. I understand that the first time you hit your website after uploading a new version it needs to load up the code for the first time. Additionally, if your site isn't visited often then this happens alot. All of this I'm aware of and not trying get more info on.
My site is relatively simple and its not loading big data or displaying complicated designs. And on my localhost it runs extremely fast. I should also point out that I'm using Django nonrel, which is a great tool that allows me to develop quickly with django on gae: http://www.allbuttonspressed.com/projects/django-nonrel
The problem I'm having is that its taking way to long for pages to load in production and I need to get to the bottom of it. I'm sure I've coded something poorly, but I'm not familiar enough with python and gae to know the best debugging practices, especially if it only seems to have issues in production.
So for a newbie python / django / google app engine developer, how do I quickly and easily find what functions are taking so much time?

Use appstats.

Related

Django vs Flask - Which framework is more powerful considering I want to design an end to end IOT solution

Planning to build an end to end IOT application. I know that we can do it using Django and Flask. Which one is a better choice and how it is better?
I was in your position 2 months ago. Let me save you some trouble.
Flask may seem the more flexible or direct path as it is easy to have something up and running in moments. Whereas Django, at times, can feel like an entirely different language, and takes much more time and effort to get running. Still, the problem lies in getting everything to play nicely with flask and other packages. This will be tricky and Django shines in this area. Once the project is rolling, it just works, and generally continues to do so unless the packages utilized in your app become deprecated or you break it.
Both are viable but any project of scale will be better off being maintained by Django while you focus on implementation. Good luck.

Architecture question for app which serves data from SQLite in memory

I am building a dynamic <div> into my (very low-traffic) website, which I want to display only some data from a memory-based SQLite database running within a Python program on the server. Being a novice in web tech, I can't decide which technologies and principles should go into this project.
Right now, the only decided-upon technologies are Python and Apache. Python, at the very least, needs to be constantly running to fetch data from the external source, format it, and enter it into the database. Problem #1 is where this database should reside. Ideally, I would like it in RAM, since the database will update both often and around the clock. Then the question becomes, "How does one retrieve the data?". Note: the query will never change; I want the web page to receive the same JSON structure only with up-to-date values. From here, I see two options with the first, again, being ideal:
1) Perform some simple "hey someone wants the stuff" interaction with the Python program (remember that this program will be running) whenever someone loads the page, which is responded to with the JSON data. This should be fairly easy with WebSockets, but I understand they have fallen out of favor.
2) Have the Python program periodically create/update an HTML file, which the page loads with jQuery. I could do this with my current knowledge, but I find it inelegant and it would be accepting several compromises such as increased disk read/writes and possibly out-of-date data unless read/writes are increased even further, essentially rendering the benefits from a memory database useless.
So, is my ideal case feasible? Can I implement an API into my Python program to listen for requests? Would the request be made with jQuery? Node.js? PHP? Maybe even with Apache? Do I bypass Python by manipulating the VFS? The techs available feel overwhelming and most online resources only detail generating HTML with Python (Django, Flask, etc.).
Thank you!
WSGI is the tech I was looking for!

Best Python Web Framework for my API Server Needs

I am working on developing two systems:
A system that will constantly retrieve economic data from a 3rd party data feed and push it into a MySQL DB (using sqlalchemy)
A server that will allow anyone to query the data in the db over a JSON AJAX API (similar to Yelp or Yahoo API for example)
I have two main questions:
Which Python framework should I use in 2)? Pyramid is my first choice, but if you strongly suggest against it or in favor of something else like Django or Pylons I am definitely wiling to consider it.
Should I develop the two system separately? Or should 1) be a part of 2), running within the framework (using crontab or celery for example)?
Depends on what stage you are at, I would suggest to develop 2 systems because the load to pull data from 3rd party and the load to handle the API would be different. You can scale them into a different types of nodes if you want.
Django-Tastypie (https://github.com/toastdriven/django-tastypie) is not bad, it supports all JSON, XML and YAML. Also you can add OAuth easily. Though, Django itself maybe a bit heavy for your needs at this time.
You might want to check out web2py's new functionality for easily generating RESTful API's, particularly its parse_as_rest and smart_query functions. You might also consider using web2py's database abstraction layer to handle #1.
If you need any help, ask on the mailing list.
I agree with Anthony, you should look at Web2Py. It is very easy to get started, very low learning cure and easy to deploy on many systems including Linux, Windows and Amazon.
So far I have found nothing that Web2Py can not do. But more importantly it does things how you would think they should be done, so if you are not sure, very often a guess is good enough and it just works. If you do get stuck, it has by far the best and most up to date documentation for any Python Web Framework.
Even with all it's great features, easy use and up to date documentation, you will also find that the web2py user group on Google, is like having a paid for help desk staffed 24 hours a day. Most questions are answered with a couple minutes and Massimo (The original creator of Web2Py) goes out of his way not only to help, but to implement new ideas, suggestions and bug fixes within days of them being raised in the group.

Python/GAE social network / cms?

After much research, I've come up with a list of what I think might be the best way of putting together a Python based social network/cms, but have some questions about how some of these components fit together.
Before I ask about the particular components, here are some of the key features of the site to be built:
a modern almost desktop-like gui
future ability to host an advanced html5 sub-application (ex.http://www.lucidchart.com)
high scalability both for functionality and user load
user ability to password protect and permission manage content on per item/group basis
typical social network features
ability to build a scaled down mobile version in the future
Here's the list of tools I'm considering using:
Google App Engine
Python
Django
Pinax
Pyjamas
wxPython
And the questions:
Google App Engine -- this is an attempt to cut to the chase as many pieces of the puzzle seem to be in place.
Question: Am I limiting my options with this choice? Example: datastore not being relational? Should I wait
for SQL support under the Business version?
Python -- I considered 'drupal' at first, but in the end decided that being dependent on modules that may or
may not exist tomorrow + limitations of its templating system are a no-no. Learning its API, too, would be useless elsewhere
whereas Python seems like a swiss army knife of languages -- good for almost anything.
Question: v.2.5.2 is required by GAE, but python.org recommends 2.5.5. Which do I install?
Django -- v.0.96 is built into GAE. You seem to be able to upgrade it.
Questions: Any reason not to upgrade to the latest version? Ways to get around the lack of HTML5 support?
Pinax (http://pinaxproject.com) Rides on top of Django and appears to provide most of the social network functionality
anyone would want.
Question: Reasons NOT to use it? Alternatives?
Pyjamas and wxPython -- this is the part that gets a little confusing. The basic idea behind these is the ability
to build a GUI. I've considered Silverlight and Flash, before the GAE/Python route, but a few working versions of
HTML5 apps convinced me that enough of it ALREADY runs on the latest batch of browsers to chose the HTML5/Javascript
route instead.
Question: How do I extend/supplement Python/Django to build an app-like HTML5 interface? Are Pyjamas and wxPython
the way to go? Or should I change my thinking completely?
Answers to some/any of these questions would be of great help. Please excuse my ignorance if any of this doesn't make much sense.
My last venture into web programming was a decent sized LAMP website some 5-6 years ago. On the desktop side of things,
my programming experience boils down to very high level scripting languages that I keep on learning to accomplish very specific
tasks :)
As someone who has deployed a Django site to GAE, I can tell you that you are not going to reach the ideal solution. Django on GAE misses some of the best aspects of Django because the ORM doesn't work right. The best compromise may be to use Django-nonrel to add the features back in.
This introduces it's own problems though: because of the large number of files and memory used by a Django app you're code will be unloaded from memory quickly after the app becomes idle. That means that visitors will frequently hit an approximately 6 second delay on the first page view after the site's code has been unloaded from memory while GAE uncompresses the zipped modules. Once your site is busy this won't be a problem, but while your site is still young and unknown it will cause the appearance of performance problems. :-(
Second, I've also worked for a company that built a custom CMS and can tell you that the first 80% is pretty easy, especially with modern frameworks. However, the rest can be quite challenging. For example, user roles and custom content types are two challenging aspects. Therefore strongly consider standing on the backs of giants and finding a CMS or CMS framework that almost perfectly meets your needs and then extend it to do that extra bit you need.
So, that said, answering your points:
Yes, you're limiting your options but that may be OK. Most developers are more comfortable with the relational model than the nosql model. Therefore more open source software is built with it in mind. Also, GAE is a closed source platform which is also a deterrent to open source developers. App Engine Oil is a CMS framework that may suit you well and is optimized for App Engine. Also look at web2py which has support for GAE.
I've found myself to be extremely productive with Python. I used to write a lot of PHP now I find it ugly. That said, think about the total line count of code you'll have to write. If you can make Drupal work with high quality pre-made modules you may find yourself only needing 1/10th of the code. By the way, the trick with Drupal is to mainly use only high quality modules. Look at the history, make sure not to use development versions. Try to contact the authors on IRC. I'm not saying you should use Drupal but it is possible to have a reliable site with it (for example, whitehouse.gov)
You're in the classic GAE/Django problem. If you use 0.96 you get great performance but you miss a lot of the great 1.0+ features and you don't get the ORM and all of it's benefits. If you use a newer version of Django you get the performance/memory problems mentioned above.
I'm about to investigate pinax for my company. I've done a very cursor glance at it. I don't know if it has good support for non relational model backends. You'll probably need to look at django-nonrel. However know that you're going to be investing in relatively untried solutions here. A small percentage of Django users use Pinax and an even smaller percentage, if any, use it on a nonrelational backend. Therefore you're going to be in the highly experimental scenario you mentioned in point 2 above.
I can't offer personal experience on it. I've investigated pyjamas a few times. However I like writing HTML CSS and JS. I like to have control. I like progressive enhancement and knowing what users will see if they don't have the full capabilities. Also, I think any new app that doesn't explicitly address mobile clients is implicitly shooting themself in the foot. As many as 15% of Internet users only use the Internet via their smart phone. What kind of experience will they get with pyjamas?
You didn't mention this, but one thing I consider when choosing a platform is vendor lockin and portability. If you develop your solution for GAE and find that you're not able to do what you want, will you be able to port it to another solution elsewhere? How much work will it take? If you code heavily for GAE or make commitments to its architecture, you're stuck with it or with rewriting to move. Using Django or Web2py can help mitigate this.
That said, the big benefit of Python GAE is that you get to be very productive, see your results instantly, get hosting for free while your site is small and get excellent scalability. These are not small things. There is great value there.

Why use Django on Google App Engine?

When researching Google App Engine (GAE), it's clear that using Django is wildly popular for developing in Python on GAE. I've been scouring the web to find information on the costs and benefits of using Django, to find out why it's so popular. While I've been able to find a wide variety of sources on how to run Django on GAE and the various methods of doing so, I haven't found any comparative analysis on why Django is preferable to using the webapp framework provided by Google.
To be clear, it's immediately apparent why using Django on GAE is useful for developers with an existing skillset in Django (a majority of Python web developers, no doubt) or existing code in Django (where using GAE is more of a porting exercise). My team, however, is evaluating GAE for use on an all-new project and our existing experience is with TurboGears, not Django.
It's been quite difficult to determine why Django is beneficial to a development team when the BigTable libraries have replaced Django's ORM, sessions and authentication are necessarily changed, and Django's templating (if desirable) is available without using the entire Django stack.
Finally, it's clear that using Django does have the advantage of providing an "exit strategy" if we later wanted to move away from GAE and need a platform to target for the exodus.
I'd be extremely appreciative for help in pointing out why using Django is better than using webapp on GAE. I'm also completely inexperienced with Django, so elaboration on smaller features and/or conveniences that work on GAE are also valuable to me.
Django probably isn't the right choice for you, if you're sure that GAE is right for you. The strengths of the two technologies don't align very well - you completely lose a lot of Django's wonderful orm on GAE, and if you do use it, you write code that isn't really directly suitable to bigtable and the way GAE works.
The thing about GAE is that it gets the great scalability by forcing you to write code that scales easily from the ground up. You just can't do a number of things that scale poorly (of course, you can still write poorly scaling code, but you avoid some pitfalls). The tradeoff is that you really end up coding around the framework, if you use something like Django which is designed for a different environment.
If you see yourself ever leaving GAE for any reason, becoming invested in the infrastructure there is a problem for you. Coding for bigtable means that it will be harder to move to a different architecture (though the apache project is working to solve this for you with the HBase component of the Hadoop project). It would still be a lot of work to transition off of GAE.
What's the driving motivator behind using GAE, besides being a Google product, and a cool buzzword? Is there a reason that scaling using something like mediatemple's offering is unlikely to work well for you? Are you sure that the ways that GAE scales are right for your application? How does the cost compare to dedicated servers, if you're expecting to get to that performance realm? Can you solve your problem well using the tools GAE provides, as compared to a more traditional load-balanced server setup?
All this said, unless you absolutely positively need the borderline-ridiculous scaling that GAE offers, I'd personally suggest not letting that particular service structure your choice of framework. I like Django, so I'd say you should use it, but not on GAE.
Edit (June 2010):
As an update to this comment sometime later:
Google has announced sql-like capabilitys for GAE that aren't free, but will let you easily do things like run SQL-style commands to generate reports on your data.
Additionally, there are upcoming changes to the GAE query language which will allow complex queries in a far easier fashion. Look at the videos from Google I/O 2010.
Furthermore, there is work being done during the Summer of Code 2010 project which should bring no-sql support to django core, and by extension, make working with GAE significantly easier.
GAE is becoming more attractive as a hosting platform.
Edit (August 2011):
And Google just raised the cost to most users of the platform significantly by changing the pricing structure. The lockin problem has gotten better (if your application is big enough you can deploy the apache alternatives), but for most applications, running servers or VPS deployments is cheaper.
Very few people really have bigdata problems. "Oh my startup might scale someday" isn't a bigdata problem. Build stuff now and get it out the door using the standard tools.
We use django on our appengine instances mostly when we have to serve actual websites to the user. It has a great template engine, url routing and all the request/response/error handling built in. So even while we can't use the magic orm/admin stuff it has a lot going for it.
For api services we built something very simple on top of webob. It's far more lightweight because it doesn't need everything that django offers, and therefore a little quicker in some situations.
I've done lots of projects on GAE. Some in django, some in their normal framework.
For small things, I usually use their normal framework for simplicity and quickness. Like http://stdicon.com, http://yaml-online-parser.appspot.com/, or http://text-twist.appspot.com/.
For large things, I go with django to take advantage of all the nice middleware and plugins. Like http://metaward.com.
Basically my litmus test is Will this take me more than 2 weeks to write and be a REAL software project? If so, go with django for the addons.
It has the added benefit of, if your project is badly suited for BigTable then you quickly port off (like I did Is BigTable slow or am I dumb?)
I think all this answers are a bit obsolete.
Now you can use Google Cloud SQL
Django is a popular third-party Python web framework. When coupled
with Google Cloud SQL, all of its functionality can be fully supported
by applications running on App Engine. Support for using Google Cloud
SQL with Django is provided by a custom Django database backend which
wraps Django's MySQL backend.
https://cloud.google.com/python/django/appengine
one more fresh news is, that there is BETA support for PostgreSQL
I have experience using Django and not GAE. From my experiences with Django it was a very simplistic setup and the deployment process was incredibly easy in terms of web projects. Granted I had to learn Python to really get a good hold on things, but at the end of the day I would use it again on a project. This was almost 2 years ago before it reached 1.0 so I my knowledge is a bit outdated.
If you are worried about changing platforms, then this would be a better choice I suppose.
I cannot answer the question but you may want to look into web2py. It is similar to Django in many respects but its database abstraction layer works on GAE and supports most of the GAE functionality (not all but we try to catch up). In this way if GAE works for you great, if it does not, you can move your code to a different db (SQLite, MySQL, PostgreSQL, Oracle, MSSQL, FireBird, DB2, Informix, Ingres, and - soon - Sybase and MongoDB).
If you decide to run you app outside of GAE, you can still use Django. You won't really have that much luck with the GAE webapp
I am still very new to Google App engine development, but the interfaces Django provides do appear much nicer than the default. The benefits will depend on what you are using to run Django on the app engine. The Google App Engine Helper for Django allows you to use the full power of the Google App Engine with some Django functionality on the side.
Django non-rel attempts to provide as much of Django's power as possible, but running on the app-engine for possible extra scalability. In particular, it includes Django models (one of Django's core features), but this is a leaky abstraction due to the differences between relational databases and bigtable. There will most likely be tradeoffs in functionality and efficiency, as well as an increased number of bugs and quirks. Of course, this might be worth it in circumstances like those described in the question, but otherwise would strongly recommend using the helper at the start as then you have the option of moving towards either pure app-engine or Django non-rel later. Also, if you do switch to Django non-rel, your increased knowledge of how app engine works will be useful if the Django abstraction ever breaks - certainly much more useful than knowledge of the quirks/workarounds for Django non-rel if you swap the other way.

Categories