Delivering Python Processed data to the web

Delivering Python Processed data to the web - python

I have developed a python program that parses a webpage and creates a new text document with the parsed data. I want to deliver this new information to the web. I have no idea where to start with something like this. Are there any free options where I can have a site automatically call this python code upon request and update the new data to its page? Or is the only feasible solution here to have my own website/server that uses my code? I'm honestly pretty overwhelmed with many of the options when I try to begin doing a web-search for a solution like this. I have done a decent amount of application programming before so i'm confident in my ability to learn new things, but web protocols are all new to me so its hard to find a starting point.
Ultimately I want this python code to run automatically, or per request of a user, and deliver to the data to them. It could even be through an email, although that is probably less practical.

I personally have good experience using Google Appengine (and its free for a limited amount of requests). The downside is that it does not allow C-extensions or Python3.
If you want to host your own server, tornado is a good option I think. Tornado supports both Python2 and Python3.

There are a great deal of options available.. from 'traditional' virtual server or website hosts like a2hosting or godaddy to 'Cloud Application Hosts' such as Amazon EC2, Heroku or OpenShift.
For your case, and without knowing more, I would suggest that an application hosting is more appropriate, and that you should take a look at Heroku and Openshift in particular.
Define carefully what you want to achieve (how the users access your application, what they see, how they interact with it... etc..) and then evaluate these options based on those requirements.
Most offer a free trial, or even free services, depending on what you need! Good luck

If you've never worked with web technologies before this will be a overwhelming task, since there's a lot of different technologies involved, and many possible ways to combine them.
You'll probably want to start by familiarizing yourself with the very basics of the HTTP protocol.
Then you should read a bit on CGI server-side programming (the article also has a quick overview on HTTP).
Python can run both on CGI and WSGI (if the server provider allows such access), so you may also want to read about WSGI.
Once you grasp all these concepts, you should check this question for actual python techniques.
Also, since you seem to be under the impression you must pay to have a website/app deployed, you should know there are companies that host python apps for free

Related

What's the best way to get continuous data from another program in Django?

Here's the setup: On a single-board computer with a very rudimentary linux I'm running a Django app. This app is, when a button is pressed or as a response to the data described below, supposed to call either a function from a library written in C, or a compiled C program, to write data to system memory at a specified address, poke/peek like. (Python doesn't seem to be able to do that natively).
The Django app should also display data, continuously, which is being read from the memory from the same library / program.
My question now is how to even begin with setting up the scenario described above. Is this even possible with a web app? Is a Django or more fundamentally any web framework even the right approach here? I'm at a bit of a loss here, since I've spent quite a few hours now trying to figure out how to do this while not getting the most basic starting point...
Disclaimer: I'm pretty new to the entire web framework thing, and more importantly web development in general, so sorry if this is a bad question as in, I could have easily found information on this topic online, but I couldn't really find a good starting point on this.

I wanted to add a comment but not enough space... anyway
You can write a native extension in C for Python that could do what you need, check this.
Now for the fact of displaying data continuously this is kind of vague, if this C library is switching this hypothetical address, very often and very fast you have to update a browser client as fast as possible.
I think websockets would do the trick but they are js related, so I think NodeJs would be a better candidate for the server side of your application instead of Django.
If you want to stick to Django you can also expose an URL with the generated address value and have a webpage continuously (with a little Interval) checking that URL using a simple ajax call, kind of ugly and inefficient but would work.
Anyway IMHO your best bet is for websockets because with them you have a fullduplex communication between client and server.
Good Luck with your project.
Info:
Websockets in Django with socket.io
Nodejs socket.io

What tracking solutions are available for server side code?

I'm working on a tracking proxy (for want of a better term) written in Python. It's a simple http (wsgi) application that will run on one (maybe more) server and accepts event data from a desktop client. This service would then forward the tracking data on to some actual tracking platform (DeskMetrics, MixPanel, Google Analytics) so that we don't have to deal with the slicing and dicing of data.
The reason for this implementation is that it's much easier and faster to make changes to a server process that we control rather than having to ensure every client in the wild gets updated if the tracking backend changes in some way.
I've been looking up info on the various options and I was hoping somebody here would have some good advice from their own experiences. Ideally we'd be able to use Google Analytics as it's free for any amount of usage but paid options are fine.
My only real requirement is either a good Python library or a well documented api that I can write a wrapper for (this seems somewhat lacking in GA when it comes to triggering events through any method other than their js or other provided libs).
N.B. We're not really tracking server code so something like NewRelic isn't appropriate, we're just decoupling a desktop application from the specifics of the tracking backend.

We ran into this same problem a bunch of times, we ended up building a suite of server-side analytics libraries to make this easier.
Segment.io has libraries for Python, Ruby, Java, Node, .NET and PHP that abstract the APIs for Mixpanel, KISSmetrics, Google Analytics and a bunch of other analytics services.
You could integrate the Python library once, and then send your data wherever you want. The data is proxied through Segment.io's hosted service. Hopefully this cleans up the mess of integrating a bunch of libraries, each with slightly different APIs. (The service is free for the first million events.)

Have you tried anything below?
The Google Data APIs Python Client Library has source specific to analytics
http://code.google.com/p/gdata-python-client/
http://code.google.com/p/gdata-python-client/source/browse/#hg%2Fsamples%2Fanalytics
https://developers.google.com/gdata/articles/python_client_lib
You might be able to borrow from these sources as well;
Google has something they are working on for mobile and source is available in PHP, JSP, ASP.net and Perl: https://developers.google.com/analytics/devguides/collection/other/mobileWebsites
I also came accross this in PHP http://code.google.com/p/php-ga/
As for others:
KissMetrics: http://support.kissmetrics.com/apis/python
MixPanel: https://mixpanel.com/docs/integration-libraries/python
DeskMetrics: don't seem to have python, http://docs.deskmetrics.com/index.html
Sorry I cannot provide information based off extensive experience with anything python related other then providing a few of these resources. I would be interested to see what you come up with.

Server Topology Help - Django and Twisted Possibility?

I am currently working on a complex web interface and backend, that will need to address several issues.
Scalablility
multiple deployments of varying load demands
Very structured authorization groups
Different views for different user groups
admin panel
user/content management
Large managed database
current
long term stored data (histories)
Data Updates
Polling
Ex. Search queries, static pages/files, report generation per request
Pushing (likely websockets)
Ex. Real-time notifications
Varying protocols
Ex. HTTP, SSL, Websockets
I would like to use Python, because I have grown to really enjoy the language, and I am considering some combo of Django and Twisted.
I have some experience with Django, which I love for its MVT style of application programming, its authorization models, its admin panel, and its database API. However, it is not so strong in some of the data requirements that I need, in particular, the real-time aspects.
Now, I have not really used Twisted before, but I have seen many interesting things to it. In particular the async aspects, and the ability to run many protocols.
The problems in getting the two to work together are obvious in that Django is a blocking server and Twisted is designed to be non-blocking. I have seen some topics stating using the two together is possible and have had success with it. It also seems possible to run both and proxy them to accept different urls, but getting the authentication over the two may become tricky?
Having said all of that, I would like to ask if I am on the right track for implementing this system, as well as suggestions on how to use the two together, alternatives, or if I should just kick one out (at this point, I guess it'd have to be Django, because the real time stuff is necessary). I should mention that I have written some of the preliminary data models and views in Django already.
I am quite experienced on the client side of things (JS,CSS,HTML), but I am not so savvy in the server side of things. Any input would be helpful, thanks.

You can definitely use Twisted with Django. Several projects have used the two together to good effect. twistd web --wsgi provides a basic way to get it set up, and there's a great example with more bells and whistles, like static content by Alex Clemesha on github.

Mod_wsgi versus fapws3 - Django

is there a difference between using FAPWS3 and MOD_WSGI when dealing with Django?
FAPWS3 seems alot faster when serving requests toward Python scripts. I would like to know if I'm missing out anything. :)
Any ideas?

The underlying web server is not the bottleneck, it is your application and database access. The differences between any underlying web server are going to very minimal or non existent in the context of an actual full application stack. You cannot base decisions on hello world type tests as they are pretty meaningless. Decisions should therefore be based on the quality and stability of the hosting solutions under load, as well as ease of configuration and support, including your own competence to manage a particular setup. If you have no idea how to configure and support a particular web server properly, eg., Apache, then why would you use it.

here is the best explanation what i ever seen in the web at the moment.
http://nichol.as/benchmark-of-python-web-servers
Quote from nichol.as
When you are just interested in quickly hosting your threaded
application you really can’t go wrong with Apache ModWSGI. Even though
Apache ModWSGI might put a little more strain on your memory
requirements there is a lot to go for in terms of functionality. For
example, protecting part of your website by using a LDAP server is as
easy as enabling a module. Standalone CherryPy also shows great
performance and functionality and is really a viable (fully Python)
alternative which can lower memory requirements.
When you are a little more adventurous you can look at uWSGI and
FAPWS3, they are relatively new compared to CherryPy and ModWSGI but
they show a significant performance increase and do have lower memory
requirements.

TFS Webservice Documentation

We use a lot of of python to do much of our deployment and would be handy to connect to our TFS server to get information on iteration paths, tickets etc. I can see the webservice but unable to find any documentation. Just wondering if anyone knew of anything?

The web services are not documented by Microsoft as it is not an officially supported route to talk to TFS. The officially supported route is to use their .NET API.
In the case of your sort of application, the course of action I usually recommend is to create your own web service shim that lives on the TFS server (or another server) and uses their API to talk to the server but allows you to present the data in a nice way to your application.
Their object model simplifies the interactions a great deal (depending on what you want to do) and so it actually means less code over-all - but better tested and testable code and also you can work around things such as the NTLM auth used by the TFS web services.
Hope that helps,
Martin.

So, this question is friggin' old, but let me take a whack at it (since it keeps coming up in my google searches).
There's no officiall supported API for the on premise TFS (the MSFT hosted one has http://www.visualstudio.com/en-us/integrate/api/overview).
That said, you can always use Fiddler (http://www.telerik.com/fiddler) or something like it to inspect the calls that the web client for TFS is making to the server and do your magic to turn those into the scripts in python you want.
You'll need to run your python scripts under a service account that has TFS privs appropriate to what it is trying to do (read, update, confugure... whatever).
Since it sounds like you are just trying to read from TFS, this might be a really easy way for you to get what you want since an HTTP get to
http://yourserver/tfs/yourcollection/yourproject/_workitems#id=yourworkitemid
will hand you back (halfway) sane html payloads.
If you want lists of iterations or teams or whatever, then your service account needs to have the appropriate admin privileges and hit things like
http://yourserver/tfs/yourcollection/yourproject/_admin/_iterations
and use that response.

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.