I'm trying to figure out how to build a TCP proxy on GAE (Google App Engine). I would ordinarily do it using twisted networking engine but GAE doesn't allow frameworks. I'm also pretty new to internet and networking technologies in general.
Basically I have a proxy server and I'd like to use GAE as a TCP proxy to relay everything to the primary proxy server. All the GAE front ends are connected to the back end by google fiber, so if I make the back end near the primary proxy server, it should make it super fast regardless of where I'm connecting from.
Unfortunately GAE doesn't allow me to control ports at all and everything that I'm reading either tells me how to configure a TCP proxy on a server that I'm in complete control of or how to configure a proxy where I type the url into a webpage in the browser. Something along the lines of making a personal http://www.hidemyass.com/proxy/ type of website.
I'd like to set it up so I can simply tell chrome to ignore certificate errors (it connects to a dynamic IP using HTTPS so there's no way to sign it but I trust myself) and put the proxy info into chrome.
Edit: I'd prefer to write it in python but I can do any language
Thanks in advance
P.S. Please don't give answers like just use GoAgent or tor or something. They don't fulfill my purpose.
If you're simply trying to proxy HTTP requests like GoAgent does then have a look at the URLFetch documentation for Google App Engine.
URL Fetch Python API Overview
If you're trying to proxy anything else, then Daniel is correct.
This isn't the sort of thing you can use GAE for.
I don't know where you got the idea that GAE "doesn't allow frameworks". Of course it does, anything that speaks WSGI (eg Django, Flask, Pylons) is fine. But GAE is a web platform: it's not an appropriate place to try and write any sort of bare-metal networking platform. Apart from anything else, bandwidth on GAE is fairly expensive.
And also I don't know where you think the GAE "front ends" are, as opposed to the "back ends". GAE is not split that way, AFAIK.
I don't really understand what exactly you are trying to do, but it sounds like a content delivery network (CDN) like Akamai might be more appropriate.
Related
I'm currently working on a University project that needs to be implemented with a Client - Server model.
I had experiences in the past where I was managing the communication at socket level and that really sucked.
I was wondering if someone could suggest an easy to use python framework that I can use for that purpose.
I don't know what kind of details you may need to answer so I'm just going to describe the project briefly.
Communication should happen over HTTP, possibly HTTPS.
The server does not need to send data back or invoke methods on the clients, it just collects data
Many clients send data concurrently to server, who needs to distinguish the sender, process the data accordingly and put the result in a database.
You can use something like Flask or Django. Both frameworks are fairly easy to implement, Flask is much easier than Django IMO, although Django has a built in authentication layer that you can use, albeit more difficult to implement in a client/server scenario like you need.
I would personally use Flask and JWT (JSON Web Tokens), which will allow you to give a token to each client for authentication with the server, which will also let you differentiate between clients, and you can use HTTPS for your SSL/TLS requirement. It is tons easier to implement this, and although I like django better for what it brings to the table, it is probably overkill to have you learn it for a single assignment.
For Flask with SSL, here is a quick rundown of that.
For JWT with Flask, here is that.
You can use any database system you would like.
If I understood you correctly you can use any web framework in python. For instance, you can use Flask (I use it and I like it). Django is also a popular choice among the python web frameworks. However, you shouldn't be limited to only these two. There are plenty of them out there. Just google for them.
The implementation of the client depends on what kind of communication there will be between the clients and the server - I don't have enough details here. I only know it's unidirectional.
The client can be a browser accessing you web application written in Flask where users send only POST requests to the server. However, even here the communication will bidirectional (the clients need to open the page which means the server sends requests back to the client) and it violates your initial requirement.
Then it can be a specific client written in python sending some particular requests to your server over http/https. For instance, your client can use a requests package to send HTTP requests.
I've set up a free account on Google App Engine, and I currently have something like this deployed:
import webapp2
class MainHandler(webapp2.RequestHandler):
def get(self):
self.redirect('http://x.x.x.x:9000/')
This works and accomplishes what I was in the basic sense but since it's just issuing a http redirect I don't get my fancy Google Domain name and it ends up being the ip address (and port) of the final server. I am aware of why this happens, but was hoping for a solution that would preserve the domain name (and leave the port hidden).
Normally for something like this, you'd just have a rewrite rule in Apache, but that only works if both URLs are hosted by that same server. When the two servers are different, you'd probably go with some transparent proxy (Squid?), but I don't have a server capable of hosting that (this is for personal use, and though my router is ddwrt, I've had no luck getting squid installed on it).
So is there a python one-liner that let's me proxy to a single address but is smart enough to mangle resource requests and send along any request headers? I've found various solutions for writing proxies in python, but they seem overly complicated because they're intended to be general purpose.
This isn't even easy to google, since the obvious keywords all bring back too many results with only slightly relevant results.
You are looking for a reverse proxy setup. Here is one that I have used before. https://code.google.com/p/bs2grproxy/
You can either setup the DNS to point your domain directly to the IP address OR you can use urlfetch.
However, please keep in mind that urlfetch has quota and limitations [1]. It might not be worth it just to have a "pretty domain/URL".
[1] https://cloud.google.com/appengine/docs/quotas#UrlFetch
I am currently working on a project where we need to establish communication like an ESB, between a REST API and the apps services on a small scale.
Scenario:
Assume a web app front end (e.g. Django/Python or Ruby/Rails) and services that are accessible via a HTTP RESTful request.
How can I:
make it configurable which web services are called on a web request depending on the request and not requiring code changes (through keys for example)
encapsulate or implement the services in a way to make it easy to manage them e.g. start/stop etc.
I have been looking at spring.io, but cant work out whether this could be used for the this??
I am open to all suggestions,
Thanks
From what I understand, you want an authorisation solution.
In Rails, Pundit and CanCanCan are very popular. You could also implement it from scratch. Here is a screencast to help you get started.
I need to figure out which IP address my application is actually connecting to when it makes a urlfetch to a provided domain. My application on the production server is having problems connecting to a domain but connecting works perfectly fine using the SDK on my computer. I am trying to debug this problem and it occurred to me that Google App Engine may be resolving the domain to a different IP address than my local computer is.
If I had access to the socket library this would be as simple as socket.gethostbyname('thedomainiwant.com') but unfortunately Google does not allow the socket library on App Engine.
Any ideas?
If there is a solution that requires Java or Go on App Engine I am willing to try that too.
Update June 26, 2011:
I changed the production code to use the IP directly right away just to get this working (and it did) but this is not a good long term solution as I don't control the server I am making urlfetches to so the IP may change on me without warning.
Returned headers would not be helpful in this case because whatever IP address the production instance is resolving the domain to is not responding at all and the request times out.
If the server I am doing urlfetches to was blocking App Engine then doing an urlfetch by IP would not work either...but it does work. Also, I talked to the team managing the server and they confirmed they are not blocking App Engine. I am still pestering them for more info but it does not seem to be a problem on that end.
Update July 7, 2011:
Google has confirmed that there was a problem on their end that affected my application. They have applied a work around and are working on a fix. See here:
http://code.google.com/p/googleappengine/issues/detail?id=5244
There's currently no way to do name resolution on App Engine. You'll have to call an external service over HTTP if you want to do that.
Take a look at the response headers, you might get a HOST header back with exactly this info.
Otherwise, why not just use the raw IP's for your connections while you're diagnosing this?
You can use web services that perform DNS lookup. You can embed the address in the URL, like this:
http://www.dnswatch.info/dns/dnslookup?la=en&host=HOST_HERE&type=A&submit=Resolve
(replace the HOST_HERE) and then parse the result. Unfortunately it is HTML, but even simple regex should make it. You can also try find some service, which allows some XML output or so - there are a lot of such services, just type "dnslookup" in Google, someone might have it.
We use a lot of of python to do much of our deployment and would be handy to connect to our TFS server to get information on iteration paths, tickets etc. I can see the webservice but unable to find any documentation. Just wondering if anyone knew of anything?
The web services are not documented by Microsoft as it is not an officially supported route to talk to TFS. The officially supported route is to use their .NET API.
In the case of your sort of application, the course of action I usually recommend is to create your own web service shim that lives on the TFS server (or another server) and uses their API to talk to the server but allows you to present the data in a nice way to your application.
Their object model simplifies the interactions a great deal (depending on what you want to do) and so it actually means less code over-all - but better tested and testable code and also you can work around things such as the NTLM auth used by the TFS web services.
Hope that helps,
Martin.
So, this question is friggin' old, but let me take a whack at it (since it keeps coming up in my google searches).
There's no officiall supported API for the on premise TFS (the MSFT hosted one has http://www.visualstudio.com/en-us/integrate/api/overview).
That said, you can always use Fiddler (http://www.telerik.com/fiddler) or something like it to inspect the calls that the web client for TFS is making to the server and do your magic to turn those into the scripts in python you want.
You'll need to run your python scripts under a service account that has TFS privs appropriate to what it is trying to do (read, update, confugure... whatever).
Since it sounds like you are just trying to read from TFS, this might be a really easy way for you to get what you want since an HTTP get to
http://yourserver/tfs/yourcollection/yourproject/_workitems#id=yourworkitemid
will hand you back (halfway) sane html payloads.
If you want lists of iterations or teams or whatever, then your service account needs to have the appropriate admin privileges and hit things like
http://yourserver/tfs/yourcollection/yourproject/_admin/_iterations
and use that response.