Multithreaded Downloading Through Proxies In Python - python

What would be the best library for multithreaded harvesting/downloading with multiple proxy support? I've looked at Tkinter, it looks good but there are so many, does anyone have a specific recommendation? Many thanks!

Twisted

Is this something you can't just do by passing a URL to newly spawned threads and calling urllib2.urlopen in each one, or is there a more specific requirement?

Also take a look at http://scrapy.org/, which is a scraping framework built on top of twisted.

Related

how to use tornado and django on open swift

Recently I have been working on openswift Django(1.8) with Python(3.3). I was woundering is it possible to work with Websocket along tornado and django beacuse using of tornado will be handling the request in Asynchronous IO handler. Any good suggestions.
There is django-channels which is an websocket implementation based on django. I would recommend taking a look. Those guys really thought on how to change django's request response paradigm in a way that it can deal with persistent connections.
There is nice overview. They also wrote down the concepts in detail.
Read that one regarding the comparison with tornado.

Python - Server and browser-client

I have written a python server that does a task depending on the input given by the user through a client. Unfortunately, this requires the user to use the terminal.
I'd like the user to use a browser instead to send the data to the server. How would I go on about this?
Does anyone here have suggestions? Perhaps even an example?
Thank you all in advance,
This is a very subjective question and depends on what exactly you are trying to achieve but if you want to write a program with an embedded http server then you could use either Tornado or Twisted. I've spent some time with both and found that Tornado is a bit cleaner and easier to write a web api with, but Twisted is more versatile if you want to handle different types of network connections.
Answering my question for future reference or other people with similar requests.
All of the requirements for this can be found in the standard module BaseHTTPServer

Using proxies with Unirest

I'm using the Unirest library for making async web requests with Python. I've read the documentation, but I wasn't able to find if I can use proxy with it. Maybe I'm just blind and there's a way to use it with Unirest?
Or is there some other way to specify proxy for Python? Proxies should be changed from script itself after making some requests, so this way should allow me to do it.
Thanks in advance.
I know nothing about Unirest, but, In all the scripts I wrote that requierd proxy support I used SocksiPy (http://socksipy.sourceforge.net) module. It support HTTP, SOCKS4 and SOCKS5 and it s really easy to use. :)
Would something like this work for you?
[1] https://github.com/obriencj/python-promises

Can I use socket.io with twisted.web?

I'm writing a web application using Python's twisted.web on the server side.
On the frontend side I would like to use Ajax for displaying real time updates of events which are happening in the server.
There are lots of information out there on how this can be done, so I realized I need to pick a javascript library that would make my life easier.
socket.io seems to be a good choice since it supports several browsers and transport mechanisms, but by reading their examples it seems it can only work with node.js?
So, does anyone know if it's possible to use socket.io with twisted.web?
If so, any links for a good example/tutorial would be also welcome.
You could try https://github.com/DesertBus/sockjs-twisted or if you need SocketIO for a specific reason, it wouldn't be difficult to port TornadIO2 to Cyclone. You might find interesting this issue.
You need something server side to integrate with the socket.io script on the client side. The servers that I know that are written in Python and do this all use Tornado. You could look at an implementation like, Tornadio (https://github.com/MrJoes/tornadio) and see what methods and classes they used to hook Tornadio and Tornado together. This would give you a pretty good idea of how to integrate it with your twisted.web server.
We've just switched away from socket.io to sockJS (which is also compatible with Tornado) and have seen large performance improvements.

Distributed python

What is the best python framework to create distributed applications? For example to build a P2P app.
I think you mean "Networked Apps"? Distributed means an app that can split its workload among multiple worker clients over the network.
You probably want.
Twisted
You probably want Twisted. There is a P2P framework for Twisted called "Vertex". While not actively maintained, it does allow you to tunnel through NATs and make connections directly between users in a very abstract way; if there were more interest in this sort of thing I'm sure it would be more actively maintained.
You could checkout pyprocessing which will be included in the standard library as of 2.6. It allows you to run tasks on multiple processes using an API similar to threading.
You could download the source of BitTorrent for starters and see how they did it.
http://download.bittorrent.com/dl/
If it's something where you're going to need tons of threads and need better concurrent performance, check out Stackless Python. Otherwise you could just use the SOAP or XML-RPC protocols. In response to Ben's post, if you don't want to look over the BitTorrent source, you could just look at the article on the BitTorrent protocol.

Categories