error: can't start new thread

error: can't start new thread - python

I have a site that runs with follow configuration:
Django + mod-wsgi + apache
In one of user's request, I send another HTTP request to another service, and solve this by httplib library of python.
But sometimes this service don't get answer too long, and timeout for httplib doesn't work. So I creating thread, in this thread I send request to service, and join it after 20 sec (20 sec - is a timeout of request). This is how it works:
class HttpGetTimeOut(threading.Thread):
def __init__(self,**kwargs):
self.config = kwargs
self.resp_data = None
self.exception = None
super(HttpGetTimeOut,self).__init__()
def run(self):
h = httplib.HTTPSConnection(self.config['server'])
h.connect()
sended_data = self.config['sended_data']
h.putrequest("POST", self.config['path'])
h.putheader("Content-Length", str(len(sended_data)))
h.putheader("Content-Type", 'text/xml; charset="utf-8"')
if 'base_auth' in self.config:
base64string = base64.encodestring('%s:%s' % self.config['base_auth'])[:-1]
h.putheader("Authorization", "Basic %s" % base64string)
h.endheaders()
try:
h.send(sended_data)
self.resp_data = h.getresponse()
except httplib.HTTPException,e:
self.exception = e
except Exception,e:
self.exception = e
something like this...
And use it by this function:
getting = HttpGetTimeOut(**req_config)
getting.start()
getting.join(COOPERATION_TIMEOUT)
if getting.isAlive(): #maybe need some block
getting._Thread__stop()
raise ValueError('Timeout')
else:
if getting.resp_data:
r = getting.resp_data
else:
if getting.exception:
raise ValueError('REquest Exception')
else:
raise ValueError('Undefined exception')
And all works fine, but sometime I start catching this exception:
error: can't start new thread
at the line of starting new thread:
getting.start()
and the next and the final line of traceback is
File "/usr/lib/python2.5/threading.py", line 440, in start
_start_new_thread(self.__bootstrap, ())
And the answer is: What's happen?
Thank's for all, and sorry for my pure English. :)

The "can't start new thread" error almost certainly due to the fact that you have already have too many threads running within your python process, and due to a resource limit of some kind the request to create a new thread is refused.
You should probably look at the number of threads you're creating; the maximum number you will be able to create will be determined by your environment, but it should be in the order of hundreds at least.
It would probably be a good idea to re-think your architecture here; seeing as this is running asynchronously anyhow, perhaps you could use a pool of threads to fetch resources from another site instead of always starting up a thread for every request.
Another improvement to consider is your use of Thread.join and Thread.stop; this would probably be better accomplished by providing a timeout value to the constructor of HTTPSConnection.

You are starting more threads than can be handled by your system. There is a limit to the number of threads that can be active for one process.
Your application is starting threads faster than the threads are running to completion. If you need to start many threads you need to do it in a more controlled manner I would suggest using a thread pool.

I was running on a similar situation, but my process needed a lot of threads running to take care of a lot of connections.
I counted the number of threads with the command:
ps -fLu user | wc -l
It displayed 4098.
I switched to the user and looked to system limits:
sudo -u myuser -s /bin/bash
ulimit -u
Got 4096 as response.
So, I edited /etc/security/limits.d/30-myuser.conf and added the lines:
myuser hard nproc 16384
myuser soft nproc 16384
Restarted the service and now it's running with 7017 threads.
Ps. I have a 32 cores server and I'm handling 18k simultaneous connections with this configuration.

I think the best way in your case is to set socket timeout instead of spawning thread:
h = httplib.HTTPSConnection(self.config['server'],
timeout=self.config['timeout'])
Also you can set global default timeout with socket.setdefaulttimeout() function.
Update: See answers to Is there any way to kill a Thread in Python? question (there are several quite informative) to understand why. Thread.__stop() doesn't terminate thread, but rather set internal flag so that it's considered already stopped.

I completely rewrite code from httplib to pycurl.
c = pycurl.Curl()
c.setopt(pycurl.FOLLOWLOCATION, 1)
c.setopt(pycurl.MAXREDIRS, 5)
c.setopt(pycurl.CONNECTTIMEOUT, CONNECTION_TIMEOUT)
c.setopt(pycurl.TIMEOUT, COOPERATION_TIMEOUT)
c.setopt(pycurl.NOSIGNAL, 1)
c.setopt(pycurl.POST, 1)
c.setopt(pycurl.SSL_VERIFYHOST, 0)
c.setopt(pycurl.SSL_VERIFYPEER, 0)
c.setopt(pycurl.URL, "https://"+server+path)
c.setopt(pycurl.POSTFIELDS,sended_data)
b = StringIO.StringIO()
c.setopt(pycurl.WRITEFUNCTION, b.write)
c.perform()
something like that.
And I testing it now. Thanks all of you for help.

If you are tying to set timeout why don't you use urllib2.

I'm running a python script on my machine only to copy and convert some files from one format to another, I want to maximize the number of running threads to finish as quickly as possible.
Note: It is not a good workaround from an architecture perspective If you aren't using it for a quick script on a specific machine.
In my case, I checked the max number of running threads that my machine can run before I got the error, It was 150
I added this code before starting a new thread. which checks if the max limit of running threads is reached then the app will wait until some of the running threads finish, then it will start new threads
while threading.active_count()>150 :
time.sleep(5)
mythread.start()

If you are using a ThreadPoolExecutor, the problem may be that your max_workers is higher than the threads allowed by your OS.
It seems that the executor keeps the information of the last executed threads in the process table, even if the threads are already done. This means that when your application has been running for a long time, eventually it will register in the process table as many threads as ThreadPoolExecutor.max_workers

As far as I can tell it's not a python problem. Your system somehow cannot create another thread (I had the same problem and couldn't start htop on another cli via ssh).
The answer of Fernando Ulisses dos Santos is really good. I just want to add, that there are other tools limiting the number of processes and memory usage "from the outside". It's pretty common for virtual servers. Starting point is the interface of your vendor or you might have luck finding some information in files like
/proc/user_beancounters

Related

Set function timeout without having to use contextlib [duplicate]

I looked online and found some SO discussing and ActiveState recipes for running some code with a timeout. It looks there are some common approaches:
Use thread that run the code, and join it with timeout. If timeout elapsed - kill the thread. This is not directly supported in Python (used private _Thread__stop function) so it is bad practice
Use signal.SIGALRM - but this approach not working on Windows!
Use subprocess with timeout - but this is too heavy - what if I want to start interruptible task often, I don't want fire process for each!
So, what is the right way? I'm not asking about workarounds (eg use Twisted and async IO), but actual way to solve actual problem - I have some function and I want to run it only with some timeout. If timeout elapsed, I want control back. And I want it to work on Linux and Windows.

A completely general solution to this really, honestly does not exist. You have to use the right solution for a given domain.
If you want timeouts for code you fully control, you have to write it to cooperate. Such code has to be able to break up into little chunks in some way, as in an event-driven system. You can also do this by threading if you can ensure nothing will hold a lock too long, but handling locks right is actually pretty hard.
If you want timeouts because you're afraid code is out of control (for example, if you're afraid the user will ask your calculator to compute 9**(9**9)), you need to run it in another process. This is the only easy way to sufficiently isolate it. Running it in your event system or even a different thread will not be enough. It is also possible to break things up into little chunks similar to the other solution, but requires very careful handling and usually isn't worth it; in any event, that doesn't allow you to do the same exact thing as just running the Python code.

What you might be looking for is the multiprocessing module. If subprocess is too heavy, then this may not suit your needs either.
import time
import multiprocessing
def do_this_other_thing_that_may_take_too_long(duration):
time.sleep(duration)
return 'done after sleeping {0} seconds.'.format(duration)
pool = multiprocessing.Pool(1)
print 'starting....'
res = pool.apply_async(do_this_other_thing_that_may_take_too_long, [8])
for timeout in range(1, 10):
try:
print '{0}: {1}'.format(duration, res.get(timeout))
except multiprocessing.TimeoutError:
print '{0}: timed out'.format(duration)
print 'end'

If it's network related you could try:
import socket
socket.setdefaulttimeout(number)

I found this with eventlet library:
http://eventlet.net/doc/modules/timeout.html
from eventlet.timeout import Timeout
timeout = Timeout(seconds, exception)
try:
... # execution here is limited by timeout
finally:
timeout.cancel()

For "normal" Python code, that doesn't linger prolongued times in C extensions or I/O waits, you can achieve your goal by setting a trace function with sys.settrace() that aborts the running code when the timeout is reached.
Whether that is sufficient or not depends on how co-operating or malicious the code you run is. If it's well-behaved, a tracing function is sufficient.

An other way is to use faulthandler:
import time
import faulthandler
faulthandler.enable()
try:
faulthandler.dump_tracebacks_later(3)
time.sleep(10)
finally:
faulthandler.cancel_dump_tracebacks_later()
N.B: The faulthandler module is part of stdlib in python3.3.

If you're running code that you expect to die after a set time, then you should write it properly so that there aren't any negative effects on shutdown, no matter if its a thread or a subprocess. A command pattern with undo would be useful here.
So, it really depends on what the thread is doing when you kill it. If its just crunching numbers who cares if you kill it. If its interacting with the filesystem and you kill it , then maybe you should really rethink your strategy.
What is supported in Python when it comes to threads? Daemon threads and joins. Why does python let the main thread exit if you've joined a daemon while its still active? Because its understood that someone using daemon threads will (hopefully) write the code in a way that it wont matter when that thread dies. Giving a timeout to a join and then letting main die, and thus taking any daemon threads with it, is perfectly acceptable in this context.

I've solved that in that way:
For me is worked great (in windows and not heavy at all) I'am hope it was useful for someone)
import threading
import time
class LongFunctionInside(object):
lock_state = threading.Lock()
working = False
def long_function(self, timeout):
self.working = True
timeout_work = threading.Thread(name="thread_name", target=self.work_time, args=(timeout,))
timeout_work.setDaemon(True)
timeout_work.start()
while True: # endless/long work
time.sleep(0.1) # in this rate the CPU is almost not used
if not self.working: # if state is working == true still working
break
self.set_state(True)
def work_time(self, sleep_time): # thread function that just sleeping specified time,
# in wake up it asking if function still working if it does set the secured variable work to false
time.sleep(sleep_time)
if self.working:
self.set_state(False)
def set_state(self, state): # secured state change
while True:
self.lock_state.acquire()
try:
self.working = state
break
finally:
self.lock_state.release()
lw = LongFunctionInside()
lw.long_function(10)
The main idea is to create a thread that will just sleep in parallel to "long work" and in wake up (after timeout) change the secured variable state, the long function checking the secured variable during its work.
I'm pretty new in Python programming, so if that solution has a fundamental errors, like resources, timing, deadlocks problems , please response)).

solving with the 'with' construct and merging solution from -
Timeout function if it takes too long to finish
this thread which work better.
import threading, time
class Exception_TIMEOUT(Exception):
pass
class linwintimeout:
def __init__(self, f, seconds=1.0, error_message='Timeout'):
self.seconds = seconds
self.thread = threading.Thread(target=f)
self.thread.daemon = True
self.error_message = error_message
def handle_timeout(self):
raise Exception_TIMEOUT(self.error_message)
def __enter__(self):
try:
self.thread.start()
self.thread.join(self.seconds)
except Exception, te:
raise te
def __exit__(self, type, value, traceback):
if self.thread.is_alive():
return self.handle_timeout()
def function():
while True:
print "keep printing ...", time.sleep(1)
try:
with linwintimeout(function, seconds=5.0, error_message='exceeded timeout of %s seconds' % 5.0):
pass
except Exception_TIMEOUT, e:
print " attention !! execeeded timeout, giving up ... %s " % e

Using celery to process huge text files

Background
I'm looking into using celery (3.1.8) to process huge text files (~30GB) each. These files are in fastq format and contain about 118M sequencing "reads", which are essentially each a combination of header, DNA sequence, and quality string). Also, these sequences are from a paired-end sequencing run, so I'm iterating two files simultaneously (via itertools.izip). What I'd like to be able to do is take each pair of reads, send them to a queue, and have them be processed on one of the machines in our cluster (don't care which) to return a cleaned-up version of the read, if cleaning needs to happen (e.g., based on quality).
I've set up celery and rabbitmq, and my workers are launched as follows:
celery worker -A tasks --autoreload -Q transient
and configured like:
from kombu import Queue
BROKER_URL = 'amqp://guest#godel97'
CELERY_RESULT_BACKEND = 'rpc'
CELERY_TASK_SERIALIZER = 'pickle'
CELERY_RESULT_SERIALIZER = 'pickle'
CELERY_ACCEPT_CONTENT=['pickle', 'json']
CELERY_TIMEZONE = 'America/New York'
CELERY_ENABLE_UTC = True
CELERYD_PREFETCH_MULTIPLIER = 500
CELERY_QUEUES = (
Queue('celery', routing_key='celery'),
Queue('transient', routing_key='transient',delivery_mode=1),
)
I've chosen to use an rpc backend and pickle serialization for performance, as well as not
writing anything to disk in the 'transient' queue (via delivery_mode).
Celery startup
To set up the celery framework, I first launch the rabbitmq server (3.2.3, Erlang R16B03-1) on a 64-way box, writing log files to a fast /tmp disk. Worker processes (as above) are launched on each node on the cluster (about 34 of them) ranging anywhere from 8-way to 64-way SMP for a total of 688 cores. So, I have a ton of available CPUs for the workers to use to process of the queue.
Job submission/performance
Once celery is up and running, I submit the jobs via an ipython notebook as below:
files = [foo, bar]
f1 = open(files[0])
f2 = open(files[1])
res = []
count = 0
for r1, r2 in izip(FastqGeneralIterator(f1), FastqGeneralIterator(f2)):
count += 1
res.append(tasks.process_read_pair.s(r1, r2))
if count == 10000:
break
t.stop()
g = group(res)
for task in g.tasks:
task.set(queue="transient")
This takes about a 1.5s for 10000 pairs of reads. Then, I call delay on the group to submit to the workers, which takes about 20s, as below:
result = g.delay()
Monitoring with rabbitmq console, I see that I'm doing OK, but not nearly fast enough.
Question
So, is there any way to speed this up? I mean, I'd like to see at least 50,000 read pairs processed every second rather than 500. Is there anything obvious that I'm missing in my celery configuration? My worker and rabbit logs are essentially empty. Would love some advice on how to get my performance up. Each individual read pair processes pretty quickly, too:
[2014-01-29 13:13:06,352: INFO/Worker-1] tasks.process_read_pair[95ec7f2f-0143-455a-a23b-c032998951b8]: HWI-ST425:143:C04A5ACXX:3:1101:13938:2894 1:N:0:ACAGTG HWI-ST425:143:C04A5ACXX:3:1101:13938:2894 2:N:0:ACAGTG 0.00840497016907 sec
Up to this point
So up to this point, I've googled all I can think of with celery, performance, routing, rabbitmq, etc. I've been through the celery website and docs. If I can't get the performance higher, I'll have to abandon this method in favor of another solution (basically dividing up the work into many smaller physical files and processing them directly on each compute node with multiprocessing or something). It would be a shame to not be able to spread this load out over the cluster, though. Plus, this seems like an exquisitely elegant solution.
Thanks in advance for any help!

Not an answer but too long for a comment.
Let's narrow the problem down a little...
Firstly, try skipping all your normal logic/message preparation and just do the tightest possible publishing loop with your current library. See what rate you get. This will identify if it's a problem with your non-queue-related code.
If it's still slow, set up a new python script but use amqplib instead of celery. I've managed to get it publishing at over 6000/s while doing useful work (and json encoding) on a mid-range desktop, so I know that it's performant. This will identify if the problem is with the celery library. (To save you time, I've snipped the following from a project of mine and hopefully not broken it when simplifying...)
from amqplib import client_0_8 as amqp
try:
lConnection = amqp.Connection(
host=###,
userid=###,
password=###,
virtual_host=###,
insist=False)
lChannel = lConnection.channel()
Exchange = ###
for i in range(100000):
lMessage = amqp.Message("~130 bytes of test data..........................................................................................................")
lMessage.properties["delivery_mode"] = 2
lChannel.basic_publish(lMessage, exchange=Exchange)
lChannel.close()
lConnection.close()
except Exception as e:
#Fail
Between the two approaches above you should be able to track down the problem to one of the Queue, the Library or your code.

Reusing the producer instance should give you some performance improvement:
with app.producer_or_acquire() as producer:
task.apply_async(producer=producer)
Also the task may be a proxy object and if so must be evaluated for every invocation:
task = task._get_current_object()
Using group will automatically reuse the producer and is usually what you would
do in a loop like this:
process_read_pair = tasks.process_read_pair.s
g = group(
process_read_pair(r1, r2)
for r1, r2 in islice(
izip(FastGeneralIterator(f1), FastGeneralIterator(f2)), 0, 1000)
)
result = g.delay()
You can also consider installing the librabbitmq module which is written in C.
The amqp:// transport will automatically use it if available (or can be specified manually using librabbitmq://:
pip install librabbitmq
Publishing messages directly using the underlying library may be faster
since it will bypass the celery routing helpers and so on, but I would not
think it was that much slower. If so there is definitely room for optimization in Celery,
as I have mostly focused on optimizing the consumer side so far.
Note also that you may want to process multiple DNA pairs in the same task,
as using coarser task granularity may be beneficial for CPU/memory caches and so on,
and it will often saturate parallelization anyway since that is a finite resource.
NOTE: The transient queue should be durable=False

One solution you have is that the reads are highly compressible so replacing the following
res.append(tasks.process_read_pair.s(r1, r2))
by
res.append(tasks.process_bytes(zlib.compress(pickle.dumps((r1, r2))),
protocol = pickle.HIGHEST_PROTOCOL),
level=1))
and calling a pickle.loads(zlib.decompress(obj)) on the other side.
It should win a factor around big factor for long enough DNA sequence if they are not long enough you can grouping them by chunk in an array which you dumps and compress.
another win can be to use zeroMQ for transport if you don't do yet.
I'm not sure what process_byte should be

Again, not an answer, but too long for comments. Per Basic's comments/answer below, I set up the following test using the same exchange and routing as my application:
from amqplib import client_0_8 as amqp
try:
lConnection = amqp.Connection()
lChannel = lConnection.channel()
Exchange = 'celery'
for i in xrange(1000000):
lMessage = amqp.Message("~130 bytes of test data..........................................................................................................")
lMessage.properties["delivery_mode"] = 1
lChannel.basic_publish(lMessage, exchange=Exchange, routing_key='transient')
lChannel.close()
lConnection.close()
except Exception as e:
print e
You can see that it's rocking right along.
I guess now it's up to finding out the difference between this and what's going on inside of celery

I added amqp into my logic, and it's fast. FML.
from amqplib import client_0_8 as amqp
try:
import stopwatch
lConnection = amqp.Connection()
lChannel = lConnection.channel()
Exchange = 'celery'
t = stopwatch.Timer()
files = [foo, bar]
f1 = open(files[0])
f2 = open(files[1])
res = []
count = 0
for r1, r2 in izip(FastqGeneralIterator(f1), FastqGeneralIterator(f2)):
count += 1
#res.append(tasks.process_read_pair.s(args=(r1, r2)))
#lMessage = amqp.Message("~130 bytes of test data..........................................................................................................")
lMessage = amqp.Message(" ".join(r1) + " ".join(r2))
res.append(lMessage)
lMessage.properties["delivery_mode"] = 1
lChannel.basic_publish(lMessage, exchange=Exchange, routing_key='transient')
if count == 1000000:
break
t.stop()
print "added %d tasks in %s" % (count, t)
lChannel.close()
lConnection.close()
except Exception as e:
print e
So, I made a change to submit an async task to celery in the loop, as below:
res.append(tasks.speed.apply_async(args=("FML",), queue="transient"))
The speed method is just this:
#app.task()
def speed(s):
return s
Submitting the tasks I'm slow again!
So, it doesn't appear to have anything to do with:
How I'm iterating to submit to the queue
The message that I'm submitting
but rather, it has to do with the queueing of the function?!?! I'm confused.

Again, not an answer, but more of an observation. By simply changing my backend from rpc to redis, I more than triple my throughput:

Sharing variables across multiple mod_wsgi processes/threads

Can you have an object shared across multiple WSGI threads/processes (and in a manner that would work on both *NIX and Windows)?
The basic premise:
(1) I have a WSGI front end that will connect to a back end server. I have a serialization class, that contains rules on how to serialize/unserialize various objects, including classes specific to this project. As such, it needs to have some setup telling it how to handle custom objects. However, it is otherwise stateless - multiple threads can access it at the same time to serialize their data without issue.
(2) There are sockets to connect to the back end. I'd rather have a socket pool than create/destroy every time there's a connection.
Note: I don't mind a solution where there are multiple instances of (1) and (2), but ideally, I'd like them created/initialized as few times as possible. I'm not sure about the internals, but if a thread loops rather than closing and having the server reopen on a new request, it would be fine to have one data set per thread (and hence, the socket and serializer are initialized once per thread, but reused each subsequent call it handles.) Actually having one socket per thread, if that's how it works, would be best, since I wouldn't need a socket pool and have to deal with mutexes.
Note: this is not sessions and has nothing to do with sessions. This shouldn't care who is making the call to the server. It's only about tweaking performance on systems that have slow thread creation, or a lot of memory but relatively slow CPUs.
Edit 2: The below code will give some info on how your system shares variables. You'll need to load the page a few times to get some diagnostics...
from datetime import *;
from threading import *;
import thread;
from cgi import escape;
from os import getpid;
count = 0;
responses = [];
lock = RLock();
start_ident = "%08X::%08X" % (thread.get_ident(), getpid());
show_env = False;
def application(environ, start_response):
status = '200 OK';
this_ident = "%08X::%08X" % (thread.get_ident(), getpid());
lock.acquire();
current_response = """<HR>
<B>Request Number</B>: """ + str(count) + """<BR>
<B>Request Time</B>: """ + str(datetime.now()) + """<BR>
<B>Current Thread</B>: """ + this_ident + """<BR>
<B>Initializing Thread</B>: """ + start_ident + """<BR>
Multithread/Multiprocess: """ + str(environ["wsgi.multithread"]) + "/" + str(environ["wsgi.multiprocess"]) +"""<BR>
"""
global count;
count += 1;
global responses;
responses.append(current_response)
if(len(responses) >= 100):
responses = responses[1:];
lock.release();
output="<HTML><BODY>";
if(show_env):
output+="<H2>Environment</H2><TABLE><TR><TD>Key</TD><TD>Value</TD></TR>";
for k in environ.keys():
output += "<TR><TD>"+escape(k)+"</TD><TD>"+escape(str(environ[k]))+"</TD></TR>";
output+="</TABLE>";
output += "<H2>Response History</H2>";
for r in responses:
output += r;
output+="</BODY></HTML>"
response_headers = [('Content-type', 'text/html'),
('Content-Length', str(len(output)))]
start_response(status, response_headers)
return [output]

For some reading on how mod_wsgi process/threading model works see:
http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading
Pay particular note to the section on building a portable application.
The comments there aren't really any different no matter what WSGI server you use and. All WSGI servers also fall into one of those multi process/single process, multi thread/single thread categories.

By my reading of http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading, if you have multiprocessing and multithreading on (as with worker, or if you have
WSGIDaemonProcess example processes=2 threads=25
then you have BOTH problems: multiple threads mean you could share a variable, but it would only be shared within each of 2 processes. There is no real way to share vars between processes unless you explicitly have another daemon (NON-APACHE) handling the message passing and requests.
Let's say you have a simple need for a database connection pool. With the above configuration, you'd have two pools, each serving 25 threads. This is fine for most people, as threads are lightweight and processes aren't (supposedly).
So, how to implement this?
In one of your modules, create a variable that instantiates a connection pool. Have each thread (really, the code that services an individual request) at an appropriate time get a connection use it, and return it to the pool. Don't forget the last part, you'll run out of connections quickly.
Create another daemon (not Apache related). instantiate an array of shared memory. Into this array put objects that consist of a db connection and a process Id (null to start). In a while-True loop, listen for connections on a socket, and when you get one, spawn a subprocess, passing in the shared array, the number of the array element. the subprocess fills in the process id it knows, it handles the request, closes any cursors, then removes its process id and exits.
Hire a programmer familiar with WSGI and db connection pooling to do it for you ;-) .

Python consumes 99% of CPU running eventlet

I have posted to the python and eventlet mailing list already so I apologize if I seem impatient.
I am running eventlet 0.9.16 on a Small (not micro) reserved ubuntu 11.10 aws instance.
I have a socketserver that is similar to the echo server from the examples in the eventlet documentation. When I first start running the code, everything seems fine, but I have been noticing that after 10 or 15 hours the cpu usage goes from about 1% to 99+%. At that point I am unable to make further connections to the socketserver.
This is the code that I am running:
def socket_listener(self, port, socket_type):
L.LOGG(self._CONN, 0, H.func(), 'Action:Starting|SocketType:%s' % socket_type)
listener = eventlet.listen((self._host, port))
listener.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
pool = eventlet.GreenPool(20000)
while True:
connection, address = listener.accept()
connection.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
L.LOGG(self._CONN, 0, H.func(), 'IPAddress:%s|GreenthreadsFree:%s|GreenthreadsRunning:%s' % (str(address[0]), str(pool.free()),str(pool.running())))
pool.spawn_n(self.spawn_socketobject, connection, address, socket_type)
listener.shutdown(socket.SHUT_RDWR)
listener.close()
The L.LOGG method simply logs the supplied parameters to a mysql table.
I am running the socket_listener in a thread like so:
def listen_phones(self):
self.socket_listener(self._port_phone, 'phone')
t_phones = Thread(target = self.listen_phones)
t_phones.start()
From my initial google searches I thought the issue might be similar to the bug reported at https://lists.secondlife.com/pipermail/eventletdev/2008-October/000140.html but I am using a new version of eventlet so surely that cannot be it?

If listener.accept() is non-blocking, you should put the thread to sleep for a small amount of time, so that the os scheduler can dispatch work to other processes. Do this by putting
time.sleep(0.03)
at the end of your while True loop.

Sorry for late reply.
There was no code like listener.setblocking(0), therefore, it MUST behave as blocking and no sleep must be required.
Also, please use a tool like ps or top to at least ensure that it's python process who eats all CPU.
If the issue still persists, please, report it to one of these channels, whichever you like:
https://bitbucket.org/which_linden/eventlet/issues/new
https://github.com/eventlet/eventlet/issues/new
email to eventletdev#lists.secondlife.com

running multiple threads in python, simultaneously - is it possible?

I'm writing a little crawler that should fetch a URL multiple times, I want all of the threads to run at the same time (simultaneously).
I've written a little piece of code that should do that.
import thread
from urllib2 import Request, urlopen, URLError, HTTPError
def getPAGE(FetchAddress):
attempts = 0
while attempts < 2:
req = Request(FetchAddress, None)
try:
response = urlopen(req, timeout = 8) #fetching the url
print "fetched url %s" % FetchAddress
except HTTPError, e:
print 'The server didn\'t do the request.'
print 'Error code: ', str(e.code) + " address: " + FetchAddress
time.sleep(4)
attempts += 1
except URLError, e:
print 'Failed to reach the server.'
print 'Reason: ', str(e.reason) + " address: " + FetchAddress
time.sleep(4)
attempts += 1
except Exception, e:
print 'Something bad happened in gatPAGE.'
print 'Reason: ', str(e.reason) + " address: " + FetchAddress
time.sleep(4)
attempts += 1
else:
try:
return response.read()
except:
"there was an error with response.read()"
return None
return None
url = ("http://www.domain.com",)
for i in range(1,50):
thread.start_new_thread(getPAGE, url)
from the apache logs it doesn't seems like the threads are running simultaneously, there's a little gap between requests, it's almost undetectable but I can see that the threads are not really parallel.
I've read about GIL, is there a way to bypass it with out calling a C\C++ code?
I can't really understand how does threading is possible with GIL? python basically interpreters the next thread as soon as it finishes with the previous one?
Thanks.

As you point out, the GIL often prevents Python threads from running in parallel.
However, that's not always the case. One exception is I/O-bound code. When a thread is waiting for an I/O request to complete, it would typically have released the GIL before entering the wait. This means that other threads can make progress in the meantime.
In general, however, multiprocessing is the safer bet when true parallelism is required.

I've read about GIL, is there a way to bypass it with out calling a C\C++ code?
Not really. Functions called through ctypes will release the GIL for the duration of those calls. Functions that perform blocking I/O will release it too. There are other similar situations, but they always involve code outside the main Python interpreter loop. You can't let go of the GIL in your Python code.

You can use an approach like this to create all threads, have them wait for a condition object, and then have them start fetching the url "simultaneously":
#!/usr/bin/env python
import threading
import datetime
import urllib2
allgo = threading.Condition()
class ThreadClass(threading.Thread):
def run(self):
allgo.acquire()
allgo.wait()
allgo.release()
print "%s at %s\n" % (self.getName(), datetime.datetime.now())
url = urllib2.urlopen("http://www.ibm.com")
for i in range(50):
t = ThreadClass()
t.start()
allgo.acquire()
allgo.notify_all()
allgo.release()
This would get you a bit closer to having all fetches happen at the same time, BUT:
The network packets leaving your computer will pass along the ethernet wire in sequence, not at the same time,
Even if you have 16+ cores on your machine, some router, bridge, modem or other equipment in between your machine and the web host is likely to have fewer cores, and may serialize your requests,
The web server you're fetching stuff from will use an accept() call to respond to your request. For correct behavior, that is implemented using a server-global lock to ensure only one server process/thread responds to your query. Even if some of your requests arrive at the server simultaneously, this will cause some serialisation.
You will probably get your requests to overlap to a greater degree (i.e. others starting before some finish), but you're never going to get all of your requests to start simultaneously on the server.

You can also look at things like the future of pypy where we will have software transitional memory (thus doing away with the GIL) This is all just research and intellectual scoffing at the moment but it could grow into something big.

If you run your code with Jython or IronPython (and maybe PyPy in the future), it will run in parallel

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.