Multithreading in zip password cracker

Multithreading in zip password cracker - python

I am learning how to crack zip files using dictionary attacks. This is the code:
import zipfile
from threading import Thread
def extractFile(zFile, password):
try:
zFile.extractall(pwd=password)
print '[+] Found password ' + password + '\n'
except:
pass
def main():
zFile = zipfile.ZipFile('evil.zip')
passFile = open('dictionary.txt')
for line in passFile.readlines():
password = line.strip('\n')
extractFile(zFile, password)
if __name__ == '__main__':
main()
I use threading on it
import zipfile
from threading import Thread
def extractFile(zFile, password):
try:
zFile.extractall(pwd=password)
print '[+] Found password ' + password + '\n'
except:
pass
def main():
zFile = zipfile.ZipFile('evil.zip')
passFile = open('dictionary.txt')
for line in passFile.readlines():
password = line.strip('\n')
t = Thread(target=extractFile, args=(zFile, password))
t.start()
if __name__ == '__main__':
main()
However, when I time the two programmes, it takes 90 seconds to complete the first but nearly 300 seconds to complete the second. The dictionary contains 459026 entries. I am baffled as to why this happens. I also tried limiting the threads to 10,20,so on. But still the loop performs faster at each instance. Can anybody explain why this is so?? Also is there any chance to improve the program at all.
EDIT
I tried slicing as suggested by Ray as follows:
import zipfile
from threading import Thread
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in xrange(0, len(l), n):
yield l[i:i+n]
def extractFile(zFile, passwords):
for password in passwords:
try:
zFile.extractall(pwd=password)
print '[+] Found password ' + password + '\n'
sys.exit(0)
except:
continue
def main():
zFile = zipfile.ZipFile('evil.zip')
with open('dictionary.txt', 'rb') as pass_file:
passwords = [i.strip() for i in pass_file]
passes = list(chunks(passwords, 10))
for pas in passes:
t = Thread(target=extractFile, args=(zFile, pas))
t.start()
if __name__ == '__main__':
main()
Still takes 3-4 mins

One reason why this does not work properly for multiprocessing; you must open the zip file in each subprocess, otherwise you can be hurt by sharing filehandles. Then create only a handful (say 2 * number of cores) subprocesses, and let a single subprocess test multiple passwords.
Thus we get:
import zipfile
from multiprocessing import Process
def extract_file(passwords):
with zipfile.ZipFile('evil.zip') as zipf:
for password in passwords:
try:
zipf.extractall(pwd=password)
print('[+] Found password {}\n'.format(password))
except Exception as e:
pass
def main():
with open('dictionary.txt', 'rb') as pass_file:
passwords = [i.strip() for i in pass_file]
N_PROC = 8
for i in range(N_PROC):
p = Process(target=extract_file, args=[passwords[i::N_PROC]])
p.start()
if __name__ == '__main__':
main()

Can anybody explain why this is so??
I think that, in addition to the problem of the Global Interpreter Lock (GIL), you might be using the threads incorrectly.
Judging from the loop, you're starting a completely new thread for every password line in your file -i.e. just to make a single attempt. Starting a new thread for only a single attempt is, as you've discovered, expensive and not working out as you expected. If you do this using multiprocessing, then it'll be even slower because creating a completely new process just to for a single try is even more expensive than creating a thread.
Is there any chance to improve the program at all?
I suggest you:
break up the passwords into several sub-lists/groups (i.e. slicing)
create a thread (or process) for each of these sub-lists
let each thread/process consume a group (i.e. make multiple attempts and get more out of them)
For example, if you have 100 lines in the file, you could break it up into 4 parts (i.e. 25 passwords per sub-list) and use these to feed 4 threads/processes (i.e. one for each sub-list).
Using multiprocessing here would be advantageous because you can avoid the GIL. However, keep in mind that you'd still have multiple processes accessing the same file simultaneously, so make sure you account for this when trying to extract the file, etc.
You should take care not to overwhelm your PC cores. You might want to use a process pool (see python docs) and limit the amount of processes you create to the number of cores in your PC as a maximum (perhaps your_core_count - 1 to keep it responsive).
Then, as each process consumes a sub-list and terminates, a new process is created (or existing one re-assigned, if using a process pool) to handle yet another sub-list waiting in your queue. If one of the children completes successfully, then you might want to get the parent process to kill all the other children to avoid unnecessary resource usage.

Related

Python Threading issue, not working and I need to make sure threads dont read same line

So this is the first time I am playing around with threading so please bare with me here. In my main application (which I will implement this into), I need to add multithreading into my script. The script will read account info from a text file, then login & do some tasks with that account. I need to make sure that threads aren't reading the same line from the accounts text file since that would screw everything up, which I'm not quite sure about how to do.
from multiprocessing import Queue, Process
from threading import Thread
from time import sleep
urls_queue = Queue()
max_process = 10
def dostuff():
with open ('acc.txt', 'r') as accounts:
for account in accounts:
account.strip()
split = account.split(":")
a = {
'user': split[0],
'pass': split[1],
'name': split[2].replace('\n', ''),
}
sleep(1)
print(a)
for i in range(max_process):
urls_queue.put("DONE")
def doshit_processor():
while True:
url = urls_queue.get()
if url == "DONE":
break
def main():
file_reader_thread = Thread(target=dostuff)
file_reader_thread.start()
procs = []
for i in range(max_process):
p = Process(target=doshit_processor)
procs.append(p)
p.start()
for p in procs:
p.join()
print('all done')
# wait for all tasks in the queue
file_reader_thread.join()
if __name__ == '__main__':
main()
So at the moment I don't think the threading is even working, because it's printing one account out per second, even with 10 threads. So it should be printing 10 accounts per second which it isn't which has me confused. Also I am not sure how to make sure that threads won't pick the same account line. Help by a big brain is much appreciated

The problem is that you create a single thread to generate the data for your processes but then don't post that data to the queue. You sleep in that single thread so you see one item generated per second and then... nothing because the item isn't queued. It seems that all you are doing is creating a process pool and the inbuilt multiprocessing.Pool should work for you.
I've set pool "chunk size" low so that workers are only given 1 work item at a time. This is good for workflows where processing time can vary for each work item. By default, pool tries to optimize for the case where processing time is roughly equivalent and instead tries to reduce interprocess communication time.
Your data looks like a colon-separated file and you can use csv to cut down the processing there too. This smaller script should do what you want.
import multiprocessing as mp
from time import sleep
import csv
max_process = 10
def doshit_processor(row):
time.sleep(1) # if you want to simulate work
print(row)
def main():
with open ('acc.txt', newline='') as accounts:
table = list(csv.DictReader(accounts, fieldnames=('user', 'pass', 'name'),
delimiter=':')
with mp.Pool(max_process) as pool:
pool.map(doshit_processor, table, chunksize=1)
print('all done')
if __name__ == '__main__':
main()

Problem with Multiprocessing and Deadlocking in Python3

I'm having a problem with my multiprocessing and I'm afraid it's a rather simple fix and I'm just not properly implementing the multiprocessing correctly. I've been researching the things that can cause the problem, but all I'm really finding is people recommending the use of a queue to prevent this, but that doesn't seem to be stopping it (again, I may just be implementing the queue incorrectly) I've been at this a couple of days now and I was hoping I could get some help.
Thanks in advance!
import csv
import multiprocessing as mp
import os
import queue
import sys
import time
import connections
import packages
import profiles
def execute_extract(package, profiles, q):
# This is the package execution for the extract
# It fires fine and will print the starting message below
started_at = time.monotonic()
print(f"Starting {package.packageName}")
try:
oracle_connection = connections.getOracleConnection(profiles['oracle'], 1)
engine = connections.getSQLConnection(profiles['system'], 1)
path = os.path.join(os.getcwd(), 'csv_data', package.packageName + '.csv')
cursor = oracle_connection.cursor()
if os.path.exists(path):
os.remove(path)
f = open(path, 'w')
chunksize = 100000
offset = 0
row_total = 0
csv_writer = csv.writer(f, delimiter='^', lineterminator='\n')
# I am having to do some data cleansing. I know this is not the most efficient way to do this, but currently
# it is what I am limited too
while True:
cursor.execute(package.query + f'\r\n OFFSET {offset} ROWS\r\n FETCH NEXT {chunksize} ROWS ONLY')
test = cursor.fetchone()
if test is None:
break
else:
while True:
row = cursor.fetchone()
if row is None:
break
else:
new_row = list(row)
new_row.append(package.sourceId[0])
new_row.append('')
i = 0
for item in new_row:
if type(item) == float:
new_row[i] = int(item)
elif type(item) == str:
new_row[i] = item.encode('ascii', 'replace')
i += 1
row = tuple(new_row)
csv_writer.writerow(row)
row_total += 1
offset += chunksize
f.close()
# I know that execution is at least reaching this point. I can watch the CSV files grow as more and more
# rows are added to the for all the packages What I never get are either the success message or error message
# below, and there are never any entries placed in the tables
query = f"BULK INSERT {profiles['system'].database.split('_')[0]}_{profiles['system'].database.split('_')[1]}_test_{profiles['system'].database.split('_')[2]}.{package.destTable} FROM \"{path}\" WITH (FIELDTERMINATOR='^', ROWTERMINATOR='\\n');"
engine.cursor().execute(query)
engine.commit()
end_time = time.monotonic() - started_at
print(
f"{package.packageName} has completed. Total rows inserted: {row_total}. Total execution time: {end_time} seconds\n")
os.remove(path)
except Exception as e:
print(f'An error has occured for package {package.packageName}.\r\n {repr(e)}')
finally:
# Here is where I am trying to add an item to the queue so the get method in the main def will pick it up and
# remove it from the queue
q.put(f'{package.packageName} has completed')
if oracle_connection:
oracle_connection.close()
if engine:
engine.cursor().close()
engine.close()
if __name__ == '__main__':
# Setting mp creation type
ctx = mp.get_context('spawn')
q = ctx.Queue()
# For the Etl I generate a list of class objects that hold relevant information profs contains a list of
# connection objects (credentials, connection strings, etc) packages contains the information to run the extract
# (destination tables, query string, package name for logging, etc)
profs = profiles.get_conn_vars(sys.argv[1])
packages = packages.get_etl_packages(profs)
processes = []
# I'm trying to track both individual package execution time and overall time so I can get an estimate on rows
# per second
start_time = time.monotonic()
sqlConn = connections.getSQLConnection(profs['system'])
# Here I'm executing a SQL command to truncate all my staging tables to ensure they are empty and will not
# generate any key violations
sqlConn.execute(
f"USE[{profs['system'].database.split('_')[0]}_{profs['system'].database.split('_')[1]}_test_{profs['system'].database.split('_')[2]}]\r\nExec Sp_msforeachtable #command1='Truncate Table ?',#whereand='and Schema_Id=Schema_id(''my_schema'')'")
# Here is where I start generating a process per package to try and get all packages to run simultaneously
for package in packages:
p = ctx.Process(target=execute_extract, args=(package, profs, q,))
processes.append(p)
p.start()
# Here is my attempt at managing the queue. This is a monstrosity of fixes I've tried to get this to work
results = []
while True:
try:
result = q.get(False, 0.01)
results.append(result)
except queue.Empty:
pass
allExited = True
for t in processes:
if t.exitcode is None:
allExited = False
break
if allExited & q.empty():
break
for p in processes:
p.join()
# Closing out the end time and writing the overall execution time in minutes.
end_time = time.monotonic() - start_time
print(f'Total execution time of {end_time / 60} minutes.')

I can't be sure why you are experiencing a deadlock (I am not at all convinced it is related to your queue management), but I can say for sure that you can simplify your queue management logic if you do one of either two things:
Method 1
Ensure that your worker function, execute_extract will put something on the results queue even in the case of an exception (I would recommend placing the Exception object itself). Then your entire main process loop that begins with while True: that attempts to get the results can be replaced with:
results = [q.get() for _ in range(len(processes))]
You are guaranteed that there will be a fixed number of messages on the queue equal to the number of processes created.
Method 2 (even simpler)
Simply reverse the order in which you wait for the subprocesses to complete and you process the results queue. You don't know how many messages will be on the queue but you aren't processing the queue until all the processes have returned. So however many messages are on the queue is all you will ever get. Just retrieve them until the queue is empty:
for p in processes:
p.join()
results = []
while not q.empty():
results.append(q.get())
At this point I would normally suggest that you use a multiprocessing pool class such as multiprocessing.Pool which does not require an explicit queue to retrieve results. But make either of these changes (I suggest Method 2, as I cannot see how it can cause a deadlock since only the main process is running at this point) and see if your problem goes away. I am not, however, guaranteeing that your issue is not somewhere else in your code. While your code is overly complicated and inefficient, it is not obviously "wrong." At least you will know whether your problem is elsewhere.
And my question for you: What does it buy you to do everything using a context acquired with ctx = mp.get_context('spawn') instead of just calling the methods on the multiprocessing module itself? If your platform had support for a fork call, which would be the default context, would you not want to use that?

Multiple stdout w/ flush going on in Python threading

I have a small piece of code that I made to test out and hopefully debug the problem without having to modify the code in my main applet in Python. This has let me to build this code:
#!/usr/bin/env python
import sys, threading, time
def loop1():
count = 0
while True:
sys.stdout.write('\r thread 1: ' + str(count))
sys.stdout.flush()
count = count + 1
time.sleep(.3)
pass
pass
def loop2():
count = 0
print ""
while True:
sys.stdout.write('\r thread 2: ' + str(count))
sys.stdout.flush()
count = count + 2
time.sleep(.3)
pass
if __name__ == '__main__':
try:
th = threading.Thread(target=loop1)
th.start()
th1 = threading.Thread(target=loop2)
th1.start()
pass
except KeyboardInterrupt:
print ""
pass
pass
My goal with this code is to be able to have both of these threads displaying output in stdout format (with flushing) at the same time and have then side by side or something. problem is that I assume since it is flushing each one, it flushes the other string by default. I don't quite know how to get this to work if it is even possible.
If you just run one of the threads, it works fine. However I want to be able to run both threads with their own string running at the same time in the terminal output. Here is a picture displaying what I'm getting:
terminal screenshot
let me know if you need more info. thanks in advance.

Instead of allowing each thread to output to stdout, a better solution is to have one thread control stdout exclusively. Then provide a threadsafe channel for the other threads to dispatch data to be output.
One good method to achieve this is to share a Queue between all threads. Ensure that only the output thread is accessing data after it has been added to the queue.
The output thread can store the last message from each other thread and use that data to format stdout nicely. This can include clearing output to display something like this, and update it as each thread generates new data.
Threads
#1: 0
#2: 0
Example
Some decisions were made to simplify this example:
There are gotchas to be wary of when giving arguments to threads.
Daemon threads terminate themselves when the main thread exits. They are used to avoid adding complexity to this answer. Using them on long-running or large applications can pose problems. Other
questions discuss how to exit a multithreaded application without leaking memory or locking system resources. You will need to think about how your program needs to signal an exit. Consider using asyncio to save yourself these considerations.
No newlines are used because \r carriage returns cannot clear the whole console. They only allow the current line to be rewritten.
import queue, threading
import time, sys
q = queue.Queue()
keepRunning = True
def loop_output():
thread_outputs = dict()
while keepRunning:
try:
thread_id, data = q.get_nowait()
thread_outputs[thread_id] = data
except queue.Empty:
# because the queue is used to update, there's no need to wait or block.
pass
pretty_output = ""
for thread_id, data in thread_outputs.items():
pretty_output += '({}:{}) '.format(thread_id, str(data))
sys.stdout.write('\r' + pretty_output)
sys.stdout.flush()
time.sleep(1)
def loop_count(thread_id, increment):
count = 0
while keepRunning:
msg = (thread_id, count)
try:
q.put_nowait(msg)
except queue.Full:
pass
count = count + increment
time.sleep(.3)
pass
pass
if __name__ == '__main__':
try:
th_out = threading.Thread(target=loop_output)
th_out.start()
# make sure to use args, not pass arguments directly
th0 = threading.Thread(target=loop_count, args=("Thread0", 1))
th0.daemon = True
th0.start()
th1 = threading.Thread(target=loop_count, args=("Thread1", 3))
th1.daemon = True
th1.start()
# Keep the main thread alive to wait for KeyboardInterrupt
while True:
time.sleep(.1)
except KeyboardInterrupt:
print("Ended by keyboard stroke")
keepRunning = False
for th in [th0, th1]:
th.join()
Example Output:
(Thread0:110) (Thread1:330)

Multiprocessing acting up in Python 3 [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 7 years ago.
Improve this question
I was messing around with a zip file cracker and decided to use the multiprocessing module to speed the process up. It was a complete pain since it was my first time using the module and I don't even fully understand it yet. However, I got it to work.
The problem is that it doesn't complete the word list; it just stops at random puts during the word list, and if the password is found it continues to go through the word list instead of just stopping the process.
Does anyone know why it's exhibiting this behaviour?
Source Code For ZipFile Cracker
#!/usr/bin/env python3
import multiprocessing as mp
import zipfile # Handeling the zipfile
import sys # Command line arguments, and quiting application
import time # To calculate runtime
def usage(program_name):
print("Usage: {0} <path to zipfile> <dictionary>".format(program_name))
sys.exit(1)
def cracker(password):
try:
zFile.extractall(pwd=password)
print("[+] Password Found! : {0}".format(password.decode('utf-8')))
pool.close()
except:
pass
def main():
global zFile
global pool
if len(sys.argv) < 3:
usage(sys.argv[0])
zFile = zipfile.ZipFile(sys.argv[1])
print("[*] Started Cracking")
startime = time.time()
pool = mp.Pool()
for i in open(sys.argv[2], 'r', errors='ignore'):
pswd = bytes(i.strip('\n'), 'utf-8')
pool.apply_async(cracker, (pswd,))
print (pswd)
runtime = round(time.time() - startime, 5)
print ("[*] Runtime:", runtime, 'seconds')
sys.exit(0)
if __name__ == "__main__":
main()

You are terminating your program too early. To test this out, add a harmless time.sleep(10) in the cracker method and observe your program still terminating within a second.
Call join to wait for the pool to finish:
pool = mp.Pool()
for i in open(sys.argv[2], 'r', errors='ignore'):
pswd = bytes(i.strip('\n'), 'utf-8')
pool.apply_async(cracker, (pswd,))
pool.close() # Indicate that no more data is coming
pool.join() # Wait for pool to finish processing
runtime = round(time.time() - startime, 5)
print ("[*] Runtime:", runtime, 'seconds')
sys.exit(0)
Additionally, once you find the right password, calling close just indicates that no more future tasks are coming - all tasks already submitted will still be done. Instead, call terminate to kill the pool without processing any more tasks.
Furthermore, depending on the implementation details of multiprocessing.Pool, the global variable pool may not be available when you need it (and its value isn't serializable anyways). To solve this problem, you can use a callback, as in
def cracker(password):
try:
zFile.extractall(pwd=password)
except RuntimeError:
return
return password
def callback(found):
if found:
pool.terminate()
...
pool.apply_async(cracker, (pswd,), callback=cb)
Of course, since you now look at the result all the time, apply is not the right way to go. Instead, you can write your code using imap_unordered:
with open(sys.argv[2], 'r', errors='ignore') as passf, \
multiprocessing.Pool() as pool:
passwords = (line.strip('\n').encode('utf-8') for line in passf)
for found in pool.imap_unordered(cracker, passwords):
if found:
break
Instead of using globals, you may also want to open the zip file (and create a ZipFile object) in each process, by using an initializer for the pool. Even better (and way faster), forgo all of the I/O in the first place and read just the bytes you need once and then pass them on to the children.

phihag's answer is the correct solution.
I just wanted to provide an additional detail regarding calling terminate() when you've found the correct password. The pool variable in cracker() was not defined when I ran the code. So trying to invoke it from there simply threw an exception:
NameError: name 'pool' is not defined
(My fork() experience is weak, so I don't completely understand why the global zFile is copied to the child processes successfully while pool is not. Even if it were copied, it would not be the same pool in the parent process, right? So any methods invoked on it would have no effect on the real pool in the parent process. Regardless, I prefer this advice listed within the multiprocessing module's Programming guidelines section: Explicitly pass resources to child processes.)
My suggestion is to make cracker() return the password if it is correct, otherwise return None. Then pass a callback to apply_async() that records the correct password, as well as terminating the pool. Here's my take at modifying your code to do this:
#!/usr/bin/env python3
import multiprocessing as mp
import zipfile # Handeling the zipfile
import sys # Command line arguments, and quiting application
import time # To calculate runtime
import os
def usage(program_name):
print("Usage: {0} <path to zipfile> <dictionary>".format(program_name))
sys.exit(1)
def cracker(zip_file_path, password):
print('[*] Starting new cracker (pid={0}, password="{1}")'.format(os.getpid(), password))
try:
time.sleep(1) # XXX: to simulate the task taking a bit of time
with zipfile.ZipFile(zip_file_path) as zFile:
zFile.extractall(pwd=bytes(password, 'utf-8'))
return password
except:
return None
def main():
if len(sys.argv) < 3:
usage(sys.argv[0])
print('[*] Starting main (pid={0})'.format(os.getpid()))
zip_file_path = sys.argv[1]
password_file_path = sys.argv[2]
startime = time.time()
actual_password = None
with mp.Pool() as pool:
def set_actual_password(password):
nonlocal actual_password
if password:
print('[*] Found password; stopping future tasks')
pool.terminate()
actual_password = password
with open(password_file_path, 'r', errors='ignore') as password_file:
for pswd in password_file:
pswd = pswd.strip('\n')
pool.apply_async(cracker, (zip_file_path, pswd,), callback=set_actual_password)
pool.close()
pool.join()
if actual_password:
print('[*] Cracked password: "{0}"'.format(actual_password))
else:
print('[*] Unable to crack password')
runtime = round(time.time() - startime, 5)
print("[*] Runtime:", runtime, 'seconds')
sys.exit(0)
if __name__ == "__main__":
main()

Here's an implementation of the advice from #phihag's and #Equality 7-2521's answers:
#!/usr/bin/env python3
"""Brute force zip password.
Usage: brute-force-zip-password <zip archive> <passwords>
"""
import sys
from multiprocessing import Pool
from time import monotonic as timer
from zipfile import ZipFile
def init(archive): # run at the start of a worker process
global zfile
zfile = ZipFile(open(archive, 'rb')) # open file in each process once
def check(password):
assert password
try:
with zfile.open(zfile.infolist()[0], pwd=password):
return password # assume success
except Exception as e:
if e.args[0] != 'Bad password for file':
# assume all other errors happen after the password was accepted
raise RuntimeError(password) from e
def main():
if len(sys.argv) != 3:
sys.exit(__doc__) # print usage
start = timer()
# decode passwords using the preferred locale encoding
with open(sys.argv[2], errors='ignore') as file, \
Pool(initializer=init, initargs=[sys.argv[1]]) as pool: # use all CPUs
# check passwords encoded using utf-8
passwords = (line.rstrip('\n').encode('utf-8') for line in file)
passwords = filter(None, passwords) # filter empty passwords
for password in pool.imap_unordered(check, passwords, chunksize=100):
if password is not None: # found
print("Password: '{}'".format(password.decode('utf-8')))
break
else:
sys.exit('Unable to find password')
print('Runtime: %.5f seconds' % (timer() - start,))
if __name__=="__main__":
main()
Note:
each worker process has its own ZipFile object and the zip file is opened once per process: it should make it more portable (Windows support) and improve time performance
the content is not extracted: check(password) tries to open and immediately closes an archive member on success: it is safer and it should improve time performance (no need to create directories, etc)
all errors except 'Bad password for file' while decrypting the archive member are assumed to happen after the password is accepted: the rational is to avoid silencing unexpected errors -- each exception should be considered individually
check(password) expects nonempty passwords
chunksize parameter may drastically improve performance
a rare for/else syntax is used, to report cases when the password is not found
the with-statement calls pool.terminate() for you

python watchdog for threads

Im writing simple app, which reads (about a million) lines from file, copy those lines into list, and if next line will be different then previous it runs a thread, to do some job with that list. Thread job is based on tcp sockets, sending and receiving commands via telnet lib.
Sometimes my application hangs and does nothing. All telnet operations I wrapped into try-except statements, also read and write into sockets has timeouts.
I thought about writing watchdog, which will do sys.exit() or something similiar on that hang condtition. But, for now I'm thinking how to create it, and still got no idea how to do it. So if You can trace me, it would be great.
For that file I'm creating 40 threads. Pseudo code looks:
lock = threading.Lock()
no_of_jobs = 0
class DoJob(threading.Thread):
def start(self, cond, work):
self.work = work
threading.Thread.start(self)
def run(self)
global lock
global no_of_jobs
lock.acquire()
no_of_jobs += 1
lock.release()
# do some job, if error or if finished, decrement no_of_jobs under lock
(...)
main:
#starting conditions:
with open(sys.argv[1]) as targetsfile:
head = [targetsfile.next() for x in xrange(1)]
s = head[0]
prev_cond = s[0]
work = []
for line in open(sys.argv[1], "r"):
cond = line([0])
if prev_cond != cond:
while(no_of_jobs>= MAX_THREADS):
time.sleep(1)
DoJob(cond, work)
prev_cond = cond
work = None
work = []
work.append(line)
#last job:
DoJob(cond, work)
while threading.activeCount() > 1:
time.sleep(1)
best regards
J

I have successfully used code like below in the past (from a python 3 program I wrote):
import threading
def die():
print('ran for too long. quitting.')
for thread in threading.enumerate():
if thread.isAlive():
try:
thread._stop()
except:
pass
sys.exit(1)
if __name__ == '__main__':
#bunch of app-specific code...
# setup max runtime
die = threading.Timer(2.0, die) #quit after 2 seconds
die.daemon = True
die.start()
#after work is done
die.cancel()

We Keep Coding

Python is a programming language that lets you work quickly and integrate systems more effectively.