Quantcast
Channel: Python multiprocessing Pool map and imap - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Python multiprocessing Pool map and imap

$
0
0

I have a multiprocessing script with pool.map that works. The problem is that not all processes take as long to finish, so some processes fall asleep because they wait until all processes are finished (same problem as in this question). Some files are finished in less than a second, others take minutes (or hours).

If I understand the manual (and this post) correctly, pool.imap is not waiting for all the processes to finish, if one is done, it is providing a new file to process. When I try that, the script is speeding over the files to process, the small ones are processed as expected, the large files (that take more time to process) don't finish until the end (are killed without notice ?). Is this normal behavior for pool.imap, or do I need to add more commands/parameters ? When I add the time.sleep(100) in the else part as test, it is processing more large files but the other processes fall asleep. Any suggestions ? Thanks

def process_file(infile):    #read infile    #compare things in infile    #acquire Lock, save things in outfile, release Lock    #delete infiledef main():    #nprocesses = 8    global filename    pathlist = ['tmp0', 'tmp1', 'tmp2', 'tmp3', 'tmp4', 'tmp5', 'tmp6', 'tmp7', 'tmp8', 'tmp9']    for d in pathlist:        os.chdir(d)              todolist = []        for infile in os.listdir():              todolist.append(infile)        try:               p = Pool(processes=nprocesses)            p.imap(process_file, todolist)        except KeyboardInterrupt:                            print("Shutting processes down")           # Optionally try to gracefully shut down the worker processes here.                   p.close()            p.terminate()            p.join()        except StopIteration:            continue             else:            time.sleep(100)            os.chdir('..')        p.close()        p.join() if __name__ == '__main__':    main()    

Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images