Categories
multiprocessing python windows

RuntimeError on windows trying python multiprocessing

239

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.

EDIT: By the way this code runs fine on ubuntu. Not quite on windows

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()

3

  • @doctorlove I run it as python testMain.py

    – NG Algo

    Aug 13, 2013 at 9:14

  • 2

    Sure – you need a if name == ‘main‘ see the answers and the docs

    Aug 13, 2013 at 9:15

  • 1

    @NGAlgo Your script was very helpful to me while I was debugging a problem with pymongo and multiprocessing. Thanks!

    – Clay

    Dec 4, 2013 at 1:45

354

On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.

Modified testMain.py:

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)

13

  • 7

    (smacks his palm against his forehead) Doh! It works!!!! Thank you so much! I was missing the fact that it is the original main module that gets re-imported! All this time I was trying the “name ==” check right before where I launched my processes.

    – NG Algo

    Aug 13, 2013 at 9:17


  • 1

    I cannot seem to import ‘parallelTestModule’. I’m using Python 2.7. Should it work out of the box?

    – Jonny

    Jan 28, 2016 at 14:42

  • 2

    @Jonny The code for parallelTestModule.py is part of the question.

    Jan 29, 2016 at 6:44

  • 1

    @DeshDeepSingh The code snippet is not a stand-alone example; it is a modification of OP’s code

    Jul 20, 2018 at 12:56

  • 1

    @DeshDeepSingh That module is part of the question.

    Jul 20, 2018 at 13:05

40

Try putting your code inside a main function in testMain.py

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

See the docs:

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

which say

“Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).”

… by using if __name__ == '__main__'

    14

    Though the earlier answers are correct, there’s a small complication it would help to remark on.

    In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:

    if __name__ ==  '__main__':
      import my_module