[python] Using multiprocessing.Process with a maximum number of simultaneous processes

I think Semaphore is what you are looking for, it will block the main process after counting down to 0. Sample code:

from multiprocessing import Process
from multiprocessing import Semaphore
import time

def f(name, sema):
    print('process {} starting doing business'.format(name))
    # simulate a time-consuming task by sleeping
    time.sleep(5)
    # `release` will add 1 to `sema`, allowing other 
    # processes blocked on it to continue
    sema.release()

if __name__ == '__main__':
    concurrency = 20
    total_task_num = 1000
    sema = Semaphore(concurrency)
    all_processes = []
    for i in range(total_task_num):
        # once 20 processes are running, the following `acquire` call
        # will block the main process since `sema` has been reduced
        # to 0. This loop will continue only after one or more 
        # previously created processes complete.
        sema.acquire()
        p = Process(target=f, args=(i, sema))
        all_processes.append(p)
        p.start()

    # inside main process, wait for all processes to finish
    for p in all_processes:
        p.join()

The following code is more structured since it acquires and releases sema in the same function. However, it will consume too much resources if total_task_num is very large:

from multiprocessing import Process
from multiprocessing import Semaphore
import time

def f(name, sema):
    print('process {} starting doing business'.format(name))
    # `sema` is acquired and released in the same
    # block of code here, making code more readable,
    # but may lead to problem.
    sema.acquire()
    time.sleep(5)
    sema.release()

if __name__ == '__main__':
    concurrency = 20
    total_task_num = 1000
    sema = Semaphore(concurrency)
    all_processes = []
    for i in range(total_task_num):
        p = Process(target=f, args=(i, sema))
        all_processes.append(p)
        # the following line won't block after 20 processes
        # have been created and running, instead it will carry 
        # on until all 1000 processes are created.
        p.start()

    # inside main process, wait for all processes to finish
    for p in all_processes:
        p.join()

The above code will create total_task_num processes but only concurrency processes will be running while other processes are blocked, consuming precious system resources.

Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to multithreading

How can compare-and-swap be used for a wait-free mutual exclusion for any shared data structure? Waiting until the task finishes What is the difference between Task.Run() and Task.Factory.StartNew() Why is setState in reactjs Async instead of Sync? What exactly is std::atomic? Calling async method on button click WAITING at sun.misc.Unsafe.park(Native Method) How to use background thread in swift? What is the use of static synchronized method in java? Locking pattern for proper use of .NET MemoryCache

Examples related to multiprocessing

Passing multiple parameters to pool.map() function in Python Dead simple example of using Multiprocessing Queue, Pool and Locking Using multiprocessing.Process with a maximum number of simultaneous processes Multiprocessing a for loop? RuntimeError on windows trying python multiprocessing How to use multiprocessing queue in Python? Shared-memory objects in multiprocessing Python multiprocessing PicklingError: Can't pickle <type 'function'> multiprocessing.Pool: When to use apply, apply_async or map? How to troubleshoot an "AttributeError: __exit__" in multiproccesing in Python?