How to run functions in parallel

Question

I researched first and couldn t find an answer to my question  I am trying to run multiple functions in parallel in Python  I have something like this  files py  import common  common is a util class that handles all the IO stuff  dir1    C  folder1  dir2    C  folder2  filename    test txt  addFiles    25  5  15  35  45  25  5  15  35  45   def func1       c   common Common      for i in range len addFiles           c createFiles addFiles i   filename  dir1         c getFiles dir1         time sleep 10         c removeFiles addFiles i   dir1         c getFiles dir1   def func2       c   common Common      for i in range len addFiles           c createFiles addFiles i   filename  dir2         c getFiles dir2         time sleep 10         c removeFiles addFiles i   dir2         c getFiles dir2   I want to call func1 and func2 and have them run at the same time  The functions do not interact with each other or on the same object  Right now I have to wait for func1 to finish before func2 to start  How do I do something like below  process py  from files import func1  func2  runBothFunc func1    func2     I want to be able to create both directories pretty close to the same time because every min I am counting how many files are being created  If the directory isn t there it will throw off my timing

User · Answer

In 2021 the easiest way is to use asyncio:

import asyncio, time

async def say_after(delay, what):
    await asyncio.sleep(delay)
    print(what)

async def main():

    task1 = asyncio.create_task(
        say_after(4, 'hello'))

    task2 = asyncio.create_task(
        say_after(3, 'world'))

    print(f"started at {time.strftime('%X')}")

    # Wait until both tasks are completed (should take
    # around 2 seconds.)
    await task1
    await task2

    print(f"finished at {time.strftime('%X')}")


asyncio.run(main())

References:

[1] https://docs.python.org/3/library/asyncio-task.html

User · Answer

If you are a windows user and using python 3  then this post will help you to do parallel programming in python when you run a usual multiprocessing library s pool programming  you will get an error regarding the main function in your program  This is because the fact that windows has no fork   functionality  The below post is giving a solution to the mentioned problem     http   python 6 x6 nabble com Multiprocessing-Pool-woes-td5047050 html  Since I  was using the python 3  I changed the program a little like this    from types import FunctionType import marshal  def  applicable  args    kwargs     name   kwargs    pw name     code   marshal loads kwargs    pw code      gbls   globals    gbls   marshal loads kwargs    pw gbls      defs   marshal loads kwargs    pw defs      clsr   marshal loads kwargs    pw clsr      fdct   marshal loads kwargs    pw fdct      func   FunctionType code  gbls  name  defs  clsr    func fdct   fdct   del kwargs    pw name     del kwargs    pw code     del kwargs    pw defs     del kwargs    pw clsr     del kwargs    pw fdct     return func  args    kwargs   def make applicable f   args    kwargs     if not isinstance f  FunctionType   raise ValueError  argument must be a function     kwargs    pw name     f   name      edited   kwargs    pw code     marshal dumps f   code        edited   kwargs    pw defs     marshal dumps f   defaults       edited   kwargs    pw clsr     marshal dumps f   closure       edited   kwargs    pw fdct     marshal dumps f   dict        edited   return  applicable  args  kwargs  def  mappable x     x name code defs clsr fdct   x   code   marshal loads code    gbls   globals    gbls   marshal loads gbls    defs   marshal loads defs    clsr   marshal loads clsr    fdct   marshal loads fdct    func   FunctionType code  gbls  name  defs  clsr    func fdct   fdct   return func x   def make mappable f  iterable     if not isinstance f  FunctionType   raise ValueError  argument must be a function     name   f   name        edited   code   marshal dumps f   code        edited   defs   marshal dumps f   defaults       edited   clsr   marshal dumps f   closure       edited   fdct   marshal dumps f   dict       edited   return  mappable    i name code defs clsr fdct  for i in iterable    After this function   the above problem code is also changed a little like this    from multiprocessing import Pool from poolable import make applicable  make mappable  def cube x     return x  3  if   name         main       pool      Pool processes 2    results    pool apply async  make applicable cube x   for x in range 1 7     print  result get timeout 10  for result in results     And I got the output as     1  8  27  64  125  216    I am thinking that this post may be useful for some of the windows users

User · Answer

You could use threading or multiprocessing   Due to peculiarities of CPython  threading is unlikely to achieve true parallelism  For this reason  multiprocessing is generally a better bet   Here is a complete example   from multiprocessing import Process  def func1      print  func1  starting    for i in xrange 10000000   pass   print  func1  finishing   def func2      print  func2  starting    for i in xrange 10000000   pass   print  func2  finishing   if   name         main       p1   Process target func1    p1 start     p2   Process target func2    p2 start     p1 join     p2 join     The mechanics of starting joining child processes can easily be encapsulated into a function along the lines of your runBothFunc   def runInParallel  fns     proc        for fn in fns      p   Process target fn      p start       proc append p    for p in proc      p join    runInParallel func1  func2

User · Answer

There s no way to guarantee that two functions will execute in sync with each other which seems to be what you want to do   The best you can do is to split up the function into several steps  then wait for both to finish at critical synchronization points using Process join like  aix s answer mentions   This is better than time sleep 10  because you can t guarantee exact timings   With explicitly waiting  you re saying that the functions must be done executing that step before moving to the next  instead of assuming it will be done within 10ms which isn t guaranteed based on what else is going on on the machine

User · Answer

This can be done elegantly with Ray  a system that allows you to easily parallelize and distribute your Python code  To parallelize your example  you d need to define your functions with the  ray remote decorator  and then invoke them with  remote  import ray  ray init    dir1    C   folder1  dir2    C   folder2  filename    test txt  addFiles    25  5  15  35  45  25  5  15  35  45     Define the functions     You need to pass every global variable used by the function as an argument    This is needed because each remote function runs in a different process    and thus it does not have access to the global variables defined in    the current process   ray remote def func1 filename  addFiles  dir         func1   code here      ray remote def func2 filename  addFiles  dir         func2   code here       Start two tasks in the background and wait for them to finish  ray get  func1 remote filename  addFiles  dir1   func2 remote filename  addFiles  dir2      If you pass the same argument to both functions and the argument is large  a more efficient way to do this is using ray put    This avoids the large argument to be serialized twice and to create two memory copies of it  largeData id   ray put largeData   ray get  func1 largeData id   func2 largeData id     Important - If func1   and func2   return results  you need to rewrite the code as follows  ret id1   func1 remote filename  addFiles  dir1  ret id2   func2 remote filename  addFiles  dir2  ret1  ret2   ray get  ret id1  ret id2    There are a number of advantages of using Ray over the multiprocessing module  In particular  the same code will run on a single machine as well as on a cluster of machines  For more advantages of Ray see this related post

User · Answer

Seems like you have a single function that you need to call on two different parameters  This can be elegantly done using a combination of concurrent futures and map with Python 3 2   import time from concurrent futures import ThreadPoolExecutor  ProcessPoolExecutor  def sleep secs seconds     time sleep seconds    print f  seconds  has been processed    secs list    2 4  6  8  10  12    Now  if your operation is IO bound  then you can use the ThreadPoolExecutor as such   with ThreadPoolExecutor   as executor    results   executor map sleep secs  secs list    Note how map is used here to map your function to the list of arguments   Now  If your function is CPU bound  then you can use ProcessPoolExecutor  with ProcessPoolExecutor   as executor    results   executor map sleep secs  secs list    If you are not sure  you can simply try both and see which one gives you better results   Finally  if you are looking to print out your results  you can simply do this   with ThreadPoolExecutor   as executor    results   executor map sleep secs  secs list    for result in results      print result

User · Answer

If your functions are mainly doing I O work  and less CPU work  and you have Python 3 2   you can use a ThreadPoolExecutor   from concurrent futures import ThreadPoolExecutor  def run io tasks in parallel tasks       with ThreadPoolExecutor   as executor          running tasks    executor submit task  for task in tasks          for running task in running tasks              running task result    run io tasks in parallel       lambda  print  IO task 1 running         lambda  print  IO task 2 running          If your functions are mainly doing CPU work  and less I O work  and you have Python 2 6   you can use the multiprocessing module   from multiprocessing import Process  def run cpu tasks in parallel tasks       running tasks    Process target task  for task in tasks      for running task in running tasks          running task start       for running task in running tasks          running task join    run cpu tasks in parallel       lambda  print  CPU task 1 running         lambda  print  CPU task 2 running

[python] How to run functions in parallel?

Examples related to python

Examples related to multithreading

Examples related to multiprocessing