If you use an operating system that uses copy-on-write fork()
semantics (like any common unix), then as long as you never alter your data structure it will be available to all child processes without taking up additional memory. You will not have to do anything special (except make absolutely sure you don't alter the object).
The most efficient thing you can do for your problem would be to pack your array into an efficient array structure (using numpy
or array
), place that in shared memory, wrap it with multiprocessing.Array
, and pass that to your functions. This answer shows how to do that.
If you want a writeable shared object, then you will need to wrap it with some kind of synchronization or locking. multiprocessing
provides two methods of doing this: one using shared memory (suitable for simple values, arrays, or ctypes) or a Manager
proxy, where one process holds the memory and a manager arbitrates access to it from other processes (even over a network).
The Manager
approach can be used with arbitrary Python objects, but will be slower than the equivalent using shared memory because the objects need to be serialized/deserialized and sent between processes.
There are a wealth of parallel processing libraries and approaches available in Python. multiprocessing
is an excellent and well rounded library, but if you have special needs perhaps one of the other approaches may be better.