What is copy-on-write

Question

I would like to know what copy-on-write is and what it is used for  The term  copy-on-write array  is mentioned several times in the Sun JDK tutorials but I didn t understand what it meant

User · Accepted Answer

I was going to write up my own explanation but this Wikipedia article pretty much sums it up.

Here is the basic concept:

Copy-on-write (sometimes referred to as "COW") is an optimization strategy used in computer programming. The fundamental idea is that if multiple callers ask for resources which are initially indistinguishable, you can give them pointers to the same resource. This function can be maintained until a caller tries to modify its "copy" of the resource, at which point a true private copy is created to prevent the changes becoming visible to everyone else. All of this happens transparently to the callers. The primary advantage is that if a caller never makes any modifications, no private copy need ever be created.

Also here is an application of a common use of COW:

The COW concept is also used in maintenance of instant snapshot on database servers like Microsoft SQL Server 2005. Instant snapshots preserve a static view of a database by storing a pre-modification copy of data when underlaying data are updated. Instant snapshots are used for testing uses or moment-dependent reports and should not be used to replace backups.

User · Answer

It is a memory protection concept  In this compiler creates extra copy to modify data in child and this updated data not reflect in parents data

User · Answer

Just to provide another example  Mercurial uses copy-on-write to make cloning local repositories a really  cheap  operation   The principle is the same as the other examples  except that you re talking about physical files instead of objects in memory   Initially  a clone is not a duplicate but a hard link to the original   As you change files in the clone  copies are written to represent the new version

User · Answer

A good example is Git  which uses a strategy to store blobs  Why does it use hashes  Partly because these are easier to perform diffs on  but also because makes it simpler to optimise a COW strategy  When you make a new commit with few files changes the vast majority of objects and trees will not change  Therefore the commit  will through various pointers made of hashes reference a bunch of object that already exist  making the storage space required to store the entire history much smaller

User · Answer

I shall not repeat the same answer on Copy-on-Write  I think Andrew s answer and Charlie s answer have already made it very clear  I will give you an example from OS world  just to mention how widely this concept is used   We can use fork   or vfork   to create a new process  vfork follows the concept of copy-on-write  For example  the child process created by vfork will share the data and code segment with the parent process  This speeds up the forking time  It is expected to use vfork if you are performing exec followed by vfork  So vfork will create the child process which will share data and code segment with its parent but when we call exec  it will load up the image of a new executable in the address space of the child process

User · Answer

It s also used in Ruby  Enterprise Edition  as a neat way of saving memory

User · Answer

I found this good article about zval in PHP  which mentioned COW too      Copy On Write  abbreviated as    COW     is a trick designed to save memory  It is used more generally in software engineering  It means that PHP will copy the memory  or allocate new memory region  when you write to a symbol  if this one was already pointing to a zval

User · Answer

Copy on write  means more or less what it sounds like  everyone has a single shared copy of the same data until it s written  and then a copy is made   Usually  copy-on-write is used to resolve concurrency sorts of problems   In ZFS  for example  data blocks on disk are allocated copy-on-write  as long as there are no changes  you keep the original blocks  a change changed only the affected blocks   This means the minimum number of new blocks are allocated   These changes are also usually implemented to be transactional  ie  they have the ACID properties   This eliminates some concurrency issues  because then you re guaranteed that all updates are atomic

User · Answer

Copy-on-write is a technique to reduce the memory usage of resource copies using deferred copy  The resource copies are initially virtual  i e  they share memory  and only become real  i e  they have their own memory  on the first write operation  hence the name    copy-on-write     Here after is a Python implementation of the copy-on-write technique using the proxy design pattern  A ValueProxy object  the proxy  implements the copy-on-write technique by   having an attribute bound to an immutable Value object  the subject   translating copy requests to the creation of a new ValueProxy object sharing the same subject attribute as the original ValueProxy object  forwarding read requests to the subject attribute  translating write requests to the creation of a new immutable Value object with the new state and the rebinding of the subject attribute to this new immutable Value object   import abc  class BaseValue abc ABC        abc abstractmethod     def read self           raise NotImplementedError      abc abstractmethod     def write self  data           raise NotImplementedError  class Value BaseValue       def   init   self  data           self data   data     def read self           return self data     def write self  data           pass  class ValueProxy BaseValue       def   init   self  subject           self subject   subject     def read self           return self subject read       def write self  data           self subject   Value data      def clone self           return ValueProxy self subject   v1   ValueProxy Value  foo    v2   v1 clone      shares the immutable Value object between the copies assert v1 subject is v2 subject v2 write  bar      creates a new immutable Value object with the new state assert v1 subject is not v2 subject

[data-structures] What is copy-on-write?

Examples related to data-structures

Examples related to copy-on-write