Why doesn t C have a garbage collector

Question

I m not asking this question because of the merits of garbage collection first of all   My main reason for asking this is that I do know that Bjarne Stroustrup has said that C   will have a garbage collector at some point in time   With that said  why hasn t it been added   There are already some garbage collectors for C     Is this just one of those  easier said than done  type things   Or are there other reasons it hasn t been added  and won t be added in C  11    Cross links    Garbage collectors for C     Just to clarify  I understand the reasons why C   didn t have a garbage collector when it was first created   I m wondering why the collector can t be added in

User · Answer

SHORT ANSWER   We don t know how to do garbage collection efficiently  with minor time and space overhead  and correctly all the time  in all possible cases    LONG ANSWER   Just like C  C   is a systems language  this means it is used when you are writing system code  e g   operating system   In other words  C   is designed  just like C  with best possible performance as the main target   The language  standard will not add any feature that might hinder the performance objective     This pauses the question  Why garbage collection hinders performance   The main reason is that  when it comes to implementation  we  computer scientists  do not know how to do garbage collection with minimal overhead  for all cases   Hence it s impossible to the C   compiler and runtime system to perform garbage collection efficiently all the time   On the other hand  a C   programmer  should know his design implementation and he s the best person to decide how to best do the garbage collection   Last  if control  hardware  details  etc   and performance  time  space  power  etc   are not the main constraints  then C   is not the write tool   Other language might serve better and offer more  hidden  runtime management  with the necessary overhead

User · Answer

Stroustrup made some good comments on this at the 2013 Going Native conference   Just skip to about 25m50s in this video    I d recommend watching the whole video actually  but this skips to the stuff about garbage collection    When you have a really great language that makes it easy  and safe  and predictable  and easy-to-read  and easy-to-teach  to deal with objects and values in a direct way  avoiding  explicit  use of the heap  then you don t even want garbage collection   With modern C    and the stuff we have in C  11  garbage collection is no longer desirable except in limited circumstances   In fact  even if a good garbage collector is built into one of the major C   compilers  I think that it won t be used very often   It will be easier  not harder  to avoid the GC   He shows this example   void f int n  int x        Gadget  p   new Gadget n       if x lt 100  throw SomeException        if x lt 200  return      delete p      This is unsafe in C     But it s also unsafe in Java   In C    if the function returns early  the delete will never be called   But if you had full garbage collection  such as in Java  you merely get a suggestion that the object will be destructed  at some point in the future   Update  it s even worse that this  Java does not promise to call the finalizer ever - it maybe never be called    This isn t good enough if Gadget holds an open file handle  or a connection to a database  or data which you have buffered for write to a database at a later point   We want the Gadget to be destroyed as soon as it s finished  in order to free these resources as soon as possible   You don t want your database server struggling with thousands of database connections that are no longer needed - it doesn t know that your program is finished working   So what s the solution  There are a few approaches   The obvious approach  which you ll use for the vast majority of your objects is   void f int n  int x        Gadget p    n       Just leave it on the stack  where it belongs       if x lt 100  throw SomeException        if x lt 200  return      This takes fewer characters to type   It doesn t have new getting in the way   It doesn t require you to type Gadget twice   The object is destroyed at the end of the function   If this is what you want  this is very intuitive   Gadgets behave the same as int or double   Predictable  easy-to-read  easy-to-teach   Everything is a  value    Sometimes a big value  but values are easier to teach because you don t have this  action at a distance  thing that you get with pointers  or references    Most of the objects you make are for use only in the function that created them  and perhaps passed as inputs to child functions   The programmer shouldn t have to think about  memory management  when returning objects  or otherwise sharing objects across widely separated parts of the software   Scope and lifetime are important   Most of the time  it s easier if the lifetime is the same as the scope   It s easier to understand and easier to teach   When you want a different lifetime  it should be obvious reading the code that you re doing this  by use of shared ptr for example    Or returning  large  objects by value  leveraging move-semantics or unique ptr   This might seem like an efficiency problem   What if I want to return a Gadget from foo     C  11 s move semantics make it easier to return big objects   Just write Gadget foo           and it will just work  and work quickly   You don t need to mess with  amp  amp  yourself  just return things by value and the language will often be able to do the necessary optimizations    Even before C  03  compilers did a remarkably good job at avoiding unnecessary copying    As Stroustrup said elsewhere in the video  paraphrasing    Only a computer scientist would insist on copying an object  and then destroying the original   audience laughs   Why not just move the object directly to the new location  This is what humans  not computer scientists  expect    When you can guarantee only one copy of an object is needed  it s much easier to understand the lifetime of the object   You can pick what lifetime policy you want  and garbage collection is there if you want   But when you understand the benefits of the other approaches  you ll find that garbage collection is at the bottom of your list of preferences   If that doesn t work for you  you can use unique ptr  or failing that  shared ptr   Well written C  11 is shorter  easier-to-read  and easier-to-teach than many other languages when it comes to memory management

User · Answer

All the technical talking is overcomplicating the concept   If you put GC into C   for all the memory automatically then consider something like a web browser  The web browser must load a full web document AND run web scripts  You can store web script variables in the document tree  In a BIG document in a browser with lots of tabs open  it means that every time the GC must do a full collection it must also scan all the document elements    On most computers this means that PAGE FAULTS will occur  So the main reason  to answer the question is that PAGE FAULTS will occur  You will know this as when your PC starts making lots of disk access  This is because the GC must touch lots of memory in order to prove invalid pointers  When you have a bona fide application using lots of memory  having to scan all objects every collection is havoc because of the PAGE FAULTS  A page fault is when virtual memory needs to get read back into RAM from disk   So the correct solution is to divide an application into the parts that need GC and the parts that do not  In the case of the web browser example above  if the document tree was allocated with malloc  but the javascript ran with GC  then every time the GC kicks in it only scans a small portion of memory and all PAGED OUT elements of the memory for the document tree does not need to get paged back in   To further understand this problem  look up on virtual memory and how it is implemented in computers  It is all about the fact that 2GB is available to the program when there is not really that much RAM  On modern computers with 2GB RAM for a 32BIt system it is not such a problem provided only one program is running   As an additional example  consider a full collection that must trace all objects  First you must scan all objects reachable via roots  Second scan all the objects visible in step 1  Then scan waiting destructors  Then go to all the pages again and switch off all invisible objects  This means that many pages might get swapped out and back in multiple times   So my answer to bring it short is that the number of PAGE FAULTS which occur as a result of touching all the memory causes full GC for all objects in a program to be unfeasible and so the programmer must view GC as an aid for things like scripts and database work  but do normal things with manual memory management   And the other very important reason of course is global variables  In order for the collector to know that a global variable pointer is in the GC it would require specific keywords  and thus existing C   code would not work

User · Answer

To answer most  why  questions about C    read Design and Evolution of C

User · Answer

Imposing garbage collection is really a low level to high level paradigm shift   If you look at the way strings are handled in a language with garbage collection  you will find they ONLY allow high level string manipulation functions and do not allow binary access to the strings  Simply put  all string functions first check the pointers to see where the string is  even if you are only drawing out a byte  So if you are doing a loop that processes each byte in a string in a language with garbage collection  it must compute the base location plus offset for each iteration  because it cannot know when the string has moved  Then you have to think about heaps  stacks  threads  etc etc

User · Answer

The idea behind C   was that you would not pay any performance impact for features that you don t use  So adding garbage collection would have meant having some programs run straight on the hardware the way C does and some within some sort of runtime virtual machine    Nothing prevents you from using some form of smart pointers that are bound to some third-party garbage collection mechanism  I seem to recall Microsoft doing something like that with COM and it didn t go to well

User · Answer

To add to the debate here   There are known issues with garbage collection  and understanding them helps understanding why there is none in C     1  Performance    The first complaint is often about performance  but most people don t really realize what they are talking about  As illustrated by Martin Beckett the problem may not be performance per se  but the predictability of performance   There are currently 2 families of GC that are widely deployed    Mark-And-Sweep kind Reference-Counting kind   The Mark And Sweep is faster  less impact on overall performance  but it suffers from a  freeze the world  syndrome  i e  when the GC kicks in  everything else is stopped until the GC has made its cleanup  If you wish to build a server that answers in a few milliseconds    some transactions will not live up to your expectations     The problem of Reference Counting is different  reference-counting adds overhead  especially in Multi-Threading environments because you need to have an atomic count  Furthermore there is the problem of reference cycles so you need a clever algorithm to detect those cycles and eliminate them  generally implement by a  freeze the world  too  though less frequent   In general  as of today  this kind  even though normally more responsive or rather  freezing less often  is slower than the Mark And Sweep   I have seen a paper by Eiffel implementers that were trying to implement a Reference Counting Garbage Collector that would have a similar global performance to Mark And Sweep without the  Freeze The World  aspect  It required a separate thread for the GC  typical   The algorithm was a bit frightening  at the end  but the paper made a good job of introducing the concepts one at a time and showing the evolution of the algorithm from the  simple  version to the full-fledged one  Recommended reading if only I could put my hands back on the PDF file     2  Resources Acquisition Is Initialization  RAII   It s a common idiom in C   that you will wrap the ownership of resources within an object to ensure that they are properly released  It s mostly used for memory since we don t have garbage collection  but it s also useful nonetheless for many other situations    locks  multi-thread  file handle       connections  to a database  another server         The idea is to properly control the lifetime of the object    it should be alive as long as you need it it should be killed when you re done with it   The problem of GC is that if it helps with the former and ultimately guarantees that later    this  ultimate  may not be sufficient  If you release a lock  you d really like that it be released now  so that it does not block any further calls   Languages with GC have two work arounds    don t use GC when stack allocation is sufficient  it s normally for performance issues  but in our case it really helps since the scope defines the lifetime using construct    but it s explicit  weak  RAII while in C   RAII is implicit so that the user CANNOT unwittingly make the error  by omitting the using keyword    3  Smart Pointers  Smart pointers often appear as a silver bullet to handle memory in C    Often times I have heard  we don t need GC after all  since we have smart pointers   One could not be more wrong   Smart pointers do help  auto ptr and unique ptr use RAII concepts  extremely useful indeed  They are so simple that you can write them by yourself quite easily   When one need to share ownership however it gets more difficult  you might share among multiple threads and there are a few subtle issues with the handling of the count  Therefore  one naturally goes toward shared ptr   It s great  that s what Boost for after all  but it s not a silver bullet  In fact  the main issue with shared ptr is that it emulates a GC implemented by Reference Counting but you need to implement the cycle detection all by yourself    Urg  Of course there is this weak ptr thingy  but I have unfortunately already seen memory leaks despite the use of shared ptr because of those cycles    and when you are in a Multi Threaded environment  it s extremely difficult to detect   4  What s the solution    There is no silver bullet  but as always  it s definitely feasible  In the absence of GC one need to be clear on ownership    prefer having a single owner at one given time  if possible if not  make sure that your class diagram does not have any cycle pertaining to ownership and break them with subtle application of weak ptr   So indeed  it would be great to have a GC    however it s no trivial issue  And in the mean time  we just need to roll up our sleeves

User · Answer

One of the biggest reasons that C   doesn t have built in garbage collection is that getting garbage collection to play nice with destructors is really  really hard  As far as I know  nobody really knows how to solve it completely yet  There are alot of issues to deal with    deterministic lifetimes of objects  reference counting gives you this  but GC doesn t  Although it may not be that big of a deal   what happens if a destructor throws when the object is being garbage collected  Most languages ignore this exception  since theres really no catch block to be able to transport it to  but this is probably not an acceptable solution for C    How to enable disable it  Naturally it d probably be a compile time decision but code that is written for GC vs code that is written for NOT GC is going to be very different and probably incompatible  How do you reconcile this    These are just a few of the problems faced

User · Answer

If you want automatic garbage collection  there are good commercial   and public-domain garbage collectors for C    For applications where   garbage collection is suitable  C   is an excellent garbage collected   language with a performance that compares favorably with other garbage   collected languages  See The C   Programming Language  4rd   Edition  for a discussion of automatic garbage collection in C      See also  Hans-J  Boehm s site for C and C   garbage collection  archive        Also  C   supports programming techniques that allow memory   management to be safe and implicit without a garbage collector  I consider garbage collection a last choice and an imperfect way of handling for resource management  That does not mean that it is never useful  just that there are better approaches in many situations     Source  http   www stroustrup com bs faq html garbage-collection  As for why it doesnt have it built in  If I remember correctly it was invented before GC was the thing  and I don t believe the language could have had GC for several reasons I E Backwards compatibilty with C   Hope this helps

User · Answer

What type  should it be optimised for embedded washing machine controllers  cell phones  workstations or supercomputers  Should it prioritise gui responsiveness or server loading  should it use lots of memory or lots of CPU   C c   is used in just too many different circumstances  I suspect something like boost smart pointers will be enough for most users  Edit - Automatic garbage collectors aren t so much a problem of performance  you can always buy more server  it s a question of predicatable performance  Not knowing when the GC is going to kick in is like employing a narcoleptic airline pilot  most of the time they are great - but when you really need responsiveness

User · Answer

Implicit garbage collection could have been added in  but it just didn t make the cut   Probably due to not just implementation complications  but also due to people not being able to come to a general consensus fast enough    A quote from Bjarne Stroustrup himself      I had hoped that a garbage collector   which could be optionally enabled   would be part of C  0x  but there were   enough technical problems that I have   to make do with just a detailed   specification of how such a collector   integrates with the rest of the   language  if provided  As is the case   with essentially all C  0x features    an experimental implementation exists    There is a good discussion of the topic here   General overview   C   is very powerful and allows you to do almost anything   For this reason it doesn t automatically push many things onto you that might impact performance    Garbage collection can be easily implemented with smart pointers  objects that wrap pointers with a reference count  which auto delete themselves when the reference count reaches 0    C   was built with competitors in mind that did not have garbage collection   Efficiency was the main concern that C   had to fend off criticism from in comparison to C and others    There are 2 types of garbage collection     Explicit garbage collection   C  0x will have garbage collection via pointers created with shared ptr  If you want it you can use it  if you don t want it you aren t forced into using it   You can currently use boost shared ptr as well if you don t want to wait for C  0x   Implicit garbage collection   It does not have transparent garbage collection though   It will be a focus point for future C   specs though   Why Tr1 doesn t have implicit garbage collection   There are a lot of things that tr1 of C  0x should have had  Bjarne Stroustrup in previous interviews stated that tr1 didn t have as much as he would have liked

User · Answer

Mainly for two reasons    Because it doesn t need one  IMHO  Because it s pretty much incompatible with RAII  which is the cornerstone of C     C   already offers manual memory management  stack allocation  RAII  containers  automatic pointers  smart pointers    That should be enough  Garbage collectors are for lazy programmers who don t want to spend 5 minutes thinking about who should own which objects or when should resources be freed  That s not how we do things in C

User · Answer

Though this is an old question  there s still one problem that I don t see anybody having addressed at all  garbage collection is almost impossible to specify   In particular  the C   standard is quite careful to specify the language in terms of externally observable behavior  rather than how the implementation achieves that behavior  In the case of garbage collection  however  there is virtually no externally observable behavior   The general idea of garbage collection is that it should make a reasonable attempt at assuring that a memory allocation will succeed  Unfortunately  it s essentially impossible to guarantee that any memory allocation will succeed  even if you do have a garbage collector in operation  This is true to some extent in any case  but particularly so in the case of C    because it s  probably  not possible to use a copying collector  or anything similar  that moves objects in memory during a collection cycle   If you can t move objects  you can t create a single  contiguous memory space from which to do your allocations -- and that means your heap  or free store  or whatever you prefer to call it  can  and probably will  become fragmented over time  This  in turn  can prevent an allocation from succeeding  even when there s more memory free than the amount being requested   While it might be possible to come up with some guarantee that says  in essence  that if you repeat exactly the same pattern of allocation repeatedly  and it succeeded the first time  it will continue to succeed on subsequent iterations  provided that the allocated memory became inaccessible between iterations  That s such a weak guarantee it s essentially useless  but I can t see any reasonable hope of strengthening it   Even so  it s stronger than what has been proposed for C    The previous proposal  warning  PDF   that got dropped  didn t guarantee anything at all  In 28 pages of proposal  what you got in the way of externally observable behavior was a single  non-normative  note saying        Note  For garbage collected programs  a high quality hosted implementation should attempt to maximize the amount of unreachable memory it reclaims     end note     At least for me  this raises a serious question about return on investment  We re going to break existing code  nobody s sure exactly how much  but definitely quite a bit   place new requirements on implementations and new restrictions on code  and what we get in return is quite possibly nothing at all   Even at best  what we get are programs that  based on testing with Java  will probably require around six times as much memory to run at the same speed they do now  Worse  garbage collection was part of Java from the beginning -- C   places enough more restrictions on the garbage collector that it will almost certainly have an even worse cost benefit ratio  even if we go beyond what the proposal guaranteed and assume there would be some benefit    I d summarize the situation mathematically  this a complex situation  As any mathematician knows  a complex number has two parts  real and imaginary  It appears to me that what we have here are costs that are real  but benefits that are  at least mostly  imaginary

User · Answer

When we compare C   with Java  we see that C   was not designed with implicit Garbage Collection in mind  while Java was   Having things like arbitrary pointers in C-Style is not only bad for GC-implementations  but it would also destroy backward compatibility for a large amount of C  -legacy-code   In addition to that  C   is a language that is intended to run as standalone executable instead of having a complex run-time environment   All in all  Yes it might be possible to add Garbage Collection to C    but for the sake of continuity it is better not to do so

User · Answer

tl dr  Because modern C   doesn t need garbage collection  Bjarne Stroustrup s FAQ answer on this matter says   I don t like garbage  I don t like littering  My ideal is to eliminate the need for a garbage collector by not producing any garbage  That is now possible    The situation  for code written these days  C  17 and following the official  Core Guidelines  is as follows   Most memory ownership-related code is in libraries  especially those providing containers   Most use of code involving memory ownership follows the RAII pattern  so allocation is made on construction and deallocation on destruction  which happens when exiting the scope in which something was allocated  You do not explicitly allocate or deallocate memory directly  Raw pointers do not own memory  if you ve followed the guidelines   so you can t leak by passing them around  If you re wondering how you re going to pass the starting addresses of  sequences of values in memory - you ll be doing that with a span  no raw pointer needed  If you really need an owning  quot pointer quot   you use C     standard-library smart pointers - they can t leak  and are decently efficient  although the ABI can get in the way of that   Alternatively  you can pass ownership across scope boundaries with  quot owner pointers quot   These are uncommon and must be used explicitly  but when adopted - they allow for nice static checking against leaks    quot Oh yeah  But what about        if I just write code the way we used to write C   in the old days  quot  Indeed  you could just disregard all of the guidelines and write leaky application code - and it will compile and run  and leak   same as always  But it s not a  quot just don t do that quot  situation  where the developer is expected to be virtuous and exercise a lot of self control  it s just not simpler to write non-conforming code  nor is it faster to write  nor is it better-performing  Gradually it will also become more difficult to write  as you would face an increasing  quot impedance mismatch quot  with what conforming code provides and expects      if I reintrepret cast  Or do complex pointer arithmetic  Or other such hacks  quot  Indeed  if you put your mind to it  you can write code that messes things up despite playing nice with the guidelines  But   You would do this rarely  in terms of places in the code  not necessarily in terms of fraction of execution time  You would only do this intentionally  not accidentally  Doing so will stand out in a codebase conforming to the guidelines  It s  the kind of code in which you would bypass the GC in another language anyway       library development  quot  If you re a C   library developer then you do write unsafe code involving raw pointers  and you are required to code carefully and responsibly - but these are self-contained pieces of code written by experts  and more importantly  reviewed by experts   So  it s just like Bjarne said  There s really no motivation to collect garbage generally  as you all but make sure not to produce garbage  GC is becoming a non-problem with C    That is not to say GC isn t an interesting problem for certain specific applications  when you want to employ custom allocation and de-allocations strategies  For those you would want custom allocation and de-allocation  not a language-level GC

User · Answer

One of the fundamental principles behind the original C language is that memory is composed of a sequence of bytes  and code need only care about what those bytes mean at the exact moment that they are being used   Modern C allows compilers to impose additional restrictions  but C includes--and C   retains--the ability to decompose a pointer into a sequence of bytes  assemble any sequence of bytes containing the same values into a pointer  and then use that pointer to access the earlier object   While that ability can be useful--or even indispensable--in some kinds of applications  a language that includes that ability will be very limited in its ability to support any kind of useful and reliable garbage collection   If a compiler doesn t know everything that has been done with the bits that made up a pointer  it will have no way of knowing whether information sufficient to reconstruct the pointer might exist somewhere in the universe   Since it would be possible for that information to be stored in ways that the computer wouldn t be able to access even if it knew about them  e g  the bytes making up the pointer might have been shown on the screen long enough for someone to write them down on a piece of paper   it may be literally impossible for a computer to know whether a pointer could possibly be used in the future   An interesting quirk of many garbage-collected frameworks is that an object reference not defined by the bit patterns contained therein  but by the relationship between the bits held in the object reference and other information held elsewhere   In C and C    if the bit pattern stored in a pointer identifies an object  that bit pattern will identify that object until the object is explicitly destroyed   In a typical GC system  an object may be represented by a bit pattern 0x1234ABCD at one moment in time  but the next GC cycle might replace all references to 0x1234ABCD with references to 0x4321BABE  whereupon the object would be represented by the latter pattern   Even if one were to display the bit pattern associated with an object reference and then later read it back from the keyboard  there would be no expectation that the same bit pattern would be usable to identify the same object  or any object

[c++] Why doesn't C++ have a garbage collector?

Examples related to c++

Examples related to garbage-collection

Examples related to c++11