I stumbled upon Stack Overflow question Memory leak with std::string when using std::list<std::string>, and one of the comments says this:
Stop using
new
so much. I can't see any reason you used new anywhere you did. You can create objects by value in C++ and it's one of the huge advantages to using the language.
You do not have to allocate everything on the heap.
Stop thinking like a Java programmer.
I'm not really sure what he means by that.
Why should objects be created by value in C++ as often as possible, and what difference does it make internally?
Did I misinterpret the answer?
This question is related to
c++
memory-management
heap
new-operator
c++-faq
new
is the new goto
.
Recall why goto
is so reviled: while it is a powerful, low-level tool for flow control, people often used it in unnecessarily complicated ways that made code difficult to follow. Furthermore, the most useful and easiest to read patterns were encoded in structured programming statements (e.g. for
or while
); the ultimate effect is that the code where goto
is the appropriate way to is rather rare, if you are tempted to write goto
, you're probably doing things badly (unless you really know what you're doing).
new
is similar — it is often used to make things unnecessarily complicated and harder to read, and the most useful usage patterns can be encoded have been encoded into various classes. Furthermore, if you need to use any new usage patterns for which there aren't already standard classes, you can write your own classes that encode them!
I would even argue that new
is worse than goto
, due to the need to pair new
and delete
statements.
Like goto
, if you ever think you need to use new
, you are probably doing things badly — especially if you are doing so outside of the implementation of a class whose purpose in life is to encapsulate whatever dynamic allocations you need to do.
Consider a "careful" user who remembers to wrap objects in smart pointers:
foo(shared_ptr<T1>(new T1()), shared_ptr<T2>(new T2()));
This code is dangerous because there is no guarantee that either shared_ptr
is constructed before either T1
or T2
. Hence, if one of new T1()
or new T2()
fails after the other succeeds, then the first object will be leaked because no shared_ptr
exists to destroy and deallocate it.
Solution: use make_shared
.
This is no longer a problem: C++17 imposes a constraint on the order of these operations, in this case ensuring that each call to new()
must be immediately followed by the construction of the corresponding smart pointer, with no other operation in between. This implies that, by the time the second new()
is called, it is guaranteed that the first object has already been wrapped in its smart pointer, thus preventing any leaks in case an exception is thrown.
A more detailed explanation of the new evaluation order introduced by C++17 was provided by Barry in another answer.
Thanks to @Remy Lebeau for pointing out that this is still a problem under C++17 (although less so): the shared_ptr
constructor can fail to allocate its control block and throw, in which case the pointer passed to it is not deleted.
Solution: use make_shared
.
new
allocates objects on the heap. Otherwise, objects are allocated on the stack. Look up the difference between the two.
I see that a few important reasons for doing as few new's as possible are missed:
new
has a non-deterministic execution timeCalling new
may or may not cause the OS to allocate a new physical page to your process this can be quite slow if you do it often. Or it may already have a suitable memory location ready, we don't know. If your program needs to have consistent and predictable execution time (like in a real-time system or game/physics simulation) you need to avoid new
in your time critical loops.
new
is an implicit thread synchronizationYes you heard me, your OS needs to make sure your page tables are consistent and as such calling new
will cause your thread to acquire an implicit mutex lock. If you are consistently calling new
from many threads you are actually serialising your threads (I've done this with 32 CPUs, each hitting on new
to get a few hundred bytes each, ouch! that was a royal p.i.t.a. to debug)
The rest such as slow, fragmentation, error prone, etc have already been mentioned by other answers.
Objects created by new
must be eventually delete
d lest they leak. The destructor won't be called, memory won't be freed, the whole bit. Since C++ has no garbage collection, it's a problem.
Objects created by value (i. e. on stack) automatically die when they go out of scope. The destructor call is inserted by the compiler, and the memory is auto-freed upon function return.
Smart pointers like unique_ptr
, shared_ptr
solve the dangling reference problem, but they require coding discipline and have other potential issues (copyability, reference loops, etc.).
Also, in heavily multithreaded scenarios, new
is a point of contention between threads; there can be a performance impact for overusing new
. Stack object creation is by definition thread-local, since each thread has its own stack.
The downside of value objects is that they die once the host function returns - you cannot pass a reference to those back to the caller, only by copying, returning or moving by value.
In C++, it takes but a single instruction to allocate space -- on the stack -- for every local scope object in a given function, and it's impossible to leak any of that memory. That comment intended (or should have intended) to say something like "use the stack and not the heap".
The reason why is complicated.
First, C++ is not garbage collected. Therefore, for every new, there must be a corresponding delete. If you fail to put this delete in, then you have a memory leak. Now, for a simple case like this:
std::string *someString = new std::string(...);
//Do stuff
delete someString;
This is simple. But what happens if "Do stuff" throws an exception? Oops: memory leak. What happens if "Do stuff" issues return
early? Oops: memory leak.
And this is for the simplest case. If you happen to return that string to someone, now they have to delete it. And if they pass it as an argument, does the person receiving it need to delete it? When should they delete it?
Or, you can just do this:
std::string someString(...);
//Do stuff
No delete
. The object was created on the "stack", and it will be destroyed once it goes out of scope. You can even return the object, thus transfering its contents to the calling function. You can pass the object to functions (typically as a reference or const-reference: void SomeFunc(std::string &iCanModifyThis, const std::string &iCantModifyThis)
. And so forth.
All without new
and delete
. There's no question of who owns the memory or who's responsible for deleting it. If you do:
std::string someString(...);
std::string otherString;
otherString = someString;
It is understood that otherString
has a copy of the data of someString
. It isn't a pointer; it is a separate object. They may happen to have the same contents, but you can change one without affecting the other:
someString += "More text.";
if(otherString == someString) { /*Will never get here */ }
See the idea?
Two reasons:
delete
it later, or it will cause a memory leak.I tend to disagree with the idea of using new "too much". Though the original poster's use of new with system classes is a bit ridiculous. (int *i; i = new int[9999];
? really? int i[9999];
is much clearer.) I think that is what was getting the commenter's goat.
When you're working with system objects, it's very rare that you'd need more than one reference to the exact same object. As long as the value is the same, that's all that matters. And system objects don't typically take up much space in memory. (one byte per character, in a string). And if they do, the libraries should be designed to take that memory management into account (if they're written well). In these cases, (all but one or two of the news in his code), new is practically pointless and only serves to introduce confusions and potential for bugs.
When you're working with your own classes/objects, however (e.g. the original poster's Line class), then you have to begin thinking about the issues like memory footprint, persistence of data, etc. yourself. At this point, allowing multiple references to the same value is invaluable - it allows for constructs like linked lists, dictionaries, and graphs, where multiple variables need to not only have the same value, but reference the exact same object in memory. However, the Line class doesn't have any of those requirements. So the original poster's code actually has absolutely no needs for new
.
When you use new, objects are allocated to the heap. It is generally used when you anticipate expansion. When you declare an object such as,
Class var;
it is placed on the stack.
You will always have to call destroy on the object that you placed on the heap with new. This opens the potential for memory leaks. Objects placed on the stack are not prone to memory leaking!
I think the poster meant to say You do not have to allocate everything on the
heap
rather than the the stack
.
Basically objects are allocated on the stack (if the object size allows, of course) because of the cheap cost of stack-allocation, rather than heap-based allocation which involves quite some work by the allocator, and adds verbosity because then you have to manage data allocated on the heap.
To a great extent, that's someone elevating their own weaknesses to a general rule. There's nothing wrong per se with creating objects using the new
operator. What there is some argument for is that you have to do so with some discipline: if you create an object you need to make sure it's going to be destroyed.
The easiest way of doing that is to create the object in automatic storage, so C++ knows to destroy it when it goes out of scope:
{
File foo = File("foo.dat");
// do things
}
Now, observe that when you fall off that block after the end-brace, foo
is out of scope. C++ will call its dtor automatically for you. Unlike Java, you don't need to wait for the GC to find it.
Had you written
{
File * foo = new File("foo.dat");
you would want to match it explicitly with
delete foo;
}
or even better, allocate your File *
as a "smart pointer". If you aren't careful about that it can lead to leaks.
The answer itself makes the mistaken assumption that if you don't use new
you don't allocate on the heap; in fact, in C++ you don't know that. At most, you know that a small amout of memory, say one pointer, is certainly allocated on the stack. However, consider if the implementation of File is something like
class File {
private:
FileImpl * fd;
public:
File(String fn){ fd = new FileImpl(fn);}
then FileImpl
will still be allocated on the stack.
And yes, you'd better be sure to have
~File(){ delete fd ; }
in the class as well; without it, you'll leak memory from the heap even if you didn't apparently allocate on the heap at all.
One notable reason to avoid overusing the heap is for performance -- specifically involving the performance of the default memory management mechanism used by C++. While allocation can be quite quick in the trivial case, doing a lot of new
and delete
on objects of non-uniform size without strict order leads not only to memory fragmentation, but it also complicates the allocation algorithm and can absolutely destroy performance in certain cases.
That's the problem that memory pools where created to solve, allowing to to mitigate the inherent disadvantages of traditional heap implementations, while still allowing you to use the heap as necessary.
Better still, though, to avoid the problem altogether. If you can put it on the stack, then do so.
The core reason is that objects on heap are always difficult to use and manage than simple values. Writing code that are easy to read and maintain is always the first priority of any serious programmer.
Another scenario is the library we are using provides value semantics and make dynamic allocation unnecessary. Std::string
is a good example.
For object oriented code however, using a pointer - which means use new
to create it beforehand - is a must. In order to simplify the complexity of resource management, we have dozens of tools to make it as simple as possible, such as smart pointers. The object based paradigm or generic paradigm assumes value semantics and requires less or no new
, just as the posters elsewhere stated.
Traditional design patterns, especially those mentioned in GoF book, use new
a lot, as they are typical OO code.
new()
shouldn't be used as little as possible. It should be used as carefully as possible. And it should be used as often as necessary as dictated by pragmatism.
Allocation of objects on the stack, relying on their implicit destruction, is a simple model. If the required scope of an object fits that model then there's no need to use new()
, with the associated delete()
and checking of NULL pointers.
In the case where you have lots of short-lived objects allocation on the stack should reduce the problems of heap fragmentation.
However, if the lifetime of your object needs to extend beyond the current scope then new()
is the right answer. Just make sure that you pay attention to when and how you call delete()
and the possibilities of NULL pointers, using deleted objects and all of the other gotchas that come with the use of pointers.
One more point to all the above correct answers, it depends on what sort of programming you are doing. Kernel developing in Windows for example -> The stack is severely limited and you might not be able to take page faults like in user mode.
In such environments, new, or C-like API calls are prefered and even required.
Of course, this is merely an exception to the rule.
Source: Stackoverflow.com