A binary tree is a tree data structure in which each node has at most two child nodes, usually distinguished as "left" and "right". Nodes with children are parent nodes, and child nodes may contain references to their parents. Outside the tree, there is often a reference to the "root" node (the ancestor of all nodes), if it exists. Any node in the data structure can be reached by starting at root node and repeatedly following references to either the left or right child. In a binary tree a degree of every node is maximum two.
Binary trees are useful, because as you can see in the picture, if you want to find any node in the tree, you only have to look a maximum of 6 times. If you wanted to search for node 24, for example, you would start at the root.
This search is illustrated below:
You can see that you can exclude half of the nodes of the entire tree on the first pass. and half of the left subtree on the second. This makes for very effective searches. If this was done on 4 billion elements, you would only have to search a maximum of 32 times. Therefore, the more elements contained in the tree, the more efficient your search can be.
Deletions can become complex. If the node has 0 or 1 child, then it's simply a matter of moving some pointers to exclude the one to be deleted. However, you can not easily delete a node with 2 children. So we take a short cut. Let's say we wanted to delete node 19.
Since trying to determine where to move the left and right pointers to is not easy, we find one to substitute it with. We go to the left sub-tree, and go as far right as we can go. This gives us the next greatest value of the node we want to delete.
Now we copy all of 18's contents, except for the left and right pointers, and delete the original 18 node.
To create these images, I implemented an AVL tree, a self balancing tree, so that at any point in time, the tree has at most one level of difference between the leaf nodes (nodes with no children). This keeps the tree from becoming skewed and maintains the maximum O(log n)
search time, with the cost of a little more time required for insertions and deletions.
Here is a sample showing how my AVL tree has kept itself as compact and balanced as possible.
In a sorted array, lookups would still take O(log(n))
, just like a tree, but random insertion and removal would take O(n) instead of the tree's O(log(n))
. Some STL containers use these performance characteristics to their advantage so insertion and removal times take a maximum of O(log n)
, which is very fast. Some of these containers are map
, multimap
, set
, and multiset
.
Example code for an AVL tree can be found at http://ideone.com/MheW8