[algorithm] How to efficiently build a tree from a flat structure?

I have a bunch of objects in a flat structure. These objects have an ID and a ParentID property so they can be arranged in trees. They are in no particular order. Each ParentID property does not necessarily matches with an ID in the structure. Therefore their could be several trees emerging from these objects.

How would you process these objects to create the resulting trees ?

I'm not so far from a solution but I'm sure it is far from optimal...

I need to create these trees to then insert Data into a database, in proper order.

There are no circular references. A Node is a RootNode when ParentID == null or when ParentID can't be found in the other objects

This question is related to algorithm tree language-agnostic

The answer is


Here is a simple JavaScript algorithm to parse a flat table into a parent/child tree structure that runs in N time:

var table = [
    {parent_id: 0, id: 1, children: []},
    {parent_id: 0, id: 2, children: []},
    {parent_id: 0, id: 3, children: []},
    {parent_id: 1, id: 4, children: []},
    {parent_id: 1, id: 5, children: []},
    {parent_id: 1, id: 6, children: []},
    {parent_id: 2, id: 7, children: []},
    {parent_id: 7, id: 8, children: []},
    {parent_id: 8, id: 9, children: []},
    {parent_id: 3, id: 10, children: []}
];

var root = {id:0, parent_id: null, children: []};
var node_list = { 0 : root};

for (var i = 0; i < table.length; i++) {
    node_list[table[i].id] = table[i];
    node_list[table[i].parent_id].children.push(node_list[table[i].id]);
}

console.log(root);

I wrote a generic solution in C# loosely based on @Mehrdad Afshari answer:

void Example(List<MyObject> actualObjects)
{
  List<TreeNode<MyObject>> treeRoots = actualObjects.BuildTree(obj => obj.ID, obj => obj.ParentID, -1);
}

public class TreeNode<T>
{
  public TreeNode(T value)
  {
    Value = value;
    Children = new List<TreeNode<T>>();
  }

  public T Value { get; private set; }
  public List<TreeNode<T>> Children { get; private set; }
}

public static class TreeExtensions
{
  public static List<TreeNode<TValue>> BuildTree<TKey, TValue>(this IEnumerable<TValue> objects, Func<TValue, TKey> keySelector, Func<TValue, TKey> parentKeySelector, TKey defaultKey = default(TKey))
  {
    var roots = new List<TreeNode<TValue>>();
    var allNodes = objects.Select(overrideValue => new TreeNode<TValue>(overrideValue)).ToArray();
    var nodesByRowId = allNodes.ToDictionary(node => keySelector(node.Value));

    foreach (var currentNode in allNodes)
    {
      TKey parentKey = parentKeySelector(currentNode.Value);
      if (Equals(parentKey, defaultKey))
      {
        roots.Add(currentNode);
      }
      else
      {
        nodesByRowId[parentKey].Children.Add(currentNode);
      }
    }

    return roots;
  }
}

For anyone interested in a C# version of Eugene's solution, note that node_list is accessed as a map, so use a Dictionary instead.

Keep in mind that this solution only works if table is sorted by parent_id.

var table = new[]
{
    new Node { parent_id = 0, id = 1 },
    new Node { parent_id = 0, id = 2 },
    new Node { parent_id = 0, id = 3 },
    new Node { parent_id = 1, id = 4 },
    new Node { parent_id = 1, id = 5 },
    new Node { parent_id = 1, id = 6 },
    new Node { parent_id = 2, id = 7 },
    new Node { parent_id = 7, id = 8 },
    new Node { parent_id = 8, id = 9 },
    new Node { parent_id = 3, id = 10 },
};

var root = new Node { id = 0 };
var node_list = new Dictionary<int, Node>{
    { 0, root }
};

foreach (var item in table)
{
    node_list.Add(item.id, item);
    node_list[item.parent_id].children.Add(node_list[item.id]);
}

Node is defined as follows.

class Node
{
    public int id { get; set; }
    public int parent_id { get; set; }
    public List<Node> children = new List<Node>();
}

Are you stuck using only those attributes? If not, it might be nice to create an array of child nodes, where you can cycle through all these objects once to build such attributes. From there, select the node with children but no parents and iteratively build your tree from the top down.


Vague as the question seems to me, I would probably create a map from the ID to the actual object. In pseudo-java (I didn't check whether it works/compiles), it might be something like:

Map<ID, FlatObject> flatObjectMap = new HashMap<ID, FlatObject>();

for (FlatObject object: flatStructure) {
    flatObjectMap.put(object.ID, object);
}

And to look up each parent:

private FlatObject getParent(FlatObject object) {
    getRealObject(object.ParentID);
}

private FlatObject getRealObject(ID objectID) {
    flatObjectMap.get(objectID);
}

By reusing getRealObject(ID) and doing a map from object to a collection of objects (or their IDs), you get a parent->children map too.


here is a ruby implementation:

It will catalog by attribute name or the result of a method call.

CatalogGenerator = ->(depth) do
  if depth != 0
    ->(hash, key) do
      hash[key] = Hash.new(&CatalogGenerator[depth - 1])
    end
  else
    ->(hash, key) do
      hash[key] = []
    end
  end
end

def catalog(collection, root_name: :root, by:)
  method_names = [*by]
  log = Hash.new(&CatalogGenerator[method_names.length])
  tree = collection.each_with_object(log) do |item, catalog|
    path = method_names.map { |method_name| item.public_send(method_name)}.unshift(root_name.to_sym)
  catalog.dig(*path) << item
  end
  tree.with_indifferent_access
end

 students = [#<Student:0x007f891d0b4818 id: 33999, status: "on_hold", tenant_id: 95>,
 #<Student:0x007f891d0b4570 id: 7635, status: "on_hold", tenant_id: 6>,
 #<Student:0x007f891d0b42c8 id: 37220, status: "on_hold", tenant_id: 6>,
 #<Student:0x007f891d0b4020 id: 3444, status: "ready_for_match", tenant_id: 15>,
 #<Student:0x007f8931d5ab58 id: 25166, status: "in_partnership", tenant_id: 10>]

catalog students, by: [:tenant_id, :status]

# this would out put the following
{"root"=>
  {95=>
    {"on_hold"=>
      [#<Student:0x007f891d0b4818
        id: 33999,
        status: "on_hold",
        tenant_id: 95>]},
   6=>
    {"on_hold"=>
      [#<Student:0x007f891d0b4570 id: 7635, status: "on_hold", tenant_id: 6>,
       #<Student:0x007f891d0b42c8
        id: 37220,
        status: "on_hold",
        tenant_id: 6>]},
   15=>
    {"ready_for_match"=>
      [#<Student:0x007f891d0b4020
        id: 3444,
        status: "ready_for_match",
        tenant_id: 15>]},
   10=>
    {"in_partnership"=>
      [#<Student:0x007f8931d5ab58
        id: 25166,
        status: "in_partnership",
        tenant_id: 10>]}}}

Based on the answer of Mehrdad Afshari and the comment of Andrew Hanlon for a speedup, here is my take.

Important difference to the original task: A root node has ID==parentID.

class MyObject
{   // The actual object
    public int ParentID { get; set; }
    public int ID { get; set; }
}

class Node
{
    public List<Node> Children = new List<Node>();
    public Node Parent { get; set; }
    public MyObject Source { get; set; }
}

List<Node> BuildTreeAndGetRoots(List<MyObject> actualObjects)
{
    var lookup = new Dictionary<int, Node>();
    var rootNodes = new List<Node>();

    foreach (var item in actualObjects)
    {
        // add us to lookup
        Node ourNode;
        if (lookup.TryGetValue(item.ID, out ourNode))
        {   // was already found as a parent - register the actual object
            ourNode.Source = item;
        }
        else
        {
            ourNode = new Node() { Source = item };
            lookup.Add(item.ID, ourNode);
        }

        // hook into parent
        if (item.ParentID == item.ID)
        {   // is a root node
            rootNodes.Add(ourNode);
        }
        else
        {   // is a child row - so we have a parent
            Node parentNode;
            if (!lookup.TryGetValue(item.ParentID, out parentNode))
            {   // unknown parent, construct preliminary parent
                parentNode = new Node();
                lookup.Add(item.ParentID, parentNode);
            }
            parentNode.Children.Add(ourNode);
            ourNode.Parent = parentNode;
        }
    }

    return rootNodes;
}

JS version that returns one root or an array of roots each of which will have a Children array property containing the related children. Does not depend on ordered input, decently compact, and does not use recursion. Enjoy!

// creates a tree from a flat set of hierarchically related data
var MiracleGrow = function(treeData, key, parentKey)
{
    var keys = [];
    treeData.map(function(x){
        x.Children = [];
        keys.push(x[key]);
    });
    var roots = treeData.filter(function(x){return keys.indexOf(x[parentKey])==-1});
    var nodes = [];
    roots.map(function(x){nodes.push(x)});
    while(nodes.length > 0)
    {

        var node = nodes.pop();
        var children =  treeData.filter(function(x){return x[parentKey] == node[key]});
        children.map(function(x){
            node.Children.push(x);
            nodes.push(x)
        });
    }
    if (roots.length==1) return roots[0];
    return roots;
}


// demo/test data
var treeData = [

    {id:9, name:'Led Zep', parent:null},
    {id:10, name:'Jimmy', parent:9},
    {id:11, name:'Robert', parent:9},
    {id:12, name:'John', parent:9},

    {id:8, name:'Elec Gtr Strings', parent:5},
    {id:1, name:'Rush', parent:null},
    {id:2, name:'Alex', parent:1},
    {id:3, name:'Geddy', parent:1},
    {id:4, name:'Neil', parent:1},
    {id:5, name:'Gibson Les Paul', parent:2},
    {id:6, name:'Pearl Kit', parent:4},
    {id:7, name:'Rickenbacker', parent:3},

    {id:100, name:'Santa', parent:99},
    {id:101, name:'Elf', parent:100},

];
var root = MiracleGrow(treeData, "id", "parent")
console.log(root)

Here is java solution of the answer by Mehrdad Afshari.

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

public class Tree {

    Iterator<Node> buildTreeAndGetRoots(List<MyObject> actualObjects) {
        Map<Integer, Node> lookup = new HashMap<>();
        actualObjects.forEach(x -> lookup.put(x.id, new Node(x)));
        //foreach (var item in lookup.Values)
        lookup.values().forEach(item ->
                {
                    Node proposedParent;
                    if (lookup.containsKey(item.associatedObject.parentId)) {
                        proposedParent = lookup.get(item.associatedObject.parentId);
                        item.parent = proposedParent;
                        proposedParent.children.add(item);
                    }
                }
        );
        //return lookup.values.Where(x =>x.Parent ==null);
        return lookup.values().stream().filter(x -> x.parent == null).iterator();
    }

}

class MyObject { // The actual object
    public int parentId;
    public int id;
}

class Node {
    public List<Node> children = new ArrayList<Node>();
    public Node parent;
    public MyObject associatedObject;

    public Node(MyObject associatedObject) {
        this.associatedObject = associatedObject;
    }
}

Vague as the question seems to me, I would probably create a map from the ID to the actual object. In pseudo-java (I didn't check whether it works/compiles), it might be something like:

Map<ID, FlatObject> flatObjectMap = new HashMap<ID, FlatObject>();

for (FlatObject object: flatStructure) {
    flatObjectMap.put(object.ID, object);
}

And to look up each parent:

private FlatObject getParent(FlatObject object) {
    getRealObject(object.ParentID);
}

private FlatObject getRealObject(ID objectID) {
    flatObjectMap.get(objectID);
}

By reusing getRealObject(ID) and doing a map from object to a collection of objects (or their IDs), you get a parent->children map too.


one elegant way to do this is to represent items in the list as string holding a dot separated list of parents, and finally a value:

server.port=90
server.hostname=localhost
client.serverport=90
client.database.port=1234
client.database.host=localhost

When assembling a tree, you would end up with something like:

server:
  port: 90
  hostname: localhost
client:
  serverport=1234
  database:
    port: 1234
    host: localhost

I have a configuration library that implements this override configuration (tree) from command line arguments (list). The algorithm to add a single item to the list to a tree is here.


Vague as the question seems to me, I would probably create a map from the ID to the actual object. In pseudo-java (I didn't check whether it works/compiles), it might be something like:

Map<ID, FlatObject> flatObjectMap = new HashMap<ID, FlatObject>();

for (FlatObject object: flatStructure) {
    flatObjectMap.put(object.ID, object);
}

And to look up each parent:

private FlatObject getParent(FlatObject object) {
    getRealObject(object.ParentID);
}

private FlatObject getRealObject(ID objectID) {
    flatObjectMap.get(objectID);
}

By reusing getRealObject(ID) and doing a map from object to a collection of objects (or their IDs), you get a parent->children map too.


It's not exactly the same as what the asker looked for, but I had difficulty wrapping my head around the ambiguously phrased answers provided here, and I still think this answer fits under the title.

My answer is for mapping a flat structure to a directly-on-object tree where all you have is a ParentID on each object. ParentID is null or 0 if it is a root. Opposite of the asker, I assume all valid ParentID's point to something else in the list:

var rootNodes = new List<DTIntranetMenuItem>();
var dictIntranetMenuItems = new Dictionary<long, DTIntranetMenuItem>();

//Convert the flat database items to the DTO's,
//that has a list of children instead of a ParentID.
foreach (var efIntranetMenuItem in flatIntranetMenuItems) //List<tblIntranetMenuItem>
{
    //Automapper (nuget)
    DTIntranetMenuItem intranetMenuItem =
                                   Mapper.Map<DTIntranetMenuItem>(efIntranetMenuItem);
    intranetMenuItem.Children = new List<DTIntranetMenuItem>();
    dictIntranetMenuItems.Add(efIntranetMenuItem.ID, intranetMenuItem);
}

foreach (var efIntranetMenuItem in flatIntranetMenuItems)
{
    //Getting the equivalent object of the converted ones
    DTIntranetMenuItem intranetMenuItem = dictIntranetMenuItems[efIntranetMenuItem.ID];

    if (efIntranetMenuItem.ParentID == null || efIntranetMenuItem.ParentID <= 0)
    {
        rootNodes.Add(intranetMenuItem);
    }
    else
    {
        var parent = dictIntranetMenuItems[efIntranetMenuItem.ParentID.Value];
        parent.Children.Add(intranetMenuItem);
        //intranetMenuItem.Parent = parent;
    }
}
return rootNodes;

java version

// node
@Data
public class Node {
    private Long id;
    private Long parentId;
    private String name;
    private List<Node> children = new ArrayList<>();
}

// flat list to tree
List<Node> nodes = new ArrayList();// load nodes from db or network
Map<Long, Node> nodeMap = new HashMap();
nodes.forEach(node -> {
  if (!nodeMap.containsKey(node.getId)) nodeMap.put(node.getId, node);
  if (nodeMap.containsKey(node.getParentId)) {
    Node parent = nodeMap.get(node.getParentId);
    node.setParentId(parent.getId());
    parent.getChildren().add(node);
  }
});

// tree node
List<Node> treeNode = nodeMap .values().stream().filter(n -> n.getParentId() == null).collect(Collectors.toList());

Found an awesome JavaScript version here: http://oskarhane.com/create-a-nested-array-recursively-in-javascript/

Let’s say you have an array like this:

const models = [
    {id: 1, title: 'hello', parent: 0},
    {id: 2, title: 'hello', parent: 0},
    {id: 3, title: 'hello', parent: 1},
    {id: 4, title: 'hello', parent: 3},
    {id: 5, title: 'hello', parent: 4},
    {id: 6, title: 'hello', parent: 4},
    {id: 7, title: 'hello', parent: 3},
    {id: 8, title: 'hello', parent: 2}
];

And you want to have the objects nested like this:

const nestedStructure = [
    {
        id: 1, title: 'hello', parent: 0, children: [
            {
                id: 3, title: 'hello', parent: 1, children: [
                    {
                        id: 4, title: 'hello', parent: 3, children: [
                            {id: 5, title: 'hello', parent: 4},
                            {id: 6, title: 'hello', parent: 4}
                        ]
                    },
                    {id: 7, title: 'hello', parent: 3}
                ]
            }
        ]
    },
    {
        id: 2, title: 'hello', parent: 0, children: [
            {id: 8, title: 'hello', parent: 2}
        ]
    }
];

Here’s a recursive function that makes it happen.

function getNestedChildren(models, parentId) {
    const nestedTreeStructure = [];
    const length = models.length;

    for (let i = 0; i < length; i++) { // for-loop for perf reasons, huge difference in ie11
        const model = models[i];

        if (model.parent == parentId) {
            const children = getNestedChildren(models, model.id);

            if (children.length > 0) {
                model.children = children;
            }

            nestedTreeStructure.push(model);
        }
    }

    return nestedTreeStructure;
}

Usuage:

const models = [
    {id: 1, title: 'hello', parent: 0},
    {id: 2, title: 'hello', parent: 0},
    {id: 3, title: 'hello', parent: 1},
    {id: 4, title: 'hello', parent: 3},
    {id: 5, title: 'hello', parent: 4},
    {id: 6, title: 'hello', parent: 4},
    {id: 7, title: 'hello', parent: 3},
    {id: 8, title: 'hello', parent: 2}
];
const nestedStructure = getNestedChildren(models, 0);

Here is java solution of the answer by Mehrdad Afshari.

import java.util.ArrayList;
import java.util.HashMap;
import java.util.Iterator;
import java.util.List;
import java.util.Map;

public class Tree {

    Iterator<Node> buildTreeAndGetRoots(List<MyObject> actualObjects) {
        Map<Integer, Node> lookup = new HashMap<>();
        actualObjects.forEach(x -> lookup.put(x.id, new Node(x)));
        //foreach (var item in lookup.Values)
        lookup.values().forEach(item ->
                {
                    Node proposedParent;
                    if (lookup.containsKey(item.associatedObject.parentId)) {
                        proposedParent = lookup.get(item.associatedObject.parentId);
                        item.parent = proposedParent;
                        proposedParent.children.add(item);
                    }
                }
        );
        //return lookup.values.Where(x =>x.Parent ==null);
        return lookup.values().stream().filter(x -> x.parent == null).iterator();
    }

}

class MyObject { // The actual object
    public int parentId;
    public int id;
}

class Node {
    public List<Node> children = new ArrayList<Node>();
    public Node parent;
    public MyObject associatedObject;

    public Node(MyObject associatedObject) {
        this.associatedObject = associatedObject;
    }
}

It's not exactly the same as what the asker looked for, but I had difficulty wrapping my head around the ambiguously phrased answers provided here, and I still think this answer fits under the title.

My answer is for mapping a flat structure to a directly-on-object tree where all you have is a ParentID on each object. ParentID is null or 0 if it is a root. Opposite of the asker, I assume all valid ParentID's point to something else in the list:

var rootNodes = new List<DTIntranetMenuItem>();
var dictIntranetMenuItems = new Dictionary<long, DTIntranetMenuItem>();

//Convert the flat database items to the DTO's,
//that has a list of children instead of a ParentID.
foreach (var efIntranetMenuItem in flatIntranetMenuItems) //List<tblIntranetMenuItem>
{
    //Automapper (nuget)
    DTIntranetMenuItem intranetMenuItem =
                                   Mapper.Map<DTIntranetMenuItem>(efIntranetMenuItem);
    intranetMenuItem.Children = new List<DTIntranetMenuItem>();
    dictIntranetMenuItems.Add(efIntranetMenuItem.ID, intranetMenuItem);
}

foreach (var efIntranetMenuItem in flatIntranetMenuItems)
{
    //Getting the equivalent object of the converted ones
    DTIntranetMenuItem intranetMenuItem = dictIntranetMenuItems[efIntranetMenuItem.ID];

    if (efIntranetMenuItem.ParentID == null || efIntranetMenuItem.ParentID <= 0)
    {
        rootNodes.Add(intranetMenuItem);
    }
    else
    {
        var parent = dictIntranetMenuItems[efIntranetMenuItem.ParentID.Value];
        parent.Children.Add(intranetMenuItem);
        //intranetMenuItem.Parent = parent;
    }
}
return rootNodes;

Most of the answers assume you are looking to do this outside of the database. If your trees are relatively static in nature and you just need to somehow map the trees into the database, you may want to consider using nested set representations on the database side. Check out books by Joe Celko (or here for an overview by Celko).

If tied to Oracle dbs anyway, check out their CONNECT BY for straight SQL approaches.

With either approach, you could completely skip mapping the trees before you load the data up in the database. Just thought I would offer this up as an alternative, it may be completely inappropriate for your specific needs. The whole "proper order" part of the original question somewhat implies you need the order to be "correct" in the db for some reason? This might push me towards handling the trees there as well.


Are you stuck using only those attributes? If not, it might be nice to create an array of child nodes, where you can cycle through all these objects once to build such attributes. From there, select the node with children but no parents and iteratively build your tree from the top down.


Most of the answers assume you are looking to do this outside of the database. If your trees are relatively static in nature and you just need to somehow map the trees into the database, you may want to consider using nested set representations on the database side. Check out books by Joe Celko (or here for an overview by Celko).

If tied to Oracle dbs anyway, check out their CONNECT BY for straight SQL approaches.

With either approach, you could completely skip mapping the trees before you load the data up in the database. Just thought I would offer this up as an alternative, it may be completely inappropriate for your specific needs. The whole "proper order" part of the original question somewhat implies you need the order to be "correct" in the db for some reason? This might push me towards handling the trees there as well.


Are you stuck using only those attributes? If not, it might be nice to create an array of child nodes, where you can cycle through all these objects once to build such attributes. From there, select the node with children but no parents and iteratively build your tree from the top down.


I can do this in 4 lines of code and O(n log n) time, assuming that Dictionary is something like a TreeMap.

dict := Dictionary new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | (dict at: each parent) addChild: each].
root := dict at: nil.

EDIT: Ok, and now I read that some parentIDs are fake, so forget the above, and do this:

dict := Dictionary new.
dict at: nil put: OrderedCollection new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | 
    (dict at: each parent ifAbsent: [dict at: nil]) 
          add: each].
roots := dict at: nil.

Most of the answers assume you are looking to do this outside of the database. If your trees are relatively static in nature and you just need to somehow map the trees into the database, you may want to consider using nested set representations on the database side. Check out books by Joe Celko (or here for an overview by Celko).

If tied to Oracle dbs anyway, check out their CONNECT BY for straight SQL approaches.

With either approach, you could completely skip mapping the trees before you load the data up in the database. Just thought I would offer this up as an alternative, it may be completely inappropriate for your specific needs. The whole "proper order" part of the original question somewhat implies you need the order to be "correct" in the db for some reason? This might push me towards handling the trees there as well.


Most of the answers assume you are looking to do this outside of the database. If your trees are relatively static in nature and you just need to somehow map the trees into the database, you may want to consider using nested set representations on the database side. Check out books by Joe Celko (or here for an overview by Celko).

If tied to Oracle dbs anyway, check out their CONNECT BY for straight SQL approaches.

With either approach, you could completely skip mapping the trees before you load the data up in the database. Just thought I would offer this up as an alternative, it may be completely inappropriate for your specific needs. The whole "proper order" part of the original question somewhat implies you need the order to be "correct" in the db for some reason? This might push me towards handling the trees there as well.


For anyone interested in a C# version of Eugene's solution, note that node_list is accessed as a map, so use a Dictionary instead.

Keep in mind that this solution only works if table is sorted by parent_id.

var table = new[]
{
    new Node { parent_id = 0, id = 1 },
    new Node { parent_id = 0, id = 2 },
    new Node { parent_id = 0, id = 3 },
    new Node { parent_id = 1, id = 4 },
    new Node { parent_id = 1, id = 5 },
    new Node { parent_id = 1, id = 6 },
    new Node { parent_id = 2, id = 7 },
    new Node { parent_id = 7, id = 8 },
    new Node { parent_id = 8, id = 9 },
    new Node { parent_id = 3, id = 10 },
};

var root = new Node { id = 0 };
var node_list = new Dictionary<int, Node>{
    { 0, root }
};

foreach (var item in table)
{
    node_list.Add(item.id, item);
    node_list[item.parent_id].children.Add(node_list[item.id]);
}

Node is defined as follows.

class Node
{
    public int id { get; set; }
    public int parent_id { get; set; }
    public List<Node> children = new List<Node>();
}

I wrote a generic solution in C# loosely based on @Mehrdad Afshari answer:

void Example(List<MyObject> actualObjects)
{
  List<TreeNode<MyObject>> treeRoots = actualObjects.BuildTree(obj => obj.ID, obj => obj.ParentID, -1);
}

public class TreeNode<T>
{
  public TreeNode(T value)
  {
    Value = value;
    Children = new List<TreeNode<T>>();
  }

  public T Value { get; private set; }
  public List<TreeNode<T>> Children { get; private set; }
}

public static class TreeExtensions
{
  public static List<TreeNode<TValue>> BuildTree<TKey, TValue>(this IEnumerable<TValue> objects, Func<TValue, TKey> keySelector, Func<TValue, TKey> parentKeySelector, TKey defaultKey = default(TKey))
  {
    var roots = new List<TreeNode<TValue>>();
    var allNodes = objects.Select(overrideValue => new TreeNode<TValue>(overrideValue)).ToArray();
    var nodesByRowId = allNodes.ToDictionary(node => keySelector(node.Value));

    foreach (var currentNode in allNodes)
    {
      TKey parentKey = parentKeySelector(currentNode.Value);
      if (Equals(parentKey, defaultKey))
      {
        roots.Add(currentNode);
      }
      else
      {
        nodesByRowId[parentKey].Children.Add(currentNode);
      }
    }

    return roots;
  }
}

Based on the answer of Mehrdad Afshari and the comment of Andrew Hanlon for a speedup, here is my take.

Important difference to the original task: A root node has ID==parentID.

class MyObject
{   // The actual object
    public int ParentID { get; set; }
    public int ID { get; set; }
}

class Node
{
    public List<Node> Children = new List<Node>();
    public Node Parent { get; set; }
    public MyObject Source { get; set; }
}

List<Node> BuildTreeAndGetRoots(List<MyObject> actualObjects)
{
    var lookup = new Dictionary<int, Node>();
    var rootNodes = new List<Node>();

    foreach (var item in actualObjects)
    {
        // add us to lookup
        Node ourNode;
        if (lookup.TryGetValue(item.ID, out ourNode))
        {   // was already found as a parent - register the actual object
            ourNode.Source = item;
        }
        else
        {
            ourNode = new Node() { Source = item };
            lookup.Add(item.ID, ourNode);
        }

        // hook into parent
        if (item.ParentID == item.ID)
        {   // is a root node
            rootNodes.Add(ourNode);
        }
        else
        {   // is a child row - so we have a parent
            Node parentNode;
            if (!lookup.TryGetValue(item.ParentID, out parentNode))
            {   // unknown parent, construct preliminary parent
                parentNode = new Node();
                lookup.Add(item.ParentID, parentNode);
            }
            parentNode.Children.Add(ourNode);
            ourNode.Parent = parentNode;
        }
    }

    return rootNodes;
}

java version

// node
@Data
public class Node {
    private Long id;
    private Long parentId;
    private String name;
    private List<Node> children = new ArrayList<>();
}

// flat list to tree
List<Node> nodes = new ArrayList();// load nodes from db or network
Map<Long, Node> nodeMap = new HashMap();
nodes.forEach(node -> {
  if (!nodeMap.containsKey(node.getId)) nodeMap.put(node.getId, node);
  if (nodeMap.containsKey(node.getParentId)) {
    Node parent = nodeMap.get(node.getParentId);
    node.setParentId(parent.getId());
    parent.getChildren().add(node);
  }
});

// tree node
List<Node> treeNode = nodeMap .values().stream().filter(n -> n.getParentId() == null).collect(Collectors.toList());

I can do this in 4 lines of code and O(n log n) time, assuming that Dictionary is something like a TreeMap.

dict := Dictionary new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | (dict at: each parent) addChild: each].
root := dict at: nil.

EDIT: Ok, and now I read that some parentIDs are fake, so forget the above, and do this:

dict := Dictionary new.
dict at: nil put: OrderedCollection new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | 
    (dict at: each parent ifAbsent: [dict at: nil]) 
          add: each].
roots := dict at: nil.

Are you stuck using only those attributes? If not, it might be nice to create an array of child nodes, where you can cycle through all these objects once to build such attributes. From there, select the node with children but no parents and iteratively build your tree from the top down.


Here is a simple JavaScript algorithm to parse a flat table into a parent/child tree structure that runs in N time:

var table = [
    {parent_id: 0, id: 1, children: []},
    {parent_id: 0, id: 2, children: []},
    {parent_id: 0, id: 3, children: []},
    {parent_id: 1, id: 4, children: []},
    {parent_id: 1, id: 5, children: []},
    {parent_id: 1, id: 6, children: []},
    {parent_id: 2, id: 7, children: []},
    {parent_id: 7, id: 8, children: []},
    {parent_id: 8, id: 9, children: []},
    {parent_id: 3, id: 10, children: []}
];

var root = {id:0, parent_id: null, children: []};
var node_list = { 0 : root};

for (var i = 0; i < table.length; i++) {
    node_list[table[i].id] = table[i];
    node_list[table[i].parent_id].children.push(node_list[table[i].id]);
}

console.log(root);

The accepted answer looks way too complex to me so I am adding a Ruby and NodeJS versions of it

Suppose that flat nodes list have the following structure:

nodes = [
  { id: 7, parent_id: 1 },
  ...
] # ruby

nodes = [
  { id: 7, parentId: 1 },
  ...
] # nodeJS

The functions which will turn the flat list structure above into a tree look the following way

for Ruby:

def to_tree(nodes)

  nodes.each do |node|

    parent = nodes.find { |another| another[:id] == node[:parent_id] }
    next unless parent

    node[:parent] = parent
    parent[:children] ||= []
    parent[:children] << node

  end

  nodes.select { |node| node[:parent].nil? }

end

for NodeJS:

const toTree = (nodes) => {

  nodes.forEach((node) => {

    const parent = nodes.find((another) => another.id == node.parentId)
    if(parent == null) return;

    node.parent = parent;
    parent.children = parent.children || [];
    parent.children = parent.children.concat(node);

  });

  return nodes.filter((node) => node.parent == null)

};

I can do this in 4 lines of code and O(n log n) time, assuming that Dictionary is something like a TreeMap.

dict := Dictionary new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | (dict at: each parent) addChild: each].
root := dict at: nil.

EDIT: Ok, and now I read that some parentIDs are fake, so forget the above, and do this:

dict := Dictionary new.
dict at: nil put: OrderedCollection new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | 
    (dict at: each parent ifAbsent: [dict at: nil]) 
          add: each].
roots := dict at: nil.

The accepted answer looks way too complex to me so I am adding a Ruby and NodeJS versions of it

Suppose that flat nodes list have the following structure:

nodes = [
  { id: 7, parent_id: 1 },
  ...
] # ruby

nodes = [
  { id: 7, parentId: 1 },
  ...
] # nodeJS

The functions which will turn the flat list structure above into a tree look the following way

for Ruby:

def to_tree(nodes)

  nodes.each do |node|

    parent = nodes.find { |another| another[:id] == node[:parent_id] }
    next unless parent

    node[:parent] = parent
    parent[:children] ||= []
    parent[:children] << node

  end

  nodes.select { |node| node[:parent].nil? }

end

for NodeJS:

const toTree = (nodes) => {

  nodes.forEach((node) => {

    const parent = nodes.find((another) => another.id == node.parentId)
    if(parent == null) return;

    node.parent = parent;
    parent.children = parent.children || [];
    parent.children = parent.children.concat(node);

  });

  return nodes.filter((node) => node.parent == null)

};

one elegant way to do this is to represent items in the list as string holding a dot separated list of parents, and finally a value:

server.port=90
server.hostname=localhost
client.serverport=90
client.database.port=1234
client.database.host=localhost

When assembling a tree, you would end up with something like:

server:
  port: 90
  hostname: localhost
client:
  serverport=1234
  database:
    port: 1234
    host: localhost

I have a configuration library that implements this override configuration (tree) from command line arguments (list). The algorithm to add a single item to the list to a tree is here.


Python solution

def subtree(node, relationships):
    return {
        v: subtree(v, relationships) 
        for v in [x[0] for x in relationships if x[1] == node]
    }

For example:

# (child, parent) pairs where -1 means no parent    
flat_tree = [
     (1, -1),
     (4, 1),
     (10, 4),
     (11, 4),
     (16, 11),
     (17, 11),
     (24, 17),
     (25, 17),
     (5, 1),
     (8, 5),
     (9, 5),
     (7, 9),
     (12, 9),
     (22, 12),
     (23, 12),
     (2, 23),
     (26, 23),
     (27, 23),
     (20, 9),
     (21, 9)
    ]

subtree(-1, flat_tree)

Produces:

{
    "1": {
        "4": {
            "10": {}, 
            "11": {
                "16": {}, 
                "17": {
                    "24": {}, 
                    "25": {}
                }
            }
        }, 
        "5": {
            "8": {}, 
            "9": {
                "20": {}, 
                "12": {
                    "22": {}, 
                    "23": {
                        "2": {}, 
                        "27": {}, 
                        "26": {}
                    }
                }, 
                "21": {}, 
                "7": {}
            }
        }
    }
}

here is a ruby implementation:

It will catalog by attribute name or the result of a method call.

CatalogGenerator = ->(depth) do
  if depth != 0
    ->(hash, key) do
      hash[key] = Hash.new(&CatalogGenerator[depth - 1])
    end
  else
    ->(hash, key) do
      hash[key] = []
    end
  end
end

def catalog(collection, root_name: :root, by:)
  method_names = [*by]
  log = Hash.new(&CatalogGenerator[method_names.length])
  tree = collection.each_with_object(log) do |item, catalog|
    path = method_names.map { |method_name| item.public_send(method_name)}.unshift(root_name.to_sym)
  catalog.dig(*path) << item
  end
  tree.with_indifferent_access
end

 students = [#<Student:0x007f891d0b4818 id: 33999, status: "on_hold", tenant_id: 95>,
 #<Student:0x007f891d0b4570 id: 7635, status: "on_hold", tenant_id: 6>,
 #<Student:0x007f891d0b42c8 id: 37220, status: "on_hold", tenant_id: 6>,
 #<Student:0x007f891d0b4020 id: 3444, status: "ready_for_match", tenant_id: 15>,
 #<Student:0x007f8931d5ab58 id: 25166, status: "in_partnership", tenant_id: 10>]

catalog students, by: [:tenant_id, :status]

# this would out put the following
{"root"=>
  {95=>
    {"on_hold"=>
      [#<Student:0x007f891d0b4818
        id: 33999,
        status: "on_hold",
        tenant_id: 95>]},
   6=>
    {"on_hold"=>
      [#<Student:0x007f891d0b4570 id: 7635, status: "on_hold", tenant_id: 6>,
       #<Student:0x007f891d0b42c8
        id: 37220,
        status: "on_hold",
        tenant_id: 6>]},
   15=>
    {"ready_for_match"=>
      [#<Student:0x007f891d0b4020
        id: 3444,
        status: "ready_for_match",
        tenant_id: 15>]},
   10=>
    {"in_partnership"=>
      [#<Student:0x007f8931d5ab58
        id: 25166,
        status: "in_partnership",
        tenant_id: 10>]}}}

Vague as the question seems to me, I would probably create a map from the ID to the actual object. In pseudo-java (I didn't check whether it works/compiles), it might be something like:

Map<ID, FlatObject> flatObjectMap = new HashMap<ID, FlatObject>();

for (FlatObject object: flatStructure) {
    flatObjectMap.put(object.ID, object);
}

And to look up each parent:

private FlatObject getParent(FlatObject object) {
    getRealObject(object.ParentID);
}

private FlatObject getRealObject(ID objectID) {
    flatObjectMap.get(objectID);
}

By reusing getRealObject(ID) and doing a map from object to a collection of objects (or their IDs), you get a parent->children map too.


JS version that returns one root or an array of roots each of which will have a Children array property containing the related children. Does not depend on ordered input, decently compact, and does not use recursion. Enjoy!

// creates a tree from a flat set of hierarchically related data
var MiracleGrow = function(treeData, key, parentKey)
{
    var keys = [];
    treeData.map(function(x){
        x.Children = [];
        keys.push(x[key]);
    });
    var roots = treeData.filter(function(x){return keys.indexOf(x[parentKey])==-1});
    var nodes = [];
    roots.map(function(x){nodes.push(x)});
    while(nodes.length > 0)
    {

        var node = nodes.pop();
        var children =  treeData.filter(function(x){return x[parentKey] == node[key]});
        children.map(function(x){
            node.Children.push(x);
            nodes.push(x)
        });
    }
    if (roots.length==1) return roots[0];
    return roots;
}


// demo/test data
var treeData = [

    {id:9, name:'Led Zep', parent:null},
    {id:10, name:'Jimmy', parent:9},
    {id:11, name:'Robert', parent:9},
    {id:12, name:'John', parent:9},

    {id:8, name:'Elec Gtr Strings', parent:5},
    {id:1, name:'Rush', parent:null},
    {id:2, name:'Alex', parent:1},
    {id:3, name:'Geddy', parent:1},
    {id:4, name:'Neil', parent:1},
    {id:5, name:'Gibson Les Paul', parent:2},
    {id:6, name:'Pearl Kit', parent:4},
    {id:7, name:'Rickenbacker', parent:3},

    {id:100, name:'Santa', parent:99},
    {id:101, name:'Elf', parent:100},

];
var root = MiracleGrow(treeData, "id", "parent")
console.log(root)

I can do this in 4 lines of code and O(n log n) time, assuming that Dictionary is something like a TreeMap.

dict := Dictionary new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | (dict at: each parent) addChild: each].
root := dict at: nil.

EDIT: Ok, and now I read that some parentIDs are fake, so forget the above, and do this:

dict := Dictionary new.
dict at: nil put: OrderedCollection new.
ary do: [:each | dict at: each id put: each].
ary do: [:each | 
    (dict at: each parent ifAbsent: [dict at: nil]) 
          add: each].
roots := dict at: nil.

Found an awesome JavaScript version here: http://oskarhane.com/create-a-nested-array-recursively-in-javascript/

Let’s say you have an array like this:

const models = [
    {id: 1, title: 'hello', parent: 0},
    {id: 2, title: 'hello', parent: 0},
    {id: 3, title: 'hello', parent: 1},
    {id: 4, title: 'hello', parent: 3},
    {id: 5, title: 'hello', parent: 4},
    {id: 6, title: 'hello', parent: 4},
    {id: 7, title: 'hello', parent: 3},
    {id: 8, title: 'hello', parent: 2}
];

And you want to have the objects nested like this:

const nestedStructure = [
    {
        id: 1, title: 'hello', parent: 0, children: [
            {
                id: 3, title: 'hello', parent: 1, children: [
                    {
                        id: 4, title: 'hello', parent: 3, children: [
                            {id: 5, title: 'hello', parent: 4},
                            {id: 6, title: 'hello', parent: 4}
                        ]
                    },
                    {id: 7, title: 'hello', parent: 3}
                ]
            }
        ]
    },
    {
        id: 2, title: 'hello', parent: 0, children: [
            {id: 8, title: 'hello', parent: 2}
        ]
    }
];

Here’s a recursive function that makes it happen.

function getNestedChildren(models, parentId) {
    const nestedTreeStructure = [];
    const length = models.length;

    for (let i = 0; i < length; i++) { // for-loop for perf reasons, huge difference in ie11
        const model = models[i];

        if (model.parent == parentId) {
            const children = getNestedChildren(models, model.id);

            if (children.length > 0) {
                model.children = children;
            }

            nestedTreeStructure.push(model);
        }
    }

    return nestedTreeStructure;
}

Usuage:

const models = [
    {id: 1, title: 'hello', parent: 0},
    {id: 2, title: 'hello', parent: 0},
    {id: 3, title: 'hello', parent: 1},
    {id: 4, title: 'hello', parent: 3},
    {id: 5, title: 'hello', parent: 4},
    {id: 6, title: 'hello', parent: 4},
    {id: 7, title: 'hello', parent: 3},
    {id: 8, title: 'hello', parent: 2}
];
const nestedStructure = getNestedChildren(models, 0);

Python solution

def subtree(node, relationships):
    return {
        v: subtree(v, relationships) 
        for v in [x[0] for x in relationships if x[1] == node]
    }

For example:

# (child, parent) pairs where -1 means no parent    
flat_tree = [
     (1, -1),
     (4, 1),
     (10, 4),
     (11, 4),
     (16, 11),
     (17, 11),
     (24, 17),
     (25, 17),
     (5, 1),
     (8, 5),
     (9, 5),
     (7, 9),
     (12, 9),
     (22, 12),
     (23, 12),
     (2, 23),
     (26, 23),
     (27, 23),
     (20, 9),
     (21, 9)
    ]

subtree(-1, flat_tree)

Produces:

{
    "1": {
        "4": {
            "10": {}, 
            "11": {
                "16": {}, 
                "17": {
                    "24": {}, 
                    "25": {}
                }
            }
        }, 
        "5": {
            "8": {}, 
            "9": {
                "20": {}, 
                "12": {
                    "22": {}, 
                    "23": {
                        "2": {}, 
                        "27": {}, 
                        "26": {}
                    }
                }, 
                "21": {}, 
                "7": {}
            }
        }
    }
}

Examples related to algorithm

How can I tell if an algorithm is efficient? Find the smallest positive integer that does not occur in a given sequence Efficiently getting all divisors of a given number Peak signal detection in realtime timeseries data What is the optimal algorithm for the game 2048? How can I sort a std::map first by value, then by key? Finding square root without using sqrt function? Fastest way to flatten / un-flatten nested JSON objects Mergesort with Python Find common substring between two strings

Examples related to tree

Tree implementation in Java (root, parents and children) Build tree array from flat array in javascript Binary Search Tree - Java Implementation Difference between "Complete binary tree", "strict binary tree","full binary Tree"? Tree view of a directory/folder in Windows? Definition of a Balanced Tree Difference between binary tree and binary search tree How to create a collapsing tree table in html/css/js? How to search JSON tree with jQuery Non-recursive depth first search algorithm

Examples related to language-agnostic

IOException: The process cannot access the file 'file path' because it is being used by another process Peak signal detection in realtime timeseries data Match linebreaks - \n or \r\n? Simple way to understand Encapsulation and Abstraction How can I pair socks from a pile efficiently? How do I determine whether my calculation of pi is accurate? What is ADT? (Abstract Data Type) How to explain callbacks in plain english? How are they different from calling one function from another function? Ukkonen's suffix tree algorithm in plain English Private vs Protected - Visibility Good-Practice Concern