1

While trying to understand binary (search) trees, I stumbled upon the question of whether a BST can be stored in a flat array efficiently (i.e., without wasting space). After google'ing and searching in forums (including SO), I didn't find anything that satisfied me. Most people want to store BTs (l=2*i+1, r=2*i+2) or accept the fact that empty children of a node are represented as nil values (and stored like that, unnecessarily eating space).

So I wrote my own algorithm, harnessing the fact that child nodes in BSTs have a particular ordering property (smaller than root: left, larger than root: right).

To store a BST, I just linearly write it to an array, depth-first. That happens in O(n), recursing into the tree's child nodes. The array is then exactly n entries long, n being the number of nodes in the BST.

To reconstruct the BST from the array, I follow two rules when encountering a new entry in the array:

  • The next left node (if existing) is the next smaller than the current one.

  • The next right node (if existing) is the next one ...

    • ... bigger than the current one
    • ... no larger than the smallest parent key when last branching left
    • ... smaller than parent if in parent's left branch

This also happens in O(n).

This works surprisingly well for very, very large trees of random structure (as long as they are BSTs). Empirical tests have never shown any problems. I made a working code example in C++ here, the two conversion functions being toArray and fromArray.

Now to my question: Does this generalize? I am afraid to oversee crucial problems with this. The fact that I haven't found anything else on that on the net makes me wonder whether

a) I'm too stupid to find it,

b) It's common sense and nobody talks about it or

c) It's plain wrong.

I'd be very grateful for anybody that is proficient on the topic to enlighten me on this.

Thanks a lot already.

EDIT:

After google'ing some more and completely ignoring the array detail, I found a couple of valid solutions. The most prominent one (as explained here and here) was to store the BST using pre-order traversal.

8
  • How do heaps help? Their ordering property is completely different from BSTs. In heaps, the keys of the children are either always bigger (min heap) or smaller (max heap) than the parent's key. With BSTs, the left child is always less that the parent while the right child is always larger. Can you elaborate on your comment? Commented Jan 30, 2016 at 20:07
  • @JanWinkler: You can construct a BST using a greater-than operator for ordering instead of less-than. Commented Jan 30, 2016 at 20:18
  • @Blastfurnace What do you mean? Can you give an example? How does this work for nodes that don't have both children? Commented Jan 30, 2016 at 20:19
  • @JanWinkler: It sounded good when I wrote it. :) Well I think I was thinking of structure as function of array index. Anyway, since I see now that you considered that first of all, just disregard, please. Commented Jan 30, 2016 at 20:22
  • @JanWinkler: I'm just saying that a BST ordering can be reversed. Nothing requires the left child to be less than the parent and the right greater as long as the same ordering is used for all operations. Commented Jan 30, 2016 at 20:25

1 Answer 1

1

Eventually, I found the solution myself. The technique to use for reconstructing a BST from an array that was created from the BST depth-first is based on pre-order traversal.

I found the solution here and implemented it in my own BST class here. The essential part goes as follows:

bool BinarySearchTree::fromArrayPreorderTraversal(std::vector<unsigned int> vecArray, unsigned int& unCurrentIndex, unsigned int unMin, unsigned int unMax) {
  bool bResult = false;

  if(unCurrentIndex < vecArray.size()) {
    unsigned int unVal = vecArray[unCurrentIndex];

    if(unVal > unMin && unVal < unMax) {
      bResult = true;
      this->setKey(unVal);
      unCurrentIndex++;

      if(unCurrentIndex < vecArray.size()) {
        if(!this->left(true)->fromArrayPreorderTraversal(vecArray, unCurrentIndex, unMin, unVal)) {
          this->setLeft(NULL);
        }

        if(!this->right(true)->fromArrayPreorderTraversal(vecArray, unCurrentIndex, unVal, unMax)) {
          this->setRight(NULL);
        }
      }
    }
  }

  return bResult;
}

Here, a BST of the form:

      10
     /  \
    5    20
   / \     \
  2   7     30
 /
1

Is available as a pre-order traversal array:

10 5 2 1 7 20 30

The parent node comes first, then the left child, then the right child (recursively). This is a depth-first representation of the BST.

The algorithm gets the valid range of keys to accept for a new node on the left or right side as parameters (here unMin and unMax) and based on that, decides whether the node should be created or not. The full BST including its original structure is reconstructed this way.

The time complexity of this (according to the reference above) is O(n). As no space is wasted, the space complexity is O(n) as well.

This algorithm is far more elegant than the one I proposed in the question. It also seems to be the general way to do this reconstruction.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.