I am writing some data to a text file(stored in disk) as output from my program. I want to organize the data in the text file in a search tree format so that it facilitates efficient search and replace(through the program itself). I would like to know how to implement the tree structure to be stored in a disk memory.
-
which language are you working in?user4987274– user49872742015-06-23 18:34:42 +00:00Commented Jun 23, 2015 at 18:34
-
Your question is confusing. Are you hoping to modify a text file in-place? Can you show us an example of what you're trying to accomplish?Jim Mischel– Jim Mischel2015-06-23 18:58:17 +00:00Commented Jun 23, 2015 at 18:58
-
@JimMischel I am trying to understand the practical application of search trees. Just would like to know how to implement a tree structure that can be pushed back and forth between RAM and disk memory. (Correct me if my understanding of the approach is wrong).user3531992– user35319922015-06-23 19:11:05 +00:00Commented Jun 23, 2015 at 19:11
1 Answer
One of the main practical difficulties of using a tree data-structure on disk is that with naive binary trees data will be "far apart" and trying to access this data will likely cause thrashing as your hard drive attempts to continuously access different locations on disk.
The classic solution to this problem is to use B-trees. The basic idea behind B-trees is that reads from disk are expensive so you should use them as little as possible. This is accomplished by using large nodes; instead of storing only two children, B-trees can have m children. This greatly increases the entropy of each node meaning that it takes far fewer reads to access you data.
Some more reading on B-trees can be found here, the pictures are particularly helpful in my opinion, and several implementations on B-trees can be found here.