2

The title pretty much.

I created a suffix array in O(n) time using the DC3 algorithm. I then created an LCP array using Kasai's algorithm in O(n) time. Now I need to create a suffix tree from the two arrays I have. How does one do that? I looked at journal papers and looked around using Google but I could not find a way to do it.

There is one coursera video that I came across that describes the process but they do not state what method they use and I doubt it is a linear time algorithm.

1 Answer 1

4

It's actually very simple. The suffix array tells you the sequence of suffixes that you encounter if you do a left-to-right depth-first traversal of the suffix tree. The LCP array tells you how far up you need to go before starting a new edge corresponding to the next suffix. Assuming the string s has some unique character at the end (so each suffix is represented by a leaf node), the algorithm looks roughly like this:

let root = new node
let p = new node
make p a child of root with edge label S[0] (the least suffix)
for (int i = 1; i < s.size(); i++) {
    let l = LCP[i-1] (the LCP length of S[i] and S[i-1])
    let prevP = null
    while ((depth of p) > l) {
        // note that "depth" means total edge length from root to p, not
        // total number of edges from root to p
        prevP := p
        p := (parent of p)
    }
    if ((depth of p) == l) {
        let q = new node
        make q a child of p, with edge label S[i][l...]
        p := q
    } else {
        // we need to "split" the edge from p to prevP
        let q = new node
        let prevDepth = depth of prevP
        unlink prevP from p
        make q a child of p, with edge label S[i-1][(depth of p)...(l - 1)]
        make prevP a child of q, with edge label S[i-1][l...(prevDepth - 1)]
        let r = new node
        make r a child of q, with edge label S[i][l...]
        p := r
    }
}
return root
Sign up to request clarification or add additional context in comments.

4 Comments

Thank you for your answer! Does this algorithm have a name?
@hussainsagar I have no idea.
I have a doubt about what you mean by depth. You have said depth is the total number of edges from the root to p and not edge length. How are edge length and the number of edges different?
@hussainsagar Sorry, when I talk about the length of an edge, I mean the length of the substring label that that edge carries. For the string "abcabd", the suffix tree would have a node that is linked to the root with an edge with label "ab". So that node's depth would be 2 since "ab" has length 2, even though it's only 1 edge.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.