0

To find the number of occurrences of a given string P ( length m ) in a text T ( length N )

We must use binary search against the suffix array of T.

The issue with using standard binary search ( without the LCP information ) is that in each of the O(log N) comparisons you need to make, you compare P to the current entry of the suffix array, which means a full string comparison of up to m characters. So the complexity is O(m*log N).

The LCP-LR array helps improve this to O(m+log N). know more

How we precompute LCP-LR array from LCP array?

And How does LCP-LR help in finding the number of occurrences of a pattern?

Please Explain the Algorithm with Example

Thank you

2

2 Answers 2

1
// note that arrSize is O(n)
// int arrSize = 2 * 2 ^ (log(N) + 1) + 1; // start from 1

// LCP = new int[N];
// fill the LCP...
// LCP_LR = new int[arrSize];
// memset(LCP_LR, maxValueOfInteger, arrSize);
// 

// init: buildLCP_LR(1, 1, N);
// LCP_LR[1] == [1..N]
// LCP_LR[2] == [1..N/2]
// LCP_LR[3] == [N/2+1 .. N]

// rangeI = LCP_LR[i]
//   rangeILeft  = LCP_LR[2 * i]
//   rangeIRight = LCP_LR[2 * i + 1]
// ..etc
void buildLCP_LR(int index, int low, int high)
{
    if(low == high)
    {
        LCP_LR[index] = LCP[low];
        return;
    }

    int mid = (low + high) / 2;

    buildLCP_LR(2*index, low, mid);
    buildLCP_LR(2*index+1, mid + 1, high);

    LCP_LR[index] = min(LCP_LR[2*index], LCP_LR[2*index + 1]);
}

Reference: https://stackoverflow.com/a/28385677/1428052

Sign up to request clarification or add additional context in comments.

3 Comments

We can construct LCP Array using segment tree in O(n) but each operation on segment tree takes O(log n). How can we improve this to O(1) ? Is it possible?
the LCP-LR should precompute all possible results so you won't need to divide it to find what you need
The computation of mid = (low + high) / 2 will overflow on large arrays. Consider changing it to mid = low + (high - low) / 2.
0

Not having enough reps to comment so posting. Is anybody able to create the LCP-LR using @Abhijeet Ashok Muneshwar solution. For ex for text- mississippi the Suffix array-

0 1 2 3 4 5 6 7 8 9 10

10 7 1 4 0 9 8 3 6 2 5

The LCP array will be

0 1 2 3 4 5 6 7 8 9 10

1 1 4 0 0 1 0 2 1 3 0

And LCP-LR will be

0 1 2 3 4 5 6 7 8 9 10

1 1 0 4 0 0 0 0 0 1 3

But the LCP-LR obtained using the code is not same as above. To the method buildLCP_LR i am passing index=0, low=0, high=n

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.