0

I was solving

https://www.spoj.com/problems/BEADS/

above question at SPOJ. I have stated the relevant information below:

Problem Statement: The description of the necklace is a string A = a1a2 ... am specifying sizes of the particular beads, where the last character am is considered to precede character a1 in circular fashion. The disjoint point i is said to be worse than the disjoint point j if and only if the string aiai+1 ... ana1 ... ai-1 is lexicografically smaller than the string ajaj+1 ... ana1 ... aj-1. String a1a2 ... an is lexicografically smaller than the string b1b2 ... bn if and only if there exists an integer i, i <= n, so that aj=bj, for each j, 1 <= j < i and ai < bi.

Output: For each test case, print exactly one line containing only one integer -- number of the bead which is the first at the worst possible disjoining, i.e. such i, that the string A[i] is lexicographically smallest among all the n possible disjoinings of a necklace. If there are more than one solution, print the one with the lowest i.

Now the solution is using SUFFIX ARRAY. Input string s, and concat with itself, s'=s+s ,since I have to sort cyclic suffixes of array. Then create a suffix array on s', and output the smallest index that points to a suffix of original s, i.e., index < len(s).

But there is a problem I face. I was appending '$' character to get SA, but I was getting wrong answer. After looking online, I found 1 solution that had appended '}' to string. I found that ascii('$') < ascii('a') < ascii('z') < ascii('}')

But i don't understand how this will make a difference, why this is accepted answer and haven;t found a case where this will make a difference. The solution (AC) can be found here:

Link to Code

    #include <bits/stdc++.h>
using namespace std;

string s;int n;
bool cmp_init(int a, int b)
{
    return s[a]<s[b] || (s[a]==s[b] && a<b);
}

int jmp;
vector<int> pos;
bool cmp(int a, int b)
{
    return pos[a]<pos[b] || (pos[a]==pos[b] && pos[(a+jmp)%n]<pos[(b+jmp)%n]);
}

int main() {
    int tc;cin>>tc;
    while(tc--){
    cin>>s;
    int m=s.size();
    s=s+s+"{";
    n=s.size();
    
    vector<int> SA(n,0);
    for(int i=0;i<n;i++)SA[i]=i;
    sort(SA.begin(), SA.end(), cmp_init);
    pos.assign(n,0);
    
    for(int i=1 , c=0;i<n;i++)pos[SA[i]]=(s[SA[i]]==s[SA[i-1]])?c:++c;
        
    for(jmp=1;jmp<=n;jmp*=2)
    {
        
        sort(SA.begin(), SA.end(), cmp);
        
        vector<int> tmp(n,0);
    
        for(int i=1 , c=0;i<n;i++)tmp[SA[i]]=(pos[SA[i]]==pos[SA[i-1]] && pos[(SA[i]+jmp)%n]==pos[(SA[i-1]+jmp)%n])?c:++c;

        for(int i=0;i<n;i++)pos[i]=tmp[i];

    }
    
        for(int i=0;i<n;i++)if(SA[i]<m){cout<<SA[i]+1<<"\n";break;}
    }
    
}


PS.: I have found that SA construction code is correct, only problem is with last character appending. Normally we append '$' in SA construction.

1 Answer 1

0

The difference is in the last condition:

If there are more than one solution, print the one with the lowest i.

Consider input "abab".

The correct answer is 0, which you get when you append '}', because "abababab}" is less than all of its suffixes.

If you append '$', you get the wrong answer, because "ab$" < "abab$" < "ababab$" < "abababab$".

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, I too suspected, this condition was the culprit, but wasn't able to understand how there could be more than 1 solution. Now I have gotit

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.