Find if string contains the substring without Java.substring

Question

Design a function hasCheated(String s1, String s2, int N) that returns true if the two strings, s1, s2 have a common substring of at least length N. Without using .contains, .substring

Was the desired approach to convert to char arrays or something like that?

I am considering finding the longest common substring and then seeing it again N. What are some approaches for this?

You are apparently looking for a String searching algorithm. — Andy Turner
– Andy Turner, Commented Apr 27, 2017 at 7:59
"What are some approaches for finding the longest common substring?" Why don't you try searching the web? There are already answers out there. See longest common substring. — Andreas
– Andreas, Commented Apr 27, 2017 at 8:09

Peter Lawrey · Accepted Answer · 2017-04-27 10:00:54Z

2

Finding the longest common substring first will be slower than just checking whether there is a match of at least a given size. You can stop as soon as you get a match, even if there could have been a longer one.

I would build a hash set (or a hash map if the requirement is more complex) which represent all the sub strings of length N for one (without actually creating those substrings) and use it to scan strings of length N for likely matches.

You could do this in O(M) time where M is the length of longest String.

You can create a substring class like this.

class Substring {
    final String s;
    final int offset, length, hashCode;

    Substring(String s, int offset, int length) {
        this.s = s;
        this.offset = offset;
        this.length = length;
        this.hashCode = hashCode(s, offset, length); // define to taste
    }

    public int hashCode() { return hashCode; }

    public boolean equals(Object o) {
       if (!(o instanceof Substring)) return false;
       Substring ss = (Substring) s;
       if (hashCode != ss.hashCode || length != ss.length) return false;
       for (int i = 0; i < length; i++) 
           if (s.charAt(i) != ss.s.charAt(i))
               return false;
       return true;
    }
}

To build the HashSet, you can do the following in O(n) where n is s1.length()

Set<Substring> substrings = new HashSet<>();
for (int i = 0; i < s1.length() - n; i++) {
    Substring ss = new Substring(s1, i, n);
    substrings.add(ss);
}

To do the search for matches, you can do the following in O(n) where n is s2.length()

for (int i = 0; i < s2.length() - n; i++) {
    Substring ss = new Substring(s2, i, n);
    if (substrings.contains(ss))
         return true; // found a match.
}

edited Apr 27, 2017 at 10:00

answered Apr 27, 2017 at 7:54

Peter Lawrey

535k83 gold badges770 silver badges1.2k bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Andreas Over a year ago

Perhaps, to be a useful answer, you should explain further how you would build a hashmap of substrings without creating substrings.

Peter Lawrey Over a year ago

@Andreas added a simple example.

Kaidul · Accepted Answer · 2017-04-27 08:03:59Z

1

So, without contains, substring I am assuming you mean not to use any Java APIs. Here is an implementation using dynamic programming:

public static boolean hashCheated(String a, String b, int N) {
    int m = a.length();
    int n = b.length();

    int max = 0;

    int[][] dp = new int[m][n];

    for(int i=0; i<m; i++){
        for(int j=0; j<n; j++){
            if(a.charAt(i) == b.charAt(j)){
                if(i==0 || j==0){
                    dp[i][j]=1;
                }else{
                    dp[i][j] = dp[i-1][j-1]+1;
                }

                if(max < dp[i][j])
                    max = dp[i][j];
            }

        }
    }

    return (max >= N);
}

edited Apr 27, 2017 at 8:03

answered Apr 27, 2017 at 7:56

Kaidul

16k17 gold badges84 silver badges158 bronze badges

4 Comments

Peter Lawrey Over a year ago

This is O(m*n) for time and memory usage.

Kaidul Over a year ago

yes, this is one of the many approaches possible without using Java APIs and every approach has time-space tradeoff. May be not the best approach.

Peter Lawrey Over a year ago

If you consider time to get a working solution (i.e. time of developer), it's pretty good. :)

Andreas Over a year ago

What do you mean by "not to use any Java APIs"? Your code uses both length() and charAt(), which are both Java API methods. They are no less "Java APIs" than contains() and substring().

Collectives™ on Stack Overflow

Find if string contains the substring without Java.substring

2 Answers 2

2 Comments

4 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

4 Comments

Your Answer

Sign up or log in

Post as a guest

Related