0

I'm trying to solve this problem:

A string has a majority character if the majority of the positions in the string are the same character. For example, ababa has the majority character a (three of five positions), but abab and abcd do not.

Every non-empty string can be partitioned into one or more substrings that have majority characters, because if nothing else, it can be partitioned into single-character substrings.

Given a non-empty string with length ≤ 5000, find the minimum number of substrings that it can be partitioned into such that each of the substrings has a majority character.

For example, if the string is abca then the answer is 4 (a, b, c, a); if the string is ababa then the answer is 1 (ababa); and if the string is aaaabbbcd then the answer is 2 (aaaa, bbbcd).

I've written a solution using dynamic programming, and I believe that the basic idea is correct, but it's too slow: it requires O(length3) time, and the length can be as high as 5000. How can I speed this up? Do I need to fundamentally change my approach?

func minSubstrings(_ s: String) -> Int {
  let chars = Array(s)
  let length = chars.count
  guard length > 0 else { return 0 }

  var dp = Array(repeating: Int.max, count: length + 1)
  dp[0] = 0

  func isValidSubstring(start: Int, end: Int) -> Bool {
    var freq: [Character: Int] = [:]
    for i in start..<end {
      let char = chars[i]
      freq[char, default: 0] += 1
    }
    let substringLength = end - start
    for count in freq.values {
      if count > substringLength / 2 {
        return true
      }
    }
    return false
  }

  for end in 1...length {
    for start in 0..<end {
      if isValidSubstring(start: start, end: end) {
        dp[end] = min(dp[end], dp[start] + 1)
      }
    }
  }

  return dp[length]
}

For anyone who may have a similar problem, I put the modified code below. The complexity is quadratic.

func minSubstrings(_ s: String) -> Int {
  let chars = Array(s)
  let length = chars.count
  guard length > 0 else { return 0 }
  
  var dp = Array(repeating: Int.max, count: length + 1)
  dp[0] = 0
  
  for start in 0..<length {
    var freq: [Character: Int] = [:]
    var maxFreq = 0
    var majorityChar: Character? = nil
    
    for end in start..<length {
      let char = chars[end]
      freq[char, default: 0] += 1
      if freq[char]! > maxFreq {
        maxFreq = freq[char]!
        majorityChar = char
      }
      
      let substringLength = end - start + 1
      
      if maxFreq > substringLength / 2 {
        dp[end + 1] = min(dp[end + 1], dp[start] + 1)
      }
    }
  }
  
  return dp[length]
}
1
  • Please move your added solution into an answer. It doesn't belong in the question. Commented Sep 26, 2024 at 13:40

2 Answers 2

1

The dynamic-programming part of your solution is fine, but the problem is that the isValidSubstring function is too slow — and there's no way to make it fast in its current form.

The reason is that by treating isValidSubstring(a, b) and isValidSubstring(a, b + 1) as completely unrelated computations, you force the latter to start over without any of the information you gathered in the former.

Instead, for each value of start, you can do a single O(length) pass to determine which substrings starting at start are "valid" ones. You can do that by keeping a single freq throughout this pass (so that each new end only has to update the one element), and by using the fact that the only characters that can possibly be a majority character in s[start..<end+1] are s[end] and the majority character in s[start..<end] (so that each new end only needs to check those two elements in freq). That gives you an O(length2) algorithm.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you for such a detailed explanation! Indeed, everything turned out to be very simple.
@clearcut3000: My pleasure!
0

You see both of your cycles do string analysis for a substring as an independent entity. But each of the cycle iteration is just doing one thing - either add or remove one character to your analysis set, so basically you'll not need to build freq dictionary every time, you just modify it on each cycle by adding/removing corresponding characters.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.