Educative

Why use a map to store character frequency when its never used?

The question only requires us to identify the number of distinct characters in the subsequence at any given point in time. The individual frequencies of the characters don’t actually matter. So why waste space storing them and also updating them when they are never used? Code becomes much simpler if we use a set.

Code below:

def longest_substring_with_k_distinct(string, k):
  string_start = 0
  longest_length = 0
  for string_end in range(len(string)):
        current_word = string[string_start: string_end + 1]
        num_distincts = len(set(current_word))
        if num_distincts == k:
               longest_length = max(longest_length, string_end - string_start + 1)

        while len(set(current_word)) > k:
              string_start += 1
              current_word = string[string_start: string_end + 1]
  
  return longest_length

Agreed.

Java edition based on Set

import java.util.*;

class LongestSubstringKDistinct {
  public static int findLength(String str, int k) {
    if (str == null) {
      return -1;
    }
    int ret = 0;
    Set<Character> chars = new HashSet<>();
    for (int left = 0, right = 0; right < str.length(); right++) {
      chars.add(str.charAt(right));
      if (chars.size() > k) {
        char c = str.charAt(left);
        while (left <= right && str.charAt(left) == c) {
          left++;
        }
        chars.remove(c);
      }
      ret = Math.max(ret, right - left + 1);
    }
    return ret;
  }
}