The question only requires us to identify the number of distinct characters in the subsequence at any given point in time. The individual frequencies of the characters don’t actually matter. So why waste space storing them and also updating them when they are never used? Code becomes much simpler if we use a set.
Code below:
def longest_substring_with_k_distinct(string, k):
string_start = 0
longest_length = 0
for string_end in range(len(string)):
current_word = string[string_start: string_end + 1]
num_distincts = len(set(current_word))
if num_distincts == k:
longest_length = max(longest_length, string_end - string_start + 1)
while len(set(current_word)) > k:
string_start += 1
current_word = string[string_start: string_end + 1]
return longest_length