Longest Substring with K Distinct Characters

Sliding Window

Problem

Given a string, find the length of the longest substring in it with no more than K distinct characters.

For example:

Input: String="araaci", K=2
Output: 4
Explanation: The longest substring with no 
more than '2' distinct characters is "araa".
Input: String="araaci", K=1
Output: 2
Explanation: The longest substring with no
more than '1' distinct characters is "aa".
Input: String="cbbebi", K=3
Output: 5
Explanation: The longest substrings with no
more than '3' distinct characters are "cbbeb"
& "bbebi".

Thought Process

  • How can we keep track of distinct characters?

    • By using a dictionary and keeping track of character frequency

  • We use a dictionary to keep track of distinct characters along with their count

  • We only move the sliding window forward when the length of the dictionary surpasses k because this means we have more distinct characters than k.

    • We shrink the sliding window until we are left with 'k' distinct characters in our dictionary

Solution

def longest_substring_with_k_distinct(str, k):
  distinct = {}
  start = 0
  maxLength = 0

  for i in range(len(str)):
    if str[i] not in distinct:
      distinct[str[i]] = 0
    distinct[str[i]]+=1

    while len(distinct) > k:
      windowStart = str[start]
      distinct[windowStart]-=1
      if distinct[windowStart] == 0:
        del distinct[windowStart]
      start+=1
    
    maxLength = max(maxLength, i-start+1)
  return maxLength

Key Points

  • Using a dynamic window that is static according to the numbers of distnict characters we have

  • The length of our dictionary controls when we move/increment the sliding window

  • We use a while loop in order to continuously shrink the sliding window until we meet our condition (which in this case is k distinct characters)

Time complexity

Last updated