Top K Elements

Min-Heap

Problem

Given an unsorted array of numbers, find the ‘K’ largest numbers in it.

For example:

Input: [3, 1, 5, 12, 2, 11], K = 3
Output: [5, 12, 11]
Input: [5, 12, 11, -1, 12], K = 3
Output: [12, 11, 12]

Thought Process

  • The best data structure that comes to mind to keep track of 'K' elements is heap

  • How a heap works (min heap) in Python is that each time, the smallest of the heap elements is popped. Whenever elements are pushed or popped, heap structure is maintained. The heap[0] element returns the smallest each time or pop the smallest.

  • Since we are looking for "Top K elements" a min-heap is the best option. Why min-heap is the best option because it will have the smallest elements in the array and we don't want these, so we can discard these in place for the bigger elements.

Solution

from heapq import *


def find_k_largest_numbers(nums, k):
  heap = []
  
  # put first 'K' numbers in the min heap
  for i in range(k):
    heappush(heap, nums[i])

  for i in range(k, len(nums)):
    if nums[i] > heap[0]:
      heappop(minHeap)
      heappush(heap, nums[i])

  # the heap has the top 'K' numbers, return them in a list
  return list(heap)

Key Points

  • Put the first 'K' numbers in the min-heap as these will be the initial values and the root (which is the smallest) is what we will compare to.

  • We go through the remaining numbers in the array, if the number from the array is bigger than the top(smallest) number of the min-heap, remove the top number from the heap and add the number from the array

  • Min-heap essentially "bubbles up" the smaller numbers to the top so we can discard them

Time Complexity

  • Time: O(Nlog(K))O(N*log(K)) because it will take us O(log(K))O(log(K)) to extract the minimum number from the min-heap. So the overall time complexity of our algorithm will be O(Klog(K)+(NK)log(K))O(K*log(K)+(N-K)*log(K)) since, first, we insert ‘K’ numbers in the heap and then iterate through the remaining numbers and at every step, in the worst case, we need to extract the minimum number and insert a new number in the heap. This is asumpotically equal to O(Nlog(K))O(N*log(K))

  • Space: O(K)O(K) since we need to store the top ‘K’ numbers in the heap.

Last updated

Was this helpful?