[python] What do I use for a max-heap implementation in Python?

Python includes the heapq module for min-heaps, but I need a max heap. What should I use for a max-heap implementation in Python?

This question is related to python data-structures heap recursive-datastructures

The answer is


If you are inserting keys that are comparable but not int-like, you could potentially override the comparison operators on them (i.e. <= become > and > becomes <=). Otherwise, you can override heapq._siftup in the heapq module (it's all just Python code, in the end).


Allowing you to chose an arbitrary amount of largest or smallest items

import heapq
heap = [23, 7, -4, 18, 23, 42, 37, 2, 8, 2, 23, 7, -4, 18, 23, 42, 37, 2]
heapq.heapify(heap)
print(heapq.nlargest(3, heap))  # [42, 42, 37]
print(heapq.nsmallest(3, heap)) # [-4, -4, 2]

class MaxHeap:

    def __init__(self, items=[]):
        self.heap = [0]
        self.items = items
        for item in items:
            self.heap.append(item)
            self.__floatUp(len(self.heap)-1)

    """ 
    self.heap = [0] heap index stars form 1
    self.heap.append(item) append items from list to heap
    append items to it's proper place
    """

    def push(self, data):
        self.heap.append(data)
        self.__floatUp(len(self.heap)-1)
        
    """
    Append data to the end of the list the using float up put it to the proper place
    """


    def peek(self):
        if self.heap[1]:
            return self.heap[1]
        else:
            return False
        
        
    
    """
    To return the first element form list
    """

    def pop(self):
        if (len(self.heap)) > 2:
            self.__swap(1,len(self.heap)-1)
            max = self.heap.pop()
            self.__bubbleDown(1)

        elif (len(self.heap)) == 2:
            max = self.heap.pop()
        else:
            return False
        return max
    
    """
    If length is greater than two then swap first and last value as only the last value can be removed from the tree
    Place the new first item after removing of the item to its proper place using bubble down
    """
    
    
    def __swap(self, i, j):
        self.heap[i], self.heap[j] = self.heap[j], self.heap[i]

    """
    Swap is used in a bubble down and bubble up for proper placing of values is list
    """
    
    def __floatUp(self, index):
        parent = index // 2
        if index <= 1:
            return
        elif self.heap[index] > self.heap[parent]:
            self.__swap(index, parent)
            self.__floatUp(parent)
    
    """
    if value and parents , where parent < value , swap where parent, value = value,parent
    After swap again check parent > value == True if false then again continue
    """

    def __bubbleDown(self, index):
        left = index * 2
        right = index * 2 + 1
        largest = index

        if len(self.heap) > left and self.heap[largest] < self.heap[left]:
            largest = left

        if len(self.heap) > right and self.heap[largest] < self.heap[right]:
            largest = right

        if largest != index:
            self.__swap(index, largest)
            self.__bubbleDown(largest)
    """
    self.heap[largest] < self.heap[left] : largest = left
    current value and lower left value is bigger largest = left
    self.heap[largest] > self.heap[right]
    current value and lower value right value if lower right value is bigger than largest value
    
    after getting the largest value we will swap larger value and smaller value and them again check that the current value is 
    bigger than is down value then again swap space and check again until every node is settled at its the correct place 
    """
    
    def __str__(self):
        return str(self.heap)
    
    """
    To str representation of the list
    """

m = MaxHeap([95, 3, 21])

m.push(10)

print(m)

print(m.pop())

print(m.peek())

The easiest and ideal solution

Multiply the values by -1

There you go. All the highest numbers are now the lowest and vice versa.

Just remember that when you pop an element to multiply it with -1 in order to get the original value again.


The solution is to negate your values when you store them in the heap, or invert your object comparison like so:

import heapq

class MaxHeapObj(object):
  def __init__(self, val): self.val = val
  def __lt__(self, other): return self.val > other.val
  def __eq__(self, other): return self.val == other.val
  def __str__(self): return str(self.val)

Example of a max-heap:

maxh = []
heapq.heappush(maxh, MaxHeapObj(x))
x = maxh[0].val  # fetch max value
x = heapq.heappop(maxh).val  # pop max value

But you have to remember to wrap and unwrap your values, which requires knowing if you are dealing with a min- or max-heap.

MinHeap, MaxHeap classes

Adding classes for MinHeap and MaxHeap objects can simplify your code:

class MinHeap(object):
  def __init__(self): self.h = []
  def heappush(self, x): heapq.heappush(self.h, x)
  def heappop(self): return heapq.heappop(self.h)
  def __getitem__(self, i): return self.h[i]
  def __len__(self): return len(self.h)

class MaxHeap(MinHeap):
  def heappush(self, x): heapq.heappush(self.h, MaxHeapObj(x))
  def heappop(self): return heapq.heappop(self.h).val
  def __getitem__(self, i): return self.h[i].val

Example usage:

minh = MinHeap()
maxh = MaxHeap()
# add some values
minh.heappush(12)
maxh.heappush(12)
minh.heappush(4)
maxh.heappush(4)
# fetch "top" values
print(minh[0], maxh[0])  # "4 12"
# fetch and remove "top" values
print(minh.heappop(), maxh.heappop())  # "4 12"

You can use

import heapq
listForTree = [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]    
heapq.heapify(listForTree)             # for a min heap
heapq._heapify_max(listForTree)        # for a maxheap!!

If you then want to pop elements, use:

heapq.heappop(minheap)      # pop from minheap
heapq._heappop_max(maxheap) # pop from maxheap

This is a simple MaxHeap implementation based on heapq. Though it only works with numeric values.

import heapq
from typing import List


class MaxHeap:
    def __init__(self):
        self.data = []

    def top(self):
        return -self.data[0]

    def push(self, val):
        heapq.heappush(self.data, -val)

    def pop(self):
        return -heapq.heappop(self.data)

Usage:

max_heap = MaxHeap()
max_heap.push(3)
max_heap.push(5)
max_heap.push(1)
print(max_heap.top())  # 5

The easiest way is to invert the value of the keys and use heapq. For example, turn 1000.0 into -1000.0 and 5.0 into -5.0.


I have created a heap wrapper that inverts the values to create a max-heap, as well as a wrapper class for a min-heap to make the library more OOP-like. Here is the gist. There are three classes; Heap (abstract class), HeapMin, and HeapMax.

Methods:

isempty() -> bool; obvious
getroot() -> int; returns min/max
push() -> None; equivalent to heapq.heappush
pop() -> int; equivalent to heapq.heappop
view_min()/view_max() -> int; alias for getroot()
pushpop() -> int; equivalent to heapq.pushpop

I implemented a max heap version of heapq and submitted it to PyPI. (Very slight change of heapq module CPython code.)

https://pypi.python.org/pypi/heapq_max/

https://github.com/he-zhe/heapq_max

Installation

pip install heapq_max

Usage

tl;dr: same as heapq module except adding ‘_max’ to all functions.

heap_max = []                           # creates an empty heap
heappush_max(heap_max, item)            # pushes a new item on the heap
item = heappop_max(heap_max)            # pops the largest item from the heap
item = heap_max[0]                      # largest item on the heap without popping it
heapify_max(x)                          # transforms list into a heap, in-place, in linear time
item = heapreplace_max(heap_max, item)  # pops and returns largest item, and
                                    # adds new item; the heap size is unchanged

In case if you would like to get the largest K element using max heap, you can do the following trick:

nums= [3,2,1,5,6,4]
k = 2  #k being the kth largest element you want to get
heapq.heapify(nums) 
temp = heapq.nlargest(k, nums)
return temp[-1]

Following up to Isaac Turner's excellent answer, I'd like put an example based on K Closest Points to the Origin using max heap.

from math import sqrt
import heapq


class MaxHeapObj(object):
    def __init__(self, val):
        self.val = val.distance
        self.coordinates = val.coordinates

    def __lt__(self, other):
        return self.val > other.val

    def __eq__(self, other):
        return self.val == other.val

    def __str__(self):
        return str(self.val)


class MinHeap(object):
    def __init__(self):
        self.h = []

    def heappush(self, x):
        heapq.heappush(self.h, x)

    def heappop(self):
        return heapq.heappop(self.h)

    def __getitem__(self, i):
        return self.h[i]

    def __len__(self):
        return len(self.h)


class MaxHeap(MinHeap):
    def heappush(self, x):
        heapq.heappush(self.h, MaxHeapObj(x))

    def heappop(self):
        return heapq.heappop(self.h).val

    def peek(self):
        return heapq.nsmallest(1, self.h)[0].val

    def __getitem__(self, i):
        return self.h[i].val


class Point():
    def __init__(self, x, y):
        self.distance = round(sqrt(x**2 + y**2), 3)
        self.coordinates = (x, y)


def find_k_closest(points, k):
    res = [Point(x, y) for (x, y) in points]
    maxh = MaxHeap()

    for i in range(k):
        maxh.heappush(res[i])

    for p in res[k:]:
        if p.distance < maxh.peek():
            maxh.heappop()
            maxh.heappush(p)

    res = [str(x.coordinates) for x in maxh.h]
    print(f"{k} closest points from origin : {', '.join(res)}")


points = [(10, 8), (-2, 4), (0, -2), (-1, 0), (3, 5), (-2, 3), (3, 2), (0, 1)]
find_k_closest(points, 3)

Extending the int class and overriding __lt__ is one of the ways.

import queue
class MyInt(int):
    def __lt__(self, other):
        return self > other

def main():
    q = queue.PriorityQueue()
    q.put(MyInt(10))
    q.put(MyInt(5))
    q.put(MyInt(1))
    while not q.empty():
        print (q.get())


if __name__ == "__main__":
    main()

To elaborate on https://stackoverflow.com/a/59311063/1328979, here is a fully documented, annotated and tested Python 3 implementation for the general case.

from __future__ import annotations  # To allow "MinHeap.push -> MinHeap:"
from typing import Generic, List, Optional, TypeVar
from heapq import heapify, heappop, heappush, heapreplace


T = TypeVar('T')


class MinHeap(Generic[T]):
    '''
    MinHeap provides a nicer API around heapq's functionality.
    As it is a minimum heap, the first element of the heap is always the
    smallest.
    >>> h = MinHeap([3, 1, 4, 2])
    >>> h[0]
    1
    >>> h.peek()
    1
    >>> h.push(5)  # N.B.: the array isn't always fully sorted.
    [1, 2, 4, 3, 5]
    >>> h.pop()
    1
    >>> h.pop()
    2
    >>> h.pop()
    3
    >>> h.push(3).push(2)
    [2, 3, 4, 5]
    >>> h.replace(1)
    2
    >>> h
    [1, 3, 4, 5]
    '''
    def __init__(self, array: Optional[List[T]] = None):
        if array is None:
            array = []
        heapify(array)
        self.h = array
    def push(self, x: T) -> MinHeap:
        heappush(self.h, x)
        return self  # To allow chaining operations.
    def peek(self) -> T:
        return self.h[0]
    def pop(self) -> T:
        return heappop(self.h)
    def replace(self, x: T) -> T:
        return heapreplace(self.h, x)
    def __getitem__(self, i) -> T:
        return self.h[i]
    def __len__(self) -> int:
        return len(self.h)
    def __str__(self) -> str:
        return str(self.h)
    def __repr__(self) -> str:
        return str(self.h)


class Reverse(Generic[T]):
    '''
    Wrap around the provided object, reversing the comparison operators.
    >>> 1 < 2
    True
    >>> Reverse(1) < Reverse(2)
    False
    >>> Reverse(2) < Reverse(1)
    True
    >>> Reverse(1) <= Reverse(2)
    False
    >>> Reverse(2) <= Reverse(1)
    True
    >>> Reverse(2) <= Reverse(2)
    True
    >>> Reverse(1) == Reverse(1)
    True
    >>> Reverse(2) > Reverse(1)
    False
    >>> Reverse(1) > Reverse(2)
    True
    >>> Reverse(2) >= Reverse(1)
    False
    >>> Reverse(1) >= Reverse(2)
    True
    >>> Reverse(1)
    1
    '''
    def __init__(self, x: T) -> None:
        self.x = x
    def __lt__(self, other: Reverse) -> bool:
        return other.x.__lt__(self.x)
    def __le__(self, other: Reverse) -> bool:
        return other.x.__le__(self.x)
    def __eq__(self, other) -> bool:
        return self.x == other.x
    def __ne__(self, other: Reverse) -> bool:
        return other.x.__ne__(self.x)
    def __ge__(self, other: Reverse) -> bool:
        return other.x.__ge__(self.x)
    def __gt__(self, other: Reverse) -> bool:
        return other.x.__gt__(self.x)
    def __str__(self):
        return str(self.x)
    def __repr__(self):
        return str(self.x)


class MaxHeap(MinHeap):
    '''
    MaxHeap provides an implement of a maximum-heap, as heapq does not provide
    it. As it is a maximum heap, the first element of the heap is always the
    largest. It achieves this by wrapping around elements with Reverse,
    which reverses the comparison operations used by heapq.
    >>> h = MaxHeap([3, 1, 4, 2])
    >>> h[0]
    4
    >>> h.peek()
    4
    >>> h.push(5)  # N.B.: the array isn't always fully sorted.
    [5, 4, 3, 1, 2]
    >>> h.pop()
    5
    >>> h.pop()
    4
    >>> h.pop()
    3
    >>> h.pop()
    2
    >>> h.push(3).push(2).push(4)
    [4, 3, 2, 1]
    >>> h.replace(1)
    4
    >>> h
    [3, 1, 2, 1]
    '''
    def __init__(self, array: Optional[List[T]] = None):
        if array is not None:
            array = [Reverse(x) for x in array]  # Wrap with Reverse.
        super().__init__(array)
    def push(self, x: T) -> MaxHeap:
        super().push(Reverse(x))
        return self
    def peek(self) -> T:
        return super().peek().x
    def pop(self) -> T:
        return super().pop().x
    def replace(self, x: T) -> T:
        return super().replace(Reverse(x)).x


if __name__ == '__main__':
    import doctest
    doctest.testmod()

https://gist.github.com/marccarre/577a55850998da02af3d4b7b98152cf4


Examples related to python

programming a servo thru a barometer Is there a way to view two blocks of code from the same file simultaneously in Sublime Text? python variable NameError Why my regexp for hyphenated words doesn't work? Comparing a variable with a string python not working when redirecting from bash script is it possible to add colors to python output? Get Public URL for File - Google Cloud Storage - App Engine (Python) Real time face detection OpenCV, Python xlrd.biffh.XLRDError: Excel xlsx file; not supported Could not load dynamic library 'cudart64_101.dll' on tensorflow CPU-only installation

Examples related to data-structures

Program to find largest and second largest number in array golang why don't we have a set datastructure How to initialize a vector with fixed length in R C compiling - "undefined reference to"? List of all unique characters in a string? Binary Search Tree - Java Implementation How to clone object in C++ ? Or Is there another solution? How to check queue length in Python Difference between "Complete binary tree", "strict binary tree","full binary Tree"? Write code to convert given number into words (eg 1234 as input should output one thousand two hundred and thirty four)

Examples related to heap

Android Gradle Could not reserve enough space for object heap How to increase application heap size in Eclipse? Increase JVM max heap size for Eclipse Command-line Tool to find Java Heap Size and Memory Used (Linux)? how to increase java heap memory permanently? Finding the median of an unsorted array Find running median from a stream of integers Object creation on the stack/heap? How can building a heap be O(n) time complexity? Why should C++ programmers minimize use of 'new'?

Examples related to recursive-datastructures

What do I use for a max-heap implementation in Python?