My program seems to suffer from linear access to dictionaries, its run-time grows exponentially even though the algorithm is quadratic.
I use a dictionary to memoize values. That seems to be a bottleneck.
This is evidence of a bug in your memoization method.