I agree with @GregoryKlopper that the right way to solve the general problem of finding Waldo (or any object of interest) in an arbitrary image would be to train a supervised machine learning classifier. Using many positive and negative labeled examples, an algorithm such as Support Vector Machine, Boosted Decision Stump or Boltzmann Machine could likely be trained to achieve high accuracy on this problem. Mathematica even includes these algorithms in its Machine Learning Framework.
The two challenges with training a Waldo classifier would be:
A quick Google image search turns up some good data -- I'm going to have a go at collecting some training examples and coding this up right now!
However, even a machine learning approach (or the rule-based approach suggested by @iND) will struggle for an image like the Land of Waldos!