do wget -r -l0 www.oldsite.com
Then just find www.oldsite.com
would reveal all urls, I believe.
Alternatively, just serve that custom not-found page on every 404 request! I.e. if someone used the wrong link, he would get the page telling that page wasn't found, and making some hints about site's content.