This is a pretty old question, but since I've dealt with both recently here are my 2c:
Merge sort needs on average ~ N log N comparisons. For already (almost) sorted sorted arrays this gets down to 1/2 N log N, since while merging we (almost) always select "left" part 1/2 N of times and then just copy right 1/2 N elements. Additionally I can speculate that already sorted input makes processor's branch predictor shine but guessing almost all branches correctly, thus preventing pipeline stalls.
Quick sort on average requires ~ 1.38 N log N comparisons. It does not benefit greatly from already sorted array in terms of comparisons (however it does in terms of swaps and probably in terms of branch predictions inside CPU).
My benchmarks on fairly modern processor shows the following:
When comparison function is a callback function (like in qsort() libc implementation) quicksort is slower than mergesort by 15% on random input and 30% for already sorted array for 64 bit integers.
On the other hand if comparison is not a callback, my experience is that quicksort outperforms mergesort by up to 25%.
However if your (large) array has a very few unique values, merge sort starts gaining over quicksort in any case.
So maybe the bottom line is: if comparison is expensive (e.g. callback function, comparing strings, comparing many parts of a structure mostly getting to a second-third-forth "if" to make difference) - the chances are that you will be better with merge sort. For simpler tasks quicksort will be faster.
That said all previously said is true: - Quicksort can be N^2, but Sedgewick claims that a good randomized implementation has more chances of a computer performing sort to be struck by a lightning than to go N^2 - Mergesort requires extra space