[hadoop] Difference between Pig and Hive? Why have both?

I believe that the real answer to your question is that they are/were independent projects and there was no centrally coordinated goal. They were in different spaces early on and have grown to overlap with time as both projects expand.

Paraphrased from the Hadoop O'Reilly book:

Pig: a dataflow language and environment for exploring very large datasets.

Hive: a distributed data warehouse