[hadoop] Setting the number of map tasks and reduce tasks

  • Use -D property=value rather than -D property = value (eliminate extra whitespaces). Thus -D mapred.reduce.tasks=value would work fine.

  • Setting number of map tasks doesnt always reflect the value you have set since it depends on split size and InputFormat used.

  • Setting the number of reduces will definitely override the number of reduces set on cluster/client-side configuration.