[hadoop] What is best way to start and stop hadoop ecosystem, with command line?

I see there are several ways we can start hadoop ecosystem,

  1. start-all.sh & stop-all.sh Which say it's deprecated use start-dfs.sh & start-yarn.sh.

  2. start-dfs.sh, stop-dfs.sh and start-yarn.sh, stop-yarn.sh

  3. hadoop-daemon.sh namenode/datanode and yarn-deamon.sh resourcemanager

EDIT: I think there has to be some specific use cases for each command.

This question is related to hadoop

The answer is


Starting

start-dfs.sh (starts the namenode and the datanode)
start-mapred.sh (starts the jobtracker and the tasktracker)

Stopping

stop-dfs.sh
stop-mapred.sh

From Hadoop page,

start-all.sh 

This will startup a Namenode, Datanode, Jobtracker and a Tasktracker on your machine.

start-dfs.sh

This will bring up HDFS with the Namenode running on the machine you ran the command on. On such a machine you would need start-mapred.sh to separately start the job tracker

start-all.sh/stop-all.sh has to be run on the master node

You would use start-all.sh on a single node cluster (i.e. where you would have all the services on the same node.The namenode is also the datanode and is the master node).

In multi-node setup,

You will use start-all.sh on the master node and would start what is necessary on the slaves as well.

Alternatively,

Use start-dfs.sh on the node you want the Namenode to run on. This will bring up HDFS with the Namenode running on the machine you ran the command on and Datanodes on the machines listed in the slaves file.

Use start-mapred.sh on the machine you plan to run the Jobtracker on. This will bring up the Map/Reduce cluster with Jobtracker running on the machine you ran the command on and Tasktrackers running on machines listed in the slaves file.

hadoop-daemon.sh as stated by Tariq is used on each individual node. The master node will not start the services on the slaves.In a single node setup this will act same as start-all.sh.In a multi-node setup you will have to access each node (master as well as slaves) and execute on each of them.

Have a look at this start-all.sh it call config followed by dfs and mapred