What is best way to start and stop hadoop ecosystem with command line

Question

I see there are several ways we can start hadoop ecosystem    start-all sh  amp  stop-all sh Which say it s deprecated use start-dfs sh  amp  start-yarn sh   start-dfs sh  stop-dfs sh and start-yarn sh  stop-yarn sh hadoop-daemon sh namenode datanode and yarn-deamon sh resourcemanager   EDIT  I think there has to be some specific use cases for each command

User · Accepted Answer

start-all sh  amp  stop-all sh   Used to start and stop hadoop daemons all at once  Issuing it on the master machine will start stop the daemons on all the nodes of a cluster  Deprecated as you have already noticed   start-dfs sh  stop-dfs sh and start-yarn sh  stop-yarn sh   Same as above but start stop HDFS and YARN daemons separately on all the nodes from the master machine  It is advisable to use these commands now over start-all sh  amp  stop-all sh  hadoop-daemon sh namenode datanode and yarn-deamon sh resourcemanager   To start individual daemons on an individual machine manually  You need to go to a particular node and issue these commands   Use case   Suppose you have added a new DN to your cluster and you need to start the DN daemon only on this machine   bin hadoop-daemon sh start datanode   Note   You should have ssh enabled if you want to start all the daemons on all the nodes from one machine   Hope this answers your query

User · Answer

Starting  start-dfs sh  starts the namenode and the datanode  start-mapred sh  starts the jobtracker and the tasktracker    Stopping  stop-dfs sh stop-mapred sh

User · Answer

From Hadoop page   start-all sh    This will startup a Namenode  Datanode  Jobtracker and a Tasktracker on your machine   start-dfs sh   This will bring up HDFS with the Namenode running on the machine you ran the command on  On such a machine you would need start-mapred sh to separately start the job tracker  start-all sh stop-all sh has to be run on the master node  You would use start-all sh on a single node cluster  i e  where you would have all the services on the same node The namenode is also the datanode and is the master node    In multi-node setup   You will use start-all sh on the master node and would start what is necessary on the slaves as well   Alternatively   Use start-dfs sh on the node you want the Namenode to run on  This will bring up HDFS with the Namenode running on the machine you ran the command on and Datanodes on the machines listed in the slaves file   Use start-mapred sh on the machine you plan to run the Jobtracker on  This will bring up the Map Reduce cluster with Jobtracker running on the machine you ran the command on and Tasktrackers running on machines listed in the slaves file   hadoop-daemon sh as stated by Tariq is used on each individual node  The master node will not start the services on the slaves In a single node setup this will act same as start-all sh In a multi-node setup you will have to access each node  master as well as slaves  and execute on each of them   Have a look at this start-all sh it call config followed by dfs and mapred

[hadoop] What is best way to start and stop hadoop ecosystem, with command line?

Examples related to hadoop