2

I'm trying to get hadoop and hive to run locally on my linux system, but when I run jps, I noticed that the datanode service is missing:

vaughn@vaughn-notebook:/usr/local/hadoop$ jps
2209 NameNode
2682 ResourceManager
3084 Jps
2510 SecondaryNameNode

If I run bin/hadoop datanode, the following error occurs:

    17/07/13 19:40:14 INFO datanode.DataNode: registered UNIX signal handlers for [TERM, HUP, INT]
    17/07/13 19:40:14 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    17/07/13 19:40:15 WARN datanode.DataNode: Invalid dfs.datanode.data.dir /home/cloudera/hdata/dfs/data : 
    ExitCodeException exitCode=1: chmod: changing permissions of '/home/cloudera/hdata/dfs/data': Operation not permitted

        at org.apache.hadoop.util.Shell.runCommand(Shell.java:559)
        at org.apache.hadoop.util.Shell.run(Shell.java:476)
        at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:723)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:812)
        at org.apache.hadoop.util.Shell.execCommand(Shell.java:795)
        at org.apache.hadoop.fs.RawLocalFileSystem.setPermission(RawLocalFileSystem.java:646)
        at org.apache.hadoop.fs.FilterFileSystem.setPermission(FilterFileSystem.java:479)
        at org.apache.hadoop.util.DiskChecker.mkdirsWithExistsAndPermissionCheck(DiskChecker.java:140)
        at org.apache.hadoop.util.DiskChecker.checkDir(DiskChecker.java:156)
        at org.apache.hadoop.hdfs.server.datanode.DataNode$DataNodeDiskChecker.checkDir(DataNode.java:2285)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2327)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2309)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2201)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2248)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2424)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2448)
    17/07/13 19:40:15 FATAL datanode.DataNode: Exception in secureMain
    java.io.IOException: All directories in dfs.datanode.data.dir are invalid: "/home/cloudera/hdata/dfs/data/" 
        at org.apache.hadoop.hdfs.server.datanode.DataNode.checkStorageLocations(DataNode.java:2336)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:2309)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:2201)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:2248)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:2424)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:2448)
    17/07/13 19:40:15 INFO util.ExitUtil: Exiting with status 1
    17/07/13 19:40:15 INFO datanode.DataNode: SHUTDOWN_MSG: 
    /************************************************************

SHUTDOWN_MSG: Shutting down DataNode at vaughn-notebook/127.0.1.1

That directory seems unusual, but I don't think there's anything technically wrong with it. Here are the permissions on the directory:

vaughn@vaughn-notebook:/usr/local/hadoop$ ls -ld /home/cloudera/hdata/dfs/data
drwxrwxrwx 2 root root 4096 Jul 13 19:14 /home/cloudera/hdata/dfs/data

I also removed anything in the tmp folder and formatted the hdfs namenode. Here is my hdfs-site file:

<configuration>

<property>
  <name>dfs.replication</name>
  <value>1</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
 </property>
 <property>
   <name>dfs.namenode.name.dir</name>
   <value>file:/home/cloudera/hdata/dfs/name</value>
 </property>
 <property>
   <name>dfs.datanode.data.dir</name>
   <value>file:/home/cloudera/hdata/dfs/data</value>
 </property>

</configuration>

And my core-site file:

<configuration>

<property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/cloudera/hdata</value>
</property>

</configuration>

In my googling, I've seen some suggest running "sudo chown hduser:hadoop -R /usr/local/hadoop_store", but when I do that I get the error "chown: invalid user: ‘hduser:hadoop’". Do I have to create this user and group? I'm not really familiar with the process. Thanks in advance for any assistance.

1
  • What's the user account you are using to start/stop hadoop services ? Commented Jul 14, 2017 at 0:53

4 Answers 4

3

1.sudo chown vaughn:hadoop -R /usr/local/hadoop_store

where hadoop is group name. use

grep vaughn /etc/group

in your terminal to see your group name.

2.clean temporary directories.

3.Format the name node.

Hope this helps.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks, this did it!
1

Looks like it's a permission issue, The user which is used to start datanode should have write access in the data node -data directories.

Try to execute the below command before starting datanode service.

sudo chmod -R 777 /home/cloudera/hdata/dfs

You can also update owner:group using chown command, that's the best option.

Edit

If data node start up still fails try to update the file ownership using the below command before starting data node.

sudo chown -R vaughn.root /home/cloudera/hdata/dfs

1 Comment

Thanks for your replies, sachin. I already did set the permissions on the dfs folder to 777, I think I forgot to mention that. I tried it again just to be sure, and the datanode still does not start. The account that I use to start the services is called 'vaughn', but I'm not sure how I would use chown in this context.
0

sudo chown -R /usr/local/hadoop_store

delete datanode namenode directories in hadoop_store

stop-dfs.sh and stop-yarn.sh

hadoop fs namenode -format

start-dfs.sh and start dfs-yarn.sh

Hope it'll help

Comments

0

One more possible reason which was in my case: Location of HDFS directory in folder properties shower user name twice i.e. home/hadoop/hadoop/hdfs So, I had added the same directory in hdfs-site.xml. As a solution, I removed hadoop/ and changed it to home/hadoop/hdfs and this resolved my problem.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.