0

I'm trying to run the example found here: http://thysmichels.com/2012/01/31/java-based-hdfs-api-tutorial/

But when I go to compile the java program I get errors saying the packages don't exist eg.

error: package org.apache.hadoop.conf does not exist
import org.apache.hadoop.conf.Configuration;

Hadoop 1.0.4 is installed and works fine. Every tutorial I've looked at for dealing with hdfs just starts with a program like in the link I provided earlier and they do not talk about any special prereqs I need. So I'm wondering what do I need to do to make this compile? I'm assuming I need to edit my classpath to point to the appropriate packages but I do not know where those are located.

Also I'm running Ubuntu 12.04, Hadoop 1.0.4 on a single node cluster following the instructions here: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/

3
  • 2
    How are you compiling the program? (Ant, Maven, IDE, command line?). You'll get this error if the hadoop-core jar isn't on the classpath Commented Dec 3, 2012 at 1:49
  • And can you post the exact command line you are using to compile the code (edit your question rather than psot it as a comment) Commented Dec 3, 2012 at 11:37
  • I'm just gonna put it as a comment since I don't think you'll be notified if I just do an edit: javac HDFSExample.java Commented Dec 4, 2012 at 0:13

1 Answer 1

1

I'd suggest you brush up on some basic java compilation basics.

You need to do more than just a javac HDFSExample.java - in that you need to include some dependency jars on the classpath. Something more like javac -cp hadoop-core-1.0.4.jar HDFSExample.java

Personally, i'd recommend looking into using a build tool (such as Maven, Ant) or an IDE as this will make things far less painful when you start to organize your code into packages and depend on multiple external libraries.

EDIT: For example, maven configuration is as simple as (ok i'm not including some of the other boiler plate pom decalarations..):

<dependencies>
    <dependency>
        <groupId>org.apache.hadoop</groupId>
        <artifactId>hadoop-core</artifactId>
        <version>1.0.4</version>
        <scope>provided</scope>
    </dependency>
<dependency>

Then to compile into a jar:

#> mvn jar
Sign up to request clarification or add additional context in comments.

2 Comments

Well actually I just changed my classpath to include the hadoop core file instead of doing it in the commandline. The reason I asked the question was so you would tell me what file actually held those specific files for the import not because I don't know how to link to external dependencies. Now the problem is that when it runs I get an error saying: "Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/commons/logging/LogFactory". Again I am assuming this is a class path issue so if you happen to know what jar contains the LogFactory it would be appreciated.
commons-logging, but seriously look into maven - you just specify hadoop-core dependency and it will figure out the other things that hadoop depends on

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.