2

I am new to Hadoop. I have done word count program with single input file and single output file. Now I want to take 2 files as input and write that output to a single file. I tried like this:

FileInputFormat.setInputPaths(conf, new Path(args[0]), new Path(args[1]));
FileOutputFormat.setOutputPath(conf, new Path(args[2]));

This is the command in terminal:

hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out

When I run this, its taking sample.txt as output directory and says that :

Output directory hdfs://localhost:9000/user/sample.txt already exists

Can anyone help me with this?

2
  • Just wanted to know how did you made the Jar..Is it a normal jar or runnable jar...If runnable jar then you dont have to mention the driver class name .... Commented May 21, 2015 at 6:25
  • It was a runnable jar, so I removed Driver and it worked. Thanks Aman. Commented May 21, 2015 at 11:25

2 Answers 2

2

May be because it is taking Driver as your first argument. why don't you try like this.

hadoop jar test.jar /user/in.txt /user/sample.txt /user/out
Sign up to request clarification or add additional context in comments.

Comments

1

If you have all the input files in one folder as you have mentioned (/user), the replace

hadoop jar test.jar Driver /user/in.txt /user/sample.txt /user/out

with this

hadoop jar test.jar Driver /user /user/out

This takes all the file inside /user directory as input and outputs in user/out folder in HDFS.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.