0

I am new to Hadoop and Python and trying to make a mapper for common log format:

[training@localhost code]$ ls 
access_log  mapper2.py  mylocalfile.txt
reducer2.py  reducer4.py  testfile
cat    mapper.py   practice.py   reducer3.py  reducer.py
[training@localhost code]$ vim reducer4.py
[training@localhost code]$ hs mapper2.py reducer2.py access_log output2
packageJobJar: [mapper2.py, reducer2.py, /tmp/hadoop-training/hadoop-unjar2368120810978008335/] [] /tmp/streamjob7636411608115265060.jar tmpDir=null
17/01/05 05:50:32 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
17/01/05 05:50:32 WARN snappy.LoadSnappy: Snappy native library is available
17/01/05 05:50:32 INFO snappy.LoadSnappy: Snappy native library loaded
17/01/05 05:50:32 INFO mapred.JobClient: Cleaning up the staging area hdfs://0.0.0.0:8020/var/lib/hadoop-hdfs/cache/mapred/mapred/staging/training/.staging/job_201701041500_0009
17/01/05 05:50:32 ERROR security.UserGroupInformation: PriviledgedActionException as:training (auth:SIMPLE) cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://0.0.0.0:8020/user/training/access_log
17/01/05 05:50:32 ERROR streaming.StreamJob: Error Launching job : Input path does not exist: hdfs://0.0.0.0:8020/user/training/access_log
Streaming Command Failed!

I don't understand why it can't find the file, what am I doing wrong?

2

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.