2

Is it possible to walk HDFS using Python? If it's possible, how can I do it? Thanks!

3
  • What...have you...tried? Commented Dec 4, 2012 at 14:42
  • 1
    Haven't tried anything yet, but I'm looking at code.google.com/p/libpyhdfs . Basically I want to walk the HDFS and add partitions to a HIVE table. Commented Dec 4, 2012 at 14:46
  • You will frequently get a more positive response if you make it appear you have applied a little bit of thought or effort to the topic before posting, even if it is just to link to the documentation of the libraries you have been looking at. Commented Dec 4, 2012 at 15:02

2 Answers 2

1

hdfs3 which is based on libhdfs3 supports this

from hdfs3 import HDFileSystem
hdfs = HDFileSystem()
hdfs.walk('/path/to/directory')
Sign up to request clarification or add additional context in comments.

Comments

0

There are some libraries that you can take a look:

2 Comments

In terms of walking file structure, which one would you recommend to use?
Well, I haven't used any of them really. But, you have the listing directory method on both of them, so you can use both. pydoop seems to be better documented though.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.