Is it possible to walk HDFS using Python? If it's possible, how can I do it? Thanks!
-
What...have you...tried?Marcin– Marcin2012-12-04 14:42:23 +00:00Commented Dec 4, 2012 at 14:42
-
1Haven't tried anything yet, but I'm looking at code.google.com/p/libpyhdfs . Basically I want to walk the HDFS and add partitions to a HIVE table.darcyy– darcyy2012-12-04 14:46:09 +00:00Commented Dec 4, 2012 at 14:46
-
You will frequently get a more positive response if you make it appear you have applied a little bit of thought or effort to the topic before posting, even if it is just to link to the documentation of the libraries you have been looking at.Marcin– Marcin2012-12-04 15:02:54 +00:00Commented Dec 4, 2012 at 15:02
Add a comment
|
2 Answers
hdfs3 which is based on libhdfs3 supports this
from hdfs3 import HDFileSystem
hdfs = HDFileSystem()
hdfs.walk('/path/to/directory')
Comments
2 Comments
darcyy
In terms of walking file structure, which one would you recommend to use?
Faruk Sahin
Well, I haven't used any of them really. But, you have the listing directory method on both of them, so you can use both. pydoop seems to be better documented though.