Python and Hadoop - fetch and write data directly to hdfs using python?

Question

I want to fetch the data daily from yahoo/google finance, related to stock's eod prices. These prices should be directly stored in HDFS in file.

I can later make external table on top of it (using HIVE) and use for further analysis.

So, I am not looking for basic map-reduce, since I don't have any input file as such. Are there any connectors available in python, which can write data in Hadoop?

Community · Accepted Answer · 2017-05-23 11:51:57Z

1

Start with dumping your data in a local file. Then find a way to upload the file to HDFS.

If you are running your job on an "edge node" (i.e. a Linux box that is not part of the cluster but has all the Hadoop clients installed and configured), then you have the good old HDFS command-line interface

hdfs dfs -put data.txt /user/johndoe/some/hdfs/dir/

If you are running your job anywhere else, use an HTTP library (or good old curl command line) to connect to the HDFS REST service -- could be either webHDFS or httpFS depending on the way the cluster has been set up -- and upload the file with a PUT request

http://namenode:port/webhdfs/v1/user/johndoe/some/hdfs/dir/data.txt?op=CREATE&overwrite=false

(and the content of "data.txt" as payload, of course)

edited May 23, 2017 at 11:51

CommunityBot

11 silver badge

answered Aug 8, 2015 at 19:47

Samson Scharfrichter

9,0771 gold badge19 silver badges39 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Samson Scharfrichter Over a year ago

BTW: when using a REST service against a HA cluster, you must call each NameNode until you find the active one.

Samson Scharfrichter Over a year ago

BTW, when unsing a REST service against a secure cluster, you must set up a Kerberos SPNEGO authentification - and optionally store the Hadoop delegation token for the duration of the session.

Collectives™ on Stack Overflow

Python and Hadoop - fetch and write data directly to hdfs using python?

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related