We need a process in place to pull data from Hadoop Distributed File System (HDFS) to a relational DB (PostgreSQL) on a regular basis. We will need to transfer several million records per hour and I am looking for the best industry standards to move data out of HDFS. Does any one have any suggestions? The idea is for a web app to interact with PostgreSQL which will have aggregated data.
1 Answer
Sqoop is built for the purpose of moving data between relational data stores and Hadoop. Specifically, you want sqoop-export.
1 Comment
user1666942
Thanks for your reply Donald! But I thought sqoop didn't support data export into PostgreSQL at this time.