I'm just a beginner in Hadoop and one of my colleges asked me for help in migrating some of PostgreSQL tables to Hadoop. Since I don't have much experience with PostgreSQL (I know databases though), I am not sure what would be the best way for this migration to happen. One of my ideas was to export the tables as gson data and then to process them from the Hadoop, as in this example: http://www.codeproject.com/Articles/757934/Apache-Hadoop-for-Windows-Platform. Are there better ways to import data (tables & databases) from PostgreSQL to Hadoop?
2 Answers
Sqoop (http://sqoop.apache.org/) is a tool precisely made for this. Go through the documentation, sqoop provides the best and the easiest way to transfer your data.
3 Comments
Amar
what exactly you mean by windows ?...is your Postgre Sql installed on a windows machine ? ..........in general sqoop runs on the side where you hadoop cluster is there since it runs a map reduce job to pull data from db and then dumps it to HDFS.
user1680859
Yes, it is on windows machine, and I am running hadoop also on windows
Amar
then you should be able to run it..i am not really sure of what the exact syntax would be but this can be done