How to populate ElasticSearch with XML files on Node.js server?

Question

I have very large XML files, sometimes over 100mb. I need to populate my ElasticSearch database with the information from these files. My server is written in Node.js. Whats the best way to go about doing this?

Shaunak Kashyap · Accepted Answer · 2015-07-06 04:21:20Z

0

There are a couple of ways you could achieve your goal:

Load and parse your XML in a node.js program, and use the elasticsearch node module to index the parsed XML into Elasticsearch. You might want to look into the bulk index API in particular for speedy indexing.
Use logstash to setup a pipeline that reads from the XML files and indexes them into Elasticsearch. Logstash is a plugin-based system with plugins for input, filter, and output stages of the pipeline similar to the extract, transform, and load stages of an ETL pipeline. You might want to look into the file input plugin, the xml filter plugin, and elasticsearch output plugin.

answered Jul 6, 2015 at 4:21

Shaunak Kashyap

5983 silver badges6 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Daniel Kobe Over a year ago

Do you recommend storing the XML data in ElasticSearch or putting it in a seperate DB like MongoDB or Postgres?

Daniel Kobe Over a year ago

If I use logstash do I need to host the XML in the database or can I just populate the ElasticSearch from local files?

Shaunak Kashyap Over a year ago

That depends: how do you plan to use the XML data once it is stored (either in Elasticsearch or in a separate database)?

Daniel Kobe · Accepted Answer · 2015-07-08 19:15:25Z

0

I found a free e-book called exploring elastic search and there is a chapter on piping almost 10GB of Wikipedia XML data into an elasticsearch db. http://exploringelasticsearch.com/searching_wikipedia.html I plan to use this in conjunction with the elasticsearch node module.

answered Jul 8, 2015 at 19:15

Daniel Kobe

9,88515 gold badges72 silver badges120 bronze badges

Collectives™ on Stack Overflow

How to populate ElasticSearch with XML files on Node.js server?

2 Answers 2

3 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related