I want to parser xml using spark so I am using spark databricks lib. sample xml is as follows:
<Transactions>
<Transaction>
<transid>1111</transid>
</Transaction>
<Transaction>
<transid>2222</transid>
</Transaction>
</Transactions>
<Payments>
<Payment>
<Id>123</Id>
</Payment>
<Payment>
<Id>456</Id>
</Payment>
</Payments>
code to parse:
val transNestedDF = sqlContext.read.format("com.databricks.spark.xml").option("rowTag","Transactions").load("trans_nested.xml")
transNestedDF.registerTempTable("TransNestedTbl")
sqlContext.sql("select Transaction[0].transid from TransNestedTbl").collect()
Here I don't have any root tag also I can't define multiple row tags so if I have to process both transactions and payments in single read using above single dataframe then how to achieve that?
need help.