I am reading multiple files from an HDFS directory, and for each file the generated data is printed using:
frequencies.foreach(x => println(x._1 + ": "+x._2))
And the printed data is (for File1.txt):
'text': 45
'data': 100
'push': 150
The key can be different for other files like (File2.txt):
'data': 45
'lea': 100
'jmp': 150
The key is not necessarily the same in all the files. I want all the file data to be written to a .csv file in the following format:
Filename text data push lea jmp
File1.txt 45 100 150 0 0
File2.txt 0 45 0 100 150 ....
Can someone please help me find a solution to this problem?