2

I'm looking for help to parse this text file. I have this sample part of the file. It’s like a list of names in a file that I like to turn into a CSV file. It looks like this:

Membership Date:  Jan 1, 1999        
Sponsors:  Mary Muray, Judy White,
                                    Ronald Zurch,
                                   Nina Lin,
                                   Nathan Garton,
                                   Howard Ross  
    Comments:  This are great members to have on our team.

Here is the expected output with quotes (“):

“Membership Date:  Jan 1, 1999",
"Sponsors:  Mary Muray, Judy White, Ronald Zurch, Nina Foss, Nathan Garton, Howard Ross“,
“Comments:  This are great members to have on our team.” 

Note that the output has 3 fields. And the sponsor field has the line feeds taken out, so all names are in one field.

My code looks like this:

val filename: String = "/data/members.csv"
val lines = Source.fromFile(filename).getLines().toList
val ToLines = lines.dropWhile(line => !line.startsWith("Sponsor: ")).takeWhile(line  => !line.startsWith("Comments: ")).toSeq

The last line of code places each name in each element in the sequence, any line is placed into its own separate element in the list. I need help to get all names to be in a single element, so when I save the results as a CSV, the sponsor field has all its names in one field. Let me know if this does not make sense.

3 Answers 3

1

Your code will not have one name in it's own element in the list, it will have each row as an element. You also need to use split(",") to separate the names into it's own lelements. After that you can use mkString(", ") to merge the list together into a single string. Here is some code that does this and some trimming of white spaces and removal of empty list elements. Note that in the file you have Sponsors: while in the dropWhile it's Sponsor:, these need to be consistent for it to work properly.

val sponsors = lines
  .dropWhile(line => !line.startsWith("Sponsors: "))
  .takeWhile(line  => !line.startsWith("Comments: "))
  .flatMap(_.split(","))
  .map(_.trim())
  .filter(_.nonEmpty)
  .mkString(", ")

This will give a single string as such:

Sponsors:  Mary Muray, Judy White, Ronald Zurch, Nina Lin, Nathan Garton, Howard Ross

Adding the date and comments to the sponsors:

val data = lines.head.trim()
val comments = lines.last.trim()

val members = List(data, sponsors, comments).map(s => "\"" + s + "\"").mkString(",\n")

Will give you a string as follows:

"Membership Date:  Jan 1, 1999",
"Sponsors:  Mary Muray, Judy White, Ronald Zurch, Nina Lin, Nathan Garton, Howard Ross",
"Comments:  This are great members to have on our team."

Depending on what you want to do with it you can modify the above code for the final result.

Sign up to request clarification or add additional context in comments.

Comments

1

I knew this is not an elegant way yet I tried to solve this using typical looping instead of using any built-in functions.This logic can be tweaked according to your actual requirement

 val file: BufferedSource = Source.fromFile("file name")
 val lines = file.getLines()
 val result = scala.collection.mutable.ArrayBuffer.empty[String]

 val temp = new StringBuilder();

    for (line <- lines) {
      if (temp.mkString.contains(":") && line.contains(":")) {    
        result.append("\"" + temp.toString + "\"")
        temp.clear()
      }
      temp.append(line.trim())
    }

    if (temp.length > 0) result.append("\"" + temp.toString() + "\"")
    temp.clear()

    result.foreach { println(_) }

Output

"Membership Date:  Jan 1, 1999"
"Sponsors:  Mary Muray, Judy White,Ronald Zurch,Nina Lin,Nathan Garton,Howard Ross"
"Comments:  This are great members to have on our team."

2 Comments

I mean this in the nicest possible way (I do this all the time myself), but that is pretty much Java code and not Scala code.
@Phil yes i agree with you
1

It seems to me that you might be a little more flexible in identifying what is a new line,and what is a line continuation.

io.Source.fromFile("members.csv")
  .getLines
  .foldLeft(List.empty[String]){(all,line) =>
    if (line.contains(": ")) line.trim :: all
    else all.head + " " + line.trim :: all.tail
  }.reverse.mkString("\"", "\",\n\"", "\"")

A single call to mkString() adds all the requested quote marks and comma separators.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.