0

Part of my input looks like the following.

Name
John Doe
Sons
Name
Son of John 
28
:
Name
Jane Doe
Daughters
Name
Daughter of Jane 
32
...
...

My parser looks like this

rep("Name" ~> rep("[A-Z ]+[a-z ]+".r)  ~> ("Sons Name" | "Daughters Name") ~> "[0-9]+") 

But looks like the regex rep("[A-Z ]+[a-z ]+".r) is also take away Name, Daughter of Jane, Son of John which results in the following error:

failure: `Daughters ' expected but `2' found

Wondering is there is a simple way to fix this ?

2
  • Do you want the parser to return the number matched (I assume that's the age of a person) for each person, such that the expected parsed result in the given example would be List(28, 32)? Commented Jan 13, 2015 at 22:56
  • I want the parser to return the Name, Daughter/Son Name and their age. Commented Jan 14, 2015 at 3:32

1 Answer 1

1

I've reformulated your parser a bit and made some of the regular expressions more explicit. Also, I've set skipWhitespace to false since it lets you have a more fine grained control over pieces being matched. I don't know whether this is most idiomatic approach to tackle your problem, but it works. Hope it helps.

import scala.util.parsing.combinator._

object Parser extends RegexParsers {

  override val skipWhitespace = false

  val word = """[A-Za-z]+""".r
  val separator = """\s+""".r    
  val colon = """(\s+:\s+)?""".r // optional colon
  val ws = """[^\S\n]+""".r      // all whitespace except newline
  val age = "[0-9]+".r

  val name = (repsep(word, ws) <~ separator) ^^ (_.mkString(" "))
  val nameHeader = "Name" ~ separator
  val childNameHeader = ("Daughters" | "Sons") ~ separator ~ nameHeader

  val person = nameHeader ~> name ~ (childNameHeader ~> name) ~ age <~ colon ^^ (p => (p._1._1, p._1._2, p._2))
  val persons = rep(person)

}

object Main extends App {

  val input  =
    """Name
      |John Doe
      |Sons
      |Name
      |Son of John
      |28
      |:
      |Name
      |Jane Doe
      |Daughters
      |Name
      |Daughter of Jane
      |32""".stripMargin

  val result = Parser.parse(Parser.persons, input)
  // prints '[13.3] parsed: List((John Doe,Son of John,28), (Jane Doe,Daughter of Jane,32))'
  println(result)
}
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.