解析 – 从使用Scala Parser Combinators编写的解析器返回有意义的错误消息

我尝试使用Parser Combinators在
scala中编写解析器.如果我递归匹配,

def body: Parser[Body] =
("begin" ~> statementList  )  ^^ {
     case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
  ("end" ^^ { _ => List() } )|
  (statement ~ statementList ^^ { case statement ~ statementList => statement :: statementList  })

然后,只要语句出错,我就会得到很好的错误消息.
但是,这是一个丑陋的长代码.所以我想写这个:

def body: Parser[Body] =
("begin" ~> statementList <~ "end"  )  ^^ {
   case s => {   new Body(s); }
}

def statementList : Parser[List[Statement]] = 
    rep(statement)

此代码有效,但只有在FIRST语句中出现错误时才会打印有意义的消息.如果它在后面的语句中,则消息变得非常难以使用,因为解析器希望看到整个错误语句被“end”标记替换:

Exception in thread "main" java.lang.RuntimeException: [4.2] error: "end" expected but "let" found

 let b : string = x(3,b,"WHAT???",!ERRORHERE!,7 ) 

 ^ 

我的问题:有没有办法让rep和repsep结合有意义的错误消息,将插入符号放在正确的位置而不是重复片段的开头?

最佳答案 您可以通过将“自制”rep方法与非回溯内部语句相结合来实现.例如:

scala> object X extends RegexParsers {
     |   def myrep[T](p: => Parser[T]): Parser[List[T]] = p ~! myrep(p) ^^ { case x ~ xs => x :: xs } | success(List())
     |   def t1 = "this" ~ "is" ~ "war"
     |   def t2 = "this" ~! "is" ~ "war"
     |   def t3 = "begin" ~ rep(t1) ~ "end"
     |   def t4 = "begin" ~ myrep(t2) ~ "end"
     | }
defined module X

scala> X.parse(X.t4, "begin this is war this is hell end")
res13: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.27] error: `war' expected but ` ' found

begin this is war this is hell end
                          ^

scala> X.parse(X.t3, "begin this is war this is hell end")
res14: X.ParseResult[X.~[X.~[String,List[X.~[X.~[String,String],String]]],String]] =
[1.19] failure: `end' expected but ` ' found

begin this is war this is hell end
                  ^
点赞