作为
this question的后续,我现在正在尝试解析一个包含变量和case …表达式的表达式语言.语法应该是基于缩进的:
>表达式可以跨越多行,只要每行相对于第一行缩进;即应将其解析为单个应用程序:
f x y
z
q
> case表达式的每个替代项都需要在它自己的行上,相对于case关键字缩进.右侧可以跨越多条线.
case E of
C -> x
D -> f x
y
应该用两个替代方案解析成单个案例,x和f x y作为右边
我将我的代码简化为以下内容:
import qualified Text.Megaparsec.Lexer as L
import Text.Megaparsec hiding (space)
import Text.Megaparsec.Char hiding (space)
import Text.Megaparsec.String
import Control.Monad (void)
import Control.Applicative
data Term = Var String
| App [Term]
| Case Term [(String, Term)]
deriving Show
space :: Parser ()
space = L.space (void spaceChar) empty empty
name :: Parser String
name = try $do
s <- some letterChar
if s `elem` ["case", "of"]
then fail $unwords ["Unexpected: reserved word", show s]
else return s
term :: Parser () -> Parser Term
term sp = App <$> atom `sepBy1` try sp
where
atom = choice [ caseBlock
, Var <$> L.lexeme sp name
]
caseBlock = L.lineFold sp $\sp' ->
Case <$>
(L.symbol sp "case" *> L.lexeme sp (term sp) <* L.symbol sp' "of") <*>
alt sp' `sepBy` try sp' <* sp
alt sp' = L.lineFold sp' $\sp'' ->
(,) <$> L.lexeme sp' name <* L.symbol sp' "->" <*> term sp''
正如您所看到的,我正在尝试使用this answer中的技术来分隔替代方案,其中sp’aces比case关键字更加缩进.
问题
这似乎适用于仅由应用程序组成的单个表达式:
λ» parseTest (L.lineFold space term) "x y\n z"
App [Var "x",Var "y",Var "z"]
它不适用于使用链接答案中的技术的此类表达式列表:
λ» parseTest (L.lineFold space $\sp -> (term sp `sepBy` try sp)) "x\n y\nz"
3:1:
incorrect indentation (got 1, should be greater than 1)
尝试使用换行时,case表达式失败了:
λ» parseTest (L.lineFold space term) "case x of\n C -> y\n D -> z"
1:5:
Unexpected: reserved word "case"
case仅适用于最外层表达式的行折叠,仅适用于一种替代方案:
λ» parseTest (term space) "case x of\n C -> y\n z"
App [Case (App [Var "x"]) [("C",App [Var "y",Var "z"])]]
但是,只要我有多种选择,案例就会失败:
λ» parseTest (term space) "case x of\n C -> y\n D -> z"
3:2:
incorrect indentation (got 2, should be greater than 2)
我究竟做错了什么?
最佳答案 我答应,因为我答应看看这个.对于处于当前状态的Parsec类解析器来说,这个问题代表了一个相当困难的问题.我可能会花费更多的时间来使它工作,但是在我可以花时间回答这个问题的时候,我只能做到这一点:
module Main (main) where
import Control.Applicative
import Control.Monad (void)
import Text.Megaparsec
import Text.Megaparsec.String
import qualified Data.List.NonEmpty as NE
import qualified Text.Megaparsec.Lexer as L
data Term = Var String
| App [Term]
| Case Term [(String, Term)]
deriving Show
scn :: Parser ()
scn = L.space (void spaceChar) empty empty
sc :: Parser ()
sc = L.space (void $oneOf " \t") empty empty
name :: Parser String
name = try $do
s <- some letterChar
if s `elem` ["case", "of"]
then (unexpected . Label . NE.fromList) ("reserved word \"" ++ s ++ "\"")
else return s
manyTerms :: Parser [Term]
manyTerms = many pTerm
pTerm :: Parser Term
pTerm = caseBlock <|> app -- parse a term first
caseBlock :: Parser Term
caseBlock = L.indentBlock scn $do
void (L.symbol sc "case")
t <- Var <$> L.lexeme sc name -- not sure what sort of syntax case of
-- case expressions should have, so simplified to vars for now
void (L.symbol sc "of")
return (L.IndentSome Nothing (return . Case t) alt)
alt :: Parser (String, Term)
alt = L.lineFold scn $\sc' ->
(,) <$> L.lexeme sc' name <* L.symbol sc' "->" <*> pTerm -- (1)
app :: Parser Term
app = L.lineFold scn $\sc' ->
App <$> ((Var <$> name) `sepBy1` try sc' <* scn)
-- simplified here, with some effort should be possible to go from Var to
-- more general Term in applications
你的原始语法是左递归的,因为每个术语都可以是一个case表达式或一个应用程序,如果它是一个应用程序,那么它的第一部分可以是case表达式或应用程序等等.你需要处理它不知何故.
这是一个会话:
λ> parseTest pTerm "x y\n z"
App [Var "x",Var "y",Var "z"]
λ> parseTest pTerm "x\n y\nz"
App [Var "x",Var "y"]
λ> parseTest manyTerms "x\n y\nz"
[App [Var "x",Var "y"],App [Var "z"]]
λ> parseTest pTerm "case x of\n C -> y\n D -> z"
Case (Var "x") [("C",App [Var "y"]),("D",App [Var "z"])]
λ> parseTest pTerm "case x of\n C -> y\n z"
3:3:
incorrect indentation (got 3, should be equal to 2)
最后的结果是因为代码中的(1).向app中引入一个参数使得在不考虑上下文的情况下不可能使用它(它将不再是独立的表达式,而是一些事物的分解部分).我们可以看到,如果你在y应用程序的开头缩进z,而不是整个替代方案,它可以工作:
λ> parseTest pTerm "case x of\n C -> y\n z"
Case (Var "x") [("C",App [Var "y",Var "z"])]
最后,case表达式有效:
λ> parseTest pTerm "case x of\n C -> y\n D -> z"
Case (Var "x") [("C",App [Var "y"]),("D",App [Var "z"])]
我的建议是看看一些预处理器并使用Megaparsec. Text.Megaparsec.Lexer中的工具在这种情况下并不容易应用,但它们是我们能想到的最好的工具,它们适用于简单的缩进敏感语法.