There is an increasing interest in extensible languages, (domain-specific) language extensions, and mechanisms for their specification and implementation. One challenge is to develop tools that allow non-expert programmers to add an eclectic set of language extensions to a host language. We describe mechanisms for composing and analyzing concrete syntax specifications of a host language and extensions to it. These specifications consist of context-free grammars with each terminal symbol mapped to a regular expression, from which a slightly-modified LR parser and context-aware scanner are generated. Traditionally, conflicts are detected when a parser is generated from the composed grammar, but this comes too late since it is the non-expert programmer directing the composition of independently developed extensions with the host language. The primary contribution of this paper is a modular analysis that is performed independently by each extension designer on her extension (composed alone with the host language). If each extension passes this modular analysis, then the language composed later by the programmer will compile with no conflicts or lexical ambiguities. Thus, extension writers can verify that their extension will safely compose with others and, if not, fix the specification so that it will. This is possible due to the context-aware scanner's lexical disambiguation and a set of reasonable restrictions limiting the constructs that can be introduced by an extension. The restrictions ensure that the parse table states can be partitioned so that each state can be attributed to the host language or a single extension.
|Original language||English (US)|
|Number of pages||12|
|Journal||ACM SIGPLAN Notices|
|State||Published - Jun 2009|
|Event||2009 ACM SIGPLAN Conference on Programming Language Design and Implementation, PLDI'09 - Dublin, Ireland|
Duration: Jun 15 2009 → Jun 20 2009
Copyright 2020 Elsevier B.V., All rights reserved.
- Context-aware scanning
- Extensible languages
- Grammar composition
- Language composition
- LR parsing