What follows after lexical analysis?
I’m working on a toy compiler (for some simple language like PL/0) and I have my lexer up and running. At this point I should start working on building the parse tree, but before I start I was wondering: How much information can one gather from just the string of tokens? Here’s what I gathered so far:
How are “Json.org”-like specs graphs called and how can I generate them?
In http://www.json.org Douglas Crockford shows the specs of the JSON format in two interesting ways:
Practical reference for learning about graph reduction
Are there any practical references (with actual examples) for getting started implementing a small, lazy functional programming language with graph reduction? A reference that included the lexing and parsing steps would be especially helpful.
What is the name of a grammar which can change its tokenizer in mid parse?
I was creating a language and discovered that my language tokenizer would have to change depending where in the parse it is.
Is it possible to create a single tokenizer to parse this?
This extends off this other Q&A thread, but is going into details that are out of scope from the original question.
Chosing a parser for a code beautifier
I’m in the planning stage of making a code beautifier (similar to AStyle or Uncrustify) – originally I was going to just contribute to one of those projects,
but reviewing their source led me to the conclusion that I have different design goals and that their source is written in a way that makes it difficult for an outsider to easily contribute. AStyle, for example, instead of building some sort of AST, uses over 100 state variables such as isInComment
, foundClassHeader
, isLineReady
, etc.
What is the proper way to distinguish between keywords and identifiers?
I’m aware that most modern languages use reserved words to prevent things like keywords from being used as identifiers.
What is the proper way to distinguish between keywords and identifiers?
I’m aware that most modern languages use reserved words to prevent things like keywords from being used as identifiers.
Concatenating strings given a BNF grammar
<Definition> ::= <Name> <LeftPar> <param> <RightPar> <Name> ::= <Letter><LetterTail> <LetterTail> ::= <Letter><LetterTail> | ‘’ A question that confused when doing the derivation is the following. Lets say after matching f++u++n we match the last char ‘c’ with LetterTail and then match c with letter. When we peek now, it is ‘(‘ but how come we […]
Clarification about Grammars , Lexers and Parsers
Background info (May Skip): I am working on a task we have been set at uni in which we have to design a grammar for a DSL we have been provided with. The grammar must be in BNF or EBNF. As well as other thing we are being evaluated on the Lexical rules in the grammar and the Parsing rules – such as if rules are suitable for the language subset, how comprehensive these rules are, how clear the rules are ect.