oas/doc/parser_grammar.txt
omicron 242fd9baa5 Fix grammar not being able to disambiguate some instructions
When two identifiers follow eachother it could be two instruction
mnemonics or one instruction mnemonic and one operand. To fix this
TOKEN_NEWLINE has been reintroduced as a semantic token. The grammar has
been changed to allow empty statements and every instruction and
directive has to end in a newline. Labels do not have to end in a
newline.

In addition to updating the grammar, the implementation of tokenlist,
ast and parser has been updated to reflect these changes.
2025-04-16 12:34:44 +02:00

40 lines
1.1 KiB
Plaintext

<program> ::= <statement>*
<statement> ::= <label> | <directive> | <instruction> | <newline>
<label> ::= <identifier> <colon>
<directive> ::= <dot> <section_directive> <newline>
<section_directive> ::= "section" <identifier>
<instruction> ::= <identifier> <operands> <newline>
<operands> ::= <operand> ( <comma> <operand> )*
<operand> ::= <register> | <immediate> | <memory>
<immediate> ::= <number> | <label_reference>
<number> ::= <octal> | <binary> | <decimal> | <hexadecimal>
<label_reference> ::= <identifier>
<memory> ::= <lbracket> <memory_expression> <rbracket>
<memory_expression> ::= <label_reference> | <register_expression>
<register_expression> ::= <register> <register_index>? <register_offset>?
<register_index> ::= <plus> <register> <asterisk> <number>
<register_offset> ::= <plus_or_minus> <number>
<plus_or_minus> ::= <plus> | <minus>
/* These are lexer identifiers with the correct string value */
<section> ::= "section"
<register> ::= "rax" | "rbx" | "rcx" | "rdx" | "rsi" | "rdi" | "rbp" | "rsp" |
"r8" | "r9" | "r10" | "r11" | "r12" | "r13" | "r14" | "r15"