When two identifiers follow eachother it could be two instruction
mnemonics or one instruction mnemonic and one operand. To fix this
TOKEN_NEWLINE has been reintroduced as a semantic token. The grammar has
been changed to allow empty statements and every instruction and
directive has to end in a newline. Labels do not have to end in a
newline.
In addition to updating the grammar, the implementation of tokenlist,
ast and parser has been updated to reflect these changes.
When a number has a suffix the lexer state didn't record the number of
characters consumed for this suffix. This made the lexer state be 2-3
characters short in its line location reporting until it encountered a
newline character. It did not otherwise corrupt the state of the lexer.
- Exposes all errors in the header file so any user of the api can test
for the specific error conditions
- Mark all static error pointers as const
- Move generic errors into error.h
- Name all errors err_modulename_* for errors that belong to a specific
module and err_* for generic errors.
Split most of the work off into make/base.mk and allow for easy wrappers
to be created around that that can build with different instrumentation
in their own build directory.
Create wrappers for the following:
- release build
- debug build
- afl++ fuzzing build
- static analysis with clang
- clang memory sanitizer
- clang address/undefined sanitizer
The buffer length len and the requested number of tokens n are mixed up
in an invalid comparison. This causes all valid requests for n < len
tokens to be denied and all invalid requests for n > len tokens to be
accepted. This may cause a buffer overflow if the caller requests more
characters than they provide space for.
The linked list is doubly linked so the parser can look forward into it
and error reporting can look backward.
This commmit also reworks main to use the tokenlist instead of dealing
with the lexer manually.