54 Commits

Author SHA1 Message Date
6f78d26ea1 Change the n argument of lexer_shift_buffer to size_t from int
Some checks failed
Validate the build / validate-build (push) Failing after 35s
2025-04-17 15:12:56 +02:00
1a79bf050e Remove unused ast_node_free_value
Values are all inside the ast struct and require no cleanup other than
freeing the ast struct.
2025-04-17 15:10:36 +02:00
26cb374c1d Update gitignore, add /build and remove old build artifacts 2025-04-17 15:09:29 +02:00
d97cfb97be Implement printing the encoding in main
All checks were successful
Validate the build / validate-build (push) Successful in 33s
2025-04-16 23:10:17 +02:00
99c9dcd985 Incomplete second pass encoding 2025-04-16 23:10:09 +02:00
7e9c1bfda2 Add bytes type and tests
bytes_t is a local (automatic) allocation array that carries the length
and capacity with it.
2025-04-16 23:10:09 +02:00
d8ae126e9a Add opcode encoding value for NODE_INSTRUCTION entries in the AST 2025-04-16 23:10:09 +02:00
68dcd9dcce Add first encoding pass
First pass collects all the symbols and interprets number and register
tokens into usable data for the later passes.
2025-04-16 23:10:00 +02:00
dcf90b72e0 Add register and number values to AST nodes 2025-04-16 23:10:00 +02:00
2cf69f5e18 Add initial limited opcode data 2025-04-16 23:09:47 +02:00
d59559d327 Add registers data table
Change the validated primitive parse_register so that it uses the data
table instead
2025-04-16 13:46:19 +02:00
ac14925a0a Add symbols tests 2025-04-16 13:46:19 +02:00
2a7bb479ac initial symbol table implementation 2025-04-16 13:46:19 +02:00
ef22c0b620 Add .import and .export to the input test file 2025-04-16 13:46:19 +02:00
8c0e9926c5 Make main properly return with failure on parsing errors 2025-04-16 13:46:19 +02:00
d3d69b82d5 Add .import and .export directive to the grammar and parser 2025-04-16 13:46:10 +02:00
dc210e409c fix parse_immediate to accept label_reference instead of identifier 2025-04-16 13:41:28 +02:00
00272d69bf Add regression test for parse zero operands at eof
All checks were successful
Validate the build / validate-build (push) Successful in 30s
2025-04-16 13:16:55 +02:00
2385d38608 Prune the parse tree of NODE_NEWLINE after parsing succeeds 2025-04-16 13:01:02 +02:00
242fd9baa5 Fix grammar not being able to disambiguate some instructions
When two identifiers follow eachother it could be two instruction
mnemonics or one instruction mnemonic and one operand. To fix this
TOKEN_NEWLINE has been reintroduced as a semantic token. The grammar has
been changed to allow empty statements and every instruction and
directive has to end in a newline. Labels do not have to end in a
newline.

In addition to updating the grammar, the implementation of tokenlist,
ast and parser has been updated to reflect these changes.
2025-04-16 12:34:44 +02:00
1574ec6249 Fix parse_consecutive behavior when the token stream runs out 2025-04-16 12:13:02 +02:00
92c63092a1 Add regression test for trivia at the head of tokenlist
All checks were successful
Validate the build / validate-build (push) Successful in 29s
2025-04-09 01:17:09 +02:00
5560de2904 Make sure parse skips past initial trivia in the tokenlist 2025-04-09 01:15:51 +02:00
2bea87b39a Run tests in the validate gitea action
All checks were successful
Validate the build / validate-build (push) Successful in 29s
2025-04-06 09:23:25 +02:00
2eb7b3c2f1 use llvm to generate test coverage 2025-04-06 09:17:51 +02:00
f1f4c93a8e Fix bug in lexer_next_number not correctly tracking character number
All checks were successful
Validate the build / validate-build (push) Successful in 28s
When a number has a suffix the lexer state didn't record the number of
characters consumed for this suffix. This made the lexer state be 2-3
characters short in its line location reporting until it encountered a
newline character. It did not otherwise corrupt the state of the lexer.
2025-04-05 01:41:40 +02:00
27099c9899 Add initial unit tests
- Add µnit source and header files
- Add test target to the build system
- Implement a thorough lexer test suite
- Implement a minimal AST test suite
2025-04-05 01:37:04 +02:00
3fead8017b Rename lexer errors 2025-04-05 01:37:04 +02:00
af66790cff Clean up error definitions, location and expose them in the headers
- Exposes all errors in the header file so any user of the api can test
   for the specific error conditions
 - Mark all static error pointers as const
 - Move generic errors into error.h
 - Name all errors err_modulename_* for errors that belong to a specific
   module and err_* for generic errors.
2025-04-05 01:37:04 +02:00
cb8768b1d0 Make clangd aware of the _POSIX_C_SOURCE define in the build system 2025-04-05 01:37:04 +02:00
1571c52012 Add some building documentation that clarifies the make targets
All checks were successful
Validate the build / validate-build (push) Successful in 26s
2025-04-04 02:18:11 +02:00
0f9ced8eb1 Rework the build system to be more modular
Split most of the work off into make/base.mk and allow for easy wrappers
to be created around that that can build with different instrumentation
in their own build directory.

Create wrappers for the following:
 - release build
 - debug build
 - afl++ fuzzing build
 - static analysis with clang
 - clang memory sanitizer
 - clang address/undefined sanitizer
2025-04-04 02:18:02 +02:00
0d3881f680 Update the test input file to contain all AST nodes
All checks were successful
Validate the build / validate-build (push) Successful in 36s
2025-04-02 21:41:27 +02:00
5ea942024f add functionality to main to parse and print the ast 2025-04-02 20:57:02 +02:00
b4757e008c Add parse_result_wrap to wrap a result with another parent node
Use the new wrap function to wrap numbers and immediate nodes
2025-04-02 20:57:02 +02:00
b70b6896bf Partial parser implementation 2025-04-02 20:56:59 +02:00
6ca7bb3661 Fix incorrect size comparison in lexer_consume_n
The buffer length len and the requested number of tokens n are mixed up
in an invalid comparison. This causes all valid requests for n < len
tokens to be denied and all invalid requests for n > len tokens to be
accepted. This may cause a buffer overflow if the caller requests more
characters than they provide space for.
2025-04-02 20:41:49 +02:00
d424c0f886 Add a parser combinator to parse a delimited list 2025-04-02 20:41:49 +02:00
c66489dd90 Add basic parser combinators 2025-04-02 20:41:49 +02:00
44fa66c2b7 Add "primitive" parsers for all the non-trivia tokens in the lexer grammar 2025-04-02 20:41:42 +02:00
c48adb1306 Add basic parser utilities 2025-04-02 20:38:35 +02:00
5fb6ebef28 Add functions to skip over trivia in a tokenlist 2025-04-02 11:59:24 +02:00
bbdcad024f Add function to print the AST 2025-04-02 11:50:25 +02:00
935da30257 Add basic AST functionality 2025-04-02 11:35:53 +02:00
34ace36920 Add a parser grammar 2025-04-02 11:28:58 +02:00
bd37ddaeea Add tokenlist, a linked list of lexer tokens
The linked list is doubly linked so the parser can look forward into it
and error reporting can look backward.

This commmit also reworks main to use the tokenlist instead of dealing
with the lexer manually.
2025-03-31 18:43:34 +02:00
42da7b1d05 Move err_allocation_failed into error.c and make it available to
everyone.
2025-03-31 18:43:34 +02:00
75fc72c35d Add action to run validation on every commit
All checks were successful
Validate the build / validate-build (push) Successful in 23s
Adds some flags to the makefile to make it build on alpine with a
different libc
2025-03-31 14:36:15 +02:00
5cdb60d395 Remove peek function 2025-03-30 22:51:47 +02:00
e5830daac9 Add documentation comments to the lexer code 2025-03-30 22:51:15 +02:00