These posts describe the development of NORCAL: a C compiler that targets the NES, with its 6502-like CPU.
January 15, 2019
I am setting up my toolchain: compiler, simulator, test harness.
January 17, 2019
For running automated tests, I've put together a very simple "headless" NES simulator. I took an open source 6502 CPU simulator and configured its memory map to match the NES. To specify the program, you give the sim a standard iNES file, just like any other NES emulator. This won't work for testing any graphical or special-purpose hardware -- only the CPU, RAM, and ROM. This should be enough to test all compiler features.
January 18, 2019
These are my priorities for the compiler:
Obviously, they will often conflict.
January 22, 2019
Parsing isn't interesting, but I need to be able to parse source code in order to run automated tests. For now, to allow me to use a very simple (and quick to implement) parser, programs will be written in a simple S-expression syntax. Hopefully I can get this parser working in just a couple days so that I can move onto code generation.
January 23, 2019
As I hoped, the simple parser was easy to implement. I've moved on to generating code for loading and storing values through pointers. For example:
*0x6000 = 123; *0x6000 = *0x6000;
Simple, untyped assignment expressions.
These kind of expressions won't work once the compiler starts enforcing C's type system, but for now they are useful test cases.
February 4, 2019
It didn't take long for me to get tired of the temporary syntax, and it didn't extend well to handling blocks and function declarations. I've now written a lexer and parser that will parse legitimate C grammar.
May 5, 2020
While I haven't kept up this project log, progress on the compiler continues. Here is a brief overview of the progress since the last update:
It is (hopefully) nearing completion: in February I compiled a fairly large homebrew game and ran it successfully on an NES emulator (Mesen). So why is the project still not done? There are two big remaining issues.
First, the February version of the compiler allocated all function parameters and local variables as globals, instead of on a stack. This was a simple solution that I hoped would be good enough, but it wasn't: the test program is too large, and this approach couldn't fit all its variables in the NES' limited RAM. Global, permanent allocation of all variables consumes a lot of memory, makes the code bigger (because of all the absolute addressing), and means that functions can't be re-entrant, since they only have one set of local variables.
Second, I still haven't reached the point of implementing all the optimization special cases that I will need in order to generate acceptable 6502 code.
The first issue is the most severe one. I believe it requires serious changes to the structure of the code generator, which are in progress. I will probably write about the new approach in future posts. The optimization issue, which is endlessly deferred, I am actually less worried about. I am even looking forward to it; figuring out how to generate clever code for the 6502 is really the original purpose of this project.
I think that I've finally found a good way to describe why writing a compiler for the 6502 is difficult. When I've previously tried to explain it to people, I would just wave my hands around and talk about the irregular registers and the numerous specialized address modes, but none of that seemed to convey the fundamental issue.
Here is my new explanation: It is difficult to generate good 6502 code because you need so much context. Look at this C statement:
n = n + 1;
In this statement there are three "primary" expressions and two operators. (Primary expressions are things like variable names and numeric literals; they are "primary" because they aren't made up of sub-expressions.) The expressions are the variable name n
-- appearing twice -- and the number one. The operators are "add" and "store".
If I try to compile one element at a time... how does that work? What code do I generate for n
in isolation? One classic solution is to push operands onto a stack, and then pop them and use them as inputs to operators. The pseudo-code would look something like this:
PUSH #n PUSH n PUSH #1 JSR _add JSR _store
There are a bunch of problems with this. How is "PUSH" actually implemented? The 6502 doesn't have a very convenient way to manipulate data on a stack. If there is a uniform stack of operands, what size are the operands? If they are two bytes, then most operations will be needlessly slow, because word-sized operations on the 6502 are slowed than byte-sized ones. If data elements on the stack are one byte, then you can't easily push pointers -- which are common in C -- on the stack.
; push address of "n" DEX DEX LDA #n STA (0,X) LDA #n+1 STA (1,X) ; push "n" DEX DEX LDA n STA (0,X) LDA n+1 STA (1,X) ; push "1" DEX DEX LDA #1 STA (0,X) LDA n+1 STA (1,X) ; add JSR _add ; store JSR _store
This looks terrible and will have terrible performance; probably about 100 cycles.
You can generate better code by looking at more context. Consider x + 1
; this is a routine kind of operation on the 6502.
LDA n CLC ADC #1
And then if you are moderately intelligent and look at the left-side n =
all together, you only need one more instruction, STA n
. This is vastly better code, and is something that you might find in reasonable 6502 source code.
However, this is the vital problem: a C compiler has to handle every possible expression, and the code generated above doesn't work for everything. What if instead of "n" there is a complicated sub-expression? What if the operands are words, not bytes? What if the operation is division instead of addition? So this solution isn't universal, and it isn't even the best possible code!
To generate really good 6502 code, you have to look at the entire statement at once. If you do, you will see that there is a solution that is obviously optimal:
INC n
This is very easy for a human programmer to see, but hard to get the compiler to see. This isn't a pattern that you can use to compile C expressions in general, it is incredibly specific! If the variable on the left side was something else, you would have to do something fairly different.
Hopefully this explains why it is tricky to generate good code for the 6502. The compiler must generate correct code for all inputs -- all valid C programs -- but it must also recognize the numerous and intricate special situations in which a much better result is possible.
After a year and a half of development, NORCAL can compile Hackmatch for the NES! At this point, we can finish implementing the game itself.
And now, time for a new foolish programming language project.
HACK*MATCH has been published!
If you have comments, questions, or suggestions regarding this topic, please email me at holmak+blog@gmail.com!