CompilerX32 - Compiler Werkstatt
Alpha Vorschau — Nativer x86-32 Self-Hosting Compiler

CompilerX32 v0.1.0 ALPHA

Multi-Sprach Frontend — Unified IR — Native x86

A native x86-32 compiler for PowerBASIC-compatible BASIC with a complete multi-pass preprocessor (macros, equates, conditionals, MACROTEMP). Four syntax worlds — BASIC, C, ASM, PILOT — coexist in the same source file through a language-switching stack. No GCC. No LLVM. Just pure native machine code.

Active Development — PBWin host build: 1.1 MB EXE, 0 errors
Download Coming Soon Help File (CHM) Multi-Pass Preprocessor: COMPLETE

PBWin host compile: 1,118,208 bytes

Multi-pass preprocessor (6 modules)

Macro expansion with MACROTEMP

Built-in macros (__LINE__, __COUNTER__, ...)

C-preprocessor features (##, #, #undef)

635 PB11 keywords in lexer

Self-hosting: in progress

7Stage pipeline: Lex → Parse → Sem → IR → CodeGen → Asm → Link
4Language frontends: BASIC, C, ASM, PILOT with nested switching
635PB11 keywords, 1,564 constants, 27 data types, 13 operators
~2,500Lines of preprocessor code across 6 dedicated modules

What Changed in This Update

Multi-Pass Preprocessor Complete

The Pass 2 EXPAND engine is fully implemented with a stability loop that repeats until the source no longer changes. Macros can reference macros defined later in the source. No artificial limits. No phase errors. Line continuation (_), equate expansion, built-in macros, and full MACROTEMP support.

Macro Engine with Parameters

Parameterized macros using \1 through \16 syntax. MACROTEMP generates unique names per expansion (e.g. tmp_MT0001). MACRO FUNCTION return expressions. Recursive protection prevents infinite expansion. Nested macro expansion resolves automatically.

Built-in Macros

Eight built-in macros always available: __LINE__, __FILE__, __DATE__, __TIME__, __COUNTER__ (monotonic counter), __CX32__ ("CompilerX32"), __CX32_VERSION__ ("0.1.0"), __PBWIN__ (1).

C-Preprocessor in #CCODE Blocks

Inside #CCODE / #CINLINE blocks: ## token pasting (foo##barfoobar), # stringification (#x"x"), #undef, #error, #pragma, #warning, #line.

Preprocessor Architecture

Raw Source (.bas) | |-- PASS 1: COLLECT -- | #INCLUDE / #CINCLUDE -> load and recursively scan | MACRO / #define -> macro table | %EQUATE / $EQUATE -> equate table | #OVERRIDE -> function signature overrides | #ASM...#ENDASM -> VERBATIM (skip) | |-- PASS 2: EXPAND (stability loop) -- | 1. Line continuation (_) resolution | 2. Equate expansion (%NAME, $NAME) | 3. Built-in macros (__LINE__, __COUNTER__, etc.) | 4. User macro expansion (\1..\16, MACROTEMP) | 5. C-features in #CCODE (##, #, #undef) | 6. New #INCLUDE detection -> restart from Pass 1 | Repeat until source stabilizes (max 32,768 passes) | |-- PASS 3: CONDITIONALS -- | #ifdef / #ifndef / #if / #elif / #else / #endif | #SELECT / #CASE / #DEFAULT / #ENDSELECT | #ERROR / #PRINT | v Fully expanded text -> Lexer -> Parser -> Semantic -> IR -> Backend -> PE32

The stability loop ensures that macro definition order does not matter. A macro can reference another macro defined later in the source. If macro expansion reveals new #INCLUDE directives, the entire pipeline restarts to collect and expand the newly included content.

Multi-Language Example

FUNCTION PBMAIN() AS LONG LOCAL x AS LONG x = 10 ' Switch to C #CCODE int y = x + 5; printf("C says: %d\n", y); #ENDC ' Switch to ASM #ASM MOV EAX, [x] ADD EAX, 5 MOV [x], EAX #ENDASM ' Nested: BASIC -> C -> ASM -> BASIC #CCODE int z = x + 10; #ASM PUSH EAX MOV EAX, [z] #ENDASM #ENDC FUNCTION = x END FUNCTION

BASIC variables are accessible by name in both C and ASM blocks. The language-switching stack tracks nested modes automatically. Mismatched closers produce errors showing the opening line.

Macro Expansion Example

MACRO ADD(x, y) = (\1 + \2) MACRO DOUBLE(x) = ADD(\1, \1) MACRO SWAP(a, b) MACROTEMP tmp tmp = \1 \1 = \2 \2 = tmp END MACRO DIM x AS LONG, y AS LONG x = 10: y = 20 SWAP(x, y) ' tmp -> tmp_MT0001 (unique per expansion) DIM r AS LONG r = DOUBLE(5) ' Expands to: r = (5 + 5) = 10 PRINT "Line: " + STR$(__LINE__) ' __LINE__ -> actual line number PRINT "File: " + __FILE__ ' __FILE__ -> "source.bas" PRINT "Counter: " + STR$(__COUNTER__) ' 0, 1, 2, ... each use

Language Frontends

LanguageSwitch IntoSwitch OutFeatures
PowerBASIC(default)#CCODE, #ASM, #PILOTFull PB11 syntax + MODULE/ENDMODULE + dynamic UDT strings
C#CCODE, #CINLINE, #CINCLUDE#ENDCC subset: switch, unions, function pointers, casts, ternary, ## and #
ASM#ASM, !line#ENDASMx86 inline assembly with PB variable access, Lab_ prefix for labels
PILOT#PILOT, #PCODE#ENDPILOTEducational language: T/A/M/Y/N/C/J/U/E commands

Current Status

PBWin host compile (0 errors, 0 warnings)
7-stage pipeline (all stages implemented)
Backend32: 51 encoder files (SSE, BMI, MOV, Jumps)
Multi-pass preprocessor (6 modules, ~2,500 lines)
Macro expansion with \1..\16, MACROTEMP
Built-in macros (8 builtins)
C-preprocessor features (##, #, #undef, #error)
Line continuation (_) with language awareness
Equate expansion (%NAME, $NAME)
635 PB11 keywords in lexer
Language switching stack (32,768 levels)
Self-hosting: Gen1 in progress
String runtime: stabilization
Full C syntax support
Gen2/Gen3 self-hosting chain
Test coverage expansion
Public alpha release

Architecture — From Source to PE32

Source (.bas with #CCODE / #ASM / #PILOT blocks) |-> Multi-Pass Preprocessor (macros, equates, includes, conditionals) |-> Lexer (language-mode-aware tokenization) |-> Parser (unified AST, Pratt expression parsing) |-> Semantic Analysis (type checking, scope resolution) |-> IR Generation (unified intermediate representation) |-> x86 CodeGen (IR -> x86 instruction selection, regalloc) |-> Assembler (x86 encoding: SSE, BMI, CMOV, LZCNT, ...) |-> Linker (COFF -> PE32 executable generation)

Alpha Development

CompilerX32 is in active development. The complete multi-pass preprocessor is implemented and compiles cleanly with PBWin. The self-hosting compiler, multi-language test suite, and documentation will be available for download with the first public alpha release.