Skip to main content

Compiler Optimizations

Overview

MD Engine includes experimental compiler optimization passes that can reduce ROM size and improve runtime performance. These passes analyze your visual scripts after they've been converted to an internal representation (IR) and apply transformations before generating the final C code.

All optimization passes can be toggled individually in Settings > Build Options and are enabled by default.

warning

These optimizations are marked as Experimental. If you encounter build errors or unexpected runtime behavior after enabling them, disable the passes one at a time to isolate the issue.

How It Works

When you build your project, the compilation pipeline works in stages:

  1. Visual Scripts → IR — Your event scripts are converted into a structured Intermediate Representation (Stmt/Expr tree).
  2. Optimization Passes — The IR is analyzed and transformed (inlining, CSE, LICM, dedup).
  3. IR → C Code — The optimized IR is rendered to C source files.
  4. C → Assembly → ROM — GCC compiles the C code with its own optimizations (-O2/-O3), then the assembler and linker produce the final ROM.

The compiler's optimization passes complement GCC — they operate on game-specific patterns that GCC cannot see (like repeated actor property reads or identical script bodies across actors).

Optimization Passes

Script Inlining

What it does: Replaces calls to small synchronous scripts with the script's body directly at the call site, eliminating function call overhead.

Example: If you have a utility script "Reset Temp Variables" that just sets two variables to zero, instead of generating a function call:

// Before: function call
res = script_70(&globalCtx);

// After: inlined body
TBCE_Temp[6] = 0;
TBCE_Temp[7] = 0;

When it helps:

  • Projects with many small reusable scripts called from multiple places.
  • Eliminates the overhead of parameter marshalling and function dispatch for trivial scripts.

Current limitations:

  • Only inlines scripts with no parameters (V_VAR_N, AC_PARAM_N).
  • Scripts must not use actor locals (__self, __a0).
  • Scripts with early returns (inside if/switch) are skipped.
  • Only synchronous scripts qualify (async scripts with suspend points cannot be inlined).
tip

Enable Advanced Script Inlining to lift these restrictions — it handles parameters, actor locals, and early returns.

Setting: Enable Script Inlining (Experimental)


Advanced Script Inlining

What it does: Extends basic inlining to handle scripts that take parameters (V_VAR_N, AC_PARAM_N) and use actor locals (__self, __a0). Keeps the callee's declarations but substitutes their initialization with the actual call-site values, wrapped in a scoped block.

Constant specialization: When a parameter is a compile-time literal (e.g., SFX ID 11), switch/if statements on that parameter are folded — only the matching branch is emitted.

Example (Play SFX with literal param):

// Before: 630-line function call
TBCE_scriptParams[0] = 11;
TBCE_variableParams[0] = 2;
res = script_12(&globalCtx);

// After: constant-folded inline (only case 11 emitted)
{
if ((TBCE_Variables[53] > 0))
{
TBCE_SoundPlayPCM(sound_gun_slash, TBCE_SFX_CHANNEL_AUTO, 6, false);
}
}

When it helps:

  • Scripts called with literal values (SFX IDs, animation indices, constant flags).
  • Eliminates parameter marshalling (global array writes + TBCE_UnpackVariable calls).
  • Combined with constant folding, can reduce 630-line functions to a few lines at each call site.

Early return handling: Scripts with return statements inside if/switch/for/while blocks are supported — each early return is converted to a goto past the inlined block. For example:

// Before inlining: script_78 (Destroy Tile) has early returns
if (collision == DIR_DOWN) {
setTile(0);
return true; // early return
}
setTile(0);
clearCollision();

// After inlining: returns become gotos
{
if (collision == DIR_DOWN) {
setTile(0);
goto __inline_end_0; // was: return true
}
setTile(0);
clearCollision();
}
__inline_end_0:;

Current limitations:

  • Only async scripts are excluded; sync scripts with any number of parameters qualify.
  • The callee must be under the size threshold (20 deep stmts) after body extraction.

Setting: Enable Advanced Script Inlining (Experimental)


Common Subexpression Elimination (CSE)

What it does: Finds expressions that appear two or more times across different statements within the same scope and hoists them into a temporary variable, so they're computed only once.

Example:

// Before: TBCE_ActorGetX called 3 times
if ((TBCE_ActorGetX(__self) > 100)) { ... }
if ((TBCE_ActorGetX(__self) < 200)) { ... }
TBCE_Temp[0] = TBCE_ActorGetX(__self);

// After: computed once
s16 __cse0 = TBCE_ActorGetX(__self);
if ((__cse0 > 100)) { ... }
if ((__cse0 < 200)) { ... }
TBCE_Temp[0] = __cse0;

When it helps:

  • Scripts that read the same actor property (position, size, animation) multiple times.
  • Collision checks that compute the same map index repeatedly.
  • Any script with repeated array accesses like TBCE_Variables[N].

How it stays safe:

  • Only hoists expressions whose dependencies aren't modified between uses.
  • Writes to a variable invalidate any CSE that reads it.
  • Uses a whitelist of known-pure functions (getters like TBCE_ActorGetX, TBCE_ActorGetY, TBCE_ActorOnScreen) that are safe to hoist.

Setting: Enable CSE & LICM (Experimental)


Loop-Invariant Code Motion (LICM)

What it does: Identifies expressions inside for and while loop conditions that don't change across iterations and moves them before the loop.

Example:

// Before: localVars[3] read every iteration
for (TBCE_Temp[5] = (localVars[3] - 1); TBCE_Temp[5] <= (localVars[3] + 1); TBCE_Temp[5] += 1)
{
// loop body doesn't modify localVars[3]
}

// After: computed once before the loop
s16 __licm0 = localVars[3];
for (TBCE_Temp[5] = (__licm0 - 1); TBCE_Temp[5] <= (__licm0 + 1); TBCE_Temp[5] += 1)
{
// ...
}

When it helps:

  • For-loops that use actor properties or variable reads in their condition.
  • Nested loops where the outer variable is used in the inner loop's bounds.

How it stays safe:

  • Tracks all variables written inside the loop body (including nested if/switch).
  • Uses precise alias analysis — writing localVars[1] does NOT block hoisting localVars[3] (different constant index).
  • Skips hoisting if any raw code in the loop could affect the expression.

Setting: Shares toggle with CSE — Enable CSE & LICM (Experimental)


Parameterized Script Dedup

What it does: Finds scripts that are structurally identical except for a few differing constants (such as actor IDs). Instead of emitting 20 copies of the same function, it emits one shared implementation with the varying values as parameters, plus thin one-line wrapper functions.

Example:

// Before: 20 identical ~800-line functions
bool actor_167_update(TBCE_ScriptContext* ctx) { /* 800 lines with AC_ACTOR_167 */ }
bool actor_168_update(TBCE_ScriptContext* ctx) { /* 800 lines with AC_ACTOR_168 */ }
// ... 18 more copies

// After: 1 shared function + 20 one-line wrappers
static bool __shared_update_0(TBCE_ScriptContext* ctx, u16 __p0) { /* 800 lines with __p0 */ }
bool actor_167_update(TBCE_ScriptContext* ctx) { return __shared_update_0(ctx, AC_ACTOR_167); }
bool actor_168_update(TBCE_ScriptContext* ctx) { return __shared_update_0(ctx, AC_ACTOR_168); }

When it helps:

  • Projects with many actors that share the same event logic (e.g., enemies, collectibles, NPCs with identical behavior).
  • Significantly reduces ROM size — 20 copies of 800 lines becomes ~800 + 20 lines.

How it works internally:

  • Compares scripts' IR trees structurally, replacing leaf values with placeholders.
  • Groups scripts with identical "skeletons" (same structure, different constants).
  • Diffs the IR to find exactly which values vary.
  • Supports up to 4 varying parameters per group.
tip

GCC will typically inline the thin wrapper functions at -O2/-O3, so there is zero runtime overhead from this optimization — only ROM savings.

Setting: Enable Parameterized Script Dedup (Experimental)

Troubleshooting

When to Disable Passes

If you experience issues after a build:

  1. Build fails with C compiler errors (e.g., undeclared variables, type mismatches) — an optimization pass generated invalid code. Disable passes one at a time to find which one.
  2. Game behavior changes (e.g., collisions not working, scripts not executing correctly) — a pass may have incorrectly transformed your logic. Disable and report the issue.
  3. Build succeeds but ROM is unexpectedly large — the parameterized dedup might not be catching your scripts (check if they have raw code differences).

Known Limitations

PassLimitation
Script Inlining (Basic)Only parameterless sync scripts without actor locals or early returns
Advanced Script InliningHandles params, actor locals, and early returns; still requires sync + under size threshold
CSEWon't hoist across raw code blocks or non-pure function calls
LICMOnly hoists from for/while conditions, not arbitrary loop body expressions
Param DedupSkips groups where differences are in raw code (e.g., preprocessor guards); max 4 varying params

Settings Reference

These settings are found in Settings > Build Options:

SettingDefaultDescription
Enable Script InliningOnInline small parameterless sync scripts directly at call sites
Enable Advanced Script InliningOnExtend inlining to parameterized scripts with constant folding and early return → goto conversion
Enable CSE & LICMOnHoist repeated expressions and loop-invariant code into temporaries
Enable Parameterized Script DedupOnMerge structurally identical scripts into shared functions

All three default to On. Use the Restore Defaults button in Build Options to re-enable them all.