C++20 coroutines have lovely1 syntax. They are also a terrible fit for game development.2
If you’ve ever tried to use them for boss scripts, dialogue, or AI behaviors – anywhere you want straight-line code that pauses for a few frames – you’ve probably hit the same wall I did: opaque handles, heap allocations3, hidden compiler lowering, and – most damning for games – no way to serialize a paused coroutine to disk.
In this article, I will present
sfex::Coroutine: a ~200-line stackless
macro-based coroutine library built around a variant of the classic
switch + __LINE__ trick. Like my previously
discussed sfex::Profiler,
these coroutines are meant to be simple and lightweight.
what’s our goal?
Our goal is to implement coroutines that meet these criteria:
Allocation-free: The entire coroutine state is one
intplus whatever members4 you put on the struct.Trivially serializable: Save the integer state and the struct’s data, and that’s it. Mid-cutscene saving and loading just works™.
Deterministic: No hidden compiler lowering; the generated control flow could be written by hand.
Composable: Coroutines can
awaitother coroutines, all in straight-line code.Tiny: The whole runtime is a small header you can read in one sitting.
The full implementation is part of my fork of
SFML, in SfexCoroutine.hpp.
Here are a few examples that showcase the feature, that you can directly
play here in your browser5!
🕹️ Shoot-em-up boss fight
🕹️ Coreographed top-down stealth game
🕹️ Dialogue cutscene
the trouble with C++20 coroutines
Before we get to work, let’s see what’s wrong with
co_await for game programming.
Unpredictable heap allocations: The frame of a C++20 coroutine is allocated dynamically. The standard provides “Heap Allocation eLision Optimization” (HALO) as an allowed optimization, but it is not guaranteed by the language.
Compilers do it inconsistently6, and any change to the coroutine body or to where its handle escapes can silently re-introduce the allocation.
Unoptimized builds might not get HALO at all – so your debug build might silently be allocating per-coroutine on the hot path while your
-O2build might not. For a real-time game with tight latency deadlines, “might” is an uncomfortable term.7As far as I know, there’s no attribute that you can apply to a coroutine to ensure that HALO is applied.
Opaque handles and hidden state:
std::coroutine_handleis a pointer to compiler-managed state.There is no portable way to ask: “what’s in this coroutine’s frame? what local variables does it have? where in the source code is it suspended right now?”
That makes serialization impossible without compiler hooks.8 For a game where the player can press F5 to quick-save at any moment, that’s a deal-breaker.
Serialization is not only useful for player-facing features such as quick saving, but is also very important during development. Being able to deterministically replay save states greatly helps with iteration speed and debugging, and being able to pause/introspect the state of the program at any time and interact with it is a massive productivity boost.
C++20 coroutines are well-shaped for transient work: an async HTTP request that flips a “loaded” flag when it completes, a generator that yields a finite sequence and is then discarded, a one-off scheduled task. The coroutine runs, eventually has some side effect on your real game state, and goes away. Its internal state during the suspension is scaffolding – you don’t reach into it, you just wait for the side effect.
Game logic is not transient. The boss’s behaviour is part of the boss’s data: “we’re paused on phase 3, with 0.3 seconds of wait remaining” must travel alongside the boss’s HP and position through saves, quick-loads, and network sync.
C++20 coroutines push hard against this: the only state model you get is “internal compiler scaffolding, you don’t get to look at it”.
- Cumbersome customization machinery: To get a
working coroutine type you need to write a
promise_type, anawaitable, deal withawait_transform,final_suspend,initial_suspend, return-object construction, and so on. Even minimal implementations ofgenerator<T>involve dozens of lines of plumbing.9
For a game I want a coroutine that is part of an object’s data. When the object dies, the coroutine dies with it. When I serialize the object to a save buffer, the coroutine’s state goes with it. When the optimizer is off, there is no extra cost compared to the equivalent state machine. C++20 coroutines do not provide any of these guarantees out of the box.
Let’s build something that does.
a tiny cutscene
Imagine a simple scripted exchange between two characters:
- “Hero: I finally got you.”
- Wait one second.
- “Villain: You’re too late!”
- Wait half a second.
- “Hero: We’ll see about that!”
- Wait one second.
- Hide the dialogue UI.
Here’s how this looks as an sfex::Coroutine:
struct DialogueScene : sfex::Coroutine
{
Yield operator()(World& world)
{
SFEX_CO_BEGIN;
world.showText("Hero", "I finally got you.");
SFEX_CO_YIELD(Wait{1.0f});
world.showText("Villain", "You're too late!");
SFEX_CO_YIELD(Wait{0.5f});
world.showText("Hero", "We'll see about that!");
SFEX_CO_YIELD(Wait{1.0f});
world.hideText();
SFEX_CO_RETURN(Done{});
SFEX_CO_END;
}
};It nicely reads top-to-bottom. Each
SFEX_CO_YIELD(Wait{...}) means “pause this coroutine
for this many seconds, then come back here”. You can save the game
between any two yields and reload it: the coroutine resumes at the right
line, with all its members intact. There is no compiler magic – the
runtime cost is one integer state field that tracks the
current resumption point, plus a switch.
Driving it from the main loop is equally simple:
DialogueScene scene;
float waitTimer = 0.f;
while (true)
{
const float dt = clock.restart().asSeconds();
if (waitTimer > 0.f)
{
waitTimer -= dt;
}
else
{
scene(world).match(
[&](NextFrame) { /* ...run again next frame... */ },
[&](Wait w) { waitTimer = w.seconds; },
[&](Done) { /* ...proceed to next scene... */ });
}
// ... draw, display ...
}Quick note on the types: Yield is a
std::variant<NextFrame, Wait, Done>, and
.match(...) is a thin wrapper around
std::visit using overloaded lambdas. The point is that the
three yield kinds are explicit data – the driver
pattern-matches on them and decides what to do: ignore, sleep, or
stop.
the same thing as a state machine
It would be easy to make the coroutine look good by comparing it against a (deliberately) bloated FSM. To be fair, here’s the minimal state-machine equivalent: one step counter and one wait timer.
struct DialogueSceneFSM
{
int step = 0;
float waitTimer = 0.f;
bool tick(World& world, float dt)
{
if (waitTimer > 0.f)
{
waitTimer -= dt;
return true;
}
switch (step++)
{
case 0: world.showText("Hero", "I finally got you."); waitTimer = 1.0f; return true;
case 1: world.showText("Villain", "You're too late!"); waitTimer = 0.5f; return true;
case 2: world.showText("Hero", "We'll see about that!"); waitTimer = 1.0f; return true;
case 3: world.hideText(); return false;
}
return false;
}
};For this particular shape – a flat “do X, wait, do Y, wait” sequence with no loops or branches – the step-counter FSM is short, readable, and saves to the same number of bytes as the coroutine.
IMHO, the coroutine clearly wins the moment your script starts orchestrating visual state on top of the dialogue.
Let’s actually draw the two characters on screen and have something happen between the lines of dialogue: the villain takes three menacing paces toward the hero, with a brief pause between each step. The dialogue resumes once the approach is finished.
Here’s the part of the coroutine that handles the menacing approach:
struct CutsceneScene : sfex::Coroutine
{
int paceIdx = 0;
float t = 0.f;
sf::Vec2f paceStartPos;
Yield operator()(World& world)
{
SFEX_CO_BEGIN;
// ...
for (paceIdx = 0; paceIdx < 3; ++paceIdx) // take three steps forward
{
paceStartPos = world.villainPos;
t = 0.f;
while (t < 1.f) // smoothly move by 100px per step
{
t += world.dt;
world.villainPos.x = paceStartPos.x - 100.f * t;
SFEX_CO_YIELD(NextFrame{});
}
SFEX_CO_YIELD(Wait{0.30f}); // menacing pause between paces
}
world.showText("Hero", "Don't come any closer!");
SFEX_CO_YIELD(Wait{1.5f});
SFEX_CO_RETURN(Done{});
SFEX_CO_END;
}
};Three nested control-flow constructs: an outer for over
the three paces, an inner while that does the smooth
interpolation, a Wait after each pace.
Three pieces of persistent state: paceIdx,
t, paceStartPos.
Again, you can read it top to bottom and visually see the temporal flow of events.
Very importantly, those three values have to be struct
members, not function-locals inside operator().
There’s a good reason – the rules and
footguns section below covers it – but the rule of thumb is short:
anything that needs to survive a yield goes on the
struct.
Let’s now try to compare the above solution with an FSM-based one. A
clever FSM author might create a Tween helper to clean up
the per-frame interpolation:
struct Tween
{
sf::Vec2f from, to;
float duration = 0.f, t = 0.f;
void start(sf::Vec2f f, sf::Vec2f tt, float d)
{
from = f;
to = tt;
duration = d;
t = 0.f;
}
bool active() const
{
return t < 1.f;
}
sf::Vec2f tick(float dt)
{
t += dt;
return from + (to - from) * t;
}
};That handles a single smooth movement nicely. But the outer
“do this three times with a menacing pause between each”
doesn’t fit in Tween; it needs its own state. The painful
region of the FSM looks roughly like this:
case N: // outer loop entry
paceIdx = 0;
step = N + 1;
continue;
case N + 1: // start next pace
if (paceIdx >= 3) { step = N + 3; continue; } // exit loop
villainTween.start(world.villainPos, world.villainPos + sf::Vec2f{-100.f, 0.f}, 0.35f);
step = N + 2;
return true;
case N + 2: // tick the tween + post-pause
if (villainTween.active())
{
world.villainPos = villainTween.tick(dt);
return true;
}
waitTimer = 0.30f;
++paceIdx;
step = N + 1; // back to top of loop
return true;Three coordinated case branches plus a paceIdx member,
just to express
for (paceIdx = 0; paceIdx < 3; ++paceIdx). The
Tween helper covered one level of “thing happening
over time”; the outer loop is the second level, and the FSM has
to grow another mini-state-machine on top.
In contrast, the coroutine just has one int paceIdx; and
lets ordinary C++ control flow do its job.10
The full cutscene discussed above is in CoroutineDialogue.cpp,
and playable in the browser at the top of the article.
My claim is: coroutines aren’t dramatically better for trivial scripts, but they become so the moment we have more than one layer of “thing happening over time”.
how does it work?
Every SFEX coroutine expands to a switch-based state machine you’d be able to write by hand – the macros just hide the bookkeeping.
Here is the body of DialogueScene::operator() after
preprocessing (lightly cleaned up to drop some noise we’ll talk about in
a moment):
Yield DialogueScene::operator()(World& world)
{
switch (state) // SFEX_CO_BEGIN
{ //
case 0: //
world.showText("Hero", "I finally got you.");
state = 1; // SFEX_CO_YIELD(Wait{1.0f})
return Wait{1.0f}; //
//
case 1: //
world.showText("Villain", "You're too late!");
state = 2; // SFEX_CO_YIELD(Wait{0.5f})
return Wait{0.5f}; //
//
case 2: //
world.showText("Hero", "We'll see about that!");
state = 3; // SFEX_CO_YIELD(Wait{1.0f})
return Wait{1.0f}; //
//
case 3: //
world.hideText();
state = 0; // SFEX_CO_END
return Done{}; //
} //
//
std::unreachable(); //
}What to notice:
caselabels can be nested anywhere inside the switch: This is legal C and C++. Acaselabel can sit arbitrarily deep inside the switch’s body – between two normal statements, inside a block, inside a loop – as long as no otherswitchopens between the dispatch and the label. We use this to drop a fresh case label after every yield so the nextswitch (state)jumps right back to where we left off.Each yield is two halves: The “going-to-sleep” half (
state = N; return Wait{...};) runs when the coroutine is currently executing. The “waking-up” half (case N:) runs when the driver re-enters after the wait expires. The two halves are placed adjacent to each other in source order, so the body reads top-to-bottom even though execution leaves and re-enters multiple times.The resumption point is just an
int state: That’s literally all that’s needed for the coroutine to know where to resume. The rest of the “coroutine frame” – locals, persistent counters, anything that has to survive a yield – is just regular member data on your struct. The compiler can’t change this representation; it’s an integer plus whatever you wrote.std::unreachable()tells the compiler “control flow never reaches here”: Without it, GCC and Clang would warn about a missing return.
In the real macro expansion, every yield is wrapped in a
do { ... } while(0):
// `SFEX_CO_YIELD` expansion:
do
{
state = 1;
return Wait{1.0f};
case 1:;
}
while (0);This is the standard macro-hygiene trick: it makes the macro expand
to a single statement, so writing
if (cond) SFEX_CO_YIELD(...); followed by an
else does the right thing. The case label sitting inside
the do block is still part of the enclosing
switch (rule 1 above), so jumping to it from outside the
do works fine.
These are the actual macro definitions:
// 1. Open the `switch` statement
// 2. Record the initial value of `__COUNTER__`
#define SFEX_CO_BEGIN \
static constexpr int _sfex_base = __COUNTER__; \
switch (state) { case 0:;
// 1. Update the `state` to the next resumption point
// 2. Return (yield) the specified `value`
// 3. Open the `case` for the next resumption point
#define SFEX_CO_YIELD(value) \
do { \
state = (__COUNTER__ + 1) - _sfex_base; \
return value; \
case (__COUNTER__ - _sfex_base):; \
} while (0)
// 1. Reset the coroutine state
// 2. Return the specified `value`
#define SFEX_CO_RETURN(value) \
do \
{ \
state = 0; \
return value; \
} while (0)
// 1. Syntactically close the `switch` statement
// 2. Add the "unreachable" sentinel
#define SFEX_CO_END \
} \
\
std::unreachable();…counter?
You might be wondering: why __COUNTER__ and not
__LINE__?
The Protothreads library by
Adam Dunkels (providing
“extremely lightweight stackless threads designed for severely
memory constrained systems”) uses __LINE__ to generate
unique state values, making the state values more readable – “we’re
paused at line 42” is easy to audit. I specifically chose not to
use __LINE__, and the reason is serialization.
Recall that the main pitch of this design is that the entire
coroutine state is an int field. You save it to disk, you
load it back, you resume right where you left off. That works only if
the integer means the same thing across save and load.
With __LINE__, the state value is the source-code line
number of the yield.
Any cosmetic edit to the source file shifts those numbers.11
Add a blank line above the function… Reformat with
clang-format… Insert a // ... comment…
Now the integer ‘47’ in the save file – which used to mean “paused at the third yield” – means something completely different. Save files written before the edit resume mid-function at a random instruction. Not good.
__COUNTER__ does not have this problem. It increments
per use, not per source line, so cosmetic edits don’t shift the
values. As long as you don’t add or remove yield points, the numbering
for every existing yield stays stable. Even when you do add a yield,
only the yields after the new one shift – everything before
keeps its identity.12
We relativize the counter to the start of each function thanks to
_sfex_base so the case labels start at
1, 2, 3, ... per coroutine and don’t depend on what comes
before in the translation unit.
The only price we pay versus __LINE__ is that state
values are no longer human-readable. For my use case, save-file
stability matters far more than peeking at an integer in the
debugger.
However, while __COUNTER__ survives cosmetic
edits to the function body, it does not survive
meaningful edits. The moment you ship a patch that adds or
removes a yield, every state value below the change shifts – saves
written under the previous build will resume into the wrong yield site.
There’s no easy way to avoid this in the design (the same hazard applies
to FSMs, too).13
Two practical workarounds:
Tag every save with a coroutine-revision number (a constant you bump whenever you touch a coroutine), and either refuse to load saves with a mismatched revision or keep an older version of the coroutine in the source code for backwards compatibility.
Save the high-level coroutine identity as data, e.g. “we’re at the start of phase 3 of fight 2” – alongside the integer state, and rebuild the integer from the data on load when revisions don’t match.
composing coroutines: enemy AI
A scripted dialogue is a nice introductory example, but the real value shows up when you start composing coroutines. Here are two firing patterns for an enemy in a shoot-em-up (a la Touhou):
struct RingFire : sfex::Coroutine
{
int rings = 4;
int i = 0;
Yield operator()(World& world, Enemy& self)
{
SFEX_CO_BEGIN;
for (i = 0; i < rings; ++i)
{
world.spawnBulletRing(self.pos, /* count */ 8, /* speed */ 120.f);
SFEX_CO_YIELD(Wait{0.4f});
}
SFEX_CO_RETURN(Done{});
SFEX_CO_END;
}
};
struct AimedBurst : sfex::Coroutine
{
int shots = 5;
int i = 0;
Yield operator()(World& world, Enemy& self)
{
SFEX_CO_BEGIN;
for (i = 0; i < shots; ++i)
{
world.spawnBulletAimed(self.pos, world.player.pos, /* speed */ 220.f);
SFEX_CO_YIELD(Wait{0.2f});
}
SFEX_CO_RETURN(Done{});
SFEX_CO_END;
}
};If I want my enemy to cycle forever between the two
patterns, with a one-second pause between each, it’s as easy as using a
while loop:
struct EnemyAI : sfex::Coroutine
{
RingFire ring;
AimedBurst aimed;
Yield operator()(World& world, Enemy& self)
{
SFEX_CO_BEGIN;
while (true)
{
ring = {};
SFEX_CO_AWAIT(ring(world, self));
SFEX_CO_YIELD(Wait{1.0f});
aimed = {};
SFEX_CO_AWAIT(aimed(world, self));
SFEX_CO_YIELD(Wait{1.0f});
}
SFEX_CO_END;
}
};SFEX_CO_AWAIT(child(...)) is the composition primitive:
it runs the child to completion, propagating whatever the child yields
up to the driver.
From the parent’s point of view, awaiting a sub-coroutine looks identical to executing the sub-coroutine’s sequence of yields inline. Yields propagate through arbitrarily many levels.
Each sub-coroutine is owned by its parent as a member –
EnemyAI contains ring and aimed.
The ring = {} line resets the sub-coroutine’s state before
each cycle, which is just a normal copy-assign. There’s no allocation,
nor any opaque handle. When EnemyAI is destroyed,
ring and aimed go with it.
Internally, SFEX_CO_AWAIT expands to the same
state = N; case N:; pattern from YIELD, with a
check in the case body:
do
{
state = 1; // set resumption point to `1`
case 1: // begin resumption point `1`
{
auto _sfex_res = ring(world, self);
if (!isFinished(_sfex_res))
return _sfex_res; // child still running -- propagate its yield
}
}
while (0);Each call: tick the child once, grab its yield.
If the child isn’t done, return the yield up to the parent’s caller (so a
Wait{0.4f}from insideRingFirepropagates all the way out to the driver, which sleeps the parent’s parent for 0.4 seconds).If the child is done, fall through past the
if, past thedo { } while (0), and on to the next statement – which, in the loop above, isWait{1.0f}between cycles.
That’s the whole composition mechanism: nested case
labels and yields that bubble up by being returned.
Now let’s compare to the FSM equivalent. To match the cycle
behaviour, every for loop and every yield in the
sub-coroutines becomes its own state. Even just the
RingFire part:
struct EnemyAIFSM
{
enum class Phase
{
// Embedded `RingFire`:
Ring_Init, Ring_Fire, Ring_Wait, Ring_Done,
// Pause:
WaitAfterRing,
// Embedded `AimedBurst`:
Aimed_Init, Aimed_Fire, Aimed_Wait, Aimed_Done,
// Pause:
WaitAfterAimed,
};
Phase phase = Phase::Ring_Init;
int i = 0;
float waitTimer = 0.f;
void tick(World& world, Enemy& self, float dt)
{
while (true)
{
switch (phase)
{
case Phase::Ring_Init:
i = 0;
phase = Phase::Ring_Fire;
continue;
case Phase::Ring_Fire:
if (i >= 4)
{
phase = Phase::Ring_Done;
continue;
}
world.spawnRing(self.pos, 8, 120.f);
waitTimer = 0.4f;
phase = Phase::Ring_Wait;
return;
case Phase::Ring_Wait:
waitTimer -= dt;
if (waitTimer > 0.f) return;
++i;
phase = Phase::Ring_Fire;
continue;
case Phase::Ring_Done:
waitTimer = 1.0f;
phase = Phase::WaitAfterRing;
return;
case Phase::WaitAfterRing:
waitTimer -= dt;
if (waitTimer > 0.f) return;
phase = Phase::Aimed_Init;
continue;
// ... the `Aimed_*` cases follow the same pattern ...
}
}
}
};Composition disappears in a non-hierarchical FSM: it has to
flatten RingFire and AimedBurst into states
alongside its own. Each child becomes inlined as a sub-region of the
parent’s enum, and the parent has to know which child it’s currently
running. Adding a third pattern – say, a sweeping laser between rings
and aimed shots – means another set of states and another set of
transitions to wire up, while the coroutine version is two new
lines.
Hierarchical state machines and behaviour trees offer richer alternatives – structured composition on top of FSMs – but their composition primitives are explicit, while coroutines reuse C++’s existing control flow. The coroutine version has the smallest surface area and is, in my opinion, the simplest option (but, perhaps, not the most powerful).
rules and footguns
This macro trickery is small but has many sharp edges. In rough order of how often they bite:
Locals that must outlive a yield have to be struct members. When the
switchjumps to acaselabel deep inside the function, it skips over any local-variable declarations between the function entry and the label, so a local declared after the function entry but before a yield is silently re-initialized every time the coroutine is called. If the local sits inside a nested scope (e.g. inside aforloop), the compiler will reject the code with a “jump into protected scope”14 error instead.The convention is short and absolute: anything that must persist across yields becomes a struct member. The compiler does not catch the silent-reset case; you have to remember.
For locals that don’t need to survive the yield – per-iteration scratch values that the rest of the loop body wants to be
const– there’s a less heavy-handed fix: wrap them in their own inner{ ... }so the scope ends before the yield. The case label insideSFEX_CO_YIELDthen lands at a point where the locals aren’t in scope, and the “jump into protected scope” rule simply doesn’t apply.
RAII guards do not span yields. A
std::lock_guard, file handle, orunique_ptrdeclared as a local before a yield is destroyed at the yield, because the coroutine returns from the function. “Suspending” really means “returning, withstateset so we know where to come back”. The stack unwinds; RAII fires.If you need a resource held across a yield, put it on the coroutine struct – its destructor runs when the coroutine struct itself is destroyed, not at every yield. Same workaround as locals.
While unintuitive, I don’t think this is a big problem in practice, as you’d probably use C++20 coroutines for any sort of resource-management task that involves locks, files, or networking.
__COUNTER__is shared with the rest of the translation unit. The relativization trick (__COUNTER__ - _sfex_base) handles__COUNTER__uses outside the function: the base captures the counter at function entry, so external uses cancel out. But it does not protect against__COUNTER__uses inside the same function body. Every coroutine yield uses two__COUNTER__increments; if a macro between two yields also bumps the counter, the case labels for everything below shift.Few macros use
__COUNTER__in practice, but if you mix SFEX coroutines with another__COUNTER__-driven construct in the same function the failure mode is a runtime jump to the wrong yield site, not a compile error.Adding or removing a yield invalidates older save files. The state integer is an offset into the coroutine’s switch dispatch – adding or removing a yield earlier in the function shifts every state value below it. The same hazard applies to any FSM whose enum you change. Discussed earlier, but here for completeness.
Debugging is awkward. After a yield, the program counter “zips” back into the switch dispatch on the next call rather than continuing in source order, so a stepping debugger doesn’t follow the logical control flow. The call stack shows the current C++ invocation, not the logical nesting of awaited sub-coroutines either – a deeply-nested
AWAITchain looks like one frame in the debugger.For non-trivial debugging, a logged
statevalue plus a per-coroutine name ("EnemyAI:state=3") is more informative than a stepping session. The state integer is small, stable, and hand-readable once you know which yield is which.
None of the above is fatal. The first rule is the most annoying and frequent one.
what’s next
We’ve seen:
Why C++20 coroutines aren’t a good fit for game logic.
A ~200-line header that gives us lightweight coroutines with allocation-free state and trivial save/load.
How the macros expand under the hood, and why
__COUNTER__beats__LINE__.Composition via
SFEX_CO_AWAIT, and how it crushes the FSM equivalent for branching scripts.
If there’s interest in the topic, in a follow-up post I could cover:
Parallel composition:
AWAIT_ALLandAWAIT_ANYfor “fire bullets while dashing” or “race a script against a timeout”.Generic time and yield types: How the same primitives work for tick-counter games or non-real-time simulations.
If you want to read ahead, the full implementation of the shoot-em-up example lives in VRSFML, and uses parallel composition primitives.
shameless self-promotion
I offer training, mentoring, and consulting services.
If you are interested, check out romeo.training, alternatively you can reach out at
mail (at) vittorioromeo (dot) comor on Twitter.Check out my games on Steam, powered by VRSFML!
BubbleByte: A clicker-idle game where you recruit cats to pop bubbles. That’s it. Or is it…?
Open Hexagon: A fast-paced open-source arcade game with user-created content – the de-facto community-driven spiritual successor to Terry Cavanagh’s critically acclaimed Super Hexagon.
My book “Embracing Modern C++ Safely” is available from all major resellers.
- For more information, read this interview: “Why 4 Bloomberg engineers wrote another C++ book”
For C++ standards.↩︎
This isn’t unique to C++. Unity’s C#
IEnumeratorcoroutines have, in my opinion, similar problems for the same structural reasons: they hide their state behind iterator objects you can’t introspect, serialize, copy, or network. I can’t prove it, but I’d bet a non-trivial fraction of bugs in Unity games come from coroutine overuse – and a non-trivial fraction of save / multiplayer / replay headaches come from the same place.↩︎Yes, in theory the heap allocations can be elided via HALO (P0981). In practice, you can’t rely on it.↩︎
Both “allocation-free” and “trivially serializable” are claims about the coroutine machinery, not about whatever members you choose to put on the coroutine struct. If you stick a
std::stringorstd::vectoron it, those types have their own allocations and their own non-trivial serialization story – the coroutine doesn’t change that. The same caveat applies to pointers and references: they would need to be re-resolved on load, just like in any other game object.↩︎My fork of SFML natively supports Emscripten as a target. The exact same code works both on desktop and the browser.↩︎
The optimization can also silently break from a compiler update to another, see LLVM issue #64586 for an example.↩︎
You can provide a custom allocator by overriding
promise_type::operator new, but this adds more plumbing/complexity. It also doesn’t solve the rest of the problems.↩︎Maybe a future reflection API on top of P2996 will eventually let us peek inside the coroutine frame, but there’s no such proposal, and it would take years from now to become usable.↩︎
Here is a minimal implementation of a
generator<T>– adding iterator support would require even more code.↩︎The
Tweenhelper is kind of a tiny one-shot coroutine – start, tick-per-frame, done. The FSM author isn’t avoiding the coroutine pattern; they’re reinventing a more limited version of it for one specific case (linear motion). For arbitrary script-side control flow, they’d end up reinventing the rest too.↩︎You can mitigate this with relativization (
state = __LINE__ - first_line), the way Protothreads does internally. That gets you stability against edits outside the function – adding includes, reordering functions in the file, etc. – but it does not help with edits between yields inside the function. Aprintfinserted between two yields still shifts every state value below it.__COUNTER__survives that case unchanged.↩︎There’s a secondary failure mode for
__LINE__: two yields on the same physical source line (e.g.SFEX_CO_YIELD(...); SFEX_CO_YIELD(...);) collide and produce a duplicatecaselabel, which is a compile error.↩︎One possible solution is have users manually specify the
statevalue for resumption points. This is clearly ugly/cumbersome, and requires having the foresight of “leaving some space” between those values in case a newSFEX_CO_YIELDis going to be added in the middle.↩︎Any jump, including a jump to a
switchcase, is not allowed to skip over a declaration that initializes a variable, except for a few specific cases. Violating this rule will cause a compilation error.↩︎