What are the pros and cons of structuring an application as a pipeline of functions mutating a shared struct?

  softwareengineering

I could not find anything regarding this question except some lecture notes about game design and a book which describes something similar but not quite the same.

General Description

The approach is as follows.

There are two kinds of state in an application: Core State and Derived State. The Core State is any state that cannot be readily derived from other state. The Derived State is any state that is not core. As a trivial example, if you have N bananas and M apples, you can easily calculate how many fruits S you have with S = N + M. Thus N and M is core state, and S is derived.

The main application components are Data, Systems and the Main Loop.

Data is represented as a single struct, that contains all of the core state of the application, publicly accessible. All fields within this struct should ideally be primitive types, PODs, generic collections (a hash set, a list, etc.) or utility types from the standard library (e.g. Option and Result in Rust).

A System is a function that is called in the Main Loop. It must never return any value and instead mutate the values, which are parts of Data, passed into it and must never call another systems directly. The second requirement means that systems can only communicate by modifying Data (or shared private state when grouped in a class).

Systems can be grouped into classes with private state, which should ideally be Derived. Instances of such a class are created before the Main Loop, the internal state must be initialized either during construction or in the Main Loop.

The Main Loop is simply the outermost loop inside the main function that keeps the program running. This loop must be located in the main function and not hidden behind a function call.

Example

A simple example that illustrates what the code written in this fashion may look like is given below. It is written in Rust but is not specific to it.

// Farm Data
#[derive(Default)]
struct Data {
    pub keep_going: bool,
    pub money: u32,
    pub time: Time,
    pub sheep: Vec<Sheep>,
    pub cows: Vec<Cow>,
    pub ducks: Vec<Duck>,
    pub food: u32,
    pub last_error: Option<FarmError>,
    pub log_messages: Vec<Log>,
}

fn feed_ducks(ducks: &mut [Duck], food: &mut u32, last_error: &mut Option<FarmError>) {
    for duck in ducks {
        if *food > 2 {
            *food -= 2;
            duck.hunger -= 1;
        } else {
            *last_error = Some(FarmError::new("not enough food"));
        }
    }
}

// If you want to record logs, you must pass the array as a parameter to the place
// where it is needed.
// Using globals to push logs directly from any part of the code is discouraged
// as all application state should reside in Data.
fn sell_cows(
    cows: &mut Vec<Cow>,
    money: &mut u32,
    last_error: &mut Option<FarmError>,
    logs: &mut Vec<Log>,
) {
    // Sell the cows, get money.
}

fn buy_food(food: &mut u32, money: &mut u32) {
    // Buy food to feed ducks (using money).
}

fn shear_sheep(sheep: &mut [Sheep], last_error: &mut Option<FarmError>) {
    // Shear the sheep.
}

fn rest(logs: &mut Vec<Log>) {
    // Rest
}

fn print_logs(logs: &Vec<Log>) {
    for log in logs {
        println!("{log}");
    }
}

fn clear_logs(logs: &mut Vec<Log>) {
    logs.clear();
}

fn check_bankrupt(money: u32, keep_going: &mut bool) {
    if money == 0 {
        *keep_going = false;
    }
}

fn main() {
    let mut data = Data::default();
    while data.keep_going {
        if data.time == Time::Morning {
            feed_ducks(&mut data.ducks, &mut data.food, &mut data.last_error);
        } else if data.time != Time::Night {
            sell_cows(
                &mut data.cows,
                &mut data.money,
                &mut data.last_error,
                &mut data.log_messages,
            );
            shear_sheep(&mut data.sheep, &mut data.last_error);
            buy_food(&mut data.food, &mut data.money);
        } else {
            rest(&mut data.log_messages);
        }
        check_bankrupt(data.money, &mut data.keep_going);
        print_logs(&data.log_messages);
        clear_logs(&mut data.log_messages)
    }
}

My questions are the following:

  1. Can this way of structuring an application survive in the real world and be used for almost any application imaginable? What are its pros and cons (including the most obvious ones) for a game versus, say, a microservice or other application?
  2. Can any of the problems be removed if some of the constraints were lifted?
  3. Does this approach scale to larger (game or non-game) projects? Can it make them simpler than using the current best practices?

2

This sort of thing does show up from time to time in data pipelines, compilers, and certain scripts.

Can this way of structuring an application survive in the real world and be used for almost any application imaginable?

No.

What are its pros and cons (including the most obvious ones) for a game versus, say, a microservice or other application?

Among other things, the pros and cons that come with public, global, mutable state. (Why is Global State so Evil?)

Can any of the problems be removed if some of the constraints were lifted?

Probably, but the worst of the problems surround the core idea: a single blob of state that all functions can access.

Does this approach scale to larger (game or non-game) projects?

No. This will scale extremely poorly. It also is a poor idea for most games, which need to selectively load content as the game progresses. Very few games can fit all of their assets in memory at the same time.

Maybe “Event Store” or “Event Sourcing” is a comparable pattern : https://learn.microsoft.com/en-us/azure/architecture/patterns/event-sourcing

I think the classic example is : “You have a document which you are editing, you add text, delete a paragraph, make some other text bold…. Now you want to undo.”

If you have saved the start state and the various commands or event which have been applied to it, you can replay them to any point you want. Much like a Git repository.

However, The obvious downside is storing all the events takes up lots more space than simply storing the end state. At some point you run out of space, or simply don’t need to store years worth of changes.

Less obvious is the complexity of implementation. Say when ever I set some text to bold I want to send an email, “text was set to bold!”. Now if i replay the event, should I send the email? How do I know not to send the email, but DO implement some other knock on event, say “change colour of text to green”?

Say I change an event to fix a bug, “feed the ducks” should also feed any geese present. Do I replay the new version of the event on all the current data? or do I keep both versions and remember when it changed?

Then you get all the problems related to functional programming. ie It just doesn’t match up with real life. You have to wait for buttons to be clicked, exceptions to be thrown, animations to play, environmental factors to be read in etc.

LEAVE A COMMENT