ErlMUD Commentary

Decomposition

Now that we have identified the role each element of the game serves we can think more clearly about how to implement them. Our goal is to avoid putting too much responsibility (really, too much complex behavior) into a single part of the system. The last thing we want to do is accidentally mix platform and gameplay concerns or sprinkle bits of data or rule systems into our game elements or simulation mechanics.

To avoid either hard-coding a value into a process or forcing too much intimacy between two modules we will often need to write auxiliary processes or modules to handle a specific aspect of a certain game element's duties. It is common for a non-trivial system to not just be a system of independent processes, but for each meta element of a system to itself be a group of linked processes which function together to accomplish a single conceptual task. In terms of Erlang we will find that Erlang's system of links, monitors and system processes (where trap_exit is true) are not merely convenient in demonstrating certain programming concepts, they are the very cables and mortar that let us build and segregate parts of a large system without letting any individual piece grow too complex to understand.

In the last chapter we discussed some general insights into software and how those can be applied to the specific case of game system designs which require world simulation. That discussion could apply to a system written in assembler, C, Lisp, C++, Erlang or whatever. What follows is a detailed description of each game element (locations, objects and mobs) as it applies to a D&D style MUD game world, and some thoughts about how we might implement them in a strict message-passing/actor-model world like Erlang's. From this point on "game world" is assumed to mean "a MUD" and "implementation" is assumed to mean "written in Erlang".

Locations

A location is a self-contained place within the game world where actions which immediately affect mobs take place. Our guideline is that if a mob performs a physical, verbal, or emotive ("emote") action it should only be visible to or have a direct effect on another mob in the same location. This should be distinct from the chat system and probably most other actions that can occur at a distance.

There are two basic ways to implement locations. They could be entries in a universal registry where mobs, objects and associations among locations can be tracked in tables. Alternatively, they could be processes which do their own entity tracking and are directly responsible for arbitration of physical actions.

If we use a global-registry then we have the problem that nearly every action in the world depends on this registry. If this were a single process it would be a pretty busy one, but guaranteeing consistency would be relatively simple. If it were a worker pool the load could be distributed, but consistency would be much more difficult to guarantee and the registry store itself would still be a bottleneck. Either way, we would be putting all our eggs in one basket. If this registry were to die so would the MUD. This goes against the principle of fault tolerance and is precisely the sort of thing Erlang was designed to avoid.

Fault-tolerance aside, consider how complex such a monster registry would become. Once we give too much responsibility to a single piece of code the tendency is to increase the number of responsibilities it handles over time. This is against every good practice we know as developers, but this is what we tend to do under the pressures of real life, business, and expectant users. It is quicker to hack "just one more" feature into an already overloaded system than to sort out a tangle of pre-existing responsibilities and then implement an additional feature orthogonally. One Registry To Rule Them All is actually a pretty normal design in an OOP or imperative style MUD, but I don't think it is a good fit for Erlang.

The other way, where each location is a process, makes more sense to me. If locations are processes then we avoid creating any bottlenecks and crashes can occur at a small enough scale that we can probably recover without causing a MUD-wide crisis. It is also easier for me to reason about because it is closer to being a genuine simulation of a world. We will try this first.

Each location will need to keep track of whatever mobs, objects and connections to other locations are present. If a location crashes its neighbors should be notified but carry on. The location's job is to arbitrate activity within the location (someone closes a door, drops an item, mob attacks mob, etc.), answer queries about its current state (who/what is here, what exits are available, etc.), and broadcast location-wide notifications (a mob enters the room, says something out-loud, etc.). This is a fairly simple list of responsibilities that we can probably model in a single process without getting confused.

There is a lingering problem, though. Locations will need to track the status of adjacent locations and arbitrate the movement of mobs between themselves and those other places. This means two identical processes will have to communicate, and this communication will need to be synchronous. That means we need to watch out for deadlocks. Anytime two identical process communicate synchronously this can happen because they both have identical "send foo, then listen for bar..." steps which can get out of sync (if they both "send foo" at exactly the same time, for example). Oh noes!

My way of avoiding this kind of deadlock is usually to create another process that takes care of the synchronous communication on behalf of one of the two principal processes. This lets me design a protocol that isn't guaranteed to have a deadlock built in. Whether our protocol actually avoids deadlocks is a different issue, but this way at least we're not certain to have one.

In this case I think it makes sense to create a "way" process that represents a one-way path between two locations. This lets us create interesting maps without adding new logic to the locations themselves (one-way portals, for example), and a method for implementing other game elements like shuttable/lockable doors, hidden paths, temporary portals, fuel-operated warp gateways, or whatever. Separating the idea of a "way" from a "location" gives us a lot of advantages both in terms of modeling the game world and in preventing kinks from emerging in our messaging protocol.

Something we glossed over up to now is how to instantiate the locations. Where do they come from? Since we will probably have a map defined in permanent storage that gets pulled during startup/reload, a traditional supervisor will probably be a good fit. We can assume, then, that we'll probably need at least a "location manager" process that spawns locations and restarts them if they die untimely deaths. There might be other things we want a manager to do, like alter the map dynamically, but we don't have to worry with that for now. At this point it is sufficient to say that our second "unseen" element is a location manager that supervises the locations and that the locations are workers at the bottom of the supervision tree.

So there we have it, we've identified our first "unseen" elements: ways and location managers. Let's take a closer look at these and see if they reveal any more necessary components we have been blind to.

Location Manager

Our locations have to come from somewhere and to do that the location manager has to know what locations to create and how they are supposed to be connected. That means we have to think about how we want to represent the map data in Erlang terms. We'll worry with the details of internal data representation later when we actually start hacking around on a prototype location manager. What is important now is that we create a clear distinction between using data and manipulating data.

So what should our location manager not do? It should not generate, store or edit map data. Some other part of the system should be responsible for those things. It is probably not even a good idea to make the location manager know how to read map files directly, as we still don't know anything about how we're going to store the map data (files? database? map data service?). It is best to completely separate marshalled data tasks — everything to do with touching it in an external representation — from the task of turning a map file into an Erlang term that represents a map.

But wait! Now is a good time to stop and remember the YAGNI principle: You Ain't Gonna Need It.

We don't even have a working system yet and I'm postulating the future existence of data services. It is extremely important that we consider the distinction between data operations and operations on data, but it is even more important that we don't waste an inordinate amount of time on things that we don't (and may never) need. We can't run off on a technical tangent and start writing database bindings just because we might need them in the future. What we need to spend our time on is getting a basic system working first, but in a way that isn't blind to the future. My preferred way to deal with this sort of thing is to hide the concept of "retrieving data" behind a function call and move on. Later, if we actually do wind up with a data service then we can change that function and make it do more interesting things without touching the rest of the location manager code.

We've established that the location manager is not a data manager. But what are its responsibilities? At startup the location manager will be spawned as its own process by something higher up the system hierarchy. It will retrieve a definition of the world map, spawn each location as a linked process, and trap their exits. After spawning the locations it will loop, keeping track of live locations against the initial map configuration, and restart dead locations as necessary.

We may want to build some more features into the location manager as we go, like maybe a way to ask for the pid of a location at some particular coordinate or location ID, but the above is the basic gist of what we want a location manager to do.

Ways

We recognized earlier that locations will have to arbitrate the movement of mobs around the map and that this will require synchronous communication among locations. Our way of avoiding this is to create an intermediary that functions as a one-way path between two locations. This process can handle synchronous communication on behalf of one location and provide a way to elevate the message protocol from being symmetric synchronous to asymmetric synchronous. But how does movement arbitration work? Which "direction" does a way "point"? How do locations know which ways are relevant? How do they get started? How do we write ways so they are easy to understand and manage?

The verbiage of ways can be a bit confusing. Each way is an exit from one location and an entrance to another. From the perspective of the way, however, its entrance is the departed location's exit, and its exit is the arrived location's entrance. To keep things straight I will mostly discuss ways from the perspective of the locations involved, not the ways themselves (but pay attention).

Ways are spawned by, linked to and logically extend the location on their exit side. In other words, ways are one-way entrances that belong to the location they provide a path toward. Exits from a location belong to whatever location is on the other side, and are monitored by the location they provide an exit from. If a way dies its location and all the other entrance ways to that location die as well. If an exit from a location dies the monitoring location knows that the attached location on the other side must have died as well and removes it from its list of available exits. There is no need to close an entering way if the location on the other side dies: the way can still exist, it will just be idle since no mobs can enter from that direction until the related location respawns.

When a location is spawned it will spawn and link its entrance ways. Way data includes the ID of the "in" and "out" locations on each side. Based on this the location can send an asynchronous message to each of its adjacent locations with the identity of the entrance facing it so it can begin monitoring the newly spawned way as an available exit. The location will also ask for a list of what exits are supposed to be available to it and attempt to start monitoring them on its own. Because all locations both broadcast their exits to adjacent locations and attempt to monitor their own exits on startup we don't have to worry about the order of location or way spawning, because eventually all exits from all locations will be monitored either by request or by the initial action of the spawned location.

Wait! We have a gap in functionality. Above I wrote that locations will request a list of the exits that should be available to it. Ask who? The location manager? It has a complete picture of what locations exist and what the map should look like, so it could search through every location definition and find what exits should be available by checking every single entrance on the map. Just because it could do that doesn't mean it should, though. This would be an ugly procedure if the data were left in pure map form, but we would be making the entire module ugly if we start adding a way registry to the system process that is supposed to be a location manager. It is much cleaner to instead create a "way manager" that keeps track of active ways and can be queried by "entrance" or "exit". To populate it with current data each way should register with the way manager as part of its initialization and the way manager should monitor each of the ways. The way manager is not a supervisor, though, and does not restart downed ways. It is just a simple utility for answering queries about ways. That's one more "unseen" element we hadn't thought about before: the way manager.

Way Manager

Why are we making a way manager instead of having locations act as their own way managers? The main irritation with this is the problem of data consistency. For each location to know its entrances and exits on startup it must be given that data. To do that either the location manager must receive separate way and location data and assemble them prior to spawning locations, or every location definition will have to maintain a copy of its own entrance and exit data. Because every location's entrance is another location's exit, this means that we will be duplicating data everywhere and that means we have to guarantee data consistency somehow. That sucks. If we are going to separate the location definitions from the way definitions anyway, we may as well have the way definitions managed by a queriable registry, and if we are going to do that we've already done all the work necessary for locations to determine their own way information by asking the registry.

There are, of course, other techniques we could use to make way information available, but at the beginning here it is easier to write a way manager and skip designing any more message protocol pieces than to be clever. I'm much more worried about outsmarting myself early on than I am writing a slightly inefficient way manager. Just like anything else in the design, we can revise the design of this part if we come up with something better later on.

Objects

Any inanimate thing in the world is an object: weapons, armor, consumables, light sources, money, clothing, food, containers, fountains, wagons, furniture, etc. Different types of objects may have special behaviors, like food that spoils over time, or a weapon that wears out or a piece of clothing that can only be worn on the head. Objects need a way of being created, destroyed, picked up, placed in containers, act as containers, used as ingredients or food, and any other in-game action we might come up with.

This tells us that the design of objects is not simple and probably will not fit within a single module. Perhaps a food object should be defined differently than a weapon object, but adhere to a generic object interface. Maybe objects should be a central digest of dispatchable behaviors instead of being processes. We're not sure at this point, but it is certain that being able to interact with objects in a natural way in the game is a system requirement.

I've got three basic ways in mind about how to implement objects. One is to make a central object registry that keeps track of all objects across the game world. Another is to have a library of object type definitions we can copy into a "live object" registry maintained by each interacting entity based on the type that is needed, and mobs, locations and container objects could pass the relevant data around to represent associations (passing an object would be passing the tuple that represents it). And yet another way is to make every object a process of its own, the way we do with locations.

The central registry style suffers from the same problems a central location registry would, so that's out. If we thought a location registry would become a complexity monster and system bottleneck, imagine how beastly and busy an object registry would be! There is a lot less travel and position checking in a MUD than object-influenced activity (every combat action might have to account for weapon, all clothing/armor, and total carried weight, for example), not to mention the variety of behaviors that would have to be coded into the object registry to implement diverse things like ingredients, food, equipment, etc. No, thanks.

Passing references around seems a nice solution at first, but this isn't the way Erlang works: we can't share memory. The entire tuple representing the object would have to be passed between locations, mobs, and other objects as messages, and each of these process types would have to know how to interact with them. Passing would also be another case of mandatory synchronous messaging between potentially identical processes, and we already know that's inviting complexity. Incidentally, screwing up the procedure of passing pointers or creating/destroying structs/objects to represent item passing is the source of an entire category of game bugs. Most item "dupe" bugs and money exploits are rooted in buggy data passing procedures that represent item transfer.

Naked data passing also breaks data encapsulation. If the shape of the object data ever changed that would cause a lot of surgery throughout several other modules. Record syntax could help us there a bit, but then we'd have to share record definitions across source files, and that gets wonky. (Actually, we'll see a version of this style once, and evolve away from it to point out why it is important to avoid.) A better way would be to wrap the object data behind a functional interface (that is, a functional implementation of objects as "abstract data types" or ADTs), and this might be a workable solution. But where would we put the influences of objects themselves?

Let's say we wanted to implement food that decays over time. We would need to write some logic in location code that updates the state of the food it contains to effect the decay, and this same logic would need to be available to mob and container object code. Its not that we couldn't put the actual logic in the food module, its that we would have to remember to call it from within each containing process on a similar timer, and that's a major pain. Decay would be a property internal to food, and we would be exposing a lot of details of food to other modules just because we're putting the abstraction barrier farther away than an actor model environment allows for. This would be true whether we wrote an ADT module to wrap food as a type or made food a tuple as an OOP-style object by including functions as "methods" to operate on itself as parts of the tuple. No matter where those functions are something must call them, and since the data is tucked behind the encapsulation barrier that Erlang processes represent, the containing processes must do the calling. And where is the code for container objects now? Are container objects processes and other objects are not? How would wearable containers (like backpacks) be implemented without a lot of hackery?

On closer examination this is actually not sounding like such a good idea. Its simple to think about passing a tuple that represents an object in the game world, but not so simple as soon as we want those objects to exhibit some sort of behavior of their own. This method makes us expend a lot of effort swimming against the tide of Erlang's concurrency oriented world.

That leaves us with designing objects as their own processes. This gives us the freedom we desire and makes thinking about objects much more simple. We can code "object logic" directly into them and give them their own processing loops, timers, state, etc. But what about the different categories of objects? How do we make food, weapons and containers fit into the same definition? We might not be able to. I'm not so sure that all objects will fit into a single module; we may need several to keep our code simple and understandable. But that's OK, because there are no constraints on process type with regard to message passing.

What we really need is a messaging protocol for objects. That way objects of a certain type can identify themselves when asked, have their own behaviors with regard to messages received and sent, and all look the same from the outside despite their internal differences. This is essentially the actor model version of duck typing. An oven could be a finite state machine, a container a service process, and food could be some new "food behavior" we have yet to invent. The point of defining objects as a protocol and not some specific process type is that it permits us to defer the details of any particular object type until later. We can play with this protocol early on by creating some generic object type at the outset which will grow to represent our basic object message protocol, and keep in mind that implementing special object types will require new modules that adhere to the basic protocol. Of course, to design a protocol we have to determine what to talk about first, and that means we have to establish some basic attributes common to all objects.

Note that we're suddenly dealing with the actor version of type class definitions. We probably won't write a different module for every single weapon in the game — that would suck. If we want instead to write a single weapon module that represents all weapons or maybe every weapon of a particular weapon category, and the data we plug into a spawned weapon will make it unique. We may use the Erlang concept of behaviors to ensure we don't screw up our object definitions, but that's an issue for a later time once we actually have a basic system working.

Here we have two unseen elements: generic object attributes, and the object messaging protocol. We knew at the beginning that we wanted to have objects in the game, but until now it wasn't clear that "objects" would have to be a category of process definitions which adhere to a messaging protocol as opposed to a single type of process or data structure.

Object Attributes

Here we want to iron out the attributes common to all objects in enough detail that we can write a basic system and demonstrate that it works, but no more. If we don't think through this part we'll never be able to create a coherent object protocol. Actually, our object protocol is likely to be ridiculous at first no matter what, but it will at least be a solid foundation rather than a practically useless concept proof that leaves out critical elements (like forgetting that all objects have weight because we only thought of weight in the context of encumbrance penalties for armor or whatever).

Let's start with the way we think of objects in the real world to get our MUD world on solid footing. Objects are made of matter. Matter takes up space and has mass. Mass means weight. Space means volume. We can work these into the game system without overcomplicating things. Or can we? If we implement volume we also have to implement volume checks (object A is light enough, but not small enough to fit into container B, etc.). That's probably not so hard to do in code, but its a bit more detail than most gamers are used to troubling with (players of spreadsheets cleverly disguised as games aside, of course). Gameplay tradeoffs will definitely become more complex if we include volume considerations. This could be good or bad, depending on whether the resulting gameplay decisions are interesting or merely annoying. We shouldn't forget about this volume idea, but for now I just want to stick with weight as a basic property of objects; as I mentioned above, at this point we want only enough detail to prove to ourselves the system works.

Objects also have to carry enough information to identify their category or ilk (weapon, food, chest, etc.), name, target aliases, and description. For example, a player targets a rock by calling it a "rock" but when he examines it more closely a more detailed description of the rock should be displayed. Descriptions are our only way of conveying anything about the game world to the players since we're not implementing any graphics.

So there we have it, at least for now: objects have a category, a name, a set of aliases, a description and a weight. All objects exhibit these properties, regardless of type. We can implement a very primitive game world where this is true, and come back and make this more interesting a little later.

Object Messaging Protocol

The basic object protocol needs to be general enough that every object of any type can conform to it without making simple object implementations insane, but not so narrowly general that it interferes with whatever unique things a special object type should do. There is no doubt that the protocol itself will have to get ironed out a bit as we go along, but a good place to start with building a message protocol is basic interrogation of the object's properties. This is something we have a head start on since we already decided what the most object attributes are.

We're not going to get into the details of message construction here because the messages will change a lot as ErlMUD evolves from raw Erlang to OTP. What I want to establish now is the basic semantics of each message, and leave the details of how each message is built until later.

Objects need to respond to inquiries about their current state. So far the only things they can answer are questions about their attributes. What else might objects need to send or receive messages about? Objects don't tell a location they have been placed there, whatever is doing the placing is responsible for that. Objects are not responsible for knowing their current location (this may need to change, but not yet). We currently lack a concept of in-game ownership (Elder Scrolls games are an interesting reference for ownership effects), so possession equates to ownership. There really isn't anything else to do but create a way to ask an object about itself and receive an answer. A simple synchronous message is probably OK. Some way to say "tell me attribute X" and the object should send back attribute X; something along the lines of receiving {ask, {attribute, AttName}} and responding with {AttName, Value}. We'll iron out ways of actually doing this in code later, including thinking about situations where asynchronous or synchronous messages make the most sense. The only other thing we are certain all objects will be doing with data is receiving their initial information when they are spawned.

Wait. Spawned? Who does that? Oh, darn. We need an object manager.

So the object message protocol is mercifully simple and we've discovered another "unseen" element: the object manager.

Object Manager

Objects have to come from somewhere, and where better than an object manager? The object manager will be started after the locations have been initialized, receive a list of what objects should exist, spawn each one in turn, and place it wherever it is supposed to load. The act of "placement" involves notifying the containing element that it is receiving an object, and the containing element must add it to its tracking inventory and confirm receipt to the object manager. Without sending a confirmation the object manager can't know whether the object is in inventory somewhere, accessible to the mechanics of gameplay, or effectively orphaned.

The object manager's task is a bit more complex than the location or way managers. In particular, where to load objects is not a simple thing. A "load" could be in a location, a mob's inventory or in a container object's inventory. There may also be a world total for objects of type X or global ratio between two objects which have probability X::Y of loading in a particular spot, etc. For now we're going to make the loading task flat and boring, but keep in mind that object loads are usually not as simple as that in a game, especially when it comes to re-loading objects which have been destroyed or consumed.

In addition to the non-trivial nature of object loads, the object manager is responsible for tracking live objects and restarting them if they crash, and notifying the containing element of the new identity of the recovered object. It is the object manager's job to rectify the situation if this happens in error, as with a crash. Objects must be monitored by their containing element and removed when they crash or are consumed. This responsibility goes the other direction as well. If a containing element crashes the objects it contained should either crash as well or notify the object manager of their situation lest the population of instantiated objects becomes unhinged from the game world. Our objects are so simple right now that its probably not possible for one to crash, but as specialized objects grow in behavioral complexity the likelihood for an object to crash will drastically increase.

Hrm... wait a second. Because we can't know how complex the interactions between containing elements and contained objects might eventually become (perhaps the presence of a cursed item will cause a container to become magically trapped or a magic item contributes a non-trivial buff to the mob holding it, etc.) we have to fall back on Erlang fundamentals. The purpose of "let it crash" and crash recovery is not to make a system that can't be damaged, it is to triage the damage so we can safely return to a known state. In this case, we can't be sure how complicated object interactions will become, so the safest thing to do is have containing objects kill their contained objects whenever they crash. So containers, whether locations, objects or mobs, should either link to their objects in inventory (so they all die together) or be monitored by the object manager so it can kill whatever else was contained. The simplest solution is to have everything die at once, and that means that if anything in a location dies, everything else in the location dies. This means bugs are location-wide grenades, but the catastrophe stops there. Objects, locations and mobs can follow whatever their respawn processes are without having to write a bunch of condition-checking code or trying to maintain a registry of internal information about object state just in case we have to restart them. After all, we're not intending to write objects that crash!

Another advantage of the "kill it all" approach is that we are certain to notice this sort of thing, which means that bugs in the system will get flushed out into the open rather quickly. It is acceptable to let a crash blow up a location so long as we have a clear understanding that the flames of chaos can never spread beyond that location and subject the entire MUD to collapse when something goes wrong.

Mobs

The denizens of the MUD. Whereas objects may exhibit some behavior, mobs initiate actions deliberately. They are the world creatures, the NPCs, the player avatars, the predators and the prey. Every beast, player, NPC, demon, perpetually kidnapped princess and all other sentient beings fall into this category. Anything that is attackable, killable, aggressive, prone to flee, talk-to-able, bound by in-game laws, factionalizable, grouppable, emotionally conscious, ownable as a pet, passengerable, or definable as friend or foe is a mob. Without them the MUD would be a pretty boring place.

Mobs can be arbitrarily complex, much like objects, and while a general definition of mob may suffice at the outset they will likely define a category of modules rather than a single module that can do everything. Also like objects, mobs will require a basic message protocol that lets us write a variety of mob types that do different things but interact with the system in a similar way. The main difference is that while it wasn't so obvious at first that objects should be processes it was always clear that mobs would be. In both cases we find ourselves settling on an initial, generic implementation of each so we can explore the message protocol in more detail, knowing that we will either be writing other modules that adhere to that protocol later on or writing extra modules that extend the generic one.

The similarities between objects and mobs only goes so far. Objects don't have volition so they don't have players or AI controlling them. Mobs do things. How do we make those actions happen? What determines what action happens and when? If we put a dog mob somewhere, will he ever leave? Will he chase rabbits on his own? Where does that go? Do we re-write a new module every time we come up with a new mob type (and then have to change all of them if ever we decide to change the mob messaging protocol? Ouch!)? Something is clearly missing here. We need something that controls the mobs, a place where we can tuck the logic of volition. Ah ha! Another "unseen" element!

While it would certainly be possible to write the AI elements directly into a mob, that would pose some problems. For example, every mob would need different AI or else the world will be flat and inflexible. Having an evil miniboss and the town blacksmith share the same AI would pose a problem. It also limits our options. What if we want to create a game world of pre-existing mobs where players dynamically take control of mobs? Its not traditional, but its an exciting idea. What if game staff (usually called "immortals" or "imms") want to enhance roleplay possibilities by temporarily taking control of a village NPC and providing some live interaction? We can't do those things very easily if the controlling code and the mob code are intermixed. Well, to be specific, it might be possible to implement that sort of a system exactly once and never change it again, but smooth evolution of a codebase is one of the most important things we have to think about as developers and its no good to tie a big Gordian Knot of concerns right from the start (just because this is what usually happens in the wild doesn't mean we have to make the same mistake). Besides, good AI work is hard enough on its own and it makes sense to create a concrete distinction between AI and mob code.

To perform actions mobs will have to know where to send messages that represent actions, be able to observe their situation, maintain an inventory, know what state they are in (awake, sleeping, in combat), and have some mechanism of determining the effects of received actions that may change their state. Mobs must have some way of being spawned, hurt, healed, killed, and respawned.

These are all either communication or internal state issues. Interaction with anything not in its inventory means something in another element's inventory, and this requires a query to that containing entity prior to taking an action. Most of the time this means a location's inventory of objects, mobs or available exits, but there are exceptions with global systems like chat. Interaction with Anything in a mob's inventory first requires a lookup in the mob's internal inventory prior to taking an action. Any internal influences on a mob's actions require lookups into the current state of the mob's attributes prior to calculating the effect of the influence. That means we need to work out how these queries and lookups will work, and also need to decide what attributes mobs will have.

Mobs are probably going to be the most complex things in the MUD. Not only are they extended by a controller and the player version of a controller will have to either include or connect to network code, mobs are also the only elements of the game that define direct actions and reactions with every other element in the game. Ouch. The basic procedures for accomplishing things within a mob are relatively simple, it is the variety of messages and actions that mobs must implement as part of their basic behavior that makes things complicated. To start with we will define a very minimal set of mob behaviors to make sure we're not going the wrong direction. As ErlMUD gains features it will naturally become a more complex system, but mobs, being the center of gameplay, will likely experience an explosion in complexity relative to other elements of the game. We will explore some ways to manage this complexity as ErlMUD evolves.

Mob Attributes

We already know that mobs are the active force of the game. We also know we're using a standardish RPG style system so character traits are what determine the outcome of nearly every action. Mob attributes are not going to be simple, but we aren't yet concerned with implementing the game system in full. We haven't discussed the combat system, action calculations or anything else of the sort because we already identified that most of this stuff can probably be wrapped up in minigames or temporary operating modes. We don't have to bother with implementing these aspects of gameplay to validate our architecture.

Right now We only require a basic world where mobs can move around, talk to each other, use the chat system, manipulate basic objects, and be controlled by a basic AI or player controller. That's it. If we get these parts correct, implementing a combat system on top of it should not be hard. Good combat systems are hard enough to get right by themselves, so I don't even want to think about that until we've got a basic system functioning. Its really an orthogonal task, and one that will be a lot easier to focus on when the time comes if the world and the mobs in it already work.

So what are basic mob attributes? Mobs will eventually require a rather extensive set of attributes and other state, but for now we're just trying to make them walk around without crashing. Its probably a good idea to stick with what we are familiar with and make them similar to basic object attributes for now: category, name, aliases, description and weight. Actually, we can leave weight off for now, because we don't care. At the moment we care more about what mobs do than what they are, so we'll just leave it at that.

Mob Messaging Protocol

The mob protocol is a bit complex because mobs won't work with just one protocol, they require several. Mobs are the hub of nearly all interaction in the game and being the center of activity means that mobs define as many protocols as the number of activities in which they engage. We should start with the minimum to permit a game world to work without getting lost in a jungle of message protocols.

Let's see how minimal of a list we can come up with and still prove that the world works:

That's it. We're leaving out all sorts of stuff like trade, combat, equipment, time, thieving, justice, etc. and just about every other conceivable gameplay element we might want later on. That's because we're just working on the most basic mob aspects right now not the advanced parts of gameplay (much less building an actual game on top of all that yet).

Let's think through these for a moment.

Movement is probably the most complicated because it involves a collection of processes working together to accomplish a task. If we put a mob in a location and it wants to move we need some way to indicate this to the location (the location is the arbitration authority for this action). The location must check that the mob is in its inventory and that the outgoing way is available and open, and let it know what is going on. The way must let the receiving location know who is coming. The outgoing location needs to confirm that it dropped its record of the outgoing mob and the incoming location should acknowledge its receipt. As we can see, movement is non-trivial. It must be synchronous so that we can avoid asynchronous corner cases (like travel-related messages arriving really late, or the outgoing room sending a mob into empty space because it receives no confirmation of mob receipt, etc). What I'm really beating around the bush about is our need for a two-phase commit. We can cheat a little (failure in the MUD isn't the same as failure in accounting software), but a two-phase commit is basically what we are looking at when transferring a mob or object from one containing element to another.

Observation is much easier than this. It can even be done asynchronously in cases where the answer is not required to accomplish another action. Synchronous situations will be things like when a mob enters a location and get its first view of the current contents (the action of non-blind entry requires an answer to observation queries). The mob asks another element to tell it something, waits for the answer, and then carries on with whatever it was up to. Asychronous situations are things like a player asking to refresh the location view, check inventory, peek at another mob's inventory, or especially when an observable event occurs in the same location, etc. The controller sends the request to the mob, and the mob sends a request for information, and whenever the mob receives an asynch information message it gets passed to the controller. Because this asynch communication doesn't require an initiating request the location can broadcast observations to all mobs when another mob enters, performs a visible action, or says something out loud. The most important thing to note here is that there are two types of observation protocols: synchronous and asynchronous query and receipt.

Getting and putting objects is similar to movement, but in this case instead of mobs changing from one location's inventory to another in a transaction managed by a way, an object is changing from one entity's inventory to another. When something is picked up it should be removed from the location's inventory. When it is dropped it should be added to the location and removed from the mob. If it is handed to another mob it should be removed from the giver and added to the recipient. Not too hard a task, but one which once again must be structured as a synchronous transaction to prevent weird cases like adding random objects to an inventory based on asynchronous receipt of old messages, or dropping objects from inventory even though the receiving process doesn't exist. Making this synchronous also means that two mobs can't get the same thing at once and accidentally dupe it in the process because the location will block on the pickup message, remove the object, and then handle the other mob's get action message. We do have a complication, though: two mobs passing an object will introduce the symmetric synchronous protocol deadlock problem again, and so we should create an intermediate process to handle the transaction the same way the ways do for locations. This is a process we can just spawn and kill once its work is done. We'll call these "hands" for lack of a better term. Drat! Another "unseen" module!

Chat would be hard to implement if it were synchronous. But it won't be! Ha ha! We'll just fire chat messages off into space and assume everything works out. With a bazillion chat messages flying at once I don't really care if one gets lost, and it won't crash the game system. Anyway, the Erlang runtime guarantees delivery of a message so long as its target is alive, and that is really all we care about. This promise is actually quite amazing. It makes creating things like chat systems very simple. But the chat system is not critical to the game world, just the users. (Famous last words: "just the users".) We can defer the global chat system until we decide whether it is really a mob or a controller issue, but make sure we do implement an asynchronous method for "saying" things in a location and having other mobs "hear" it.

Hands

Hands are to mobs as ways are to locations. They provide a way to make symmetric synchronous messages avoid deadlock and offload the complexity of transaction handling. Hands will probably be used a lot less frequently than ways, so we can spawn them and let them die when their job is done without any issues. Because hands will probably be fairly similar to ways, at least in some respects, we will avoid thinking about them exhaustively here, commit to implementing ways first, and then work on hands once we've worked our way around the details of transactional transfer in ways. Because there will be quite a bit of inventory transferral in our MUD its a good idea to be alert to opportunities to abstract the procedure of two-phase commit or transactional transfer so we can invoke it where needed instead of rewriting the same basic thing (possibly wrong) over and over.

Mob Controllers

Mob controllers represent a mob's volition and separate this idea from its corporeal existence. Mobs should be written to require exactly one controller be active at a time in the same way that real animals are only under control of one consciousness at a time (at least in the sense of "There can only be one!"). When a mob is spawned it will initially lack a controller and therefore spawn an AI unless a player controller is spawning the mob to begin with. What flavor of AI to spawn can be a part of the mob's definition, even subject to a randomizing factor (as in 70% of all cats are nuts, 30% docile).

This has some natural benefits. When we want to let a player take over it can spawn a controller to replace the existing one, and if the player logs off the mob can respawn its default controller. If a player's ISP flakes out or he switches from laptop to phone to play he can just log in again on a new device and that will cause a new controller to spawn which will (violently) replace the previous one. It makes no difference whether the previous controller was an AI or a player controller. The mob doesn't care and neither should we as designers. This simple rule means we don't have to write any special "takeover" procedure for mobs.

Obviously mobs and their controllers will pass quite a few messages, more I care to detail here. For now we only need to cut the unknown complexity down a bit by thinking about what categories of messages will be passed, from which direction they will initiate, and whether they should be synchronous or asynchronous.

In Erlang its usually a good idea to default to asynchronous messages as much as possible. Fire-and-forget is simple to the operator. The problem is that while asynchronous messaging is always easy for the sender, it can be confusing for the receiver, and so is not always a good fit. We should default to writing everything asynchronously, and when cases arise where that is impractical switch to synchronous messaging (excluding cases where we know in advance that a transaction is called for). This is the way we will approach controller/mob messaging for now.

We know we will have commands (controller→mob), results (mob→controller), incoming info (mob→controller), incoming chat (mob→controller), and outgoing chat (controller→mob). we also know that mobs and controllers should be linked, so if one dies the other does a well (which, if we stick to the location-wide linkage discussed above, should also kill the location and everything in it). So far I don't see anything that should obviously be synchronous, except perhaps commands and responses. But with two processes linked I still don't see the necessity of synchronous messaging; if the mob takes a while to complete an action the controller shouldn't be blocked (that would limit our options for external mode or minigame possibilities which don't involve the mob directly).

So we'll go with asynchronous messaging for everything to begin with, and examine the protocols that result later on to determine whether this remains feasible. Since locations and everything in them explode when anything goes wrong we should fairly quickly be able to detect the presence of corner cases and bugs in the future.

Mob Manager

Oh look, another manager. In the same way the object manager receives information about what objects should exist and spawns whatever is necessary, the mob manager receives similar information and spawns whatever mobs are necessary. As discussed above, once a mob is spawned it will make sure it has a controller, so we don't need to worry about that sort of thing here. The primary job of the mob manager is respawning crashed mobs and keeping mob populations to appropriate levels.

That last bit about maintaining mob populations is the same issue we encountered with the object manager, where it must decide when to load objects based on what objects have been destroyed in the course of world events. The mob manager must similarly spawn mobs to replace ones that have died in the course of world events, but where object reloading is mostly trivial (a torch disappears and another appears, big deal) the mob respawn strategy requires finesse from the outset. We don't want "highly available" minibosses respawning the instant they are killed. We can do this by keeping a total count registry of mob/type, or by making each non-player mob instance have its own place in a registry (so you get exactly one-for-one replacement, maybe even contiguous AI history — an interesting idea), and assigning each a respawn delay, a set of possible respawn locations, or both. I'm inclined to try the "total count" version first, but based on whatever data storage representation makes the most sense early on I might switch to the contiguous registry idea. There isn't really an Erlangy way of handling this sort of issue, so we will have to explore on our own to determine what makes the most sense.

Wet Clay, not Stone

None of this is set in stone. One of the wonderful things about Erlang's functional nature and strict message passing is the ease of hacking around until a system feels right. This means while we will begin writing a system that looks like the one described above, we might discover that it doesn't feel right as we go along or that some part of it is inviting unwarranted complexity into the model (attentive readers may have already spotted one glaring point of redundancy...).

Writing concurrent systems that share data is a lot like carving something in stone or cutting a diamond in that changing a part of the system can require a huge amount of work to make sure that the change fits within the surrounding system. Systems that adhere to the actor model, on the other hand, are more like working with wet clay. So don't fixate on any particular part of the system and be bold in your own experiments (that's what git branches are for!).