The Lightweight Nature of Erlang Processes

Understanding the difference between Erlang processes and OS processes can be a bit confusing at first, partly because the term “process” means something different in each case, and partly because the semantics of programming terms have become polluted by marketing, political and religious wars. A post to the Erlang questions mailing list asking why Erlang processes are so fast and OS processes are so slow reminded me of this today.

Erlang processes are more similar to the “objects” found in most OOP languages than the “processes” managed by an OS kernel, but have a proper message passing semantics added on in a way that abstracts the OS network, pipe and socket mechanisms. We wouldn’t be surprised if the Python runtime handled its objects with less overhead than the OS kernel handles a process, of course, and it should come as no surprise that the Erlang runtime handles its processes with less overhead than the OS kernel. After all, a Python “object” and an Erlang “process” are very nearly the same thing underneath.

Most OOP runtimes implement “objects” as a special syntactical form of a higher order function, one that forms a closure around its state, includes pointers to methods as a part of that state (usually with their own special syntax that abstracts the difference between a label, a pointer and a variable) and returns a dispatch function which manages access to its internal methods. Once you get down to assembly, this is the only way things work anyhow (and on von Neuman architectures there is exactly zero difference between pointers to data, pointers to data, instructions and pointers to a next instruction). If you strip that special syntax away there is no practical difference between directly writing a higher order function that does this and using the special class definition syntax.

Even in a higher language the higher-order functional nature of an “object”s class definition can be illustrated. For example, the following Python class and function definitions are equivalent.

class Point():
    def __init__(self, x=0, y=0):
        self.x = x
        self.y = y

    def set_x(self, x):
        self.x = x

    def set_y(self, y):
        self.y = y

    def get_x(self):
        return self.x

    def get_y(self):
        return self.y


def gen_point(x=0, y=0):
    coords = {"x": x, "y": y}

    def set_x(x):
        coords["x"] = x

    def set_y(y):
        coords["y"] = y

    def get_x():
        return coords["x"]

    def get_y():
        return coords["y"]

    def dispatch(message, value=0):
        if message == "set x":
            set_x(value)
        elif message == "set y":
            set_y(value)
        elif message == "get x":
            return get_x()
        elif message == "get y":
            return get_y()
        else:
            return "Bad message"

    return dispatch

We would be utterly unsurprised that both the class definition and the function definition return entities that are lighter weight than OS processes. This is not so far from being the difference between Erlang processes and OS processes.

Of course, the above code is ridiculous to do in Python either way. The whole point of the language is to let you avoid dealing with this exact sort of code. Also, Python has certain scoping rules which are designed to minimize the confusion surrounding variable masking in dynamic languages — and the use of a dictionary to hold the (X, Y) state is a hack to get around this. (A more complete example that uses explicit returns and reassignment is available here.)

For a more direct example, consider how this can be done in Guile/Scheme:

(define (point x y)
  (define (setter coord value)
    (cond ((eq? coord 'x) (set! x value))
          ((eq? coord 'y) (set! y value))))
  (define (getter coord)
    (cond ((eq? coord 'x) x)
          ((eq? coord 'y) y)))
  (define (dispatch m)
    (cond ((eq? m 'set) setter)
          ((eq? m 'get) getter)
          (else (error "point: Unknown request"))))
  dispatch)

OOP packages for Lisps wrap this technique in a way that abstracts away the boilerplate and makes it less messy, but its the same idea. This can be done in assembler or C directly as well. Equivalent examples are a bit longer, so you’ll have to take my word for it. (A commented version of the Guile example above can be found here.)

While OOP languages typically focus on access to state and access to methods as state, Erlang focuses like a laser on the idea of message passing. Easy, universal access to state in OOP languages makes it natural to do things like share state, usually by doing something innocent like declaring a name in an internal scope that points to an independent object from somewhere outside.

Erlang forbids this, and forces all data to either be a part of a the definitions that describe the process (things declared in functions or their arguments), or go through messages. Combined with recursive returns and assignment in a fresh scope (akin to the last Python example in the extra code file) this means state is effectively mutable and side effects can occur without violating single assignment, but that everything that changes must change in an explicit way.

This restriction comes at the cost of requiring a sophisticated routing and filtering system. Erlang has an unusually complete message concept, going far beyond the “signals and slots” style found in some of the more interesting OOP systems. In fact, Erlang goes so far with the idea that it abstracts message, filters, a process scheduler and the entire network layer with it. And hence we have a very safe environment for concurrent processing — using “processes” that certainly feel like OS type processes, but are actually named locations Erlang’s runtime keeps track of in the same way an OOP runtime does objects, functions and other declared thingies. They feel like OS processes because of the way Erlang handles access to them in the same way that Java objects feel like my mother-in-law’s purse because of the way the JVM handles access to them — but underneath they are much more alike each other than either are to OS processes.

In the end, all this stuff is just long lines of bits standing in memory. The special thing is the rules we invent for ourselves that tell us how to interpret those bits. Within those rules we have various ways of declaring our semantics, but in the end the lines of bits don’t care if you think of them as “objects”, as “processes”, as “closures”, as “structs with pointers to code and data” or as “lists of lists with their own embedded processing rules”. OSes have particularly heavy sets of rules regarding how bits are accessed and moved around. Runtimes tend not to. Erlang “processes” are of a kind with Python “objects”, so we shouldn’t be surprised that they are significantly lighter weight than the “processes” found in the OS.

3 thoughts on “The Lightweight Nature of Erlang Processes

  1. I think at a philosophical level Erlang processes are quite similar to OS processes. In their isolation, independence and how they communicate via messages and signals. Yes Erlang processes are very much lighter and more like OO objects in this respect, but considering how they behave they are more like OS processes.

  2. @Robert Virding:
    Philosophically, perhaps. On the other hand, it is entirely valid to view operating systems as userland applications when run within a Qemu container. From this perspective the difference between an operating system, an application environment and (more generally) an execution environment blur considerably. This is analogous to the blurring of “data” and “code”, “object” and “process”, and “class” and “function”.

    My point is that we’ve made up meanings for these words, but the semantics are overloaded to the point that we wind up confusing ourselves. We try to understand “why an Erlang process is lighter weight than an OS process” because we’re tripped up over this term “process”. In the same way anyone who started programming after 1998 or so gets confused when reading pre-Java-era texts because the term “object” was used in a much broader fashion (and still is in, say, the Lisp community) before the Java marketing machine changed the meaning of the term.

    Erlang and the OS are fundamentally different things, the execution environment dictates the way processes are handled. Even if our interaction with the two may be superficially similar in some (many?) ways, they are fundamentally different — and Erlang processes are, underneath, akin to OO objects. In this I am implying that interface does not equate to identity — which is, admittedly, a point of view that assumes that programmers do not adhere wholeheartedly to duck typing in their minds. This is easily disputed.

  3. Hi Craig,

    Although 90% of the things went over my head,the thing that came to my mind was why cant an O.S be programmed in a language like ERLANG.

    I really think I need SICP just to get a little hang of your vocab :)

    Some day,I am sure I will grok all your blogs. :)

    Thx for the blog.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.