The Intellectual Wilderness There is nothing more useless than doing efficiently that which should not be done at all.

2016.12.7 12:53

Erlangers! USE LABELS! (aka “Stop Writing Punched-in-the-Face Code Blocks”)

Filed under: Computing — Tags: , , , , — zxq9 @ 12:53

Do you write lambdas directly inline in the argument list of various list functions or list comprehensions? Do you ever do it even though the fun itself, or the other arguments or return assignment/assertion for the call are too long and force you to scrunch that lambda’s definition up into an inline-multiline ball of wild garbage? YOU DO? WTF?!?!? AHHHH!

First off, realize this is incredibly impolite to other people and your future self whenever you do it. There is a big difference for the human reading between:

%%% From shitty_inline.erl

do_whatever(Keys, SomeParameter) ->
    lists:foreach(fun(K) -> case external_lookup(K) of
                  {ok, V} -> do_side_effecty_thing(V, SomeParameter);
                  {error, R} -> report_some_failure(R)
          end, Keys


%%% From shitty_listcomp.erl

do_whatever(Keys, SomeParameter) ->
    [fun(K) -> case external_lookup(K) of
        {ok, V} -> do_side_effecty_thing(V, SomeParameter);
        {error, R} -> report_some_failure(R) end end(Key) || Key <- Keys],


%%% From less_shitty_listcomp.erl

do_whatever(Keys, SomeParameter) ->
    ExecIfFound = fun(K) -> case external_lookup(K) of
            {ok, V} -> do_side_effecty_thing(V, SomeParameter);
            {error, R} -> report_some_failure(R)
    [ExecIfFound(Key) || Key <- Keys],


%%% From labeled_lambda.erl

do_whatever(Keys, SomeParameter) ->
    ExecIfFound =
        fun(Key) ->
            case external_lookup(Key) of
                {ok, Value}     -> do_side_effecty_thing(Value, SomeParameter);
                {error, Reason} -> report_some_failure(Reason)
    lists:foreach(ExecIfFound, Keys).


%%% From isolated_functions.erl

-spec do_whatever(Keys, SomeParameter) -> ok
    when Keys          :: [some_kind_of_key()],
         SomeParameter :: term().

do_whatever(Keys, SomeParameter) ->
    ExecIfFound = fun(Key) -> maybe_do_stuff(Key, SomeParameter) end,
    lists:foreach(ExecIfFound, Keys).

maybe_do_stuff(Key, Param) ->
    case external_lookup(Key) of
        {ok, Value}     -> do_side_effecty_thing(Value, Param);
        {error, Reason} -> report_some_failure(Reason)

Which versions force your eyes to do less jumping around? How about which version lets you most naturally understand each component of the code independently? Which is more universal? What does code like this translate to after erlc has a go at it?

Are any of these difficult to read? No, of course not. Every version of this is pretty darn basic and common — you need a listy operation but require a closure over some in-scope state to make it work right, so you really do need a lambda instead of being able to look all suave with a fun some_function/1 type thing. So we agree, taken by itself, any version of this is easy to comprehend. But when you are reading through hundreds of these sort of things at once to understand wtf is going on in a project while also remembering a bunch of other trash code that is laying around and has side effects while trying to recall some detail of a standard while the phone is ringing… things change.

Do I really care which way you do it? In a toy case like this, no. In actual code I have to care about forever and ever — absolutely, yes I do, especially if the lambdas get larger. The fifth version is my definite preference, but the fourth will do just fine also.

(Or even the third, maybe. I tend to disagree with the semantic confusion of using a list comprehension to effect a loop over a list of values only for the side effects without returning a value – partly because this is semantically ambiguous, and also because whenever possible I like every expression of my code to either be an assignment or an assertion (so every line should normally have a = on it). In other words, use lists:foreach/2 in these cases, not a list comp. I especially disagree with using a listcomp when we the main utility of using a list comprehension is normally to achieve a closure over local state, but here we are just calling another closure — so semantic fail there, twice.)

But what about my lolspeed?!?

I don’t know, but let’s see. I’ve created five modules, based on the above examples:

  1. shitty_inline.erl
  2. shitty_listcomp.erl
  3. less_shitty_listcomp.erl
  4. labeled_lambda.erl
  5. isolated_functions.erl

These all call the same helpers that do basically nothing important other than having actual side effects when called (they call io:format/2). What we are interested in here is the generated assembler. What is the cost of introducing these labels that help the humans out VS leaving things all messy the way we imagine might be faster for the runtime?

It turns out that just like with using assignments to document your code, there is zero cost to label functions. For example, here is the assembler for shitty_inline.erl side-by-side with labeled_lambda.erl:

Oooh, look. The exact same stuff!

(This is a screenshot, a text file with the contents shown is here: label_example_comparison.txt)

See? All that annoying-to-read inline lambdaness buys you absolutely nothing. You’re not helping the compiler, you’re not helping the runtime, and you are hurting your future self and anyone you want to work with on the same code later. (Note: You can generate precompiler output with erlc -P and erlc -E, and assembler output with erlc -S. Here is the manpage. Play around with it a bit, BEAM and EVM are amazing platforms, wide open for exploration!)

So use labels.

As for execution speed… all of these perform basically the same, except for the last one, isolated_functions.erl. Here is the assembler for that one: isolated_functions.S. This outperforms the others, though to a relatively insignificant degree. Of course, it is only an “insignificant degree” until that part of the program is the most critical part of whatever your program does — then even a 10% difference may be a really huge win for you. In those cases it is worth it to refactor to test the speed of different representations against each version of the runtime you happen to be using — and all thoughts on mere style have to take a backseat. But this is never the case for the vast majority of our code.

(I’ve read reports in the past that indicate 99% of our performance bottlenecks tend to reside in less than 1% of our code by line count — but I can’t recall the names of any just now. If you happen to find a reference, let me know so I can update this little parenthetical blurb with some hard references.)

My point here is that breaking every lambda out into a separate named functions isn’t always worth it — sometimes an in-place lambda really is more idiomatic and easier to understand simply because you can see everything right there in the same function body. What you don’t want to see is multi-line lambdas squashed into argument lists that make things hard to read and give you the exact same result once compiled as labeling that lambda with a meaningful variable name on another line in the code and then referring to it where it is invoked later.


  1. “1337” :-)

    Comment by Mahesh Paolini-Subramanya — 2016.12.7 21:43 @ 21:43

  2. I’ve always assumed the alternatives resulted in the same basic code, and I’ve always preferred the in-place lambda simply because it is easier to read. Thanks for demonstrating that readability does not come at the cost of performance.

    Comment by Nathan Fiedler — 2016.12.8 03:20 @ 03:20

  3. This is nice to see. I find myself increasingly doinf things like avoiding anonymous functions, using maybe_do_stuff-type functions, even trying to eradicate case statements. Some of it might not be entirely rational (I just don’t like case statements) but I end up with code that I like looking at. It’s nice to see that (a) I might not be completely nuts (b) it probably doesn’t harm my lolspeed scores.



    Comment by Ivan — 2016.12.12 23:37 @ 23:37

  4. Hi, Ivan.
    It is certainly possible to go overboard (and a really trivial case like the contrived toy example above is pretty easy to grok in any version) but the more production code I write that I have to read later and share with others and the less trivial the cases get the more strict adherence to this style becomes important. I’ve never met a new Erlanger who thought this was important — and I’ve yet to meet full-time production coders who have more than a few years experience who have tried both ways and prefer reading the snaggletoothed style.

    Regarding anonymous functions…
    As a general rule I don’t use them, but they have one critical place of importance, and that is closing over some things that are in the current scope in a way that curries arguments out so you can change an explicit “pass the entire universe through each call” recursive routine into a list operation. For example, if I have a choice between making a substantial section of a module a huge fold operation where the accumulator is the process’ entire state record or currying 90% of that out by closing over only the relevant bits, of course the lambda is the more clear route. But I never write those sort of lambdas without labeling them within the function body.

    Regarding case
    I don’t really like cases, either. Sometimes pulling all the cases out into function head matches is very readable. Sometimes it is awkward. I don’t particularly like matching a function on a result of a standard lib call of type {ok, Value} | {error, Reason}, for example. I’ll usually use a case in these situations. Nested cases, though, are almost always awkward — and that’s where I draw the line. Every time I see a nested case I see an instance where the essence of some complex decision has not been fully decomposed and refactoring it has a very strong tendency to make not only the code more clear, but the nature of the problem more obvious — more often than not that turns out to be a huge win that extends far beyond simply removing an eyesore in the code.

    It’s funny… I remember a few years ago totally not understanding why the old guys felt this way. And now I’m preaching it myself. Hah! Sort of like how when you get older you catch yourself saying stuff your parents used to say… and then realize maybe they weren’t as dumb as you thought they were! My views on some Erlang things will no doubt continue to evolve with time. I’ve been alive long enough now to have changed my mind on so many things that I can’t believe that all of my opinions are objectively right — but the things I write about here are things I have found to be most effective in working with teams and working after myself.

    That’s a long way of saying that no, you are not nuts! (And the NEC, Tsuriai, Inaka, etc. coding standards authors agree with you — as do the likes of Garrett Smith, ROK and a few other big hats — and that’s a pretty good indicator.)

    Comment by zxq9 — 2016.12.13 07:04 @ 07:04

RSS feed for comments on this post.

Leave a comment

Powered by WordPress