Monthly Archives: December 2016

Erlangers! USE LABELS! (aka “Stop Writing Punched-in-the-Face Code Blocks”)

Do you write lambdas directly inline in the argument list of various list functions or list comprehensions? Do you ever do it even though the fun itself, or the other arguments or return assignment/assertion for the call are too long and force you to scrunch that lambda’s definition up into an inline-multiline ball of wild shit? YOU DO? WTF?!?!? AHHHH!

First off, realize this makes you look like a douchebag for not being polite to other people or your future self whenever you do it. There is a big difference for the human reading between:

%%% From shitty_inline.erl

do_whatever(Keys, SomeParameter) ->
    lists:foreach(fun(K) -> case external_lookup(K) of
                  {ok, V} -> do_side_effecty_thing(V, SomeParameter);
                  {error, R} -> report_some_failure(R)
                end
          end, Keys
    ).

and

%%% From shitty_listcomp.erl

do_whatever(Keys, SomeParameter) ->
    [fun(K) -> case external_lookup(K) of
        {ok, V} -> do_side_effecty_thing(V, SomeParameter);
        {error, R} -> report_some_failure(R) end end(Key) || Key <- Keys],
    ok.

and

%%% From less_shitty_listcomp.erl

do_whatever(Keys, SomeParameter) ->
    ExecIfFound = fun(K) -> case external_lookup(K) of
            {ok, V} -> do_side_effecty_thing(V, SomeParameter);
            {error, R} -> report_some_failure(R)
        end
    end,
    [ExecIfFound(Key) || Key <- Keys],
    ok.

and

%%% From labeled_lambda.erl

do_whatever(Keys, SomeParameter) ->
    ExecIfFound =
        fun(Key) ->
            case external_lookup(Key) of
                {ok, Value}     -> do_side_effecty_thing(Value, SomeParameter);
                {error, Reason} -> report_some_failure(Reason)
            end
        end,
    lists:foreach(ExecIfFound, Keys).

and

%%% From isolated_functions.erl

-spec do_whatever(Keys, SomeParameter) -> ok
    when Keys          :: [some_kind_of_key()],
         SomeParameter :: term().

do_whatever(Keys, SomeParameter) ->
    ExecIfFound = fun(Key) -> maybe_do_stuff(Key, SomeParameter) end,
    lists:foreach(ExecIfFound, Keys).

maybe_do_stuff(Key, Param) ->
    case external_lookup(Key) of
        {ok, Value}     -> do_side_effecty_thing(Value, Param);
        {error, Reason} -> report_some_failure(Reason)
    end.

Which versions force your eyes to do less jumping around? How about which version lets you most naturally understand each component of the code independently? Which is more universal? What does code like this translate to after erlc has a go at it?

Are any of these difficult to read? No, of course not. Every version of this is pretty darn basic and common — you need a listy operation by require a closure over some in-scope state to make it work right, so you really do need a lambda instead of being able to look all suave with a fun some_function/1 type thing. So we agree, taken by itself, any version of this is easy to comprehend. But when you are reading through hundreds of these sort of things at once to understand wtf is going on in a project while also remembering a bunch of other shit code that is laying around and has side effects while trying to recall some detail of a standard while the phone is ringing… things change.

Do I really care which way you do it? In a toy case like this, no. In actual code I have to care about forever and ever — absolutely, yes I do. The fifth version is my definite preference, but the fourth will do just fine also.

(Or even the third, maybe. I tend to disagree with the semantic confusion of using a list comprehension to effect a loop over a list of values only for the side effects without returning a value – partly because this is semantically ambiguous, and also because whenever possible I like every expression of my code to either be an assignment or an assertion (so every line should normally have a = on it). In other words, use lists:foreach/2 in these cases, not a list comp. I especially disagree with using a listcomp when we the main utility of using a list comprehension is normally to achieve a closure over local state, but here we are just calling another closure — so semantic fail there, twice.)

But what about my lolspeed?!?

I don’t know, but let’s see. I’ve created five modules, based on the above examples:

  1. shitty_inline.erl
  2. shitty_listcomp.erl
  3. less_shitty_listcomp.erl
  4. labeled_lambda.erl
  5. isolated_functions.erl

These all call the same helpers that do basically nothing important other than having actual side effects when called (they call io:format/2). What we are interested in here is the generated assembler. What is the cost of introducing these labels that help the humans out VS leaving things all messy the way we imagine might be faster for the runtime?

It turns out that just like with using assignments to document your code, there is zero cost to label functions. For example, here is the assembler for shitty_inline.erl side-by-side with labeled_lambda.erl:

Oooh, look. The exact same stuff!

(This is a screenshot, a text file with the contents shown is here: label_example_comparison.txt)

See? All that annoying-to-read inline lambdaness buys you absolutely nothing. You’re not helping the compiler, you’re not helping the runtime, and you are hurting your future self and anyone you want to work with on the same code later. (Note: You can generate precompiler output with erlc -P and erlc -E, and assembler output with erlc -S. Here is the manpage. Play around with it a bit, BEAM and EVM are amazing platforms, wide open for exploration!)

So use labels.

As for execution speed… all of these perform basically the same, except for the last one, isolated_functions.erl. Here is the assembler for that one: isolated_functions.S. This outperforms the others, though to a relatively insignificant degree. Of course, it is only an “insignificant degree” until that part of the program is the most critical part of whatever your program does — then even a 10% difference may be a really huge win for you. In those cases it is worth it to refactor to test the speed of different representations against each version of the runtime you happen to be using — and all thoughts on mere style have to take a backseat. But this is never the case for the vast majority of our code.

(I’ve read reports in the past that indicate 99% of our performance bottlenecks tend to reside in less than 1% of our code by line count — but I can’t recall the names of any just now. If you happen to find a reference, let me know so I can update this little parenthetical blurb with some hard references.)

My point here is that breaking every lambda out into a separate named functions isn’t always worth it — sometimes an in-place lambda really is more idiomatic and easier to understand simply because you can see everything right there in the same function body. What you don’t want to see is multi-line lambdas squashed into argument lists that make things hard to read and give you the exact same result once compiled as labeling that lambda with a meaningful variable name on another line in the code and then referring to it where it is invoked later.