/ erlang

Concurrent Programming in Erlang - Scaling Up (week three)

The concurrent programming in Erlang course ended this week. It's been informative. There's been a ton of resources shared (both in the course itself and in the comments), but wow I was spending much more than the 5 hours a week they said to expect. That's a good thing though - lots of stuff learned and lots more to read up on. Check out my notes from week 1 and week 2.

Scaling Apps

A major topic this week was the ability to quickly scale apps. That could mean adding more CPUs, memory and other resources into the mix for Erlang to take advantage of. Or it might just mean breaking up work into more processes, which could take better advantage of the resources already available to the Erlang VM.

Here's a quick app I wrote that accepts a list of "messages", which it then echoes back out. For every 3 messages, it spawns an additional process to do the work. Potentially these processes could end up running on separate cores in a multi-core CPU, which would make the app run faster.

-module(scaling).

-export([start_router/0, init_router/0, process_message/1]).

-define(MESSAGE_LIMIT, 3).

start_router() ->
  register(scaling, spawn(scaling, init_router, [])).

init_router() ->
    process_flag(trap_exit, true),
    loop().

loop() ->
  receive
    {messages, Messages} ->
      io:format("Generating ~p processes to handle ~p incoming messages.~n",
                [(length(Messages) + ?MESSAGE_LIMIT) div ?MESSAGE_LIMIT, length(Messages)]),
      spawn_message_handlers(Messages),
      loop();
    {quit, Reason} ->
      io:format("The server and linked processes will now quit. Reason: ~p~n", [Reason]);
    {'EXIT', Pid, Reason} ->
      io:format("Received EXIT signal from ~p: ~p~n", [Pid, Reason]),
      loop()
  end.

spawn_message_handlers([]) ->
  ok;
spawn_message_handlers(AllMessages) when length(AllMessages) =< ?MESSAGE_LIMIT ->
  spawn_link(scaling, process_message, [AllMessages]);
spawn_message_handlers(AllMessages) ->
  {Messages, RestOfMessages} = lists:split(?MESSAGE_LIMIT, AllMessages),
  spawn_link(scaling, process_message, [Messages]),
  spawn_message_handlers(RestOfMessages).

process_message(Messages) ->
  lists:foreach(fun(Message) ->
                  io:format("Process ~p handling message: ~p~n", [self(), Message])
                end, Messages).

You can see in the output below how 3 messages will print from the same process (PID), but then a 4th will come from a new PID. Likewise, 8 messages will be split up among 3 processes.

1> c(scaling).
{ok,scaling}

2> scaling:start_router().
true

3> scaling ! {messages, []}.
Generating 0 processes to handle 0 incoming messages.
{messages,[]}

4> scaling ! {messages, ["incoming", "transmission"]}.
Generating 1 processes to handle 2 incoming messages.
{messages,["incoming","transmission"]}
Process <0.45.0> handling message: "incoming"
Process <0.45.0> handling message: "transmission"
Received EXIT signal from <0.45.0>: normal

6> scaling ! {messages, ["one", "two", "red", "blue"]}.
Generating 2 processes to handle 4 incoming messages.
{messages,["one","two","red","blue"]}
Process <0.47.0> handling message: "one"
Process <0.48.0> handling message: "blue"
Process <0.47.0> handling message: "two"
Received EXIT signal from <0.48.0>: normal
Process <0.47.0> handling message: "red"
Received EXIT signal from <0.47.0>: normal

7> scaling ! {messages, [1,2,3,4,5,6,7,8]}.
Generating 3 processes to handle 8 incoming messages.
{messages,[1,2,3,4,5,6,7,8]}
Process <0.50.0> handling message: 1
Process <0.51.0> handling message: 4
Process <0.52.0> handling message: 7
Process <0.50.0> handling message: 2
Process <0.51.0> handling message: 5
Process <0.52.0> handling message: 8
Process <0.50.0> handling message: 3
Process <0.51.0> handling message: 6
Received EXIT signal from <0.52.0>: normal
Received EXIT signal from <0.50.0>: normal
Received EXIT signal from <0.51.0>: normal

8> scaling ! {quit, "All done!"}.
The server and linked processes will now quit. Reason: "All done!"
{quit,"All done!"}

I didn't try it but it's also possible to spawn processes in separate nodes (everything in the above program is running in a single node) by starting a new Erlang shell. Spawn the shell with a name using erl -sname some_name and then use the spawn/4 function that accepts a node name. It should communicate with processes on that node.

OTP (i.e. not reinventing the wheel)

We also learned about the OTP and the gen_server construct. I was losing steam by this point so no code to show for this, but basically everything we've been doing up to this point with messaging and concurrency is built into OTP to save us from having to reinvent the wheel.

The pieces provided by OTP, such as the gen_server behavior, are sort of like contracts or agreements. You create a module and declare in your code that it implements a behavior, and by doing that you're forced to implement the functions in that behavior. Later on, some other piece of code can use your module and, since you said you implement the behavior, it knows those functions must be available.

Macros and more

Simon covered a ton of other topics too. An interesting one is macros. Here's something neat you can do with them - define a macro that accepts a function and runs it twice on the given input. Although I can imagine a few ways to abuse this, and wouldn't be happy working in a module that had too many of these.

-module(macros).
-export([sample/1]).

-define(TWICE(F), (fun(P) -> F(F(P)) end)).

sample(InitValue) ->
    (?TWICE(increment))(InitValue).

increment(X) ->
    X + 1.
1> c(macros).
{ok,macros}
2> macros:sample(2).
4
3> macros:sample(4).
6

There were some videos from Francesco Cesarini and Joe Armstrong too. This one in particular from Joe is pretty eye-opening - if it doesn't prove Erlang does what it does well, then I don't know what does.

We're controlling a large proportion of the world's telecoms networks using this stuff, so it has to be right. You know, it has to run and reboot itself, and it has to correct errors automatically, with nobody there. Otherwise, half the world's telecoms networks would fail.

Resources

As usual, lots of good resources this week. Simon shared links to some books and tools, and other students posted links in the comments, so plenty to check out.

Books

Tools

  • erlyberly - debugger for erlang, elixir and LFE using erlang tracing
  • CutEr - a concolic unit testing tool for Erlang
  • McErlang - a model checker for programs written in Erlang
  • Dialyzer - a static analyzer for type errors
  • IntelliJ IDEA Plugin - turns IntelliJ IDEA (and RubyMine, PyCharm, WebStorm, etc.) into an Erlang IDE.
  • erlide - an IDE for Erlang, powered by Eclipse
  • Wrangler - an interactive refactoring tool, integrated into both emacs and Eclipse
  • Rebar3 - an Erlang tool that makes it easy to create, develop, and release Erlang libraries, applications, and systems in a repeatable manner
  • dbg - text based trace facility built into Erlang
  • Erlang.mk - build tool using makefiles
  • BEAM Toolbox - a list of tools and libraries that are useful for BEAM languages like efene, erlang, LFE and Elixir projects

Articles / Misc


Grant Winney

Grant Winney

I write when I've got something to share - a personal project, a solution to a difficult problem, or just an idea. We learn by doing and sharing. We've all got something to contribute.

Read More