Sunday, 15 November 2009

100,000 tasklets: Erlang and Go

Purely out of interest, and to see how Erlang and Google's Go compare on my puny laptop (a Dell Mini 10v), I wrote an Erlang version of the example application on 100,000 tasklets: Stackless and Go.


Basically, it creates a chain of 100,000 microthreads (tasklet, goroutine, process ... take your pick), sends a value in one end and waits for the result at the other end. The number is incremented by each microthread it passes through.


The code comes in two parts. Firstly, a chain.erl module:


-module(chain).
-export([run/1]).

run(Num) ->
Tail = chain(Num, self()),
Tail ! 0,
receive Result -> Result end.

chain(0, Tail) ->
Tail;

chain(Num, Tail) ->
chain(Num-1, spawn(fun() -> f(Tail) end)).

f(Tail) ->
receive
Num -> Tail ! Num+1
end.

And secondly a simple escript to start it off (mostly to make it easy to run under time):


#!/usr/bin/env escript
%%! +P 1000000 -smp disable
-export([main/1]).

main([]) ->
Result = chain:run(100000),
io:format("~p~n", [Result]).

And the run times:


$ time ./chain 
100000

real 0m1.520s
user 0m1.012s
sys 0m0.468s
$ time ./go-chain 
100000

real 0m3.371s
user 0m1.672s
sys 0m1.000s

A couple of things to point out/mention:


  • The Erlang code is just beautiful (well, maybe not the escript so much ;-)). To me, it's more readable than either the Python or Go versions.
  • I turned SMP off. Yes, it's an optimisation but then the tasks are running in series so it's never going to help.
  • The Go version was compiled and linked using 8g and 8l.
  • The Go version didn't always complete and sometimes took a *very* long time ... just not when running under time for some reason.
  • Go seemed to use about 3x as much memory.

What does this prove? Absolutely nothing! Firstly, it's an unrealistic application. Also, Go is really quite new and I'm sure performance and memory use will improve in the coming months.


So why did I do this? Simply, because I really enjoy playing with Erlang and Go is interesting and a hot topic. (In my opinion anything with concurrency built into the runtime is onto a good thing ... I sure wish we didn't have to resort to Twisted, Stackless, greenlets or generator hacks in Python.)

4 comments:

EY said...

How does it scale if you bump it up to 1,000,000 tasklets? Does it increase by the same constant factor for all three languages?

Matt Goodall said...

I managed to push the Erlang version to 1300000 by shutting down everything and running from a terminal. Anything more than that and the Linux OOM killer kicked in and did its job ;-).

Sadly, I could only get the Go version up to about 260000 before memory was exhausted.

At 250000, Erlang took 0m2.401s and Go took 4m31.750s although the Go times were quite variable.

For reference, Erlang with 1000000 processes took 0m31.461s.

I don't have Stackless installed so can't provide timings for that, sorry.

ThomasH said...

Just for your entertainment: This all goes back (I think) to a very entertaining talk Joe Armstrong gave a couple of years ago at a small conference (LL2 at MIT), entitled "Concurrency oriented programming in Erlang", and I was really amused to see it re-surface in the Go presentation. It's very entertaining and still instructive, so if you have an hour and want something to smile, give it a go. It's the first talk of the morning session, so you don't have to wait through the video. But have his slides alongside :).

Matt Goodall said...

@ThomasH. I've seen the ring of processes a few times now as well as various Joe Armstrong talks but I don't think I've seen that one before. Thanks!