14 Languages - Erlang

2015-08-30

The fifth language - Erlang

Introduction

Erlang is a functional language built to handle concurrency very well. It is not a pure functional language, however. Some non-functional things are possible to do.

Erlang is dynamically typed. It tries to infer the types of variables from the context of the variable. It is also strongly typed: a mismatch in types will generate an error at runtime.

The language is heavily influenced by Prolog. This will be rather apparent when we go through the its syntax.

Executing Erlang programs

Erlang programs can be run in the interpreter provided by the development kit. To do so, just invoke the erl executable. A prompt like this will appear:

Eshell V6.2  (abort with ^G)
1>

Listing 1: The prompt of the Erlang shell.

At the prompt, Erlang programs can be entered. E.g.

1> 1 + 2.
3
2>

Listing 2: Entering a program at the prompt of the Erlang shell.

If you don't want to use the shell, Erlang programs can be compiled by supplying -compiled command line arg to erl. Compiling a file containing an Erlang program (which should have an .erl extension by convention) will produce a .bean file. The Erlang program can then be executed using the -noshell argument to erl. For example, if you have a program in the file first_program.erl that looks like this

-module(first_program).
-export([first/0]).

first() ->
 io:fwrite("Then first Erlang program!").

Listing 3: Program to exemplify compiling. Then it can be compiled by issuing the following command: erl -compile first_program.erl. The program can then be executed using this command line:

erl -noshell -s first_program first -s init stop
...
Then first Erlang program!

Listing 4: Running a compiled Erlang program.

Note that the name of the module defined in the file must be the same as the name of the file it is defined in.

A program can also be compiled from inside the Erlang interpreter. To do this the c() function is used from within the shell. The previous program could for example be compiled like this:

Eshell V6.2  (abort with ^G)
1> c(first_program).
{ok,first_program}

Listing 5: Compiling a program from within the shell.

To invoke a compiled function just call it supplying both the module name and function name separated by a colon.

2> first_program:first().
Then first Erlang program!
ok

Listing 6: Invoking the compiling a program from within the shell.

Basic Syntax

Just as in Prolog, each statement and expression must be terminated by a single dot.

Comments

Comments in Erlang are created using %. % This is a comment.

Hello World!

Atoms

Just like in Prolog, an atom is a constant where the value of the constant is equal to its name. Atoms always start with a lower case letter.

Variables

Variables in Erlang always have names starting with an uppercase letter.

6> AVar = 42.
...
42

Listing 7: Defining a variable

The variables in Erlang are single assignment variables. What does that mean? It means that once you have assigned a value to a variable, you cannot change its value. If you try to assign another value to the variable, Erlang will refuse to do so and print an error message. Variables for which no assignment have been made are called unbound variables while variables having values assigned to them are called bound variables.

Note that you can assign the same value to a variable over and over again. You just can't assign a different value to the variable after its original initialization.

Lists

Lists are defined using by enclosing a number of elements in brackets. The elements should, as usual, be separated by commas. E.g. [1,2,3].

An interesting fact is that a string is in fact a list:

2> [76, 65, 82, 83].
"LARS"

Listing 8: A string is in reality a list.

The elements of a list can be of different types and can be as many as will fit into the memory of the computer.

A list can be split into a head an a tail just like in Prolog.


1> AList = [1, 2, 3].
[1,2,3]
2> [Head|Tail] = AList.
[1,2,3]
3> Head.
1
4> Tail.
[2,3]

Listing 9: Splitting a list into head and tail.

The [Head|Tail] construct can also be used as the right hand side of a statement. If used so it constructs a new list having Head as its first element and the elements of the list Tail as, well, the tail.

3> L = [1 | [2,3]].
[1,2,3]
4> L.
[1,2,3]

Listing 10: List construction using the [Head|Tail] construct.

Tuples

Tuples are fixed size lists, as in many other program languages. They are defining by enclosing the elements in curly brackets:

4> {1,2,3,4}.       
{1,2,3,4}

Listing 11: A tuple.

Tuples can be used to implement functionality that is provided by hash tables in other languages. We could for example create a data structure representing a person like this:

{person, {name, "Lars"}, {age, 42}}.

Listing 12: Tuples used to implement hash table like data structure.

In the example above I've used the atoms person, name and age to as the keys in the "hash map". The person atom seems redundant, but is useful if the program contains more than one hash table like this. Why is that? It has to do with the way data is extracted from a tuple, which is what we will discuss next.

Note that we can mix tuples and lists. We could for example let the tuple above contain a list of persons (then it probably would be a good idea to call the tuple persons).

Pattern matching

Pattern matching works similar in Erlang to how it works in Prolog. An atom will match itself. If any variables is present in a match expression, Erlang will examine the data structure matched against and try to assign values to the variables. Note that only a single solution, if any, is returned. Not all possible solutions as in Prolog. Let's look at an example.

We can perform pattern matching against the tuple in the previous section by assigning it to a variable and then using that in a match expression.

Eshell V6.2  (abort with ^G)
1> Lars = {person, {name, "Lars"}, {age, 42}}.
{person,{name,"Lars"},{age,42}}
2> {person, {name, LarsName}, {age, LarsAge}} = Lars.
{person,{name,"Lars"},{age,42}}
3> LarsName.
"Lars"
4> LarsAge.
42

Listing 13: Using pattern matching to retrieve values from a tuple.

In the example above, we see how the Erlang interpreter checks if both sides of the equals sign can be made equal, which they can, and by that process assigns values to any variables in the match expression.

If the expression did not match the data structure, an error message will be printed and no variable assignment would be made:

5> {person, {name, KalleName}, {shoeSize, KalleShoeSize}} = Lars.   
** exception error: no match of right hand side value 
                    {person,{name,"Lars"},{age,42}}
6> KalleName.
* 1: variable 'KalleName' is unbound

Listing 14: Variables not assigned when a matching fails.

Functions

A function belong to a module. A module is defined in a file with the extension .erl. In the beginning of the file, the name of the module and the functions the module exposes are defined. Let's look again at the example we saw a few sections ago

-module(first_program).
-export([first/0]).

first() ->
 io:fwrite("Then first Erlang program!")

Listing 15: Example of a module containing one function.

Here we define a module called first_program. It exports one function, called first. The /0 after the function name indicates that the function does not take any parameters. Had it takes one param, the export statement would have been first/1 instead.

If a module exports more than one function, several export statements are used.

The body of the function comes after the name of the function and the mandatory -> in the function definition. Several statements can be included in the body by separating them with commas. The value of the last expression in the body is returned as the value of the function call.

Several variants of a function can be defined, each matching a specific input. The different variant is separated by a semicolon. The last variant should be ended by a dot. To exemplify this we write a variant of the factorial program where we in addition to calculating the value we also print the number we which currently calculating the factorial for.

-module(factorial).
-export([fact/1]).
-export([print_number/1]).

fact(0) -> 1;
fact(N) ->
   print_number(N), N * fact(N - 1).

print_number(N) ->
      io:fwrite(integer_to_list(N)), io:fwrite("
").

Listing 16: Factorial program that print the current number.

In the program above, we define two functions: print_number that just prints the number it receives as input on a separate line on std out and fact that invokes print_number in addition to calculating the factorial. Note that we separate the invocation of the print_number and recursive call to fact by a comma. This is how several statements are executed in sequence in a function: just separate them with commas.

Running the program to calculate 5! gives the following output:

3> factorial:fact(5). 
5
4
3
2
1
120

Listing 17: Using the factorial program to calculate 5!.

It can be noted that Erlang executes recursive programs insanely quickly. Using the factorial program to calculate 1000! takes a tenth of a second or so (I haven't timed it, but it seems almost instantaneous).

Anonymous functions

An anonymous, no-name, function can be created using the fun keyword. Such functions can be assigned to variables, passed to other functions and returned from other functions. Let's create such a function, assign it to a variable and using that variable to executing the function:

1> TimesTwo = fun(N) -> N * 2 end.
#Fun
2> TimesTwo(7).
14

Listing 18: Creating and invoking an anonymous function.

Case

Case clauses use pattern matching to determine which of several paths to take. A case statement is started with case Variable of. Then comes a list of cases where the matching value is separated from the resulting value by an arrow ->. The different cases are separated from each other by a semicolon. Note that the last matching case should not have a trailing semicolon. The whole statement is then ended by end.. You can add an underscore as a wildcard to catch all cases where no explicit match could be made.

An example:

1> SuiteAsString = "clubs".
"clubs"
2> case SuiteAsString of    
2>   "spades" -> spades;    
2>   "hearts" -> hearts;    
2>   "diamonds" -> diamonds;
2>   "clubs" -> clubs;
2>   "_" -> spades
2> end.                     
clubs

If-clauses

An if clause uses one or more guards to determine which path to select. A guard is just a boolean expression. Let's code up an example:

1> X = 42.
42
2> if
2> X < 40 ->
2> io:fwrite("Less than 40
");              
2> X > 40 ->
2> io:fwrite("Greater than 40
") 
2> end.
Greater than 40
ok

Listing 19: Basic if clause.

Note that every "part" of the if clause but the last is terminated by a semicolon.

If more than one statement should be executed in a branch, the statements should be separated by commas. Here's a small function that just demonstrates how to have several statements in each branch.

-module(if_test).
-export([iftest/1]).

iftest(Y) ->
  X = Y + 2,
  if
    X < 40 ->
      io:fwrite("Less than 40
"),
      io:fwrite("Will terminate
");
    X > 40 ->
        io:fwrite("Greater than 40
"),
        io:fwrite("Will not terminate
")
      end.
...
18> c(if_test).
{ok,if_test}
19> if_test:iftest(37).
Less than 40
Will terminate
ok
20> if_test:iftest(39).
Greater than 40
Will not terminate
ok

Listing 20: If clause having several statements in each branch.

If an if clause does not cover all possible values, Erlang throws an exception:

21> if_test:iftest(38).
** exception error: no true branch found when evaluating an if expression
     in function  if_test:iftest/1 (if_test.erl, line 6)

Listing 21: An error occurs if executing an if clause with a non-covered value.

To provide a default, catch-all case, include a true part in the if clause:

-module(if_test).
-export([iftest/1]).

iftest(Y) ->
  X = Y + 2,
  if
    X < 40 ->
      io:fwrite("Less than 40
"),
      io:fwrite("Will terminate
");
    X > 40 ->
        io:fwrite("Greater than 40
"),
        io:fwrite("Will not terminate
");
    true ->
        io:fwrite("Executing the default branch
")
      end.
...
22> c(if_test).
{ok,if_test}
23> if_test:iftest(38).
Executing the default branch
ok

Listing 22: if clause with a true branch.

foreach

The foreach function defined in the lists module can be used to iterate over the elements of a list. An example of when this could be useful if you want to print the elements of the list.

1> Numbers = [1,2,3,4,5].
2> lists:foreach(fun(X) -> io:format("Element is: "), io:format("~p", [X]), io:format("
") end, Numbers).
Element is: 1
Element is: 2
Element is: 3
Element is: 4
Element is: 5
ok

Note that the elements of the list are not modified by foreach.

map

map is a function defined on lists. It can be used to apply a function to all members of the list. E.g. to apply the function that multiplies an input value by two that we defined in the section about anonymous functions above to a list of integers, the following code can be used:

3> Numbers = [1,2,3,4,5].
7> lists:map(TimesTwo, Numbers).                    
[2,4,6,8,10]

Listing 23: Using map to applying a function to all elements of a list.

Of course, we can define and use an anonymous function directly in the call to map :

11> lists:map(fun(X) -> X * 3 end, Numbers).  
[3,6,9,12,15]

Listing 24: Defining an anonymous function directly in the call to map.

filter

The function lists:filter is used for forming a new list from an old one, where the new list only contains a subset of the elements of the old list. The selection is made based the evaluation of a predicate: if the predicate evaluates to true for an element, the element will be included in the new list, if it evaluates to false, it will not.

As an example, we construct a small program that selects all elements greater than two from a list of integers:

1> L = [1,2,3,4,5].
[1,2,3,4,5]
2> GreaterThanTwo = fun(X) -> X > 2 end.
#Fun
3> lists:filter(GreaterThanTwo, L).
[3,4,5]

Listing 25: Using filter to selects elements greater than two.

any, all

lists:any and lists:all can be used to determine if any respectively all of the elements in a list fulfills a predicate.

We can use the predicate function and the list from the previous section to test the lists:any and lists:all functions:

5> lists:any(GreaterThanTwo, L).
true
6> lists:all(GreaterThanTwo, L).
false

Listing 26: Example of any() and all().

takewhile, dropwhile

The functions lists:takewhile and lists:dropwhile can be used either extract (takewhile) or discard (dropwhile) elements from the beginning of a list.

9> LessThanThree = fun(X) -> X < 3 end.
#Fun
10> lists:takewhile(LessThanThree, L).  
[1,2]
11> lists:dropwhile(LessThanThree, L).
[3,4,5]

Listing 27: Example of takewhile() and dropwhile().

foldl

foldl can be used to accumulate some kind of result from a list. It can be a sum, a product or something else. To do this foldl uses an accumulator function that is applied to each element in the list in turn. The accumulator function receives the currently accumulated value as inparam and uses that value and the value of the element to determine the new accumulated value. As no accumulated value exists from the beginning, foldl takes a separate argument, which represents the value to start accumulating from. This would typically be 0 for a summing operation and 1 for a multiplying one.

The most basic example of foldl is determining the sum of a list of integers. We use the same list as in the previous sections.

14> L.
[1,2,3,4,5]
15> lists:foldl(SumFunction, 0, L).
15

Listing 28: Using foldl() to sum the values in a list.

Features

Bit matching

You can pack the values of a number of variables into a bitmap using the <<VariableName1:numberOfBitsForVariable1, VariableName2:numberOfBitsForVariable2,...>> syntax. I.e. to pack three variables into a bit pattern of 12 bits the following code can be used:

1> A = 7.
7
2> B = 14.
14
3> C = 29.
29
4> BitMap = <<A:3, B:4, C:5>>.
<<253,13:4>>

Listing 29: Packing data into a bitmap.

To unpack data the reverse syntax is used:

5> <<APrim:3, BPrim:4, CPrim:5>> = BitMap.
<<253,13:4>>
6> APrim.
7
7> BPrim.
14
8> CPrim.
29

Listing 30: Unpacking data from a bitmap.

List comprehension

Basically a list comprehension creates a new list from one or more source lists. What's special with list comprehension is that using a compact syntax, the elements used from the source lists can be filtered and/or transformed before being used to form the new list.

As a simple first example, lets use a list comprehension to create a new list from a source list where one is added to each element in the source list. Here no filtering of the source list is performed and only one simple transformation (the addition of 1 to the source elements).

1> [X + 1 || X <- [1,2,3,4]].
[2,3,4,5]

Listing 31: Simple example of a list comprehension.

The stuff on the right hand side of || decides what source elements to include. The stuff on the left of || determines what transformations to apply to those elements before including them in the resulting list.

Another way to explain it is that the right hand side sets up variables (in this example only on variable X is set up) that can be used in the left hand side. The variables are assigned a number of values (here 1,2,3 and 4). The left hand side performs transformations the values specified by the variables and the results of those transformations become the elements of the new list.

Several variables can be set up on the left side. If that is done, all permutations of the values given to the variables will be processed by the list comprehension. For example to create a list of 2-tuples where the first element of the tuples is 1 or 2, and the second element is 100 or 200, the following code can be used:

2> [{X, Y} || X <- [1,2], Y <- [100, 200]].
[{1,100},{1,200},{2,100},{2,200}]

Listing 32: List comprehension using two generators.

Each such expression Variable <- SomeList is called a generator.

Each generator can have one or more filter that filters what elements from the source list that will be passed to the left hand side. We can rewrite our first example of a list comprehension so that only elements greater than one will be passed to the left hand side (which will add 1 to the values it receives):

3> [X + 1 || X <- [1,2,3,4], X > 1].       
[3,4,5]

Listing 33: List comprehension with a filter included.

Each generator can have several filters. To exemplify this, we add a second filter to our example so that integers in the [2,3] interval is used a source values:

7> [X + 1 || X <- [1,2,3,4], X > 1, X < 4].
[3,4]

Listing 34: List comprehension with one generator having two filters.

Concurrency

The main strength of Erlang lies in its concurrency support. It is possible to write really, really robust concurrent programs in it. The basic idea is that a program starts one or more processes which encapsulates the things that should be processed concurrently, and one (or more) monitor process(es) that monitors the worker processes. If one of the worker processes crashes, the monitor process will be informed and can then start a new worker process to fill in for the one that crashed. The monitor processses can in turn be monitored by a set of "meta-monitor processes" and so on. This way of writing concurrent software, it done correctly, will lead to really robust programs as we don't have to anticipate the different kinds of errors that might occur. We just let the worker process crash and start a new one when this happens. The Erlang "slogan" of this was of writing programs is Let it crash!.

A note about terminology: in Erlang a process is a small self-contained virtual machine that can execute Erlang functions.

So how do you write such a program in Erlang? It turns out that you only need to know a few additional keywords:

receive : Receive a message from another (or its own) process.
! : Send a message to another (or its own) process.
spawn : Create a new process.
link : Link the current process with another process in order to monitor that process.
spawn_link : Spawn a new process and link this process to that process in order to monitor it.
register : Associate a specific process with a symbol. The symbol can then be used to send messages to the process.

For correctness sake, it should be said that ! and receive exchange messages via a mailbox. I.e. ! sends a message to a mailbox, from which receives then fetches it.

As an example, let's write a small program that can be started to execute in a separate process and that listens to messages from other processes:

-module(worker_service).
-export([start/0, rpc/1]).

start() ->
  register(workerService, spawn(fun loop/0)).

rpc(RequestParameter) ->
  workerService ! {self(), RequestParameter},
  receive
    {workerService, Response} ->
      Response
  end.

loop() ->
  receive
    {From, Request} ->
      io:format("Received request: ~p~n", [Request]),
      From ! {workerService, "Thank you for your message!"},
      loop()
  end.

Listing 35: Message listener program.

We have included a function start() that spawns a process and runs the loop() function in it. It also registers the name 'workerService' with the spawned process, Another function that we've included is rpc() which sends a message to the process spawned by start().

The main function of the module is loop(). It is the method that receives and responds to messages. Our implementation of the message listener is extremely simple: we just log that we've received a message, send a return message containing a nice greeting back to the sender and go back to listening for new messages.

Now let's write function that supervises the workerService process and restarts it if it should crash:

-module(worker_supervisor).
-export([start/0]).

start() ->
  register(workerSupervisor, spawn(fun loop/0)).

loop() ->
  process_flag(trap_exit, true),
  link(whereis(workerService)),
  receive
      {'EXIT', From, Reason} ->
      io:format("Worker service died with reason ~p~n", [Reason]),
      worker_service:start(),
      loop()
  end.

Listing 36: Program that supervises the worker service.

In this program we've included a function start(), that analogously to the start() function in the worker_service progra, spawns a new process in which it listens to incoming messages. It also registers a name for the newly spawned process so that we can easily access it should we want.

The loop() function is where the interesting stuff happens. First we tell the Erlang runtime system that we want to receive an exit signal when one of the processes we are linked to exists. Then we set up a link to the worker_service process that the previous program spawns off. After that we starts to listen to messages. More specifically we listens to EXIT messages. Such a message will be sent by the runtime system when the when the workerServie process exists. When having received such a message, the listener will print a message, start up a new worker process and rerun the loop() method again. Pretty nifty eh?

Example output:

1> c(worker_supervisor).
worker_supervisor.erl:11: Warning: variable 'From' is unused
{ok,worker_supervisor}
2> worker_service:start().
true
3> worker_supervisor:start().
true
4> worker_service:rpc("Hello!").
Received request: "Hello!"
"Thank you for your message!"
5> exit(whereis(workerService), kill).
Worker service died with reason killed
true
6> worker_service:rpc("Are you alive?").
Received request: "Are you alive?"
"Thank you for your message!"

Listing 37: Sample output from running the worker and supervisor programs.

We've used the built in exit() and whereis() functions to find and kill the worker process, and we can see that when we did just that, the supervisor received the exit signal and took apropriate action. I.e. started a new worker process.

Gotchas

Forgetting that the last part of a case statement or an if statement should not end with a semicolon.

FizzBuzz in Erlang

-module(fizzBuzz).
-export([start/0]).

start() -> f(1).

f(101) -> void;
f(N) ->
  if
    N rem 3 =:= 0 andalso N rem 5 =:= 0 ->
      io:format("FizzBuzz");
    N rem 3 =:= 0 ->
      io:format("Fizz");
    N rem 5 =:= 0 ->
      io:format("Buzz");
    true -> io:format("~B", [N])
  end,
  io:format("~n"),
  f(N +1).

What I like about Erlang

It executes (tail) recursive programs extremely quickly.
Its concurrency support. It is quite easy to set up a process that monitors another process and restarts it if it crashes.
List comprehensions. Almost like in Python ;)

What I don't like about Erlang

It's hard (at least for a beginner) to remember what terminating character, if any, to end each line with.

Onwards, upwards

I must say that I like Erlang. I'm not sure why. It sometimes got a pretty strange syntax, probably because its Prolog inheritance, but on the whole, it is a really nice programming language. And its concurrency support is...marvelous :)

Now its time for another exciting language: Clojure!

Previous parts in the series

1. Ruby
2. Io
3. Prolog
4. Scala

Links

[1] Erlang homepage