Buscar en Mind w/o Soul

miércoles, noviembre 25, 2009

Monads explained to imperative programmers

(See also: answers to this slashdot comment)

I come from an imperative background myself. Let's see if my explanation of I/O and state handling in pure funtional languages, based on imperative metaphors, makes more sense for you. Please let me know if you find the following useful, I might expand it to my own personal "monad tutorial" (this seems to be a tradition when learning Haskell).

- Monads are program structures that represent processes (and one particular kind of process for each kind of monad), but in a different way than functions represent processes. Actually I think of monads as little virtual machines embedded in the functional language, that encapsulate all the boring bookkeeping and boilerplate that must be done in any program alongside the interesting business logic.

- A monad is used as a framework: you will program a sequence (or 'pipeline') of instructions in an imperative style (the do-block), using some vocabulary defined by the monad (the monadic type constructor, and return and bind operators). These blocks can then be instantiated and passed to the monad, to be executed according to the control logic programmed in the monad definition. Side effects during the sequence of actions in the do-block are instructions evaluated outside the do-block, but inside the (functional) monad code.

- As for the I/O and state handling, remember that you have IO and State monads, and that each monad represent one kind of computation. IO represents *processes that interact with the real world*, and state monads represent *processes that depend on an internal state, that changes as the process evolves*. The trick to use them in a pure functional language is that the actual side-effect is not spread all over your programming language, it's encapsulated inside the monad definition.

- The good thing of all this programming style is that you can actually *do* imperative programming, but in a much more controlled manner: instead of relying on the "generic VM" embedded in your imperative programming language (with all its risk of incorrect side-effects) you can execute it in your controled micro-VM defined by the monad. In that way, the only side-effects allowed inside the do-block are those that you enable yourself. Monads work as aspects in OOP: they represent one particular kind of unrelated logic (such as exceptions, concurrency...) that must be processed in the background of your main computation, and can alter its control flow. If you want your block to handle several aspects at once, you can always wrap several monads one inside another with monad transformers.

For example, the List type when used as a monad represents a computation which can return more than one result (or less than one, which is the empty list). Inside a do-block where one instruction has a List result, this means that the following instruction will get executed *once for each element in the list*, with its input parameter being each element. This is used for example in list comprehensions:

do { x <- [1..3] ; y <- [1..x] ; return (x,y) } ---> [ (1,1) , (2,1) , (2,2) , (3,1) , (3,2) , (3,3) ]

Is Haskell a "pure functional" language?

1) Haskell is pure functional, but that means "everything can be described as functions, even side-effects". Side effects happen inside do-blocks during the monad pipeline evaluation, but are side-effects only with respect to the do-block, not with respect to the language (the language views them as function calls over the monadic values).

2) Thanks to monads, you can embed side-effects to the do-block inside the IO monad (whose low-level system calls are then defined as a language primitive embedded in the read/write functions, the same way as it's done in imperative languages with read/write subroutines). If you expand the syntactic sugar of the do-block it gets transformed into a series of function calls, in which the monad state is one input parameter and the new state is in the result. In order to understand how this works, I'm affraid you should also learn about continuations, which complement monads in order to describe imperative execution in terms of functions.

Are side-effects part of the functional language, or something external to it?

We must get a bit philosophical here, in the sense that it depends on how you define "computation". It's been demostrated somewhere that imperative and functional programming are mathematically equivalent (i.e. both have the power of Turing machines). This means that you can represent functional languages using an imperative program, but also that you can represent imperative languages using a functional program.

In fact this is the trick used by Turing himself to solve the decision problem, even before computers existed! Mathematics is purely side-effect free, and yet it's used to describe all possible computations. There's a branch of it called "temporal logic" that is related to how continuations and the State monad work. Logic assertions are universally true, but you can create assertions about the world in a particular state that are not true for other states. By passing the transformed state from one logic expression to the next (or from one function to the next), you can represent state changes. Similarly, continuations are functions that represent *all the instructions left to be executed in the program*, and take as a parameter the current program state.

The one style you use is up to your personal preference, really. Modern Haskell is showing that we could switch all the imperative theoretical basis of programming as were defined during the 40's and 50's to a functional basis, and thus retaining all that can be done with state changes while gaining the reasoning properties of functional.

No hay comentarios: