The expression 17 is of type int and is a value already. The expression 1 + 2 * 3 + 4 is of type int and evaluates to the value 11, since the multiplication is done first. The expression 800 / 80 / 8 has type int. It is the same as (800 / 80) / 8 rather than 800 / (80 / 8) and evaluates to 1.

The expression 400 > 200 has type bool because this is the type of the result of the comparison operator >. It evaluates to true. Similarly, 1 <> 1 has type bool and evaluates to false. The expression true || false is of type bool and evaluates to true since one of the operands is true. Similarly, true && false evaluates to false since one of the operands is false. The expression if true then false else true evaluates to false since the first (then) part of the conditional expression is chosen, and takes the place of the entire expression.

The expression ’%’ is of type char and is already a value. The expression ’a’ + ’b’ has no type – it gives a type error because the + operator does not operate on characters.

The mod operator is of higher precedence than the + operator. So 1 + 2 mod 3 and 1 + (2 mod 3) are the same expression, evaluating to 1 + 2 which is 3, but (1 + 2) mod 3 is the same as 3 mod 3, which is 0.

The expression evaluates to 11. The programmer seems to be under the impression that spacing affects evaluation order. It does not, and so this use of space is misleading.

The expression max_int + 1 evaluates to a number equal to min_int. Likewise, min_int - 1 evaluates to a number equal to max_int. The number line “wraps around”. This leads to the odd situation that max_int + 1 < max_int evaluates to true. It follows that when writing programs, we must be careful about what happens when numbers may be very large or very small.

We will talk about such exceptions later in the book. They are used for program errors which cannot necessarily be found just by looking at the program text, but are only discovered during evaluation.

This illustrates how even simple mathematical operators require careful specification when programming – we must be explicit about the rules.

It prevents unexpected values: what would happen if an integer other than 1 and 0 was calculated in the program – what would it mean? It is better just to use a different type. We can then show more easily that a program is correct.

The lowercase characters are in alphabetical order, for example ’p’ < ’q’ evaluates to true. The uppercase characters are similarly ordered. The uppercase letters are all “smaller” than the lowercase characters, so for example ’A’ < ’a’ evaluates to true. For type bool, false is considered “less than” true.

2. Names and Functions

Just take in an integer and return the number multiplied by ten. The function takes and returns an integer, so the type is int → int.

We must take two integer arguments, and use the && and <> operators to test if they are both non-zero. So the result will be of type bool. The whole type will therefore be int → int → bool.

OCaml

# let both_non_zero x y =
x <> 0 && y <> 0;;
val both_non_zero : int -> int -> bool = <fun>

Our function should take an integer, and return another one (the sum). So, it is type will be int → int. The base case is when the number is equal to 1. Then, the sum of all numbers from 1…1 is just 1. If not, we add the argument to the sum of all the numbers from 1…(n − 1).

OCaml

# let rec sum n =
if n = 1 then 1 else n + sum (n - 1);;
val sum : int -> int = <fun>

The function is recursive, so we used let rec instead of let. What happens if the argument given is zero or negative?

The function will have type int → int → int. A number to the power of 0 is 1. A number to the power of 1 is itself. Otherwise, the answer is the current n multiplied by n^x − 1.

OCaml

# let rec power x n =
if n = 0 then 1 else
(if n = 1 then x else
x * power x (n - 1));;
val power : int -> int -> int = <fun>

Notice that we had to put one if … then … else inside the else part of another to cope with the three different cases. The parentheses are not actually required, though, so we may write it like this:

OCaml

# let rec power x n =
if n = 0 then 1 else
if n = 1 then x else
x * power x (n - 1);;
val power : int -> int -> int = <fun>

In fact, we can remove the case for n = 1 since power x 1 will reduce to x * power x 0 which is just x.

The function isconsonant will have type char → bool. If a lower case character in the range ’a’…’z’ is not a vowel, it must be a consonant. So we can reuse the isvowel function we wrote earlier, and negate its result using the not function:

OCaml

# let isconsonant c = not (isvowel c);;
val isconsonant : char -> bool = <fun>

The expression is the same as let x = 1 in (let x = 2 in x + x), and so the result is 4. Both instances of x in x + x evaluate to 2 since this is the value assigned to the name x in the nearest enclosing let expression.

We could simply return 0 for a negative argument. The factorial of 0 is 1, so we can change that too, and say our new function finds the factorial of any non-negative number:

OCaml

# let rec factorial x =
if x < 0 then 0 else
if x = 0 then 1 else
x * factorial (x - 1);;
val factorial : int -> int = <fun>

Note that factorial can fail in other ways too – if the number gets too big and “wraps around”. For example, on the author’s computer, factorial 40 yields -2188836759280812032.

3. Case by Case

We can just pattern match on the boolean. It does not matter, in this instance, which order the two cases are in.

(A match case belongs to its nearest enclosing match). So the expression evaluates to 5.

These two solutions have differing behaviour upon erroneous arguments (such as punctuation). Can you see why?

4. Listing Things

But we can perform the same trick as before, by reversing the cases, to reduce their number:

This is like counting the length of a list, but we only count if the current element is true.

We can use an accumulating argument in an auxiliary function to make a tail recursive version:

To make a palindrome from any list, we can append it to its reverse. To check if a list is a palindrome, we can compare it for equality with its reverse (the comparison operators work over almost all types).

We pattern match with three cases. The empty list, where we have reached the last element, and where we have yet to reach it.

We can build a tail recursive version by adding an accumulating list, and reversing it when finished (assuming a tail recursive rev, of course!)

The empty list cannot contain the element; if there is a non-empty list, either the head is equal to the element we are looking for, or if not, the result of our function is just the same as the result of recursing on the tail.

Note that we are using the property that the || operator only evaluates its right hand side if the left hand side is false to limit the recursion – it really does stop as soon as it finds the element.

If a list is empty, it is already a set. If not, either the head exists somewhere in the tail or it does not; if it does exist in the tail, we can discard it, since it will be included later. If not, we must include it.

The first part of the evaluation of rev takes time proportional to the length of the list, processing each element once. However, when the lists are appended together, the order of the operations is such that the first argument becomes longer each time. The @ operator, as we know, also takes time proportional to the length of its first argument. And so, this accumulating of the lists takes time proportional to the square of the length of the list.

By using an additional accumulating argument, we can write a version which operates in time proportional to the length of the list.

5. Sorting Things

Simply add an extra let to define a name representing the number we will take or drop:

The argument to take or drop is length l / 2 which is clearly less than or equal to length l for all possible values of l. Thus, take and drop always succeed. In our case, take and drop are only called with length l more than 1, due to the pattern matching.

We may simply replace the <= operator with the >= operator in the insert function.

We require a function of type α list → bool. List of length zero and one are, by definition, sorted. If the list is longer, check that its first two elements are in sorted order. If this is true, also check that the rest of the list is sorted, starting with the second element.

Lists are compared starting with their first elements. If the elements differ, they are compared, and that is the result of the comparison. If both have the same first element, the second elements are considered, and so on. If the end of one list is reached before the other, the shorter list is considered smaller. For example:

These are the same principles you use to look up a word in a dictionary: compare the first letters – if same, compare the second etc. So, when applied to the example in the question, it has the effect of sorting the words into alphabetical order.

We have renamed the second argument of the insert function to avoid confusion.

Chapter 6. Functions upon Functions upon Functions

Our function will have type char list → char list. We just match on the argument list: if it is empty, we are done. If it starts with an exclamation mark, we output a period, and carry on. If not, we output the character unchanged, and carry on:

To use map instead, we write a simple function calm_char to process a single character. We can then use map to build our main function:

This avoids the explicit recursion of the original, and so it is easier to see what is going on.

We require a function apply f n x which applies function f a total of n times to the initial value x. The base case is when n is zero.

The function f must take and return the same type α, since its result in one iteration is fed back in as its argument in the next. Therefore, the argument x and the final result must also have type α. For example, for α = int, we might have a power function:

We can add an extra argument to the insert function, and use that instead of the comparison operator:

We cannot use map here, because the result list will not necessarily be the same length as the argument list. The function will have type (α → bool) → α list → α list.

For example, filter (fun x -> x mod 2 = 0) [1; 2; 4; 5] evaluates to [2; 4].

For example, we can see if all elements of a list are positive: for_all (fun x -> x > 0) [1; 2; -1] evaluates to false. Notice that we are relying on the fact that && only evaluates its right hand side when the left hand side is true to limit the recursion.

The function will have type (α → β) → α list list → β list list. We use map on each element of the list.

We have used explicit recursion to handle the outer list, and map to handle each inner list.

Chapter 7. When Things go Wrong

The function smallest_inner takes a currently smallest found integer, a boolean value found indicating if we have found any suitable value or not, and the list of integers. It is started with max_int as the current value, so that any number is smaller than it, and false for found because nothing has been found yet.

Thus, the function raises an exception in the case of an empty list, or one which is non-empty but contains no positive integer, and otherwise returns the smallest positive integer in the list.

We just surround the call to smallest with an exception handler for Not_found.

We write a function sqrt_inner which, given a test number x and a target number n squares x and tests if it is more than n. If it is, the answer is x - 1. The test number will be initialized at 1. The function sqrt raises our exception if the argument is less than zero, and otherwise begins the testing process.

Chapter 8. Looking Things Up

Since the keys must be unique, the number of different keys is simply the length of the list representing the dictionary – so we can just use the usual length function.

The type is the same as for the add function. However, if we reach the end of the list, we raise an exception, since we did not manage to find the entry to replace.

The function takes a list of keys and a list of values and returns a dictionary. So it will have type α list → β list → (α × β) list.

This will have the type (α × β) list → α list × β list. For the first time, we need to return a pair, building up both result lists element by element. This is rather awkward, since we will need the tails of both of the eventual results, so we can attach the new heads. We can do this by pattern matching.

Here’s a sample evaluation (we cannot really show it in the conventional way, so you must work through it whilst looking at the function definition):

Since the inner pattern match has only one form, and is complete, we can use let instead:

We can use our member function which determines whether an element is a member of a list, building up a list of the keys we have already seen, and adding to the result list of key-value pairs only those with new keys.

We pattern match on the first list – if it is empty, the result is simply b. Otherwise, we add the first element of the first list to the union of the rest of its elements and the second list.

We can verify that the elements of dictionary a have precedence over the elements of dictionary b by noting that add replaces a value if the key already exists.

Chapter 9. More with Functions

The function g a b c has type α → β → γ → δ which can also be written α → (β → (γ → δ)). Thus, it takes an argument of type α and returns a function of type β → (γ → δ) which, when you give it an argument of type β returns a function of type γ → δ which, when you give it an argument of type γ returns something of type δ. And so, we can supply just one or two arguments to the function g (which is called partial application), or supply all three at once. When we write let g a b c = … this is just shorthand for let g = fun a -> fun b -> fun c -> …

The type of member is α → α list → bool, so if we supply the first argument, the type of member x must be α list → bool. We can use the partially-applied member function and map to produce a list of boolean values, one for each list in the argument, indicating whether or not that list contains the element. Then, we can use member again to make sure there are no false booleans in the list.

Which do you think is clearer? Why do we check for the absence of false rather than the presence of true?

The function ( / ) 2 resulting from the partial application of the / operator is the function which divides two by a given number, not the function which divides a given number by two. We can define a reverse divide function…

The function map has type (α → β) → α list → β list. The function mapl we wrote has type (α → β) → α list list → β list list. So the function mapll will have type (α → β) → α list list list → β list list list. It may be defined thus:

It is not possible to write a function which would map a function f over a list, or list of lists, or list of lists of lists depending upon its argument, because every function in OCaml must have a single type. If a function could map f over an α list list it must inspect its argument enough to know it is a list of lists, thus it could not be used on a β list unless β = α list.

We can write a function to truncate a single list using our take function, being careful to deal with the case where there is not enough to take, and then use this and map to build truncate itself.

Here we have used partial application of truncate to build a suitable function for map. Note that we could use exception handling instead of calling length, saving time:

First, define a function which takes the given number and a list, returning the first element (or the number if none). We can then build the main function, using partial application to make a suitable function to give to map:

Chapter 10. New Kinds of Data

We need two constructors – one for squares, which needs just a single integer (the length of a side), and one for rectangles which needs two integers (the width and height, in that order):

The name of our new type is rect. A rect is either a Square or a Rectangle. For example,

This will be a function of type rect → rect. Squares remain unaltered, but if we have a rectangle with a bigger width than height, we rotate it by ninety degrees.

We will use map to perform our rotation on any rects in the argument list which need it. We will then use the sorting function from the previous chapter which takes a custom comparison function so as to just compare the widths.

We follow the same pattern as for the list type, being careful to deal with exceptional circumstances:

Chapter 11. Growing Trees

Our function will have type α → α tree → bool. It takes a element to look for, a tree holding that kind of element, and returns true if the element is found, or false otherwise.

Note that we have placed the test x = y first of the three to ensure earliest termination upon finding an appropriate element.

Our function will have type α tree → α tree. A leaf flips to a leaf. A branch has its left and right swapped, and we must recursively flip its left and right sub-trees too.

We can check each part of both trees together. Leaves are considered equal, branches are equal if their left and right sub-trees are equal.

There will be no key clashes, because the argument should already be a dictionary. If it is not, earlier keys are preferred since insert replaces existing keys.

We can make list dictionaries from both tree dictionaries, append them, and build a new tree from the resultant list.

The combined list may not be a dictionary (because it may have repeated keys), but tree_of_list will prefer keys encountered earlier. So, we put entries from t’ after those from t.

We will use a list for the sub-trees of each branch, with the empty list signifying there are no more i.e. that this is the bottom of the tree. Thus, we only need a single constructor.

type ’a mtree = Branch of ’a * ’a mtree list

In fact, when there is only one pattern to match, we can put it directly in place of the function’s argument, simplifying these definitions:

Chapter 12. In and Out

OCaml

# [1; 2; 3];;
- : int list = [1; 2; 3]
# print_integers [1; 2; 3];;
[1; 2; 3; ]- : unit = ()

There is an extra space after the last element, and a semicolon too. We can fix this, at the cost of a longer program:

OCaml

# [1; 2; 3];;
- : int list = [1; 2; 3]
# print_integers [1; 2; 3];;
[1; 2; 3]- : unit = ()

We must deal with the exception raised when read_int attempts to read something which is not an integer, as before. When that exception is caught, we try again, by recursively calling ourselves. The function ends when three integers are input correctly, returning them as a tuple.

You may wonder why we used nested let … in … structures rather than just writing (read_int (), read_int (), read_int ()) – the evaluation order of a tuple is not specified and OCaml is free to do what it wants.

We ask the user how many dictionary entries will be entered, eliminating the need for a special “I have finished” code. First, a function to read a given number of integer–string pairs, dealing with the usual problem of malformed integers:

And now, asking the user how many entries there will be, and calling our first function:

Notice that we defined, raised, and handled our own exception BadNumber to deal with the user asking to read a negative number of dictionary entries – this would cause read_dict_number to fail to return.

If we write a function to build the list of integers from 1 to n (or the empty list if n is zero):

We can then write a function to output a table of a given size to an output channel.

Look at this carefully. We are using nested calls to iter to build the two-dimensional table from one-dimensional lists. Can you separate this into more than one function? Which approach do you think is more readable?

We can test write_table_channel most easily by using the built-in output channel stdout which just writes to the screen:

OCaml

# write_table_channel stdout 5;;
1 2 3 4 5
2 4 6 8 10
3 6 9 12 15
4 8 12 16 20
5 10 15 20 25
- : unit = ()

Now we just need to wrap it in a function to open an output file, write the table, and close the output, dealing with any errors which may arise.

In addition to raising Invalid_argument in the case of a negative number, we handle all possible exceptions to do with opening, writing to, and closing the file, re-raising them as our own, predefined one. Is this good style?

We write a simple function to count the lines in a channel by taking a line, ignoring it, and adding one to the result of taking another line; our recursion ends when an End_of_file exception is raised – it is caught and 0 ends the summation.

The main function countlines just opens the file, calls the first function, and closes the file. Any errors are caught and re-raised using the built-in Failure exception.

As usual, let us write a function to deal with channels, and then deal with opening and closing files afterward. Our function takes an input channel and an output channel, adds the line read from the input to the output, follows it with a newline character, and continues. It only ends when the End_of_file exception is raised inside input_line and caught.

Now we wrap it up, remembering to open and close both files and deal with the many different errors which might occur.

Chapter 13. Putting Things in Boxes

Two references, x and y, of type int ref have been created. Their initial values are 1 and 2. Their final values are 2 and 4. The type of the expression is int because this is the type of !x + !y, and the result is 6.

The expression [ref 5; ref 5] is of type int ref list. It contains two references each containing the integer 5. Changing the contents of one reference will not change the contents of the other. The expression let x = ref 5 in [x; x] is also of type int ref list and also contains two references to the integer 5. However, altering one will alter the other:

OCaml

# let r = let x = ref 5 in [x; x];;
val r : int ref list = [{contents = 5}; {contents = 5}]
# match r with h::_ -> h := 6;;
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
[]
- : unit = ()
# r;;
- : int ref list = [{contents = 6}; {contents = 6}]

We can write a function forloop which takes a function of type int → α (where alpha would normally be unit), together with the start and end numbers:

OCaml

# forloop print_int 2 10;;
2345678910- : unit = ()
# forloop print_int 2 2;;
2- : unit = ()

Note that this works for the empty array, because a for construct where the second number is less than the first never executes its expression.

Since we wish to reverse the array in place, our function will have type α array → unit. Our method is to proceed from the first element to the half-way point, swapping elements from either end of the array. If the array has odd length, the middle element will not be altered.

We will represent the int array array as an array of columns so that a.(x).(y) is the element in column x and row y.

The difference between the codes for ’a’ and ’A’, or ’z’ and ’Z’ is 32, so we add or subtract as appropriate. Codes not in those ranges are unaltered.

Periods, exclamation marks and question marks may appear in multiples, leading to a wrong answer. The number of characters does not include newlines. It is not clear how quotations would be handled. Counting the words by counting spaces is inaccurate – a line with ten words will count only nine.

Chapter 14. The Other Numbers

We calculate the ceiling and floor, and return the closer one, being careful to make sure that a point equally far from the ceiling and floor is rounded up.

The behaviour with regard to values such as infinity and nan is fine, since it always returns the result of either floor or ceil.

The whole part is calculated using the built-in floor function. We return a tuple, the first number being the whole part, the second being the original number minus the whole part. In the case of a negative number, we must be careful – floor always rounds downward, not toward zero!

We need to determine at which column the asterisk will be printed. It is important to make sure that the range 0…1 is split into fifty equal sized parts, which requires some careful thought. Then, we just print enough spaces to pad the line, add the asterisk, and a newline character.

No allowance has been made here for bad arguments (for example, b smaller than a). Can you extend our program to move the zero-point to the middle of the screen, so that the sine function can be graphed even when its result is less than zero?

Chapter 15. The OCaml Standard Library

To make a tail-recursive one, we can use an accumulator, reversing each list as we append it, and reversing the result. List.rev is tail-recursive already.

We can use List.mem, partially applied, to map over the list of lists. We then make sure that false is not in the resultant list, again with List.mem.

The String.iter function calls a user-supplied function of type char → unit on each character of the string. We can use this to increment a counter when an exclamation mark is found.

We can use the String.map function, which takes a user-supplied function of type char → char and returns a new string, where each character is the result of the mapping function on the character in the same place in the old string.

Notice that we have taken advantage of partial application to erase the last argument as usual.

We can use the functions create, add_string, and contents from the Buffer module together with the usual list iterator List.iter:

We repeatedly check if the string we are looking for is right at the beginning of the string to be searched. If not, we chop one character off the string to be searched, and try again. Every time we find a match, we increment a counter.

You might consider that writing this function with lists of characters rather than strings would be easier. Unfortunately, it would be slow, and these kinds of searching tasks are often required to be very fast.

Chapter 16. Building Bigger Programs

First, we extend the Textstat module to allow frequencies to be counted and expose it through the interface. Here is the interface q1.mli:

We can write two little functions – one to read all the lines from a file, and one to write them. The main function, then, reads the command line to find the input and output file names, reads the lines from the input, reverses the list of lines, and writes them out. If a problem occurs, the exception is printed out. If the command line is badly formed, we print a usage message and exit.

Note that there is a problem if the file has no final newline – it will end up with one. How might you solve that?

We can simply do something (or nothing) a huge number of times using a for loop.

On many systems, typing time followed by a space and the usual command will print out on the screen how long the program took to run. For example, on the author’s computer:

$ ocamlc bigloop.ml -o bigloop
$ time ./bigloop

real  0m1.896s
user  0m1.885s
sys   0m0.005s

$ ocamlopt bigloop.ml -o bigloop
$ time ./bigloop

real  0m0.022s
user  0m0.014s
sys   0m0.003s

You can see that, when compiled with ocamlc, it takes 1.9s to run, but when compiled with ocamlopt just 0.022s.

We can get all the lines in the file using our getlines function from question two. The main function simply calls string_in_line on each line, printing it if true is returned.

The interesting function is string_in_line. To see if term is in line we start at position 0. The condition for the term having been found is a combination of boolean expressions. The first ensures that we are not so far through the string that the expression could not possibly fit at the current position. The second checks to see if the term is found at the current position by using the function String.sub from the OCaml Standard Library. If not, we carry on.

Answers to Questions

1. Starting Off

1

2

3

4

5

6

7

8