The expression `17`

is of type **int** and is a value already.
The expression `1 + 2 * 3 + 4`

is of type **int** and evaluates to the
value `11`

, since the multiplication is done first. The
expression `800 / 80 / 8`

has type **int**. It is the same as
`(800 / 80) / 8`

rather than `800 / (80 / 8)`

and
evaluates to `1`

.

The expression `400 > 200`

has type **bool** because this is the type
of the result of the comparison operator `>`

. It evaluates
to `true`

. Similarly, `1 <> 1`

has type
**bool** and evaluates to
`false`

. The expression `true || false`

is of type
**bool** and evaluates to
`true`

since one of the operands is true. Similarly,
`true && false`

evaluates to `false`

since
one of the operands is false. The expression
`if true then false else true`

evaluates to
`false`

since the first (`then`

) part of the
conditional expression is chosen, and takes the place of the entire
expression.

The expression `’%’`

is of type **char** and is already a value. The
expression `’a’ + ’b’`

has no type – it gives a type error
because the `+`

operator does not operate on characters.

The `mod`

operator is of higher precedence than the
`+`

operator. So `1 + 2 mod 3`

and
`1 + (2 mod 3)`

are the same expression, evaluating to
`1 + 2`

which is `3`

, but
`(1 + 2) mod 3`

is the same as `3 mod 3`

, which is
`0`

.

The expression evaluates to `11`

. The programmer seems to
be under the impression that spacing affects evaluation order. It does
not, and so this use of space is misleading.

The expression `max_int + 1`

evaluates to a number equal
to `min_int`

. Likewise, `min_int - 1`

evaluates to
a number equal to `max_int`

. The number line “wraps around”.
This leads to the odd situation that
`max_int + 1 < max_int`

evaluates to `true`

. It
follows that when writing programs, we must be careful about what
happens when numbers may be very large or very small.

OCaml accepts the program, but complains when it is run:

` OCaml`

`# 1 / 0;;`

`Exception: Division_by_zero.`

We will talk about such *exceptions* later in the book. They
are used for program errors which cannot necessarily be found just by
looking at the program text, but are only discovered during
evaluation.

For `x mod y`

:

when

*y*= 0, OCaml prints`Exception: Division_by_zero`

when

*y*< > 0,*x*< 0, the result will be negativewhen

*y*< > 0,*x*= 0, the result will be zero

This illustrates how even simple mathematical operators require careful specification when programming – we must be explicit about the rules.

It prevents unexpected values: what would happen if an integer other
than `1`

and `0`

was calculated in the program –
what would it mean? It is better just to use a different type. We can
then show more easily that a program is correct.

The lowercase characters are in alphabetical order, for example
`’p’ < ’q’`

evaluates to `true`

. The uppercase
characters are similarly ordered. The uppercase letters are all
“smaller” than the lowercase characters, so for example
`’A’ < ’a’`

evaluates to `true`

. For type **bool**, `false`

is
considered “less than” `true`

.

Just take in an integer and return the number multiplied by ten. The
function takes and returns an integer, so the type is **int →
int**.

` OCaml`

`# let times_ten x = x * 10;;`

`val times_ten : int -> int = <fun>`

We must take two integer arguments, and use the
`&&`

and `<>`

operators to test if
they are both non-zero. So the result will be of type **bool**. The whole type will
therefore be **int → int →
bool**.

` OCaml`

`# let both_non_zero x y =`

` x <> 0 && y <> 0;;`

`val both_non_zero : int -> int -> bool = <fun>`

Our function should take an integer, and return another one (the
sum). So, it is type will be **int → int**. The base case is when
the number is equal to 1. Then, the sum
of all numbers from 1…1 is just 1. If not, we add the argument to the sum of
all the numbers from 1…(*n*−1).

` OCaml`

`# let rec sum n =`

` if n = 1 then 1 else n + sum (n - 1);;`

`val sum : int -> int = <fun>`

The function is recursive, so we used
** let rec** instead of

`let`

The function will have type **int
→ int → int**. A number to the power
of 0 is 1. A number to the power of 1 is itself. Otherwise, the answer is the
current *n* multiplied by *n*^{x − 1}.

` OCaml`

`# let rec power x n =`

` if n = 0 then 1 else`

` (if n = 1 then x else`

` x * power x (n - 1));;`

`val power : int -> int -> int = <fun>`

Notice that we had to put one `if`

`… `

`then`

`… `

** else** inside the

`else`

` OCaml`

`# let rec power x n =`

` if n = 0 then 1 else`

` if n = 1 then x else`

` x * power x (n - 1);;`

`val power : int -> int -> int = <fun>`

In fact, we can remove the case for `n = 1`

since
`power x 1`

will reduce to `x * power x 0`

which
is just `x`

.

The function `isconsonant`

will have type **char →
bool**. If a lower case character in the range
`’a’`

…`’z’`

is not a vowel, it must be a
consonant. So we can reuse the `isvowel`

function we wrote
earlier, and negate its result using the `not`

function:

` OCaml`

`# let isconsonant c = not (isvowel c);;`

`val isconsonant : char -> bool = <fun>`

The expression is the same as
`let`

` x = 1 `

`in`

` (`

`let`

` x = 2 `

`in`

` x + x)`

,
and so the result is `4`

. Both instances of `x`

in
`x + x`

evaluate to `2`

since this is the value
assigned to the name `x`

in the nearest enclosing
** let** expression.

We could simply return 0 for a negative argument. The factorial of 0 is 1, so we can change that too, and say our new function finds the factorial of any non-negative number:

` OCaml`

`# let rec factorial x =`

` if x < 0 then 0 else`

` if x = 0 then 1 else`

` x * factorial (x - 1);;`

`val factorial : int -> int = <fun>`

Note that `factorial`

can fail in other ways too – if the
number gets too big and “wraps around”. For example, on the author’s
computer, `factorial 40`

yields
`-2188836759280812032`

.

We can just pattern match on the boolean. It does not matter, in this instance, which order the two cases are in.

Recall our solution from the previous chapter:

Modifying it to use pattern matching:

Again, modifying our solution from the previous chapter:

This is the same as

(A match case belongs to its nearest enclosing
** match**). So the expression evaluates to

`5`

.We write two functions of type **char
→ bool** like this:

Alternatively, we might write:

These two solutions have differing behaviour upon erroneous arguments (such as punctuation). Can you see why?

This is similar to `odd_elements`

:

But we can perform the same trick as before, by reversing the cases, to reduce their number:

This is like counting the length of a list, but we only count if the
current element is `true`

.

We can use an accumulating argument in an auxiliary function to make a tail recursive version:

To make a palindrome from any list, we can append it to its reverse. To check if a list is a palindrome, we can compare it for equality with its reverse (the comparison operators work over almost all types).

We pattern match with three cases. The empty list, where we have reached the last element, and where we have yet to reach it.

We can build a tail recursive version by adding an accumulating list,
and reversing it when finished (assuming a tail recursive
`rev`

, of course!)

The empty list cannot contain the element; if there is a non-empty list, either the head is equal to the element we are looking for, or if not, the result of our function is just the same as the result of recursing on the tail.

Note that we are using the property that the `||`

operator
only evaluates its right hand side if the left hand side is false to
limit the recursion – it really does stop as soon as it finds the
element.

If a list is empty, it is already a set. If not, either the head exists somewhere in the tail or it does not; if it does exist in the tail, we can discard it, since it will be included later. If not, we must include it.

For example, consider the evaluation of
`make_set [4; 5; 6; 5; 4]`

:

The first part of the evaluation of `rev`

takes time
proportional to the length of the list, processing each element once.
However, when the lists are appended together, the order of the
operations is such that the first argument becomes longer each time. The
`@`

operator, as we know, also takes time proportional to the
length of its first argument. And so, this accumulating of the lists
takes time proportional to the square of the length of the list.

By using an additional accumulating argument, we can write a version which operates in time proportional to the length of the list.

For the same list:

Simply add an extra ** let** to define a
name representing the number we will take or drop:

The argument to `take`

or `drop`

is
`length l / 2`

which is clearly less than or equal to
`length l`

for all possible values of `l`

. Thus,
`take`

and `drop`

always succeed. In our case,
`take`

and `drop`

are only called with
`length l`

more than 1, due to the pattern matching.

We may simply replace the `<=`

operator with the
`>=`

operator in the `insert`

function.

The `sort`

function is unaltered.

We require a function of type ** α list → bool**. List of length zero
and one are, by definition, sorted. If the list is longer, check that
its first two elements are in sorted order. If this is true, also check
that the rest of the list is sorted, starting with the second
element.

We can reverse the cases to simplify:

Lists are compared starting with their first elements. If the elements differ, they are compared, and that is the result of the comparison. If both have the same first element, the second elements are considered, and so on. If the end of one list is reached before the other, the shorter list is considered smaller. For example:

`[1] < [2] < [2; 1] < [2; 2]`

These are the same principles you use to look up a word in a dictionary: compare the first letters – if same, compare the second etc. So, when applied to the example in the question, it has the effect of sorting the words into alphabetical order.

The ** let rec** construct can be nested
just like the

`let`

We have renamed the second argument of the `insert`

function to avoid confusion.

Our function will have type **char
list → char list**. We
just match on the argument list: if it is empty, we are done. If it
starts with an exclamation mark, we output a period, and carry on. If
not, we output the character unchanged, and carry on:

To use `map`

instead, we write a simple function
`calm_char`

to process a single character. We can then use
`map`

to build our main function:

This avoids the explicit recursion of the original, and so it is easier to see what is going on.

The `clip`

function is of type **int →
int** and is easy to write:

Now we can use map for the `cliplist`

function:

Just put the body of the clip function inside an anonymous function:

We require a function `apply f n x`

which applies function
`f`

a total of `n`

times to the initial value
`x`

. The base case is when `n`

is zero.

Consider the type:

The function `f`

must take and return the same type *α*, since its result in one iteration
is fed back in as its argument in the next. Therefore, the argument
`x`

and the final result must also have type *α*. For example, for *α* = **int**, we might
have a power function:

So `power a b`

calculates *a*^{b}.

We can add an extra argument to the `insert`

function, and
use that instead of the comparison operator:

Now we just need to rewrite the `sort`

function.

We cannot use map here, because the result list will not necessarily
be the same length as the argument list. The function will have type
**( α →
bool) →
α list → α
list**.

For example,
`filter (`

`fun`

` x -> x mod 2 = 0) [1; 2; 4; 5]`

evaluates to `[2; 4]`

.

The function will have type **( α → bool) → α
list → bool**.

For example, we can see if all elements of a list are positive:
`for_all (`

`fun`

` x -> x > 0) [1; 2; -1]`

evaluates to `false`

. Notice that we are relying on the fact
that `&&`

only evaluates its right hand side when the
left hand side is true to limit the recursion.

The function will have type **( α → β) → α
list list → β list list**. We use

`map`

on each element of the list.We have used explicit recursion to handle the outer list, and
`map`

to handle each inner list.

The function `smallest_inner`

takes a currently smallest
found integer, a boolean value `found`

indicating if we have
found any suitable value or not, and the list of integers. It is started
with `max_int`

as the current value, so that any number is
smaller than it, and `false`

for `found`

because
nothing has been found yet.

Thus, the function raises an exception in the case of an empty list, or one which is non-empty but contains no positive integer, and otherwise returns the smallest positive integer in the list.

We just surround the call to `smallest`

with an exception
handler for `Not_found`

.

We write a function `sqrt_inner`

which, given a test
number `x`

and a target number `n`

squares
`x`

and tests if it is more than `n`

. If it is,
the answer is `x - 1`

. The test number will be initialized at
`1`

. The function `sqrt`

raises our exception if
the argument is less than zero, and otherwise begins the testing
process.

We wrap up the function, handle the `Complex`

exception
and return.

Since the keys must be unique, the number of different keys is simply
the length of the list representing the dictionary – so we can just use
the usual `length`

function.

The type is the same as for the `add`

function. However,
if we reach the end of the list, we raise an exception, since we did not
manage to find the entry to replace.

The function takes a list of keys and a list of values and returns a
dictionary. So it will have type ** α list → β
list → (α × β)
list**.

This will have the type **( α × β) list
→ α list × β
list**. For the first time, we need to return a pair,
building up both result lists element by element. This is rather
awkward, since we will need the tails of both of the eventual results,
so we can attach the new heads. We can do this by pattern matching.

Here’s a sample evaluation (we cannot really show it in the conventional way, so you must work through it whilst looking at the function definition):

Since the inner pattern match has only one form, and is complete, we
can use ** let** instead:

We can use our `member`

function which determines whether
an element is a member of a list, building up a list of the keys we have
already seen, and adding to the result list of key-value pairs only
those with new keys.

How long does this take to run? Consider how long `member`

takes.

We pattern match on the first list – if it is empty, the result is
simply `b`

. Otherwise, we add the first element of the first
list to the union of the rest of its elements and the second list.

We can verify that the elements of dictionary `a`

have
precedence over the elements of dictionary `b`

by noting that
`add`

replaces a value if the key already exists.

The function `g a b c`

has type *α* →
*β* → *γ*
→ *δ* which can also be written *α* →
(*β*
→ (*γ* →
*δ*)). Thus, it takes an argument of type *α* and returns a function of type
*β* → (*γ* →
*δ*)
which, when you give it an argument of type *β* returns a function of type *γ* →
*δ* which, when you give it an
argument of type *γ* returns
something of type *δ*. And so,
we can apply just one or two arguments to the function `g`

(which is called partial application), or apply all three at once. When
we write `let`

` g a b c = …`

this
is just shorthand for
`let`

` g = `

`fun`

` a -> `

`fun`

` b -> `

`fun`

` c -> …`

The type of `member`

is ** α → α
list → bool**, so if we
partially apply the first argument, the type of

`member x`

must be `member`

function and `map`

to
produce a list of boolean values, one for each list in the argument,
indicating whether or not that list contains the element. Then, we can
use `member`

again to make sure there are no
`false`

booleans in the list.We could also write:

Which do you think is clearer? Why do we check for the absence of
`false`

rather than the presence of `true`

?

The function `( / ) 2`

resulting from the partial
application of the `/`

operator is the function which divides
two by a given number, not the function which divides a given number by
two. We can define a reverse divide function…

`let`

` rdiv x y = y / x`

…which, when partially applied, does what we want.

The function `map`

has type **( α →
β)
→ α list → β
list**. The function

`mapl`

we wrote has type
`mapll`

will have
type But, as discussed, we may remove the `l`

s too:

It is not possible to write a function which would map a function
`f`

over a list, or list of lists, or list of lists of lists
depending upon its argument, because every function in OCaml must have a
single type. If a function could map `f`

over an ** α
list list** it must inspect its argument enough to know it
is a list of lists, thus it could not be used on a

We can write a function to truncate a single list using our
`take`

function, being careful to deal with the case where
there is not enough to take, and then use this and `map`

to
build `truncate`

itself.

Here we have used partial application of `truncate`

to
build a suitable function for `map`

. Note that we could use
exception handling instead of calling length, saving time:

You might, however, reflect on whether or not this is good style.

First, define a function which takes the given number and a list,
returning the first element (or the number if none). We can then build
the main function, using partial application to make a suitable function
to give to `map`

:

We need two constructors – one for squares, which needs just a single integer (the length of a side), and one for rectangles which needs two integers (the width and height, in that order):

The name of our new type is rect. A
rect is either a `Square`

or a `Rectangle`

. For
example,

We pattern match on the argument:

This will be a function of type rect → rect. Squares remain unaltered, but if we have a rectangle with a bigger width than height, we rotate it by ninety degrees.

We will use `map`

to perform our rotation on any rects in the argument list which need it. We
will then use the sorting function from the previous chapter which takes
a custom comparison function so as to just compare the widths.

For example, packing the list of rects

`[Square 6; Rectangle (4, 3); Rectangle (5, 6); Square 2]`

will give

`[Square 2; Rectangle (3, 4); Rectangle (5, 6); Square 6]`

We follow the same pattern as for the **list** type, being careful to
deal with exceptional circumstances:

We can use our `power`

function from earlier:

We can just wrap up the previous function:

Our function will have type *α* →
*α* tree → **bool**. It takes a
element to look for, a tree holding that kind of element, and returns
`true`

if the element is found, or `false`

otherwise.

Note that we have placed the test `x = y`

first of the
three to ensure earliest termination upon finding an appropriate
element.

Our function will have type *α* tree → *α*
tree. A leaf flips to a leaf. A branch has its left and right
swapped, and we must recursively flip its left and right sub-trees
too.

We can check each part of both trees together. Leaves are considered equal, branches are equal if their left and right sub-trees are equal.

We can use the tree insertion operation repeatedly:

There will be no key clashes, because the argument should already be
a dictionary. If it is not, earlier keys are preferred since
`insert`

replaces existing keys.

We can make list dictionaries from both tree dictionaries, append them, and build a new tree from the resultant list.

The combined list may not be a dictionary (because it may have
repeated keys), but `tree_of_list`

will prefer keys
encountered earlier. So, we put entries from `t’ `

after
those from `t`

.

We will use a list for the sub-trees of each branch, with the empty list signifying there are no more i.e. that this is the bottom of the tree. Thus, we only need a single constructor.

`type`

` ’a mtree = Branch `

`of`

` ’a * ’a mtree list`

So, now we can define `size`

, `total`

, and
`map`

.

In fact, when there is only one pattern to match, we can put it directly in place of the function’s argument, simplifying these definitions:

A first attempt might be:

However, there are two problems:

` OCaml`

`# [1; 2; 3];;`

`- : int list = [1; 2; 3]`

`# print_integers [1; 2; 3];;`

`[1; 2; 3; ]- : unit = ()`

There is an extra space after the last element, and a semicolon too. We can fix this, at the cost of a longer program:

Now, the result is correct:

` OCaml`

`# [1; 2; 3];;`

`- : int list = [1; 2; 3]`

`# print_integers [1; 2; 3];;`

`[1; 2; 3]- : unit = ()`

We must deal with the exception raised when `read_int`

attempts to read something which is not an integer, as before. When that
exception is caught, we try again, by recursively calling ourselves. The
function ends when three integers are input correctly, returning them as
a tuple.

You may wonder why we used nested
`let`

` … `

`in`

` … `

structures rather than
just writing `(read_int (), read_int (), read_int ())`

– the
evaluation order of a tuple is not specified and OCaml is free to do
what it wants.

We ask the user how many dictionary entries will be entered, eliminating the need for a special “I have finished” code. First, a function to read a given number of integer–string pairs, dealing with the usual problem of malformed integers:

And now, asking the user how many entries there will be, and calling our first function:

Notice that we defined, raised, and handled our own exception
`BadNumber`

to deal with the user asking to read a negative
number of dictionary entries – this would cause
`read_dict_number`

to fail to return.

If we write a function to build the list of integers from 1 to *n* (or the empty list if *n* is zero):

We can then write a function to output a table of a given size to an output channel.

Look at this carefully. We are using nested calls to
`iter`

to build the two-dimensional table from
one-dimensional lists. Can you separate this into more than one
function? Which approach do you think is more readable?

We can test `write_table_channel`

most easily by using the
built-in output channel `stdout`

which just writes to the
screen:

` OCaml`

`# write_table_channel stdout 5;;`

`1 2 3 4 5`

`2 4 6 8 10`

`3 6 9 12 15`

`4 8 12 16 20`

`5 10 15 20 25`

`- : unit = ()`

Now we just need to wrap it in a function to open an output file, write the table, and close the output, dealing with any errors which may arise.

In addition to raising `Invalid_argument`

in the case of a
negative number, we handle all possible exceptions to do with opening,
writing to, and closing the file, re-raising them as our own, predefined
one. Is this good style?

We write a simple function to count the lines in a channel by taking
a line, ignoring it, and adding one to the result of taking another
line; our recursion ends when an `End_of_file`

exception is
raised – it is caught and 0 ends the
summation.

The main function `countlines`

just opens the file, calls
the first function, and closes the file. Any errors are caught and
re-raised using the built-in `Failure`

exception.

As usual, let us write a function to deal with channels, and then
deal with opening and closing files afterward. Our function takes an
input channel and an output channel, adds the line read from the input
to the output, follows it with a newline character, and continues. It
only ends when the `End_of_file`

exception is raised inside
`input_line`

and caught.

Now we wrap it up, remembering to open and close both files and deal with the many different errors which might occur.

Two references, `x`

and `y`

, of type
**int ref** have been
created. Their initial values are 1 and
2. Their final values are 2 and 4. The
type of the expression is **int** because this is the type
of `!x + !y`

, and the result is 6.

The expression `[ref 5; ref 5]`

is of type **int ref list**. It contains two
references each containing the integer 5. Changing the contents of one
reference will not change the contents of the other. The expression
`let`

` x = ref 5 `

`in`

` [x; x]`

is also of type **int ref
list** and also contains two references to the integer 5.
However, altering one will alter the other:

` OCaml`

`# let r = let x = ref 5 in [x; x];;`

`val r : int ref list = [{contents = 5}; {contents = 5}]`

`# `

`match r with h::_ -> h := 6;;`

`Warning 8 [partial-match]: this pattern-matching is not exhaustive.`

`Here is an example of a case that is not matched:`

`[]`

`- : unit = ()`

`# r;;`

`- : int ref list = [{contents = 6}; {contents = 6}]`

We can write a function `forloop`

which takes a function
of type **int → α** (where alpha would
normally be

For example:

` OCaml`

`# forloop print_int 2 10;;`

`2345678910- : unit = ()`

`# forloop print_int 2 2;;`

`2- : unit = ()`

`[|1; 2; 3|]`

: **int
array**

`[|true; false; true|]`

: **bool array**

`[|[|1|]|]`

: **(int array)
array** which is **int
array array**

`[|[1; 2; 3]; [4; 5; 6]|]`

: **int list array**

`[|1; 2; 3|].(2)`

: **int**, has value 2

`[|1; 2; 3|].(2) <- 4`

: **unit**, updates the array to
`[|1; 2; 4|]`

We use a ** for** construct:

Note that this works for the empty array, because a
** for** construct where the second number is
less than the first never executes its expression.

Since we wish to reverse the array in place, our function will have
type ** α array → unit**. Our method is to
proceed from the first element to the half-way point, swapping elements
from either end of the array. If the array has odd length, the middle
element will not be altered.

We will represent the **int array
array** as an array of columns so that
`a.(x).(y)`

is the element in column `x`

and row
`y`

.

Note that the result is correct for `table 0`

.

The difference between the codes for `’a’`

and
`’A’`

, or `’z’`

and `’Z’`

is 32, so we
add or subtract as appropriate. Codes not in those ranges are
unaltered.

Periods, exclamation marks and question marks may appear in multiples, leading to a wrong answer. The number of characters does not include newlines. It is not clear how quotations would be handled. Counting the words by counting spaces is inaccurate – a line with ten words will count only nine.

We calculate the ceiling and floor, and return the closer one, being careful to make sure that a point equally far from the ceiling and floor is rounded up.

The behaviour with regard to values such as `infinity`

and
`nan`

is fine, since it always returns the result of either
`floor`

or `ceil`

.

The function returns another point, and is simple arithmetic.

The whole part is calculated using the built-in `floor`

function. We return a tuple, the first number being the whole part, the
second being the original number minus the whole part. In the case of a
negative number, we must be careful – `floor`

always rounds
downward, not toward zero!

Notice that we are using the unary operator `-.`

to make
the number positive.

We need to determine at which column the asterisk will be printed. It is important to make sure that the range 0…1 is split into fifty equal sized parts, which requires some careful thought. Then, we just print enough spaces to pad the line, add the asterisk, and a newline character.

No allowance has been made here for bad arguments (for example,
`b`

smaller than `a`

). Can you extend our program
to move the zero-point to the middle of the screen, so that the sine
function can be graphed even when its result is less than zero?

A non-tail-recursive one is simple:

To make a tail-recursive one, we can use an accumulator, reversing
each list as we append it, and reversing the result.
`List.rev`

is tail-recursive already.

We can use `List.mem`

, partially applied, to map over the
list of lists. We then make sure that `false`

is not in the
resultant list, again with `List.mem`

.

The `String.iter`

function calls a user-supplied function
of type **char → unit** on each character of
the string. We can use this to increment a counter when an exclamation
mark is found.

The contents of the counter is then the result of the function.

We can use the `String.map`

function, which takes a
user-supplied function of type **char
→ char** and returns a
new string, where each character is the result of the mapping function
on the character in the same place in the old string.

Notice that we have taken advantage of partial application to erase the last argument as usual.

Looking at the documentation for the `String`

module we
find the following:

So, by using the empty string as a separator, we have what we want:

We can use the functions `create`

,
`add_string`

, and `contents`

from the
`Buffer`

module together with the usual list iterator
`List.iter`

:

The initial size of the buffer, 100, is arbitrary.

We repeatedly check if the string we are looking for is right at the beginning of the string to be searched. If not, we chop one character off the string to be searched, and try again. Every time we find a match, we increment a counter.

You might consider that writing this function with lists of characters rather than strings would be easier. Unfortunately, it would be slow, and these kinds of searching tasks are often required to be very fast.

First, we extend the `Textstat`

module to allow
frequencies to be counted and expose it through the interface. Here is
the interface `q1.mli`

:

Here is its implementation `q1.ml`

:

Here is the main program:

We can write two little functions – one to read all the lines from a file, and one to write them. The main function, then, reads the command line to find the input and output file names, reads the lines from the input, reverses the list of lines, and writes them out. If a problem occurs, the exception is printed out. If the command line is badly formed, we print a usage message and exit.

Note that there is a problem if the file has no final newline – it will end up with one. How might you solve that?

We can simply do something (or nothing) a huge number of times using
a ** for** loop.

On many systems, typing `time`

followed by a space and the
usual command will print out on the screen how long the program took to
run. For example, on the author’s computer:

`$ ocamlc bigloop.ml -o bigloop`

`$ time ./bigloop`

`real 0m1.896s`

`user 0m1.885s`

`sys 0m0.005s`

`$ ocamlopt bigloop.ml -o bigloop`

`$ time ./bigloop`

`real 0m0.022s`

`user 0m0.014s`

`sys 0m0.003s`

You can see that, when compiled with `ocamlc`

, it takes
1.9s to run, but when compiled with `ocamlopt`

just
0.022s.

We can get all the lines in the file using our `getlines`

function from question two. The main function simply calls
`string_in_line`

on each line, printing it if
`true`

is returned.

The interesting function is `string_in_line`

. To see if
`term`

is in `line`

we start at position 0. The
condition for the term having been found is a combination of boolean
expressions. The first ensures that we are not so far through the string
that the expression could not possibly fit at the current position. The
second checks to see if the term is found at the current position by
using the function `String.sub`

from the OCaml Standard
Library. If not, we carry on.