Myhill, Church and Others: Recollections and Views about Finite Automata#

Arto Salomaa#

I have worked in many areas of theoretical computer science, most often from a mathematical point of view. The area to which I have liked to return again and again is Automata Theory by which I mean mainly finite automata. I like to say something about it also on this Academia Europaea web page. Incidentally, Samuel Eilenberg always told me that "automata theory" is a wrong term, one does not say "groups theory". My presentation here has an informal and unsystematic character. The purpose is to present some views and recollections, also technical observations, about events somehow connected with finite automata. Note here the double meaning of the word event. Apart from "happening", it can mean also - as it did in old times - a "language", such as the language accepted by an automaton.


The famous red—cover Princeton book Automata Studies appeared in 1956. It was used by John Myhill in his seminar in Berkeley during the spring 1957. I was a student there, and my first acquaintance with finite automata and regular languages stems from those days. Myhill gave also some lectures himself. He was very impressive, to say the least, But he was also in some sense out of this world, and his behavior could be very irregular. Myhill’s speech was not easy to understand, at least not for me. It was referred to as the "Birmingham accent". My work in his seminar was about self-reproducing automata. The automata worked in the plane, where necessary components, such as power supplies, were randomly scattered around. The final construction was quite detailed, with some 30 states and instructions for each specific configuration, a welding operation and so forth.

No lecture notes were available for the seminar. However, later in 1957 Myhill published notes "Theorems on the Representation of Events" that became available also as a technical report. To see how things were viewed those days, I quote here a passage from the beginning of the notes.

The type of automata we shall consider are exemplified by combination locks, cash registers, digital computers. These automata have the following properties: they are capable of a certain finite number of states (both the internal mechanism and the and output): these states are discrete (i.e. the automaton passes abruptly from one state into another): what state the machine is in at any time depends only on its state immediately before and upon the immediately preceding input.

They will be called finite discrete deterministic automata. Here ”finite” refers to the extent of the machine in space - i.e. the machine is an assemblage of finitely many components each occupying a finite amount of space. This is in opposition to e.g. a Turing machine which is equipped with an infinitely long tape; each square of the tape counts as a component of the machine and can be either printed or not printed at any particular time. A Turing machine however is (like the automata under consideration) discrete, i.e. each part is capable only finitely many states, and each part passes abruptly from one state into another, this distinguishes discrete automata (finite or infinite) from such continuous automata as conventional analogue computers. Finally the machines we consider are deterministic rather than probabilistic; i.e. the internal state of the machine is determined completely (not merely with a certain probability) by its immediately preceding state and by the immediately preceding input. This represents an idealization of actually existing machines, which however is fairly well realized in some cases (combination locks, cash registers) less well in others (digital computers, dial telephones).

Rather than define separately the words "finite", "discrete", "deterministic", "automaton" (which words involve us in considerable problems), we reserve the word "automaton" to mean "finite discrete deterministic automaton".

In the actual formal definitions in the Myhill notes, an automaton is a quadruple consisting of the state set, input alphabet, initial state and transition function. Thus, the final state set is missing. The events represented (in current terminology, languages accepted) are specified by pairs (A,Q’), where A is an automaton and Q' is a subset of the state set of A. This means that several events can be represented in the same automaton; by varying the final state set. In those days this was the common practice in definitions. I followed it in my book Theory of Automata, written 1966-67.

Quite many of the early papers on automata, including papers by Myhill, A.Paz, B. Peleg and J.Giveon, were sponsored by programs of U.S. Air Force or U.S. Office of Naval Research. I am stating this just as a fact. I do not want to enter into intricate speculations concerning whether there were some specific expectations about the theory among certain circles, let alone the even more involved historical considerations about the eventual difficulties caused by such a sponsorship for researchers in autorriata theory in countries (certainly including Finland), where the official government politics regarded USAF as a rather unfriendly institution.

It is interesting to read also the ”Publication Review”’ following the Abstract of a paper by Myhill, written by LEONARD M. BUTSCH; JR., a major in the USAF:

The content of this report represents the scientific findings of an Air Force sponsored program. It does not direct any specific application thereof. The report is approved for publication to achieve an exchange and stimulation of ideas.


It is not customary to associate Alonzo Church with finite automata. Yet his invited one-hour lecture at the International Congress of Mathematicians in Stockholm 1962 was about

recent work in the application of mathematical logic to finite automata, and especially of mathematical logic beyond propositional calculus.

After Stockholm, Church visited also Finland, attending a conference on modal and many-valued logics. Later on, he visited Turku in the middle of 70's. Having studied quite much his monumental Introduction to Mathematical Logic, which appeared in 1956 and had also Volume One on its title page, I asked him whether Volume Two would come out one of these years. Church gave an answer that was not definite. As we know now, it never came out.

Church liked to write on blackboard when lecturing. His big, curly hand-writing was easy to follow. At the aforementioned ICM in 1962 the audience was very big, maybe more than a thousand people, This meant that Church had to use slides. Basically, he just read through the slides, This caused a friend of mine to remark that Church may well be the world’s greatest logician but is certainly not the world’s greatest lecturer!

I will quote below a few passages of the aforementioned Stockholm lecture. Church’s finite automata are pretty much the same as those of Myhill. However, Church has the additional logical (predicate calculus) aspect in a way similar as Biichi or Kobrinskij and Trakhtenbrot. In the early days of automata theory, Russian literature was quite strong. Church refers to the work of P. Ehrenfest, V.I. Shestakoff and S.A. Yanovskaya. The work of Trakhtenbrot was not known to Church at the time of the lecture but was added to a later version of his paper. The following excerpts give a pretty good view of Church’s definitions.

A finite automaton consists of a finite number of elements, each of which is capable of a finite number of different states. Time is measured in discrete instants, t = 0,1,2,..., beginning at an initial instant and extending indefinitely. The elements are distinguished as input elements, intermediate elements, and output elements. The states of the input elements, at any instant, constitute the input state, at that instant, similarly constitute the output state; and the states of the intermediate and output elements constitute the internal state. Except the input elements, whose states are imposed from outside, the state of any element at a given iristaiit is completely determined, in some non-circular way, by (1) the states of certain other elements at the given instant, and (2) the states of certain elements, itself and others, at instants which precede the given instant by at least 1 and by not more than a certain fixed maximum span h + 1.

(More fully, the foregoing statement is a definition of "discrete synchronous deterministic finite automaton with outputs". But the longer phrase is for the sake of distinction from a number of other notions with which we are not here concerned, and let us therefore say simply ”finite automaton” or "automaton".)

..., the sequence of input states from t = O to t = t0 is called an input sequence, and the sequence of output states from t = 0 to t = t0 is called an output sequence. The function which, for an input sequence from t = 0 to t = t0, as argument, has as value the output state at t = t0 is the behavior function. ... A class of input sequences is an event. An event is represented by an automaton if the set of values of the behavior function that belong to the event is disjoint from the set of values of the behavior function for other input sequences.

Here the notions and terminology concerning representation of events are strikingly similar to those used by Myhill. Observe also the point made already earlier: several events can be represented by the same automaton.

Church also introduced the language of regular expressions in the standard way. However, his main concern here was the development of a class of logical formulas describing the behavior of finite automata, in a way similar to the one applied by Buchi and Trakhtenbrot. Church was interested not only in applications of logic to finite automata, but also in the reverse application by which

the consideration of automata is made to yield results that belong to the field of mathematical logic.

It is well known that this approach has developed into a very big field of research, still active in our days.


There was a period in my life, about ten years around 1980, when I was very much involved in research with Hermann Maurer and Derick Wood. Much of the work we did was about or involved automata theory. Rather than going into specific details of the work from those days, I mention a recent result where I have been involved. It is very much in the “MSW spirit": I believe today we would be interested in theoretical matters like that, also very much related to Finite automata.

What I mean is the characterization of words by numerical quantities, such as the number of subword occurrences in a word. The specific result I have in mind could be called the Cauchy inequality for subwords. It resembles the well-known Cauchy inequality in algebra and analysis, and also the proof (omitted here) is somewhat similar.

The word u being a subword of a word w means that w, as a a sequence of letters, contains u as a subsequence. More formally, there exist words x1,..., xk and y0 ,.., , yk, some of them possibly empty, such that

u = x1...xk and w = y0x1y1...xk yk

We use the notation │wu for the number of occurrences of the word u as a subword of the word w. Occurrences can be viewed as vectors. If │u│ = t, each occurrence of u in w can be identified as the t-tuple (i1,...,it) of increasing positive integers, where for 1 ≤ jt, the jth letter of u is the ijth letter of w. For instance, the 5 occurrences of u = abc in w = abcbcacab are

(1,2,3), (1,2,5), (1,2,7), (1,4, 5), (1,4, 7),

We are now ready to state the Cauchy inequality for subwords:

wywxyz ≤ │wxywyz

It is valid for all words w,x,y, z. It can be claimed to be a really fundamental property of words, because of its generality and because it reduces to equality in a great variety of cases.

A Problem of a Long Open Standing#

I want to mention also an open problem about finite automata. This one dates already from the mid-60’s and is referred to as the Černý Conjecture.

Let (Q, Σ, δ) be a finite (deterministic) automaton. The cardinality of the state set Q is n, there are no specified initial or final states, The transition function δ is extended to concern words and subsets of Q. Thus, δ(Q1, w) stands for the set of states the automaton is in, after starting from some state in Q1 and reading w.

A word w is synchronizing if the set δ(Q, w) consists of one state only. An automaton is synchnizable if it has a synchronizing word. These notions are connected with the classical theory about experiments with finite automata. Even if you don’t know the state an automaton is in, you just apply a synchronizing word, after which you have a complete control of the situation. You can also view the graph of an automaton as a labyrinth, where you are lost. If you then follow the letters of a synchronizing word (and have the global knowledge of the graph of the automaton), you have found your way, The Černý Conjecture asserts that every synchronizable automaton has a synchronizing word of length at most (n — 1)2.

Only one general example, valid for all n, is known where the upper bound is actually reached, contrary to the upper bound, for instance, in the Cauchy inequality, Therefore, the upper bound does not seem natural to me, and I strongly believe that the Conjecture is wrong. Finding counterexamples is not easy since very little is known about the area around the upper bound.


There are nowadays many yearly conference series dealing with automata, some of them directed towards specific kinds of applications. Recently l have myself been quite much involved with so-called Holter automata. About the future I quote the famous saying by Turing:

We can see only a short distance ahead, but we can see plenty there that needs to be done.

Turku, Finland, April 2010
Arto Salomaa

Any further pages in alphabetic order of their title as created by you.

Just click at "Create new page", then type a short title and click OK, then add information on the empty page presented to you (including maybe a picture from your harddisk or a pdf-file by using the "Upload" Button) and finally click at "Save".

Imprint Privacy policy « This page (revision-60) was last changed on Monday, 31. May 2010, 14:40 by Maurer Hermann
  • operated by