There are two widely divergent theories about the relation of speech to language. The more conventional view holds that the elements of speech are sounds that rely for their production and perception on two wholly separate processes, neither of which is distinctly linguistic. Accordingly, the primary motor and perceptual representations are inappropriate for linguistic purposes until a cognitive process of some sort has connected them to language and to each other. The less conventional theory takes the speech elements to be articulatory gestures that are the primary objects of both production and perception. Those gestures form a natural class that serves a linguistic function and no other. Therefore, their representations are immediately linguistic, requiring no cognitive intervention to make them appropriate for use by the other components of the language system. The unconventional view provides the more plausible answers to three important questions: (1) How was the necessary parity between speaker and listener established in evolution, and how maintained? (2) How does speech meet the special requirements that underlie its ability, unique among natural communication systems, to encode an indefinitely large number of meanings? (3) What biological properties of speech make it easier than the reading and writing of its alphabetic transcription?