At the lexical level, a typical human-computer dialogue in an aural-only spoken language system consists of two stages, system output and user input. As with human-human conversation, a good proportion of turn taking clues are given by lapses in talk. Unfortunately, in telephone-based automated spoken dialogues, silences on the system’s part may not be so easily resolved. A pilot experiment examined the recogniser listening and processing states and showed that auditory icons representing these caused fewer incorrect user responses than the control condition. However, where system prompts explicitly requested a response, icons were not necessary if talkover was provided. Also, the effectiveness of auditory representations had a strong interaction with the expertise of the caller suggesting that expert users may require a period of acclimatisation to the use of sounds as they tend to listen to them due to novelty. Conversely, novice users with no experience acted correctly.