The inner ear contains sensory epithelia that detect head movements, gravity and sound. It is unclear how to derive these sensory epithelia from pluripotent stem cells, a process which will be critical for modeling inner ear disorders or developing cell-based therapies for profound hearing loss and balance disorders 1,2 . To date, attempts to derive inner ear mechanosensitive hair cells and sensory neurons have resulted in inefficient or incomplete phenotypic conversion of stem cells into inner ear-like cells 3–7 . A key insight lacking from these previous studies is the importance of the non-neural and pre-placodal ectoderm, two critical precursors during inner ear development 8–11 . Here we report the step-wise differentiation of inner ear sensory epithelia from mouse embryonic stem cells (ESCs) in three-dimensional culture 12,13 . We show that by recapitulating in vivo development with precise temporal control of BMP, TGFβ and FGF signaling, ESC aggregates transform sequentially into non-neural, pre-placodal and otic placode-like epithelia. Remarkably, in a self-organized process that mimics normal development, vesicles containing prosensory cells emerge from the presumptive otic placodes and give rise to hair cells bearing stereocilia bundles and a kinocilium. Moreover, these stem cell-derived hair cells exhibit functional properties of native mechanosensitive hair cells and form specialized synapses with sensory neurons that have also arisen from ESCs in the culture. Finally, we demonstrate how these vesicles are structurally and biochemically comparable to developing vestibular end organs. Our data thus establish a novel in vitro model of inner ear differentiation that can be used to gain deeper insight into inner ear development and disorder.