Which topics spark the most heated debates in social media? Identifying these topics is a first step towards creating systems which pierce echo chambers. In this paper, we perform the first systematic methodological study of controversy detection using social-media network structure and content. Unlike previous work, rather than identifying controversy in a single hand-picked topic and use domain-specific knowledge, we focus on comparing topics in any domain. Our approach to quantifying controversy is a graph-based three-stage pipeline, which involves (i) building a conversation graph about a topic, which represents alignment of opinion among users; (ii) partitioning the conversation graph to identify potential sides of controversy; and (iii) measuring the amount of controversy from characteristics of the graph. We perform an extensive comparison of controversy measures, as well as graph building approaches and data sources. We use both controversial and non-controversial topics on Twitter, as well as other external datasets. We find that our new random-walk-based measure outperforms existing ones in capturing the intuitive notion of controversy, and show that content features are vastly less helpful in this task.