Economists and biologists have proposed a distinction between two mechanisms--"strong" and "weak" reciprocity--that may explain the evolution of human sociality. Weak reciprocity theorists emphasize the benefits of long-term cooperation and the use of low-cost strategies to deter free-riders. Strong reciprocity theorists, in contrast, claim that cooperation in social dilemma games can be sustained by costly punishment mechanisms, even in one-shot and finitely repeated games. To support this claim, they have generated a large body of evidence concerning the willingness of experimental subjects to punish uncooperative free-riders at a cost to themselves. In this article, I distinguish between a "narrow" and a "wide" reading of the experimental evidence. Under the narrow reading, punishment experiments are just useful devices to measure psychological propensities in controlled laboratory conditions. Under the wide reading, they replicate a mechanism that supports cooperation also in "real-world" situations outside the laboratory. I argue that the wide interpretation must be tested using a combination of laboratory data and evidence about cooperation "in the wild." In spite of some often-repeated claims, there is no evidence that cooperation in the small egalitarian societies studied by anthropologists is enforced by means of costly punishment. Moreover, studies by economic and social historians show that social dilemmas in the wild are typically solved by institutions that coordinate punishment, reduce its cost, and extend the horizon of cooperation. The lack of field evidence for costly punishment suggests important constraints about what forms of cooperation can or cannot be sustained by means of decentralised policing.