In science, multiple measures of the same constructs can be useful, but they are unlikely to all be equally valid indicators. In psychological assessment, the many popular personality inventories available in the marketplace also may be useful, but their comparative validity has long remained unassessed. This is the first comprehensive comparison of 11 such multiscale instruments against each of three types of criteria: clusters of behavioral acts, descriptions by knowledgeable informants, and clinical indicators potentially associated with various types of psychopathology. Using 1,000 bootstrap resampling analyses from a sample of roughly 700 adult research participants, we assess the relative predictability of each criterion and the comparative validity of each inventory. Although there was a wide range of criterion predictability, most inventories exhibited quite similar cross-validities when averaged across all three types of criteria. On the other hand, there were important differences between inventories in their predictive capabilities for particular criteria. We discuss the factors that lead to differential validity across predictors and criteria.