As high-speed computers and sophisticated software packages for data linkage become increasingly available, investigators from nearly every arena are creating massive databases for epidemiologic and comparative effectiveness research (CER). Decisions made during database construction have a major impact on the accuracy and completeness of the data. Considering their potential use in informing health-care decisions, it is vital that we increase transparency of these data, including a thorough understanding of the record linkage strategy implemented and an evaluation of linked and unlinked records so that potential biases can be addressed.