The CD20 (B1) molecule is a differentiation Ag found only on the surface of B lymphocytes. This structurally unique phosphoprotein plays a role in the regulation of human B cell proliferation and differentiation. In this report genomic DNA clones containing the human CD20 gene were isolated and the structure of the CD20 gene determined. Southern blot analysis revealed that CD20 mRNA was transcribed from a single-copy gene. The CD20 gene was 16 kb long and was composed of eight exons. The first exon marked the major transcription initiation site as determined by primer extension and S1 nuclease analysis. The translation initiation codon was located within the third exon. Exon VIII encoded the COOH terminus of the CD20 protein and the long 3' untranslated region. Three forms of CD20 mRNA were identified that all encode an identical protein product. The dominant form of 2.8 kb results from usage of exons I through VIII, whereas a second form that is 263 bp shorter had exon I spliced into an internal 3' splice site within exon III thereby skipping exon II. A minor 3.4-kb mRNA species most likely results from an uncharacterized upstream exon(s) splicing into an internal 3' slice site located in exon I. Nucleotide sequences of cDNA clones representative of each of these RNA forms are presented. The 5' splice site following exon V was found to be divergent from the consensus splice sequence. A relationship between the individual peptides encoded by the six exons and structurally distinct regions of the CD20 protein is likely.