Mental health diagnostic approaches are seeking to identify biological markers to work alongside of advanced machine learning approaches. It is difficult to identify a biological marker of disease when the traditional diagnostic labels themselves are not necessarily valid. To begin to address this, we worked with brain imaging data collected from individuals with mood and psychosis disorders from over 1400 individuals comprising healthy controls, psychosis patients and their unaffected first-degree relatives and we assumed there may be noise in the diagnostic labelling process. We detected label noise by classifying the data multiple times using a support vector machine classifier and then we retained those individuals in which all classifiers unanimously mislabeled those subjects. Next we assigned a new diagnostic label to these individuals, based on the biological data, using an iterative data cleansing approach. Simulation results showed our method was highly accurate in identifying label noise. We evaluated our method via a deep learning model which shows performance improvement of model on the cleansed dataset. Both diagnostic and Biotype categories showed a large percentage of noisy labels with the largest amount of relabeling occurring between the healthy control and bipolar and schizophrenia disorder individuals as well as in the unaffected close relatives. Extraction of imaging features highlighted regional brain changes associated with each group. In sum, this approach represents an initial step towards developing approaches that need not assume existing mental health diagnostic categories are always valid, but rather allows us to leverage this information while also acknowledging that there are mis-assignments.