In the data obtained by laser interferometric gravitational wave detectors, transient noise with non-stationary and non-Gaussian features occurs at a high rate. This often results in problems such as detector instability and the hiding and/or imitation of gravitational-wave signals. This transient noise has various characteristics in the time–frequency representation, which is considered to be associated with environmental and instrumental origins. Classification of transient noise can offer clues for exploring its origin and improving the performance of the detector. One approach for accomplishing this is supervised learning. However, in general, supervised learning requires annotation of the training data, and there are issues with ensuring objectivity in the classification and its corresponding new classes. By contrast, unsupervised learning can reduce the annotation work for the training data and ensure objectivity in the classification and its corresponding new classes. In this study, we propose an unsupervised learning architecture for the classification of transient noise that combines a variational autoencoder and invariant information clustering. To evaluate the effectiveness of the proposed architecture, we used the dataset (time–frequency two-dimensional spectrogram images and labels) of the Laser Interferometer Gravitational-wave Observatory (LIGO) first observation run prepared by the Gravity Spy project. The classes provided by our proposed unsupervised learning architecture were consistent with the labels annotated by the Gravity Spy project, which manifests the potential for the existence of unrevealed classes.