First step is the data pre-processing to retrieve the sequence from raw data.The second step is to encode the sequences using on-hot-encoding to make the data readable for the network. The third step is the neural network model construction, and the last step is to classify the sequence as methylated or non-methylated.
The most communal post-transcriptional modification, N6-methyladenosine (m6A), is associated with a number of crucial biological processes. The precise detection of m6A sites around the genome is critical for revealing its regulatory function and providing new insights into drug design. Although both experimental and computational models for detecting m6A sites have been introduced, but these conventional methods are laborious and expensive. Furthermore, only a handful of these models are capable of detecting m6A sites in various tissues. Therefore, a more generic and optimized computational method for detecting m6A sites in different tissues is required. In this paper, we proposed a universal model using a deep neural network (DNN) and named it TS-m6A-DL, which can classify m6A sites in several tissues of humans ( Homo sapiens), mice ( Mus musculus), and rats ( Rattus norvegicus). To extract RNA sequence features and to convert the input into numerical format for the network, we utilized one-hot-encoding method. The model was tested using fivefold cross-validation and its stability was measured using independent datasets. The proposed model, TS-m6A-DL, achieved accuracies in the range of 75–85% using the fivefold cross-validation method and 72–84% on the independent datasets. Finally, to authenticate the generalization of the model, we performed cross-species testing and proved the generalization ability by achieving state-of-the-art results.