69
views
0
recommends
+1 Recommend
0 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A novel biclustering approach with iterative optimization to analyze gene expression data

      Dove Medical Press
      pearson’s correlation coefficient, biclustering, microarray data, genetic algorithm

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background The complete sequencing of the genomes of many organisms has led to the launch of various omics studies. In one study, the advent of deoxyribonucleic acid (DNA) microarray technology has enabled the monitoring of the expression levels of numerous genes at a time, under many different growth conditions. This technique is now widely used in diverse types of biological research, such as identifying disease markers, reconstructing cellular signaling pathways, and inferring gene regulatory networks. DNA microarray technology has also provided numerous biological insights.1–3 Data generated from even a few array measurements are quite complex, and the amounts of microarray data available in public databases are dramatically increasing, due to the efficiency and rapid improvement of DNA microarray technologies. As a result, the interpretation of DNA microarray data obtained under a large number of conditions has become a challenging problem. In the analyses of a large dataset, as the first step, researchers usually search for similar patterns appearing within the data. In the case of DNA microarray data, similar patterns of gene expression data are often investigated by using cluster analyses, such as K-means clustering4 and hierarchical clustering.5 Although clustering can provide considerable biological information, conventional clustering algorithms may not be suitable for some analyses of microarray data for the following two reasons. Firstly, there are many genes that encode proteins involved in several functional activities at a time, but the conventional clustering methods cannot identify these genes, because they only allow a gene to belong to one cluster at a time, instead of multiple clusters. Secondly, it is difficult to find the genes that are co-expressed under a few specific conditions but are differently expressed under other conditions because the similarity of the genes in conventional clustering is determined by the entire expression data.6,7 In terms of the above shortcomings, biclustering is more effective than conventional clustering, since it can cluster both genes and conditions simultaneously, and a gene (or a condition) can be involved in multiple clusters at a time.7 The concept of biclustering was first proposed by Hartigan,8 and Cheng and Church9 applied it to search for the most homogeneously expressed genes over certain sets of conditions by using greedy search algorithms.9 Most biclustering algorithms have been implemented with greedy search algorithms,1,10,11 to reduce the calculation costs. One such bicluster, a maximum bicluster, is known as a nondeterministic polynomial time (NP)-complete problem that can possibly be solved in polynomial time using a nondeterministic Turing machine,12 and a greedy search algorithm is required for actual applications to provide efficient approximations. Usually, one greedy search results in one bicluster, and the greedy search approach is repeatedly applied to the data, while preventing the reproduction of similar biclusters. The greedy search then tries to obtain a set of various biclusters as the final output. Biclustering has also been implemented by using a genetic algorithm (GA) to find a practical solution to balance bicluster quality and calculation cost. A GA emulates an evolutionary processes to obtain nearly optimal solutions.13 Initially, a set of candidate solutions is prepared; each solution being called a chromosome. The chromosomes evolve by exchanging their parts and changing some elements into a different state, and elite chromosomes are selected to survive as the parents of the next generation. This evolution and selection process is repeated over a number of generations to yield an optimal solution.13 Bleuler et al14 first applied GA to biclustering, whereby a binary string (representing a gene or a condition belonging to a bicluster, or not) was employed as a representation of chromosomes. To avoid any redundancy of the resulting biclusters, Bleuler et al introduced a special selection operator called environment selection. Chakraborty and Maka15 have generated a similar GA-based biclustering, but different in terms of chromosome initialization. Initial chromosomes are prepared by K-means clustering. These methods find an optimum set of biclusters from one GA search. For such methods, it would be difficult to obtain a set of various, nonredundant biclusters, because only better chromosomes can survive by the selection process of GA, and thus the resulting biclusters tend to converge into similar results in the later generations.14,15 Another type of GA-based biclustering, Sequential Evolutionary Biclustering (SEBI), has a distinct strategy. SEBI initially applies GA to select the optimal bicluster, and then this process is repeated so that the genes and the conditions in the biclusters already selected are less likely to be selected again. In other words, although SEBI would generate a set of diverse biclusters, it de-empathizes the overlap of biclusters, a significant feature of biclustering.16 In the present study, we propose BIGA as the basis of a novel biclustering approach. In BIGA, an attempt is made to progressively divide the large amounts of input data into small datasets, by iteratively using GA, such as SEBI. Instead of evaluating a set of biclusters, GA is applied to each division process. Therefore, the resulting biclusters are substantially diverse. In addition, BIGA introduces the overlap state explicitly defined in the ternary digit (or trit) encoding chromosome. In this study, the algorithm is described, the performance of BIGA is compared with those of six existing biclustering algorithms, and the biological relevance of BIGA is evaluated by using gene ontology (GO) enrichment analyses. Finally, we conclude that BIGA is a powerful and practical solution for biclustering with high-dimensional data. Material and methods Definition of biclusters BIGA accepts a set of gene expression data with the matrix form D = (G, C), including N rows of genes G = {g 1, g 2, …, g N } and M columns of conditions or samples C = {c 1, c 2, …, c M }, where N and M are the total numbers of genes and conditions, respectively. All genes will be clustered into K overlapping biclusters B = {B 1, B 2, …, B K }, and each bicluster (B i ) corresponds to a submatrix B i = (X, Y) of D, where X ⊆ G and Y ⊆ C. The sizes of X and Y, ie, the numbers of genes and the conditions of a bicluster, are denoted by n and m, in which n ≤ N and m ≤ M, respectively. Binary-iterative genetic algorithm In order to decompose D into B systematically, a binary tree was introduced. Generally, a binary tree comprises nodes and directed edges, in which each node can be extended to at most two child nodes.17 In this work, we regarded each bicluster and each edge as a node and a parent–child relationship between a bicluster pair, respectively. We designated the method as BIGA. BIGA consists of the following three steps. A schematic diagram of BIGA is shown in (Figure 1). Step 1: A division of microarray data is represented by a string, a sequence of trit (0, 1, 2) with the length of n (number of genes in the parent bicluster) +m (number of conditions in the parent bicluster). The trit 0, 1, and 2 means that an associated gene or condition is contained in either of two biclusters, b left or b right , or both, respectively. This means that one string can encode the division of one bicluster into two biclusters, while allowing overlap. An example of this encoding is shown in (Figure 1A). The “|” symbol serves as a spacer of the genes and conditions for clarity. The string is equivalent to the division illustrated by the matrix (microarray data, or a bicluster) in the middle of (Figure 1A). In the matrix, the rows and the columns correspond to the genes and the conditions, respectively. The cell of the matrix belongs to either b left (blue cell), b right (red), or both (violet), under the decoding rule shown in (Figure 1B). The white cells are ignored because they are not coexpressed with color cells. Consequently, the bicluster shown in the middle of (Figure 1A) represents the division into two biclusters on the right of (Figure 1A). Step 2: To search for the best chromosome (the best trit string) representing the optimal division of a bicluster, GA is performed (rectangles in Figure 1C). In the GA procedure, a mutation and a crossover are introduced into each chromosome. Each number on a chromosome is altered to 0, 1, or 2, for the mutation; whereas two chromosomes exchange corresponding parts with each other in the crossover. Chromosomes with higher fitness scores (described in the following section) survive in the next generation, and all other chromosomes are discarded. GA was implemented via Java Genetic Algorithm Product,18 with a mutation rate of 0.01 and a crossover rate of 0.5. Finally, the best chromosome after 100 generations of GA (the underlined string in the rectangle) is selected, based on the fitness score (see the next section). The best chromosome is then decoded into two biclusters (b left and b right ). We decide whether to continue with further decompositions after the evaluation of the biclusters, as follows. Step 3: Evaluation of biclusters. For each child bicluster, the numbers of genes and conditions, the average Pearson’s correlation coefficient (PCC), and the parent–child redundancy are examined to decide whether we should quit or continue the decomposition. Subsequently, the bicluster is either accepted as an element of the final biclusters, B, or discarded. We calculate the PCC of every gene pair in a bicluster, and average them (the average PCC). The parent–child redundancy is defined as the ratio of the number of genes of the child bicluster (n′) to that of the parent bicluster (n). Therefore, a small parent–child redundancy indicates that the child bicluster contains a smaller number of genes than the parent, and a large parent–child redundancy means that the number of genes in the child bicluster is almost the same as that of the parent. The average PCC and the parent–child redundancy are abbreviated as C and R, respectively. The decision process is illustrated in (Figure 1D). Briefly, the process employs four rules: (I) we quit the decomposition and accept the bicluster if C is higher than the threshold τ c . (II) we quit the decomposition and discard the bicluster if the bicluster is “small,” which is judged by the thresholds τ n and τ m for n′ and m′, respectively. (III) we also quit the decomposition and discard the bicluster if the redundancy, R, is small (R < τ r ) or large (R > 1 − τ r ). The latter rule was employed to reduce the calculation cost, because a child bicluster that is similar to its parent bicluster and has a low C is not considered to produce promising results. Using the forth rule: (IV) we continue the decomposition. Four thresholds, τ n , τ m , τ c , and τ r , were empirically determined as 30, 10, 0.65, and 0.15, respectively (see Table S1). The Greek symbols in (Figure 1D) indicate the rule applied in each decision. In (Figure 1C), the accepted and discarded biclusters are marked by + and – symbols. The bicluster to be decomposed is marked by a * symbol. Figure 1C indicates that four biclusters are accepted. Fitness function In general, large biclusters including co-expressed genes across many specific conditions are preferable. The average PCC of a bicluster was employed to evaluate the gene co-expression. Furthermore, the relative area A of the bicluster, defined by (n′/n) α (m′/m) β , using the gene and condition numbers of the parent and child biclusters was used to evaluate the size of a bicluster. Two parameters were introduced for gene-weight (α) and condition-weight (β), to control the balance between the number of genes and that of the conditions (0 < α, β < 1) in a relative area, A. The fitness function of a chromosome was defined as follows (Equation 1): (1) f ( c ) = A ( b l e f t ) C ( b l e f t ) + A ( b r i g h t ) C ( b r i g h t ) , where c, b i (i = left or right), A(b), and C(b) denote a chromosome, one of the child biclusters, the relative area of child bicluster b, and the average PCC of child bicluster b, respectively. The balance between α and β was important in order to select biologically meaningful biclusters when using f(c). Since a high average PCC for a large number of genes was obtained rather easily when only a small number of conditions were considered, a certain number of conditions should be required for each bicluster, to ensure the biological significance. The variation of α and β was empirically estimated, and finally 0.3 and 0.5 were chosen, respectively (see the results in Table S1). Assessment procedure Six existing methods were compared to evaluate the performance of BIGA: Cheng and Church algorithm,9 Statistical-Algorithmic Method for Bicluster Analysis (SAMBA),19,20 order-preserving submatrix (OPSM),1 iterative signature algorithm (ISA),11 binary inclusion-maximal biclustering algorithm (BIMAX),21 and SEBI.16 SEBI is selected as a representative of the GA-based biclustering approaches,15,16 because SEBI adopts an outstanding system to reduce the redundancy of biclusters and performs iterative evolutionary searches like BIGA. The five other methods are based on greedy searches. Data provided by Gasch et al22 was used for the analyses of Saccharomyces cerevisiae. The analyses contained 2993 genes and 173 stress conditions, as a result the data size was large and abundant annotations were available. Prelic et al21 used this dataset to evaluate algorithms, and the resultant sets of biclusters for the five greedy-search algorithms are publicly available. These bicluster sets were obtained for comparison with our results. Neither the results of SEBI for the data nor SEBI itself is publicly available. The framework of SEBI was re-implemented in a second experiment.16 Note that there might be some minor differences between SEBI and the re-implemented SEBI. Henceforth, we denote mySEBI as our implementation. The sets of biclusters were evaluated in terms of the following four points. Since PCC is a widely used parameter to assess the similarity of expression patterns, the distribution of the average PCC of all biclusters was examined. One may consider the mean square residual (MSR) of biclusters9 to be useful as an indicator of the coherence of biclusters, but PCC is better than MSR in terms of finding the functional relevance of genes,23–26 in much biological data, for example, the involvement of the same pathway or the participation in the same protein complex.27,28 The existing methods do not necessarily optimize the correlation of biclusters, and some biclusters derived from other algorithms can contain biclusters showing strong anti-correlation (ie, genes expressed inversely). The absolute value of PCC was used to estimate such biclusters for comparisons. Coverage and overlap are also important measures to evaluate the biclustering, as higher coverage and lower overlap are preferable for further biological analyses. Previous studies29 used “cell coverage,” by calculating the percentages of area (genes × conditions) covered by the biclusters, and “cell overlap” by measuring the intersection areas of the biclusters. In this study, “gene coverage” and “gene overlap,” were adopted because higher cell coverage can be achieved even by a high coverage of conditions and a low coverage of genes, and this result is not biologically significant. In addition, cell overlap ignores the overlap of genes shared in any two biclusters, if the conditions in the biclusters are completely different. Gene coverage is defined as the ratio of genes that are assigned to any biclusters to all genes, and gene overlap is the ratio of total genes overlapping on multiple biclusters to the genes assigned to any biclusters (Equation 2): (2) G e n e   o v e r l a p = ∑ i = 1 k X i - | ∪ i = 1 k X i | | ∪ i = 1 k X i | Gene coverage can evaluate the ability of an algorithm to decide the cluster for each gene, and gene overlap can measure the ability of an algorithm to specify the clusters for genes that are not necessarily involved in multiple biological processes. The biological significance of the results by measuring the GO enrichment was also evaluated. More precisely, FuncAssociate (2.0; Roth Laboratories, Harvard University, Boston, MA), a tool for finding overrepresented GO terms in a set of genes was utilised. Using this tool, we performed Fisher’s exact test to determine the probability of the appearance of genes associated with a GO term in each bicluster.30 FuncAssociate calculates an adjusted P-value (Padj) from the simulations, instead of the corrections of multiple tests. Padj is the probability of obtaining at least one false positive for any desired cutoff. We considered a biologically significant bicluster as one that is relevant to at least one GO term with a statistically significant appearance (namely, Padj less than significance level). The number of such biclusters, relative to the total number of biclusters (the GO enrichment), was used to estimate each algorithm. A previous study by Prelic et al21 evaluated the biological relevance of existing algorithms, using the GO enrichment. Results and discussion Biclusters for the Saccharomyces cerevisiae microarray data With the selected parameters and thresholds, BIGA found 164 biclusters from the S. cerevisiae microarray data. The average numbers of genes and conditions in the biclusters are 92.25 and 23.65, respectively (Table 1). The detailed statistics of each bicluster are provided in Table S2. The properties of the biclusters obtained by other methods are also summarized in Table 1. Performance evaluation The distribution of the average PCCs of the biclusters obtained by each biclustering algorithm is shown in the boxplot (Figure 2A). The thick line around the middle of the box indicates the median of the average PCCs. The top and bottom of the box indicate the upper and the lower quartiles, respectively. The circles show the outliers (more than 1.5 times the upper quartile or less than 1.5 times the lower quartile from the median). The whiskers mean the range of data between the maximum and the minimum values, other than the outliers. According to the plots, OPSM performs the best with a very small deviation in the average PCCs. Apart from OPSM, BIGA can outperform the other methods when compared by the median of the average PCC. One may consider that the fitness function of BIGA takes the average PCC into account (Equation 1), and thus it is obvious that the average PCC of BIGA is good. However, note that the results are not necessarily satisfactory if the optimization procedure does not work well, or the balance between the average PCC and the area of the bicluster in (Equation 1) is inappropriate. Next, using the the Wilcoxon signed-rank test the study examined whether the distribution of the average PCCs of BIGA is significantly better than those of the other algorithms.31 The results showed that BIGA detects significantly more co-expressed genes in biclusters than the other methods, except for OPSM (the highest P-value is only 5.4 × 10−6 against SAMBA). To clarify the performance, the expression profiles of the four best biclusters with higher average PCCs are demonstrated in Figure S1. Note: the reason for the highest performance of OPSM was related to the gene coverage and these analyses will be discussed later. The gene coverage and the gene overlap are shown in (Figure 2B and 2C), respectively. As a result, BIGA achieved the fourth-highest gene coverage among the seven algorithms (Figure 2B). SAMBA could classify almost 100% of the genes into biclusters, but each bicluster contained more than 900 genes (Table 1) with extremely high overlap (Figure 2C), which will make the succeeding experimental or bioinformatics analyses difficult. mySEBI could produce a set of biclusters that would include 95% of all genes with a small amount of overlap. CC showed the best gene coverage (highest) and overlap (lowest). The results indicate that the techniques to reduce redundancy of biclusters in SEBI and CC are efficient for gaining high coverage and low overlap. However, the average PCCs of the biclusters by both algorithms were very low (Figure 2A). OPSM produced biclusters with the highest correlation (Figure 2A), but failed to achieve higher gene coverage due to the small number of clusters (Table 1). The average PCCs of OPSM and BIGA are high, because both methods adopt gene co-expression in the target function. By contrast, CC and SEBI adopt MSR instead of PCC. Although MSR can sometimes identify coherent biclusters, it is not necessarily efficient to achieve higher correlations of genes. BIGA yielded the second-largest gene overlap, with 6.29 (Figure 2C), which may imply that the biclusters of BIGA are mutually similar. The pairwise overlap (PO) of two biclusters defined by X i ∩ X j /X i ∪ X j , where X i and X j are genes in biclusters B i and B j , respectively, was measured to examine the similarity of the biclusters more directly, and plotted in Figure 3A. The median of the POs for BIGA was not very large, as compared with those of the other methods, indicating that the biclusters determined by BIGA are not necessarily similar. Moreover, the variety of biclusters using the single-linkage clustering method, where the distance between two biclusters defined by 1.0–PO was investigated. At each cut-off distance, the number of clusters was counted and normalized by the total number of biclusters, which we call the fraction of independent biclusters. When the cut-off distance is sufficiently small, no biclusters are merged and FIB is 1.0. This state indicates that the biclusters are independent and diverse. On the other hand, when the cut-off distance is sufficiently large, most of the biclusters may be merged together, and FIB will converge to 0.0. This state means that all of the biclusters are judged as being similar to each other. We consider a higher FIB to be an indicator illustrating the variety of the resultant biclusters. According to the plot (Figure 3B), the FIBs of SAMBA and ISA are obviously low in almost the whole cut-off distance range, showing that their biclusters are rather similar. The FIBs of OPSM show that its ability to detect diverse biclusters is moderate. CC, mySEBI, BIMAX, and BIGA provided a wider variety of biclusters than the other algorithms, when the cut-off distance was less than 0.5. In summary, the average bicluster determined by BIGA contains many genes that are shared with other biclusters (Figure 2C): however, when focusing on each pair of biclusters, a small number of genes are shared (Figure 3A). Consequently, the biclusters determined by BIGA seem to be independent (Figure 3B), and cover most of the genes efficiently (Figure 2B). Evaluation of biological relevance by gene ontology enrichment analyses In the study by Prelic et al21 on the evaluation of existing methods using GO enrichment, OPSM showed the best performance (100% of the biclusters were significant at the 0.05 significance level). However, it only produced twelve biclusters (Table 1), and thus the gene coverage was the lowest (Figure 2B). Less than half of the biclusters produced by CC were judged to be significant,21 probably because CC cannot detect biclusters with a higher average PCC (Figure 2A). The percentages of significant biclusters from mySEBI are 93%, 81%, 69%, and 42% for the 0.05, 0.01, 0.005, and 0.001, respectively. By contrast, 94.5% of the biclusters produced by BIGA were judged to be significant at the 0.05 significance level. This value was changed to 88.4%, 86.0%, and 79.3% for the 0.01, 0.005, and 0.001 significance levels, respectively. The performance of BIGA is almost the same as those of BIMAX and ISA in GO enrichment,21 but BIGA outperforms them in the gene coverage (Figure 2B). There was a functional relationship between the resultant biclusters by BIGA, based on the enriched GO terms at the 0.001 significance level. Among the 122 GO-enriched terms, ribosome-related terms (ribosome GO:0005840, ribosomal subunit GO:0033279, etc) are abundant in many biclusters (50 biclusters). This observation was consistent with the fact that 60% of transcription was devoted to ribosomal ribonucleic acid (RNA),32 because genes with higher expression levels tend to be clustered. Apart from the ribosome-related terms, primary metabolic (GO:0044238), translation (GO:0006412), protein-related (GO:0044267, GO:0019538), macromolecule-related (GO:0009059, GO:0034645, GO:0044260, GO:0043170), and biopolymer-related (GO:0043283, GO:0034960, GO:0043284, GO:0034961) processes also frequently appeared in several biclusters. This indicated that the genes involved in these terms are primary or essential in many biological processes. Five GO terms that are most enriched at the 0.001 significance level for each bicluster five specific GO terms among them are shown in Table S2. Furthermore, the novel aspects of the biclusters identified by BIGA were examined. For each bicluster defined by BIGA, the PO against all biclusters identified by the other five methods was measured and the maximum PO was derived (Table S2). The highest value of the maximum POs was at most 0.12, indicating that the biclusters defined by BIGA are quite different from those determined by the other methods. To explore the relationships of the genes that were detected only by BIGA, on the study examined the biclusters of BIGA that were not similar to any of the other biclusters; that is, the biclusters with maximum pair-wise similarity scores < 0.05. In bicluster 109 (the maximum PO = 0.039 with bicluster 29 of CC), 16 out of 86 genes are involved in a cellular nitrogen metabolic process (GO:0034641), eg, SAS3 (YBL052C), TEF2 (YBR118W), and SWD3 (YBR175W), are co-expressed under twelve conditions. In bicluster 118 (0.037 with bicluster 56 of CC), 26 out of 66 genes, eg, RRN6 (YBL014C), ORC2 (YBR060C), and PAF1 (YBR279W), are involved in an RNA metabolic process (GO:0016070). In bicluster 160 (0.037, bicluster 24 of ISA), 33 out of 74 genes, such as HEK2 (YBL032W), ROX3 (YBL093C), and SIF2 (YBR103W), are related to a nucleic acid metabolic process (GO:0090304). These results demonstrate that BIGA is useful to reveal the functional relevance underlying the biclusters. Furthermore, some genes belonged to the same bicluster, even though they lacked known co-functional evidence (see the biclusters in Table S2 without significant GO terms). These genes represent promising experimental targets that bridge biological processes exhibiting co-expression under specific conditions. Conclusion The development of biclustering algorithms has allowed biologists to start unraveling the underlying functional mechanisms in living organisms. We propose BIGA as an alternative biclustering technique, since it was designed to address the conventional problems of the pre-existing methods. Biclustering is obviously advantageous in accounting for the overlap state among clusters, but the suitable amount of overlap is still ambiguous and different algorithms often produce solutions with various degrees of overlap. We tried to develop a novel chromosome-encoding mode that explicitly defines the overlap between biclusters. BIGA revealed that the most frequently appearing genes express their functions in fundamental and essential biological processes, such as translation. A microarray often consists of relatively few conditions, with respect to a large number of genes. The weighting of genes and conditions diminishes the bias between the number of genes and conditions, which helps to eliminate unreliable results, such as biclusters with very few conditions. We also applied an alternative index, the average PCC, which impacts the biological meaning, rather than the MSR, to measure the goodness of a bicluster. The analysis of GO enrichment demonstrated that most of our biclusters were significant, with one or more enriched GO terms. When evaluated with the five pre-existing algorithms, BIGA performed well in most of the properties with good balance, although it did not show the best performance for all criteria. A pair-wise comparison of our biclusters with those obtained by the other algorithms revealed the novel aspects of the biclusters that are distinct from those of the other methods. Since biological systems are quite complicated, resulting in high-dimensional data, it is quite difficult to answer all biological questions with a single approach. For new discoveries, we recommend the application of several approaches, including BIGA. Supplementary data Table S1 Parameter determination Goodness of biclusters Genes Conditions Correlation Biclusters Coverage Overlap α 0.1 72.15 22.84 0.74 111 0.59 3.53 0.3 92.25 23.65 0.71 164 0.69 6.29 0.5 102.22 24.42 0.7 252 0.67 11.82 τ r 0.1 81.22 21.51 0.73 355 0.74 11.97 0.15 92.25 23.65 0.71 164 0.69 6.29 0.2 109.86 25.07 0.69 57 0.58 2.59 0.25 128.13 32.5 0.71 8 0.22 0.53 0.3 163 45 0.67 1 0.05 0 τ c 0.60 100.62 22.17 0.69 145 0.71 5.9 0.65 92.25 23.65 0.71 164 0.69 6.29 0.70 83.84 22.69 0.74 178 0.61 7.09 Notes: (A) Impact of gene-weight parameter on the goodness of biclusters (τ n = 30, τ m = 10, τ c = 0.65, τ r = 0.15 and β = 0.5). (B) Impact of redundant threshold on the goodness of biclusters (τ n = 30, τ m = 10, τ c = 0.65, and α = 0.3, β = 0.5). (C) Impact of correlation threshold on the goodness of biclusters (τ n = 30, τ m = 10, τ c = 0.15, and α = 0.3, β = 0.5). Figure S1 Expression profiles of biclusters 1 (A), 2 (B), 3 (C), and 4 (D), in the descending order of the average Pearson’s correlation coefficient. Note: The x-axis represents the series of conditions; eg, the number 8 denotes the 8th condition. Table S2 Detailed statistics of resulting biclusters (sorted by descending order of average PCC) Bicluster ID Number of genes Number of conditions Average PCC The minimum adjusted P-value of GO enrichment Number of enriched GO terms Five most significant GO terms Five most specific GO terms Highest pairwise simirarity score 1 47 10 0.87 <0.001 2 GO:0003674 molecular_functionGO:0032991 macromolecular complex – 0.044 2 74 28 0.81 <0.001 3 GO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0043234 protein complex – 0.067 3 85 21 0.80 <0.001 14 GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0022618 ribonucleoprotein complex assembly GO:0007114 cell buddingGO:0022618 ribonucleoprotein complex assemblyGO:0032505 reproduction of a single-celled organismGO:0042257 ribosomal subunit assemblyGO:0043933 macromolecular complex subunit organization 0.070 4 71 32 0.80 12 GO:0030529 ribonucleoprotein complexGO:0032991 macromolecular complexGO:0005840 ribosomeGO:0044445 cytosolic partGO:0006412 translation GO:0022625 cytosolic large ribosomal subunit 0.093 5 74 18 0.80 0.001 1 GO:0005737 cytoplasm GO:0005737 cytoplasm 0.050 6 50 7 0.80 – 0 – – 0.043 7 79 24 0.80 <0.001 8 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosome GO:0009072 aromatic amino acid family metabolic process 0.073 8 52 16 0.79 – 0 – – 0.032 9 56 4 0.79 – 0 – – 0.041 10 87 21 0.79 <0.001 5 GO:0003674 molecular_functionGO:0006412 translationGO:0009987 cellular processGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic process GO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process 0.068 11 72 20 0.79 <0.001 5 GO:0032991 macromolecular complexGO:0003674 molecular_functionGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle – 0.060 12 78 26 0.79 <0.001 6 GO:0032040 small-subunit processomeGO:0030686 90S preribosomeGO:0042254 ribosome biogenesisGO:0030684 preribosomeGO:0022613 ribonucleoprotein complex biogenesis GO:0032040 small-subunit processomeGO:0022613 ribonucleoprotein complex biogenesisGO:0042254 ribosome biogenesisGO:0030684 preribosomeGO:0030686 90S preribosome 0.074 13 74 14 0.79 <0.001 1 GO:0003674 molecular_function – 0.048 14 83 33 0.78 <0.001 19 GO:0044445 cytosolic partGO:0006412 translationGO:0022625 cytosolic large ribosomal subunitGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle GO:0015934 large ribosomal subunitGO:0022625 cytosolic large ribosomal subunitGO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process 0.080 15 86 23 0.78 <0.001 2 GO:0003674 molecular_functionGO:0032991 macromolecular complex – 0.056 16 49 18 0.78 <0.001 10 GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0044260 cellular macromolecule metabolic processGO:0043283 biopolymer metabolic processGO:0030529 ribonucleoprotein complex GO:0008152 metabolic processGO:0016070 RNA metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic process 0.059 17 92 23 0.78 <0.001 12 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0034621 cellular macromolecular complex subunit organization GO:0034621 cellular macromolecular complex subunit organizationGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0016070 RNA metabolic processGO:0044237 cellular metabolic process 0.072 18 77 25 0.78 <0.001 4 GO:0003674 molecular_functionGO:0044445 cytosolic partGO:0009987 cellular processGO:0032991 macromolecular complex – 0.050 19 77 21 0.78 <0.001 5 GO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0015935 small ribosomal subunit GO:0015935 small ribosomal subunit 0.062 20 59 12 0.78 <0.001 1 GO:0044238 primary metabolic process – 0.046 21 84 30 0.77 <0.001 10 GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0005840 ribosome GO:0005737 cytoplasm 0.073 22 53 11 0.77 0.001 1 GO:0044238 primary metabolic process – 0.058 23 81 28 0.77 <0.001 11 GO:0032991 macromolecular complexGO:0043283 biopolymer metabolic processGO:0034960 cellular biopolymer metabolic processGO:0043234 protein complexGO:0043170 macromolecule metabolic process GO:0051246 regulation of protein metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0032268 regulation of cellular protein metabolic processGO:0043234 protein complex 0.059 24 61 21 0.77 – 0 – – 0.039 25 82 13 0.77 <0.001 1 GO:0003674 molecular_function – 0.045 26 103 24 0.76 <0.001 9 GO:0044238 primary metabolic processGO:0003674 molecular_functionGO:0009987 cellular processGO:0005840 ribosomeGO:0003735 structural constituent of ribosomeGO:0045182 translation regulator activity GO:0003743 translation initiation factor activityGO:0045182 translation regulator activityGO:0008135 “translation factor activity, nucleic acid binding”GO:0032268 regulation of cellular protein metabolic processGO:0043234 protein complex 0.077 27 93 27 0.76 <0.001 19 GO:0044238 primary metabolic processGO:0003735 structural constituent of ribosomeGO:0009987 cellular processGO:0005840 ribosomeGO:0003735 structural constituent of ribosome GO:0015935 small ribosomal subunitGO:0008152 metabolic processGO:0043229 intracellular organelleGO:0043226 organelleGO:0022627 cytosolic small ribosomal subunit 0.098 28 65 11 0.76 <0.001 1 GO:0003674 molecular_function – 0.045 29 78 32 0.76 <0.001 2 GO:0003674 molecular_functionGO:0032991 macromolecular complex – 0.077 30 62 19 0.76 <0.001 6 GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0044445 cytosolic part GO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process 0.056 31 89 19 0.76 <0.001 12 GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0009059 macromolecule biosynthetic processGO:0044238 primary metabolic process GO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0016070 RNA metabolic processGO:0009059 macromolecule biosynthetic process 0.063 32 91 30 0.76 <0.001 10 GO:0017111 nucleoside-triphosphatase activityGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”GO:0044238 primary metabolic process GO:0017111 nucleoside-triphosphatase activityGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides”GO:0034470 ncRNA processing 0.081 33 105 34 0.76 <0.001 8 GO:0009058 biosynthetic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0006412 translationGO:0044445 cytosolic part GO:0009058 biosynthetic process 0.098 34 105 28 0.75 <0.001 16 GO:0032991 macromolecular complexGO:0044267 cellular protein metabolic processGO:0006412 translationGO:0009987 cellular processGO:0043234 protein complex GO:0044444 cytoplasmic partGO:0044424 intracellular partGO:0043234 protein complexGO:0009058 biosynthetic process 0.088 35 110 25 0.75 <0.001 29 GO:0032991 macromolecular complexGO:0016070 RNA metabolic processGO:0044238 primary metabolic processGO:0009987 cellular processGO:0005198 structural molecule activity GO:0019438 aromatic compound biosynthetic processGO:0006396 RNA processingGO:0034470 ncRNA processingGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process” 0.085 36 66 16 0.75 <0.001 8 GO:0032991 macromolecular complexGO:0003735 structural constituent of ribosomeGO:0033279 ribosomal subunitGO:0005198 structural molecule activityGO:0006412 translation GO:0022627 cytosolic small ribosomal subunit 0.069 37 71 10 0.75 0.001 1 GO:0044085 cellular component biogenesis GO:0044085 cellular component biogenesis 0.068 38 59 14 0.74 <0.001 3 GO:0003674 molecular_functionGO:0005198 structural molecule activityGO:0032991 macromolecular complex – 0.040 39 58 16 0.74 <0.001 13 GO:0044249 cellular biosynthetic processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0009058 biosynthetic processGO:0043284 biopolymer biosynthetic process GO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022627 cytosolic small ribosomal subunit 0.048 40 83 36 0.74 <0.001 8 GO:0044445 cytosolic partGO:0006412 translationGO:0043229 intracellular organelleGO:0043226 organelleGO:0043228 nonmembrane-bounded organelle GO:0043229 intracellular organelleGO:0043226 organelle 0.076 41 78 23 0.74 <0.001 5 GO:0032991 macromolecular complexGO:0043234 protein complexGO:0003674 molecular_functionGO:0044238 primary metabolic processGO:0009987 cellular process GO:0043234 protein complex 0.069 42 113 26 0.74 <0.001 23 GO:0044445 cytosolic partGO:0030529 ribonucleoprotein complexGO:0005198 structural molecule activityGO:0033279 ribosomal subunitGO:0006412 translation GO:0006913 nucleocytoplasmic transportGO:0051169 nuclear transportGO:0005622 intracellularGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expression 0.080 43 90 22 0.74 <0.001 18 GO:0032991 macromolecular complexGO:0022627 cytosolic small ribosomal subunitGO:0030684 preribosomeGO:0030686 90S preribosomeGO:0030529 ribonucleoprotein complex GO:0044249 cellular biosynthetic processGO:0009058 biosynthetic process 0.081 44 89 25 0.74 <0.001 6 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0034621 cellular macromolecular complex subunit organizationGO:0016070 RNA metabolic process GO:0034621 cellular macromolecular complex subunit organizationGO:0016070 RNA metabolic process 0.061 45 92 28 0.74 <0.001 8 GO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0032268 regulation of cellular protein metabolic processGO:0005737 cytoplasmGO:0051246 regulation of protein metabolic process GO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process 0.057 46 106 28 0.74 <0.001 12 GO:0009987 cellular processGO:0006412 translationGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044238 primary metabolic process GO:0022627 cytosolic small ribosomal subunit 0.089 47 106 36 0.74 <0.001 14 GO:0030529 ribonucleoprotein complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosomeGO:0032991 macromolecular complex GO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides” 0.100 48 109 25 0.74 <0.001 23 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0044445 cytosolic partGO:0009987 cellular processGO:0005840 ribosome GO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunitGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translation 0.083 49 99 27 0.74 <0.001 24 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0005840 ribosomeGO:0005198 structural molecule activityGO:0006412 translation GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022627 cytosolic small ribosomal subunitGO:0034960 cellular biopolymer metabolic processGO:0009059 macromolecule biosynthetic process 0.082 50 89 24 0.73 <0.001 10 GO:0030529 ribonucleoprotein complexGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0005840 ribosomeGO:0043228 nonmembrane-bounded organelle GO:0005488 binding 0.074 51 86 15 0.73 <0.001 3 GO:0003674 molecular_functionGO:0009987 cellular processGO:0000166 nucleotide binding GO:0000166 nucleotide binding 0.065 52 141 35 0.73 <0.001 18 GO:0006412 translationGO:0032991 macromolecular complexGO:0009058 biosynthetic processGO:0009987 cellular processGO:0044249 cellular biosynthetic process GO:0006082 organic acid metabolic processGO:0019752 carboxylic acid metabolic processGO:0005737 cytoplasmGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic process 0.119 53 107 31 0.73 <0.001 20 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005198 structural molecule activity GO:0007010 cytoskeleton organizationGO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunitGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process 0.062 54 68 24 0.73 0.001 6 GO:0009987 cellular processGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0043229 intracellular organelleGO:0043226 organelle GO:0043229 intracellular organelleGO:0043226 organelle 0.045 55 128 26 0.73 <0.001 21 GO:0032991 macromolecular complexGO:0006412 translationGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0044238 primary metabolic process GO:0016043 cellular component organizationGO:0065007 biological regulationGO:0050789 regulation of biological processGO:0050794 regulation of cellular processGO:0009059 macromolecule biosynthetic process 0.089 56 101 32 0.73 <0.001 15 GO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0009987 cellular processGO:0005840 ribosome GO:0022625 cytosolic large ribosomal subunitGO:0044424 intracellular part 0.099 57 107 32 0.73 <0.001 11 GO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044238 primary metabolic processGO:0009987 cellular process GO:0043170 macromolecule metabolic process 0.091 58 111 33 0.72 <0.001 11 GO:0032991 macromolecular complexGO:0009987 cellular processGO:0019538 protein metabolic processGO:0006412 translationGO:0043228 nonmembrane-bounded organelle GO:0043234 protein complex 0.099 59 92 27 0.72 <0.001 11 GO:0009987 cellular processGO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0032268 regulation of cellular protein metabolic processGO:0044445 cytosolic part GO:0010608 posttranscriptional regulation of gene expressionGO:0016070 RNA metabolic processGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0044424 intracellular part 0.106 60 111 33 0.72 <0.001 7 GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0006412 translation – 0.078 61 76 15 0.72 <0.001 2 GO:0003674 molecular_functionGO:0009987 cellular process – 0.050 62 94 20 0.72 <0.001 6 GO:0032991 macromolecular complexGO:0032268 regulation of cellular protein metabolic processGO:0044238 primary metabolic processGO:0051246 regulation of protein metabolic processGO:0009987 cellular process GO:0051246 regulation of protein metabolic processGO:0032268 regulation of cellular protein metabolic process 0.057 63 83 24 0.72 <0.001 13 GO:0022627 cytosolic small ribosomal subunitGO:0032991 macromolecular complexGO:0015935 small ribosomal subunitGO:0044445 cytosolic partGO:0030686 90S preribosome GO:0030686 90S preribosomeGO:0015935 small ribosomal subunitGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0022627 cytosolic small ribosomal subunit 0.083 64 126 28 0.72 <0.001 39 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0005840 ribosomeGO:0030529 ribonucleoprotein complex GO:0015934 large ribosomal subunitGO:0044464 cell partGO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0022625 cytosolic large ribosomal subunit 0.094 65 45 12 0.72 – 0 – – 0.045 66 100 32 0.72 <0.001 8 GO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular process – 0.080 67 124 29 0.72 <0.001 15 GO:0032991 macromolecular complexGO:0043234 protein complexGO:0009058 biosynthetic processGO:0009987 cellular processGO:0043284 biopolymer biosynthetic process GO:0010608 posttranscriptional regulation of gene expressionGO:0006417 regulation of translationGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044424 intracellular part 0.097 68 111 37 0.72 <0.001 9 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelle – 0.099 69 51 21 0.71 – 0 – – 0.059 70 106 30 0.71 <0.001 21 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0005198 structural molecule activity GO:0034960 cellular biopolymer metabolic processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044260 cellular macromolecule metabolic processGO:0043234 protein complex 0.065 71 46 12 0.71 – 0 – – 0.047 72 126 36 0.71 <0.001 17 GO:0009987 cellular processGO:0044238 primary metabolic processGO:0016462 pyrophosphatase activityGO:0016817 “hydrolase activity, acting on acid anhydrides”GO:0016818 “hydrolase activity, acting on acid anhydrides, in phosphorus-containing anhydrides” GO:0017076 purine nucleotide bindingGO:0032553 ribonucleotide bindingGO:0032555 purine ribonucleotide bindingGO:0000166 nucleotide bindingGO:0017111 nucleoside-triphosphatase activity 0.101 73 87 25 0.71 <0.001 8 GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0016070 RNA metabolic process GO:0016070 RNA metabolic processGO:0043170 macromolecule metabolic process 0.070 74 112 30 0.71 <0.001 18 GO:0032991 macromolecular complexGO:0006412 translationGO:0044238 primary metabolic processGO:0044424 intracellular partGO:0009058 biosynthetic process GO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic processGO:0010608 posttranscriptional regulation of gene expressionGO:0006417 regulation of translationGO:0044424 intracellular part 0.085 75 116 31 0.71 <0.001 13 GO:0032991 macromolecular complexGO:0005198 structural molecule activityGO:0044445 cytosolic partGO:0044238 primary metabolic processGO:0009987 cellular process GO:0005737 cytoplasmGO:0043234 protein complex 0.093 76 68 14 0.71 <0.001 7 GO:0022627 cytosolic small ribosomal subunitGO:0015935 small ribosomal subunitGO:0006412 translationGO:0044445 cytosolic partGO:0003735 structural constituent of ribosome GO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunit 0.074 77 86 20 0.71 <0.001 3 GO:0003674 molecular_functionGO:0022627 cytosolic small ribosomal subunitGO:0032991 macromolecular complex GO:0022627 cytosolic small ribosomal subunit 0.052 78 104 39 0.71 <0.001 23 GO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complexGO:0006412 translationGO:0044238 primary metabolic processGO:0008135 “translation factor activity, nucleic acid binding” GO:0003743 translation initiation factor activityGO:0045182 translation regulator activityGO:0008135 “translation factor activity, nucleic acid binding”GO:0016070 RNA metabolic processGO:0034960 cellular biopolymer metabolic process 0.108 79 90 23 0.71 <0.001 9 GO:0006412 translationGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0005840 ribosome – 0.060 80 108 36 0.71 <0.001 7 GO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005198 structural molecule activityGO:0030529 ribonucleoprotein complex – 0.078 81 90 24 0.71 <0.001 11 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0022625 cytosolic large ribosomal subunitGO:0044238 primary metabolic processGO:0043283 biopolymer metabolic process GO:0022625 cytosolic large ribosomal subunitGO:0043234 protein complexGO:0043170 macromolecule metabolic process 0.067 82 106 33 0.71 <0.001 21 GO:0044238 primary metabolic processGO:0034960 cellular biopolymer metabolic processGO:0009987 cellular processGO:0043283 biopolymer metabolic processGO:0044260 cellular macromolecule metabolic process GO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic processGO:0043229 intracellular organelleGO:0043226 organelleGO:0034960 cellular biopolymer metabolic process 0.084 83 129 31 0.71 <0.001 18 GO:0032991 macromolecular complexGO:0006412 translationGO:0005198 structural molecule activityGO:0005840 ribosomeGO:0044445 cytosolic part GO:0005488 bindingGO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunitGO:0044422 organelle partGO:0044446 intracellular organelle part 0.091 84 129 28 0.71 <0.001 22 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0006412 translation GO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044424 intracellular partGO:0044237 cellular metabolic processGO:0044249 cellular biosynthetic process 0.098 85 77 38 0.71 <0.001 12 GO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0043228 nonmembrane-bounded organelle GO:0005622 intracellularGO:0022625 cytosolic large ribosomal subunit 0.074 86 109 28 0.70 <0.001 6 GO:0009987 cellular processGO:0006412 translationGO:0044445 cytosolic partGO:0044238 primary metabolic process 0.090 87 78 21 0.70 0.001 8 GO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic processGO:0060255 regulation of macromolecule metabolic processGO:0031326 regulation of cellular biosynthetic processGO:0009889 regulation of biosynthetic process GO:0019222 regulation of metabolic processGO:0060255 regulation of macromolecule metabolic processGO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic process 0.055 88 100 24 0.70 <0.001 19 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0005840 ribosomeGO:0003735 structural constituent of ribosome GO:0044422 organelle partGO:0044446 intracellular organelle partGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic process 0.073 89 82 24 0.70 <0.001 13 GO:0044445 cytosolic partGO:0006417 regulation of translationGO:0010608 posttranscriptional regulation of gene expressionGO:0032268 regulation of cellular protein metabolic processGO:0051246 regulation of protein metabolic process GO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic processGO:0010468 regulation of gene expressionGO:0010556 regulation of macromolecule biosynthetic process 0.060 90 77 27 0.70 <0.001 5 GO:0003674 molecular_functionGO:0005198 structural molecule activityGO:0009987 cellular processGO:0044238 primary metabolic processGO:0044445 cytosolic part – 0.050 91 97 22 0.70 <0.001 17 GO:0044445 cytosolic partGO:0006412 translationGO:0032991 macromolecular complexGO:0009987 cellular processGO:0044238 primary metabolic process GO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044249 cellular biosynthetic process 0.088 92 110 28 0.70 <0.001 6 GO:0009987 cellular processGO:0032991 macromolecular complexGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle – 0.090 93 94 29 0.70 <0.001 15 GO:0032991 macromolecular complexGO:0006417 regulation of translationGO:0010608 posttranscriptional regulation of gene expressionGO:0032268 regulation of cellular protein metabolic processGO:0051246 regulation of protein metabolic process GO:0005083 small GTPase regulator activityGO:0030695 GTPase regulator activityGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expressionGO:0051246 regulation of protein metabolic process 0.067 94 113 34 0.70 <0.001 32 GO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0006412 translationGO:0009987 cellular processGO:0044238 primary metabolic process GO:0019222 regulation of metabolic processGO:0060255 regulation of macromolecule metabolic processGO:0009889 regulation of biosynthetic processGO:0031323 regulation of cellular metabolic processGO:0031326 regulation of cellular biosynthetic process 0.075 95 94 23 0.70 <0.001 4 GO:0006412 translationGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle – 0.062 96 104 31 0.70 <0.001 10 GO:0006412 translationGO:0009987 cellular processGO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0034660 ncRNA metabolic process GO:0034470 ncRNA processingGO:0034660 ncRNA metabolic processGO:0016070 RNA metabolic process 0.107 97 51 13 0.70 <0.001 1 GO:0003674 molecular_function – 0.043 98 154 32 0.70 <0.001 14 GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0005840 ribosomeGO:0032991 macromolecular complexGO:0030529 ribonucleoprotein complex GO:0005575 cellular_componentGO:0044464 cell partGO:0010556 regulation of macromolecule biosynthetic processGO:0005737 cytoplasmGO:0010608 posttranscriptional regulation of gene expression 0.115 99 117 30 0.70 <0.001 11 GO:0005622 intracellularGO:0009987 cellular processGO:0044238 primary metabolic processGO:0006412 translationGO:0019538 protein metabolic process GO:0005622 intracellularGO:0022627 cytosolic small ribosomal subunitGO:0032268 regulation of cellular protein metabolic process 0.100 100 110 28 0.70 <0.001 8 GO:0009987 cellular processGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0044445 cytosolic partGO:0009058 biosynthetic process GO:0044249 cellular biosynthetic process 0.092 101 139 34 0.70 <0.001 16 GO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular process GO:0044422 organelle partGO:0044446 intracellular organelle partGO:0051246 regulation of protein metabolic processGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic process 0.100 102 98 28 0.69 <0.001 48 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0006412 translationGO:0043284 biopolymer biosynthetic processGO:0005840 ribosome GO:0006333 chromatin assembly or disassemblyGO:0006446 regulation of translational initiationGO:0003743 translation initiation factor activityGO:0019222 regulation of metabolic processGO:0045182 translation regulator activity 0.085 103 71 18 0.69 <0.001 1 GO:0003674 molecular_function – 0.053 104 105 21 0.69 <0.001 5 GO:0008150 biological_processGO:0009987 cellular processGO:0003674 molecular_functionGO:0032991 macromolecular complexGO:0043234 protein complex GO:0043234 protein complex 0.058 105 140 32 0.69 <0.001 16 GO:0032991 macromolecular complexGO:0043234 protein complexGO:0044238 primary metabolic processGO:0009987 cellular processGO:0044445 cytosolic part GO:0005575 cellular_componentGO:0044464 cell partGO:0010608 posttranscriptional regulation of gene expressionGO:0043226 organelleGO:0051246 regulation of protein metabolic process 0.098 106 41 12 0.69 – 0 – – 0.035 107 101 25 0.69 <0.001 24 GO:0044238 primary metabolic processGO:0005198 structural molecule activityGO:0032991 macromolecular complexGO:0005840 ribosomeGO:0044445 cytosolic part GO:0034645 cellular macromolecule biosynthetic processGO:0022625 cytosolic large ribosomal subunitGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0009059 macromolecule biosynthetic process 0.080 108 99 21 0.69 <0.001 9 GO:0032991 macromolecular complexGO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0044238 primary metabolic processGO:0006412 translation GO:0044424 intracellular part 0.080 109 86 12 0.69 <0.001 7 GO:0044267 cellular protein metabolic processGO:0009987 cellular processGO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0043229 intracellular organelle GO:0043229 intracellular organelleGO:0043226 organelle 0.039 110 118 30 0.69 <0.001 17 GO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0009987 cellular process GO:0044422 organelle partGO:0044446 intracellular organelle partGO:0044424 intracellular part 0.093 111 98 15 0.69 <0.001 5 GO:0016043 cellular component organizationGO:0009987 cellular processGO:0006996 organelle organizationGO:0032991 macromolecular complexGO:0008150 biological_process GO:0006996 organelle organizationGO:0016043 cellular component organization 0.041 112 157 43 0.69 <0.001 38 GO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0009987 cellular processGO:0032991 macromolecular complexGO:0006412 translation GO:0015934 large ribosomal subunitGO:0030686 90S preribosomeGO:0044464 cell partGO:0034961 cellular biopolymer biosynthetic processGO:0015935 small ribosomal subunit 0.108 113 116 34 0.68 <0.001 21 GO:0009058 biosynthetic processGO:0032991 macromolecular complexGO:0044249 cellular biosynthetic processGO:0006412 translationGO:0009987 cellular process GO:0000105 histidine biosynthetic processGO:0006547 histidine metabolic processGO:0009075 histidine family amino acid metabolic processGO:0009076 histidine family amino acid biosynthetic processGO:0009059 macromolecule biosynthetic process 0.084 114 69 13 0.68 0.001 1 GO:0009987 cellular process – 0.053 115 96 21 0.68 <0.001 5 GO:0003674 molecular_functionGO:0009987 cellular processGO:0022627 cytosolic small ribosomal subunitGO:0044267 cellular protein metabolic processGO:0019538 protein metabolic process GO:0022627 cytosolic small ribosomal subunit 0.050 116 38 9 0.68 – 0 – – 0.041 117 109 30 0.68 <0.001 9 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0009987 cellular processGO:0006412 translationGO:0005198 structural molecule activity GO:0043234 protein complex 0.076 118 66 17 0.68 0.001 1 GO:0009987 cellular process – 0.037 119 104 27 0.68 <0.001 5 GO:0003674 molecular_functionGO:0009987 cellular processGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0008150 biological_process – 0.072 120 122 36 0.68 <0.001 38 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0008152 metabolic processGO:0043283 biopolymer metabolic process GO:0022613 ribonucleoprotein complex biogenesisGO:0042254 ribosome biogenesisGO:0044085 cellular component biogenesisGO:0034961 cellular biopolymer biosynthetic processGO:0015935 small ribosomal subunit 0.097 121 74 16 0.68 0.001 8 GO:0022627 cytosolic small ribosomal subunitGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0043332 mating projection tipGO:0044463 cell projection part GO:0043332 mating projection tipGO:0044463 cell projection partGO:0022627 cytosolic small ribosomal subunit 0.089 122 126 38 0.68 <0.001 35 GO:0032991 macromolecular complexGO:0044445 cytosolic partGO:0006412 translationGO:0009987 cellular processGO:0043234 protein complex GO:0008135 “translation factor activity, nucleic acid binding”GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0043229 intracellular organelleGO:0044422 organelle part 0.106 123 83 18 0.68 <0.001 3 GO:0003674 molecular_functionGO:0043234 protein complexGO:0032991 macromolecular complex GO:0043234 protein complex 0.053 124 119 31 0.67 <0.001 8 GO:0032991 macromolecular complexGO:0006412 translationGO:0009987 cellular processGO:0005488 bindingGO:0044422 organelle part GO:0005488 bindingGO:0044422 organelle partGO:0044446 intracellular organelle part 0.093 125 133 41 0.67 <0.001 27 GO:0009987 cellular processGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0044238 primary metabolic processGO:0006412 translation GO:0015935 small ribosomal subunitGO:0043229 intracellular organelleGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0043226 organelle 0.092 126 132 25 0.67 <0.001 18 GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0044237 cellular metabolic processGO:0009987 cellular processGO:0008152 metabolic process GO:0031125 rRNA 3′-end processingGO:0043628 ncRNA 3′-end processingGO:0034660 ncRNA metabolic processGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic process 0.080 127 57 14 0.67 – 0 – – 0.042 128 51 18 0.67 <0.001 1 GO:0003674 molecular_function – 0.044 129 77 25 0.67 <0.001 5 GO:0009987 cellular processGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organizationGO:0003674 molecular_functionGO:0034622 cellular macromolecular complex assembly GO:0034622 cellular macromolecular complex assemblyGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organization 0.048 130 75 22 0.67 <0.001 4 GO:0044238 primary metabolic processGO:0016070 RNA metabolic processGO:0003674 molecular_functionGO:0043283 biopolymer metabolic process GO:0016070 RNA metabolic process 0.067 131 106 26 0.67 <0.001 6 GO:0003674 molecular_functionGO:0043229 intracellular organelleGO:0032991 macromolecular complexGO:0043226 organelleGO:0044238 primary metabolic process GO:0043229 intracellular organelleGO:0043226 organelle 0.076 132 133 25 0.67 <0.001 21 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0009987 cellular processGO:0044445 cytosolic partGO:0005198 structural molecule activity GO:0005488 bindingGO:0005622 intracellularGO:0044424 intracellular partGO:0044249 cellular biosynthetic process 0.097 133 128 35 0.67 <0.001 22 GO:0032991 macromolecular complexGO:0006412 translationGO:0005198 structural molecule activityGO:0005840 ribosomeGO:0043284 biopolymer biosynthetic process GO:0034961 cellular biopolymer biosynthetic processGO:0034645 cellular macromolecule biosynthetic processGO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0043234 protein complex 0.096 134 107 28 0.67 <0.001 19 GO:0005198 structural molecule activityGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0044445 cytosolic partGO:0005840 ribosome GO:0005737 cytoplasmGO:0015935 small ribosomal subunitGO:0022627 cytosolic small ribosomal subunitGO:0032268 regulation of cellular protein metabolic process 0.074 135 109 24 0.66 <0.001 17 GO:0009058 biosynthetic processGO:0044238 primary metabolic processGO:0044249 cellular biosynthetic processGO:0032991 macromolecular complexGO:0043284 biopolymer biosynthetic process GO:0003676 nucleic acid bindingGO:0006139 “nucleobase, nucleoside, nucleotide and nucleic acid metabolic process”GO:0008152 metabolic processGO:0006417 regulation of translationGO:0009059 macromolecule biosynthetic process 0.078 136 72 16 0.66 <0.001 9 GO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0022627 cytosolic small ribosomal subunitGO:0006412 translationGO:0043228 nonmembrane-bounded organelle GO:0000462 “maturation of SSU-rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0030490 maturation of SSU-rRNAGO:0022627 cytosolic small ribosomal subunit 0.050 137 113 24 0.66 <0.001 11 GO:0044238 primary metabolic processGO:0030529 ribonucleoprotein complexGO:0005840 ribosomeGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle GO:0008152 metabolic processGO:0044237 cellular metabolic process 0.080 138 48 12 0.66 – 0 – – 0.033 139 58 13 0.66 – 0 – – 0.041 140 135 37 0.66 <0.001 14 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle GO:0005488 bindingGO:0044424 intracellular partGO:0044237 cellular metabolic processGO:0043170 macromolecule metabolic process 0.101 141 103 21 0.66 <0.001 10 GO:0009987 cellular processGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelleGO:0043229 intracellular organelleGO:0043226 organelle GO:0065007 biological regulationGO:0050789 regulation of biological processGO:0050794 regulation of cellular processGO:0043229 intracellular organelleGO:0043226 organelle 0.063 142 164 32 0.66 <0.001 26 GO:0044238 primary metabolic processGO:0032991 macromolecular complexGO:0009987 cellular processGO:0006396 RNA processingGO:0016070 RNA metabolic process GO:0003824 catalytic activityGO:0006396 RNA processingGO:0030684 preribosomeGO:0030686 preribosomeGO:0034470 ncRNA processing 0.091 143 90 18 0.66 <0.001 21 GO:0032991 macromolecular complexGO:0019538 protein metabolic processGO:0044238 primary metabolic processGO:0043283 biopolymer metabolic processGO:0044267 cellular protein metabolic process GO:0008152 metabolic processGO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0044237 cellular metabolic processGO:0043234 protein complex 0.064 144 101 20 0.66 <0.001 3 GO:0009987 cellular processGO:0003674 molecular_functionGO:0008150 biological_process – 0.052 145 122 4 0.66 <0.001 2 GO:0008150 biological_processGO:0003674 molecular_function – 0.045 146 121 32 0.66 <0.001 14 GO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0009058 biosynthetic processGO:0044249 cellular biosynthetic processGO:0009987 cellular process GO:0009059 macromolecule biosynthetic processGO:0043284 biopolymer biosynthetic processGO:0044249 cellular biosynthetic process 0.061 147 121 30 0.66 <0.001 6 GO:0003824 catalytic activityGO:0032991 macromolecular complexGO:0044238 primary metabolic processGO:0030684 preribosome GO:0003824 catalytic activityGO:0030684 preribosome 0.088 148 104 22 0.66 <0.001 23 GO:0044238 primary metabolic processGO:0034660 ncRNA metabolic processGO:0034470 ncRNA processingGO:0031125 rRNA 3′-end processingGO:0009987 cellular process GO:0000459 exonucleolytic trimming during rRNA processingGO:0000467 “exonucleolytic trimming to generate mature 3′-end of 5.8S rRNA from tricistronic rRNA transcript (SSU-rRNA, 5.8S rRNA, LSU-rRNA)”GO:0000469 cleavages during rRNA processingGO:0006364 rRNA processingGO:0016072 rRNA metabolic process 0.070 149 140 19 0.66 <0.001 14 GO:0044238 primary metabolic processGO:0019538 protein metabolic processGO:0044267 cellular protein metabolic processGO:0032991 macromolecular complexGO:0005737 cytoplasm GO:0044464 cell partGO:0005737 cytoplasmGO:0006417 regulation of translationGO:0032268 regulation of cellular protein metabolic processGO:0043170 macromolecule metabolic process 0.069 150 116 30 0.65 <0.001 14 GO:0019538 protein metabolic processGO:0032991 macromolecular complexGO:0044267 cellular protein metabolic processGO:0044445 cytosolic partGO:0005198 structural molecule activity GO:0022625 cytosolic large ribosomal subunitGO:0043234 protein complex 0.079 151 61 21 0.65 <0.001 1 GO:0003674 molecular_function – 0.051 152 62 15 0.65 <0.001 1 GO:0003674 molecular_function – 0.041 153 85 27 0.65 <0.001 5 GO:0016070 RNA metabolic processGO:0003674 molecular_functionGO:0044238 primary metabolic processGO:0009987 cellular processGO:0034660 ncRNA metabolic process GO:0034660 ncRNA metabolic processGO:0016070 RNA metabolic process 0.072 154 142 33 0.65 <0.001 12 GO:0030529 ribonucleoprotein complexGO:0044445 cytosolic partGO:0032991 macromolecular complexGO:0033279 ribosomal subunitGO:0043228 nonmembrane-bounded organelle GO:0005622 intracellularGO:0043229 intracellular organelleGO:0044422 organelle partGO:0044446 intracellular organelle partGO:0043226 organelle 0.099 155 54 12 0.65 – 0 – – 0.039 156 71 15 0.65 <0.001 6 GO:0043283 biopolymer metabolic processGO:0044238 primary metabolic processGO:0034960 cellular biopolymer metabolic processGO:0043170 macromolecule metabolic processGO:0044260 cellular macromolecule metabolic process GO:0034960 cellular biopolymer metabolic processGO:0044260 cellular macromolecule metabolic processGO:0043170 macromolecule metabolic process 0.052 157 103 34 0.65 <0.001 21 GO:0032991 macromolecular complexGO:0009987 cellular processGO:0044445 cytosolic partGO:0043228 nonmembrane-bounded organelleGO:0043232 intracellular nonmembrane-bounded organelle GO:0015934 large ribosomal subunitGO:0022625 cytosolic large ribosomal subunitGO:0051246 regulation of protein metabolic processGO:0044424 intracellular partGO:0032268 regulation of cellular protein metabolic process 0.079 158 84 19 0.65 <0.001 6 GO:0005198 structural molecule activityGO:0005488 bindingGO:0044445 cytosolic partGO:0009987 cellular processGO:0032991 macromolecular complex GO:0005488 binding 0.074 159 103 20 0.65 <0.001 10 GO:0032991 macromolecular complexGO:0034621 cellular macromolecular complex subunit organizationGO:0044238 primary metabolic processGO:0009987 cellular processGO:0043933 macromolecular complex subunit organization GO:0065003 macromolecular complex assemblyGO:0034622 cellular macromolecular complex assemblyGO:0043933 macromolecular complex subunit organizationGO:0034621 cellular macromolecular complex subunit organization 0.063 160 74 7 0.65 0.001 3 GO:0044422 organelle partGO:0044446 intracellular organelle partGO:0009987 cellular process GO:0044422 organelle partGO:0044446 intracellular organelle part 0.037 161 57 7 0.64 <0.001 1 GO:0003674 molecular_function – 0.048 162 87 6 0.63 <0.001 1 GO:0003674 molecular_function – 0.048 163 75 5 0.61 <0.001 2 GO:0032991 macromolecular complexGO:0003674 molecular_function – 0.045 164 56 10 0.54 – 0 – – 0.033 Notes: The steps to select specific GO terms from each cluster. (1) We hypothesise if a GO term appears on only a small number of biclusters (ie, 1 of 4 biclusters), it is specific for the biclusters. (2) We have 164 biclusters. By the proportion test, 1 of 4 biclusters corresponds to 31 of 164 biclusters at 0.05 significance level. (3) Therefore, GO terms appear less than 32 times are specific terms.

          Related collections

          Most cited references39

          • Record: found
          • Abstract: not found
          • Article: not found

          Individual Comparisons by Ranking Methods

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Cluster analysis and display of genome-wide expression patterns.

            A system of cluster analysis for genome-wide expression data from DNA microarray hybridization is described that uses standard statistical algorithms to arrange genes according to similarity in pattern of gene expression. The output is displayed graphically, conveying the clustering and the underlying expression data simultaneously in a form intuitive for biologists. We have found in the budding yeast Saccharomyces cerevisiae that clustering gene expression data groups together efficiently genes of known similar function, and we find a similar tendency in human data. Thus patterns seen in genome-wide expression experiments can be interpreted as indications of the status of cellular processes. Also, coexpression of genes of known function with poorly characterized or novel genes may provide a simple means of gaining leads to the functions of many genes for which information is not available currently.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Genomic expression programs in the response of yeast cells to environmental changes.

              We explored genomic expression patterns in the yeast Saccharomyces cerevisiae responding to diverse environmental transitions. DNA microarrays were used to measure changes in transcript levels over time for almost every yeast gene, as cells responded to temperature shocks, hydrogen peroxide, the superoxide-generating drug menadione, the sulfhydryl-oxidizing agent diamide, the disulfide-reducing agent dithiothreitol, hyper- and hypo-osmotic shock, amino acid starvation, nitrogen source depletion, and progression into stationary phase. A large set of genes (approximately 900) showed a similar drastic response to almost all of these environmental changes. Additional features of the genomic responses were specialized for specific conditions. Promoter analysis and subsequent characterization of the responses of mutant strains implicated the transcription factors Yap1p, as well as Msn2p and Msn4p, in mediating specific features of the transcriptional response, while the identification of novel sequence elements provided clues to novel regulators. Physiological themes in the genomic responses to specific environmental stresses provided insights into the effects of those stresses on the cell.
                Bookmark

                Author and article information

                Journal
                23055751
                3459542
                10.2147/AABC.S32622
                Unknown

                Bioinformatics & Computational biology
                pearson’s correlation coefficient,biclustering,microarray data,genetic algorithm

                Comments

                Comment on this article