0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Application of Fuzzy Clustering for Text Data Dimensionality Reduction

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Large textual corpora are often represented by the document-term frequency matrix whose elements are the frequency of terms; however, this matrix has two problems: sparsity and high dimensionality. Four dimension reduction strategies are used to address these problems. Of the four strategies, unsupervised feature transformation (UFT) is a popular and efficient strategy to map the terms to a new basis in the document-term frequency matrix. Although several UFT-based methods have been developed, fuzzy clustering has not been considered for dimensionality reduction. This research explores fuzzy clustering as a new UFT-based approach to create a lower-dimensional representation of documents. Performance of fuzzy clustering with and without using global term weighting methods is shown to exceed principal component analysis and singular value decomposition. This study also explores the effect of applying different fuzzifier values on fuzzy clustering for dimensionality reduction purpose.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: not found
          • Article: not found

          A validity measure for fuzzy clustering

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            An empirical comparison of supervised learning algorithms

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering

              (2002)
                Bookmark

                Author and article information

                Journal
                20 September 2019
                Article
                1909.10881
                bbe41e1c-43fb-45cd-a2f0-b61ef8f15066

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                arXiv admin note: text overlap with arXiv:1712.05997
                cs.CL cs.LG stat.AP stat.CO stat.ML

                Theoretical computer science,Applications,Machine learning,Artificial intelligence,Mathematical modeling & Computation

                Comments

                Comment on this article