13
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A detection method for android application security based on TF-IDF and machine learning

      research-article
      1 , * , 2 , * , 1 , 3 , 1
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Android is the most widely used mobile operating system (OS). A large number of third-party Android application (app) markets have emerged. The absence of third-party market regulation has prompted research institutions to propose different malware detection techniques. However, due to improvements of malware itself and Android system, it is difficult to design a detection method that can efficiently and effectively detect malicious apps for a long time. Meanwhile, adopting more features will increase the complexity of the model and the computational cost of the system. Permissions play a vital role in the security of the Android apps. Term Frequency—Inverse Document Frequency (TF-IDF) is used to assess the importance of a word for a file set in a corpus. The static analysis method does not need to run the app. It can efficiently and accurately extract the permissions from an app. Based on this cognition and perspective, in this paper, a new static detection method based on TF-IDF and Machine Learning is proposed. The system permissions are extracted in Android application package’s (Apk’s) manifest file. TF-IDF algorithm is used to calculate the permission value (PV) of each permission and the sensitivity value of apk (SVOA) of each app. The SVOA and the number of the used permissions are learned and tested by machine learning. 6070 benign apps and 9419 malware are used to evaluate the proposed approach. The experiment results show that only use dangerous permissions or the number of used permissions can’t accurately distinguish whether an app is malicious or benign. For malware detection, the proposed approach achieve up to 99.5% accuracy and the learning and training time only needs 0.05s. For malware families detection, the accuracy is 99.6%. The results indicate that the method for unknown/new sample’s detection accuracy is 92.71%. Compared against other state-of-the-art approaches, the proposed approach is more effective by detecting malware and malware families.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: not found
          • Article: not found

          TaintDroid

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            A comparative study of TF*IDF, LSI and multi-words for text classification

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Significant Permission Identification for Machine Learning Based Android Malware Detection

                Bookmark

                Author and article information

                Contributors
                Role: Funding acquisitionRole: MethodologyRole: Project administrationRole: SoftwareRole: Writing – original draftRole: Writing – review & editing
                Role: MethodologyRole: SupervisionRole: Writing – review & editing
                Role: SupervisionRole: Writing – review & editing
                Role: ResourcesRole: Supervision
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2020
                11 September 2020
                : 15
                : 9
                : e0238694
                Affiliations
                [1 ] Institute of information engineering, Anhui Xinhua University, Hefei, Anhui, China
                [2 ] School of Big Data & Software Engineering, Chongqing University, Chongqing, China
                [3 ] Department of Modern Mechanics, University of Science and Technology of China, Hefei, Anhui, China
                Xidian University, CHINA
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                http://orcid.org/0000-0002-9426-4191
                http://orcid.org/0000-0003-2568-9628
                Article
                PONE-D-19-28572
                10.1371/journal.pone.0238694
                7485785
                32915836
                498b7eb5-2333-402d-a11c-c137805da88b
                © 2020 Yuan et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 6 November 2019
                : 22 August 2020
                Page count
                Figures: 5, Tables: 11, Pages: 19
                Funding
                Funded by: University Natural Science Research Project of Anhui Province (CN)
                Award ID: KJ2017A621
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100009558, University Natural Science Research Project of Anhui Province;
                Award ID: KJ2014A096
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100012423, Anhui Xinhua University;
                Award ID: 2016JPKCX05
                Award Recipient :
                Funded by: funder-id http://dx.doi.org/10.13039/501100010814, Anhui Department of Education;
                Award ID: 2017JYXM0533
                Award Recipient :
                The work was partially supported by Natural Science Foundation of the Anhui Provincial Education Office (Grant No.KJ2017A621, Grant No.KJ2014A096), Quality Engineering Project of Anhui Xinhua University (Grant No. 2016JPKCX05), Quality engineering projects of colleges and universities in Anhui (Grant No.2017JYXM0533).
                Categories
                Research Article
                Computer and Information Sciences
                Computer Security
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Computer and Information Sciences
                Software Engineering
                Computer Software
                Apps
                Engineering and Technology
                Software Engineering
                Computer Software
                Apps
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Machine Learning Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Machine Learning Algorithms
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Machine Learning Algorithms
                Computer and Information Sciences
                Software Engineering
                Computer Software
                Operating Systems
                Engineering and Technology
                Software Engineering
                Computer Software
                Operating Systems
                Computer and Information Sciences
                Information Technology
                Text Mining
                Physical Sciences
                Mathematics
                Applied Mathematics
                Algorithms
                Research and Analysis Methods
                Simulation and Modeling
                Algorithms
                Engineering and Technology
                Equipment
                Communication Equipment
                Cell Phones
                Custom metadata
                The data underlying the results presented in the study are available from (The Drebin Dataset, https://www.sec.tu-bs.de/~danarp/drebin/).

                Uncategorized
                Uncategorized

                Comments

                Comment on this article