4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      ALOHA: Auxiliary Loss Optimization for Hypothesis Augmentation

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Malware detection is a popular application of Machine Learning for Information Security (ML-Sec), in which an ML classifier is trained to predict whether a given file is malware or benignware. Parameters of this classifier are typically optimized such that outputs from the model over a set of input samples most closely match the samples' true malicious/benign (1/0) target labels. However, there are often a number of other sources of contextual metadata for each malware sample, beyond an aggregate malicious/benign label, including multiple labeling sources and malware type information (e.g., ransomware, trojan, etc.), which we can feed to the classifier as auxiliary prediction targets. In this work, we fit deep neural networks to multiple additional targets derived from metadata in a threat intelligence feed for Portable Executable (PE) malware and benignware, including a multi-source malicious/benign loss, a count loss on multi-source detections, and a semantic malware attribute tag loss. We find that incorporating multiple auxiliary loss terms yields a marked improvement in performance on the main detection task. We also demonstrate that these gains likely stem from a more informed neural network representation and are not due to a regularization artifact of multi-target learning. Our auxiliary loss architecture yields a significant reduction in detection error rate (false negatives) of 42.6% at a false positive rate (FPR) of \(10^{-3}\) when compared to a similar model with only one target, and a decrease of 53.8% at \(10^{-5}\) FPR.

          Related collections

          Most cited references9

          • Record: found
          • Abstract: not found
          • Book Chapter: not found

          Multitask Learning: A Knowledge-Based Source of Inductive Bias

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A new learning paradigm: learning using privileged information.

            In the Afterword to the second edition of the book "Estimation of Dependences Based on Empirical Data" by V. Vapnik, an advanced learning paradigm called Learning Using Hidden Information (LUHI) was introduced. This Afterword also suggested an extension of the SVM method (the so called SVM(gamma)+ method) to implement algorithms which address the LUHI paradigm (Vapnik, 1982-2006, Sections 2.4.2 and 2.5.3 of the Afterword). See also (Vapnik, Vashist, & Pavlovitch, 2008, 2009) for further development of the algorithms. In contrast to the existing machine learning paradigm where a teacher does not play an important role, the advanced learning paradigm considers some elements of human teaching. In the new paradigm along with examples, a teacher can provide students with hidden information that exists in explanations, comments, comparisons, and so on. This paper discusses details of the new paradigm and corresponding algorithms, introduces some new algorithms, considers several specific forms of privileged information, demonstrates superiority of the new learning paradigm over the classical learning paradigm when solving practical problems, and discusses general questions related to the new ideas.
              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              MtNet: A Multi-Task Neural Network for Dynamic Malware Classification

                Bookmark

                Author and article information

                Journal
                13 March 2019
                Article
                1903.05700
                47edca5f-30fe-4b4b-bf1a-cbc225210657

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Pre-print of a manuscript submitted to Usenix Security Symposium 2019
                cs.CR cs.LG stat.ML

                Security & Cryptology,Machine learning,Artificial intelligence
                Security & Cryptology, Machine learning, Artificial intelligence

                Comments

                Comment on this article