3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      You are your Metadata: Identification and Obfuscation of Social Media Users using Metadata Information

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Metadata are associated to most of the information we produce in our daily interactions and communication in the digital world. Yet, surprisingly, metadata are often still catergorized as non-sensitive. Indeed, in the past, researchers and practitioners have mainly focused on the problem of the identification of a user from the content of a message. In this paper, we use Twitter as a case study to quantify the uniqueness of the association between metadata and user identity and to understand the effectiveness of potential obfuscation strategies. More specifically, we analyze atomic fields in the metadata and systematically combine them in an effort to classify new tweets as belonging to an account using different machine learning algorithms of increasing complexity. We demonstrate that through the application of a supervised learning algorithm, we are able to identify any user in a group of 10,000 with approximately 96.7% accuracy. Moreover, if we broaden the scope of our search and consider the 10 most likely candidates we increase the accuracy of the model to 99.22%. We also found that data obfuscation is hard and ineffective for this type of data: even after perturbing 60% of the training data, it is still possible to classify users with an accuracy higher than 95%. These results have strong implications in terms of the design of metadata obfuscation strategies, for example for data set release, not only for Twitter, but, more generally, for most social media platforms.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          Remote Physical Device Fingerprinting

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            User identification and authentication using multi-modal behavioral biometrics

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Typing patterns: a key to user identification

                Bookmark

                Author and article information

                Journal
                27 March 2018
                Article
                1803.10133
                366c713b-2344-4ed7-b623-838d318c9d0d

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                10 pages, 13 figures. Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, USA
                cs.CR cs.AI cs.SI

                Comments

                Comment on this article