1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      From Human to Data to Dataset: Mapping the Traceability of Human Subjects in Computer Vision Datasets

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Computer vision is a "data hungry" field. Researchers and practitioners who work on human-centric computer vision, like facial recognition, emphasize the necessity of vast amounts of data for more robust and accurate models. Humans are seen as a data resource which can be converted into datasets. The necessity of data has led to a proliferation of gathering data from easily available sources, including "public" data from the web. Yet the use of public data has significant ethical implications for the human subjects in datasets. We bridge academic conversations on the ethics of using publicly obtained data with concerns about privacy and agency associated with computer vision applications. Specifically, we examine how practices of dataset construction from public data-not only from websites, but also from public settings and public records-make it extremely difficult for human subjects to trace their images as they are collected, converted into datasets, distributed for use, and, in some cases, retracted. We discuss two interconnected barriers current data practices present to providing an ethics of traceability for human subjects: awareness and control. We conclude with key intervention points for enabling traceability for data subjects. We also offer suggestions for an improved ethics of traceability to enable both awareness and control for individual subjects in dataset curation practices.

          Related collections

          Most cited references85

          • Record: found
          • Abstract: not found
          • Article: not found

          ImageNet: A large-scale hierarchical image database

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Membership Inference Attacks Against Machine Learning Models

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Content Analysis: A Flexible Methodology

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                Proceedings of the ACM on Human-Computer Interaction
                Proc. ACM Hum.-Comput. Interact.
                Association for Computing Machinery (ACM)
                2573-0142
                April 14 2023
                April 16 2023
                April 14 2023
                : 7
                : CSCW1
                : 1-33
                Affiliations
                [1 ]University of Colorado Boulder, Boulder, CO, USA
                [2 ]University of California Berkely, Berkeley, CA, USA
                [3 ]Google, New York, NY, USA
                Article
                10.1145/3579488
                ecf0282e-b0a6-4ca5-9361-06f945a3efad
                © 2023

                http://www.acm.org/publications/policies/copyright_policy#Background

                History

                Comments

                Comment on this article