153
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Deep Supervised, but Not Unsupervised, Models May Explain IT Cortical Representation

      research-article

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Inferior temporal (IT) cortex in human and nonhuman primates serves visual object recognition. Computational object-vision models, although continually improving, do not yet reach human performance. It is unclear to what extent the internal representations of computational models can explain the IT representation. Here we investigate a wide range of computational model representations (37 in total), testing their categorization performance and their ability to account for the IT representational geometry. The models include well-known neuroscientific object-recognition models (e.g. HMAX, VisNet) along with several models from computer vision (e.g. SIFT, GIST, self-similarity features, and a deep convolutional neural network). We compared the representational dissimilarity matrices (RDMs) of the model representations with the RDMs obtained from human IT (measured with fMRI) and monkey IT (measured with cell recording) for the same set of stimuli (not used in training the models). Better performing models were more similar to IT in that they showed greater clustering of representational patterns by category. In addition, better performing models also more strongly resembled IT in terms of their within-category representational dissimilarities. Representational geometries were significantly correlated between IT and many of the models. However, the categorical clustering observed in IT was largely unexplained by the unsupervised models. The deep convolutional network, which was trained by supervision with over a million category-labeled images, reached the highest categorization performance and also best explained IT, although it did not fully explain the IT data. Combining the features of this model with appropriate weights and adding linear combinations that maximize the margin between animate and inanimate objects and between faces and other objects yielded a representation that fully explained our IT data. Overall, our results suggest that explaining IT requires computational features trained through supervised learning to emphasize the behaviorally important categorical divisions prominently reflected in IT.

          Author Summary

          Computers cannot yet recognize objects as well as humans can. Computer vision might learn from biological vision. However, neuroscience has yet to explain how brains recognize objects and must draw from computer vision for initial computational models. To make progress with this chicken-and-egg problem, we compared 37 computational model representations to representations in biological brains. The more similar a model representation was to the high-level visual brain representation, the better the model performed at object categorization. Most models did not come close to explaining the brain representation, because they missed categorical distinctions between animates and inanimates and between faces and other objects, which are prominent in primate brains. A deep neural network model that was trained by supervision with over a million category-labeled images and represents the state of the art in computer vision came closest to explaining the brain representation. Our brains appear to impose upon the visual input certain categorical divisions that are important for successful behavior. Brains might learn these divisions through evolution and individual experience. Computer vision similarly requires learning with many labeled images so as to emphasize the right categorical divisions.

          Related collections

          Most cited references39

          • Record: found
          • Abstract: found
          • Article: not found

          Spatio-temporal correlations and visual signalling in a complete neuronal population.

          Statistical dependencies in the responses of sensory neurons govern both the amount of stimulus information conveyed and the means by which downstream neurons can extract it. Although a variety of measurements indicate the existence of such dependencies, their origin and importance for neural coding are poorly understood. Here we analyse the functional significance of correlated firing in a complete population of macaque parasol retinal ganglion cells using a model of multi-neuron spike responses. The model, with parameters fit directly to physiological data, simultaneously captures both the stimulus dependence and detailed spatio-temporal correlations in population responses, and provides two insights into the structure of the neural code. First, neural encoding at the population level is less noisy than one would expect from the variability of individual neurons: spike times are more precise, and can be predicted more accurately when the spiking of neighbouring neurons is taken into account. Second, correlations provide additional sensory information: optimal, model-based decoding that exploits the response correlation structure extracts 20% more information about the visual scene than decoding under the assumption of independence, and preserves 40% more visual information than optimal linear decoding. This model-based approach reveals the role of correlated activity in the retinal coding of visual stimuli, and provides a general framework for understanding the importance of correlated activity in populations of neurons.
            • Record: found
            • Abstract: found
            • Article: not found

            Receptive fields and functional architecture of monkey striate cortex.

            1. The striate cortex was studied in lightly anaesthetized macaque and spider monkeys by recording extracellularly from single units and stimulating the retinas with spots or patterns of light. Most cells can be categorized as simple, complex, or hypercomplex, with response properties very similar to those previously described in the cat. On the average, however, receptive fields are smaller, and there is a greater sensitivity to changes in stimulus orientation. A small proportion of the cells are colour coded.2. Evidence is presented for at least two independent systems of columns extending vertically from surface to white matter. Columns of the first type contain cells with common receptive-field orientations. They are similar to the orientation columns described in the cat, but are probably smaller in cross-sectional area. In the second system cells are aggregated into columns according to eye preference. The ocular dominance columns are larger than the orientation columns, and the two sets of boundaries seem to be independent.3. There is a tendency for cells to be grouped according to symmetry of responses to movement; in some regions the cells respond equally well to the two opposite directions of movement of a line, but other regions contain a mixture of cells favouring one direction and cells favouring the other.4. A horizontal organization corresponding to the cortical layering can also be discerned. The upper layers (II and the upper two-thirds of III) contain complex and hypercomplex cells, but simple cells are virtually absent. The cells are mostly binocularly driven. Simple cells are found deep in layer III, and in IV A and IV B. In layer IV B they form a large proportion of the population, whereas complex cells are rare. In layers IV A and IV B one finds units lacking orientation specificity; it is not clear whether these are cell bodies or axons of geniculate cells. In layer IV most cells are driven by one eye only; this layer consists of a mosaic with cells of some regions responding to one eye only, those of other regions responding to the other eye. Layers V and VI contain mostly complex and hypercomplex cells, binocularly driven.5. The cortex is seen as a system organized vertically and horizontally in entirely different ways. In the vertical system (in which cells lying along a vertical line in the cortex have common features) stimulus dimensions such as retinal position, line orientation, ocular dominance, and perhaps directionality of movement, are mapped in sets of superimposed but independent mosaics. The horizontal system segregates cells in layers by hierarchical orders, the lowest orders (simple cells monocularly driven) located in and near layer IV, the higher orders in the upper and lower layers.
              • Record: found
              • Abstract: found
              • Article: not found

              Matching categorical object representations in inferior temporal cortex of man and monkey.

              Inferior temporal (IT) object representations have been intensively studied in monkeys and humans, but representations of the same particular objects have never been compared between the species. Moreover, IT's role in categorization is not well understood. Here, we presented monkeys and humans with the same images of real-world objects and measured the IT response pattern elicited by each image. In order to relate the representations between the species and to computational models, we compare response-pattern dissimilarity matrices. IT response patterns form category clusters, which match between man and monkey. The clusters correspond to animate and inanimate objects; within the animate objects, faces and bodies form subclusters. Within each category, IT distinguishes individual exemplars, and the within-category exemplar similarities also match between the species. Our findings suggest that primate IT across species may host a common code, which combines a categorical and a continuous representation of objects.

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                November 2014
                6 November 2014
                : 10
                : 11
                : e1003915
                Affiliations
                [1]Medical Research Council, Cognition and Brain Sciences Unit, Cambridge, United Kingdom
                University College London, United Kingdom
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: SMKR NK. Performed the experiments: SMKR NK. Analyzed the data: SMKR NK. Wrote the paper: SMKR NK. Implemented the models: SMKR.

                Article
                PCOMPBIOL-D-14-00531
                10.1371/journal.pcbi.1003915
                4222664
                25375136
                f50d6bd6-dc80-4c77-b55c-4ded35843144
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 26 March 2014
                : 11 September 2014
                Page count
                Pages: 29
                Funding
                This work was funded by Cambridge Overseas Trust and Yousef Jameel Scholarship to SMKR; and by the Medical Research Council of the UK (programme MC-A060-5PR20) and a European Research Council Starting Grant (ERC-2010-StG 261352) to NK. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology
                Computational Neuroscience
                Neuroscience
                Computer and Information Sciences
                Computer Vision
                Custom metadata
                The authors confirm that all data underlying the findings are fully available without restriction. The data has been used in previous studies, including a recent PLOS Computational Biology paper (‘A Toolbox for Representational Similarity Analysis’ Nili et al. 2014), and is already available from here: http://www.mrc-cbu.cam.ac.uk/methods-and-resources/toolboxes/.

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article

                Related Documents Log