7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Quasar and galaxy classification in Gaia Data Release 2

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          ABSTRACT

          We construct a supervised classifier based on Gaussian Mixture Models to probabilistically classify objects in Gaia data release 2 (GDR2) using only photometric and astrometric data in that release. The model is trained empirically to classify objects into three classes – star, quasar, galaxy – for G ≥ 14.5 mag down to the Gaia magnitude limit of G = 21.0 mag. Galaxies and quasars are identified for the training set by a cross-match to objects with spectroscopic classifications from the Sloan Digital Sky Survey. Stars are defined directly from GDR2. When allowing for the expectation that quasars are 500 times rarer than stars, and galaxies 7500 times rarer than stars (the class imbalance problem), samples classified with a threshold probability of 0.5 are predicted to have purities of 0.43 for quasars and 0.28 for galaxies, and completenesses of 0.58 and 0.72, respectively. The purities can be increased up to 0.60 by adopting a higher threshold. Not accounting for this expected low frequency of extragalactic objects (the class prior) would give both erroneously optimistic performance predictions and severely impure samples. Applying our model to all 1.20 billion objects in GDR2 with the required features, we classify 2.3 million objects as quasars and 0.37 million objects as galaxies (with individual probabilities above 0.5). The small number of galaxies is due to the strong bias of the satellite detection algorithm and on-ground data selection against extended objects. We infer the true number of quasars and galaxies – as these classes are defined by our training set – to be 690 000 and 110 000, respectively (±50 per cent). The aim of this work is to see how well extragalactic objects can be classified using only GDR2 data. Better classifications should be possible with the low resolution spectroscopy (BP/RP) planned for GDR3.

          Related collections

          Most cited references38

          • Record: found
          • Abstract: found
          • Article: not found

          SMOTE: Synthetic Minority Over-sampling Technique

          An approach to the construction of classifiers from imbalanced datasets is described. A dataset is imbalanced if the classification categories are not approximately equally represented. Often real-world data sets are predominately composed of ``normal'' examples with only a small percentage of ``abnormal'' or ``interesting'' examples. It is also the case that the cost of misclassifying an abnormal (interesting) example as a normal example is often much higher than the cost of the reverse error. Under-sampling of the majority (normal) class has been proposed as a good means of increasing the sensitivity of a classifier to the minority class. This paper shows that a combination of our method of over-sampling the minority (abnormal) class and under-sampling the majority (normal) class can achieve better classifier performance (in ROC space) than only under-sampling the majority class. This paper also shows that a combination of our method of over-sampling the minority class and under-sampling the majority class can achieve better classifier performance (in ROC space) than varying the loss ratios in Ripper or class priors in Naive Bayes. Our method of over-sampling the minority class involves creating synthetic minority class examples. Experiments are performed using C4.5, Ripper and a Naive Bayes classifier. The method is evaluated using the area under the Receiver Operating Characteristic curve (AUC) and the ROC convex hull strategy.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Gaia Data Release 2

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              TheGaiamission

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                Monthly Notices of the Royal Astronomical Society
                Oxford University Press (OUP)
                0035-8711
                1365-2966
                December 2019
                December 21 2019
                December 2019
                December 21 2019
                October 21 2019
                : 490
                : 4
                : 5615-5633
                Affiliations
                [1 ]Max Planck Institute for Astronomy, Königstuhl 17, D-69117 Heidelberg, Germany
                Article
                10.1093/mnras/stz2947
                2250c8df-4859-47d9-b0ea-b6eb681778b6
                © 2019

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article