Blog
About

134
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      Assessing the Quality and Cleaning of a Software Project Dataset: An Experience Report

      , , ,

      10th International Conference on Evaluation and Assessment in Software Engineering (EASE) (EASE)

      Evaluation and Assessment in Software Engineering (EASE)

      10 - 11 April 2006

      Software Engineering, Data Quality, Filtering, Polishing, Robust Algorithms

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          OBJECTIVE – The aim is to report upon an assessment of the impact noise has on the predictive accuracy by comparing noise handling techniques.

          METHOD – We describe the process of cleaning a large software management dataset comprising initially of more than 10,000 projects. The data quality is mainly assessed through feedback from the data provider and manual inspection of the data. Three methods of noise correction (polishing, noise elimination and robust algorithms) are compared with each other assessing their accuracy. The noise detection was undertaken by using a regression tree model.

          RESULTS – Three noise correction methods are compared and different results in their accuracy where noted.

          CONCLUSIONS – The results demonstrated that polishing improves classification accuracy compared to noise elimination and robust algorithms approaches.

          Related collections

          Most cited references 8

          • Record: found
          • Abstract: not found
          • Article: not found

          Simplifying decision trees

           J.R. Quinlan (1987)
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The CN2 induction algorithm

              Bookmark
              • Record: found
              • Abstract: not found
              • Book Chapter: not found

              Outlier Detection Using Replicator Neural Networks

                Bookmark

                Author and article information

                Contributors
                Conference
                April 2006
                April 2006
                : 1-7
                Affiliations
                Brunel University, UK
                Article
                10.14236/ewic/EASE2006.14
                © Gernot Liebchen et al. Published by BCS Learning and Development Ltd. 10th International Conference on Evaluation and Assessment in Software Engineering (EASE), Keele University, UK

                This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

                10th International Conference on Evaluation and Assessment in Software Engineering (EASE)
                EASE
                10
                Keele University, UK
                10 - 11 April 2006
                Electronic Workshops in Computing (eWiC)
                Evaluation and Assessment in Software Engineering (EASE)
                Product
                Product Information: 1477-9358BCS Learning & Development
                Self URI (journal page): https://ewic.bcs.org/
                Categories
                Electronic Workshops in Computing

                Comments

                Comment on this article