1,112
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Celebrating 65 years of The Computer Journal - free-to-read perspectives - bcs.org/tcj65

      scite_
       
      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      Maximising data retention from the ISBSG repository

      proceedings-article
      ,
      12th International Conference on Evaluation and Assessment in Software Engineering (EASE) (EASE)
      Evaluation and Assessment in Software Engineering (EASE)
      26 - 27 June 2008
      Empirical software engineering, ISBSG repository, data formalisation, effort prediction, regression, FPA
      Bookmark

            Abstract

            BACKGROUND: In 1997 the International Software Benchmarking Standards Group (ISBSG) began to collect data on software projects. Since then they have provided copies of their repository to researchers and practitioners, through a sequence of releases of increasing size. PROBLEM: Questions over the quality and completeness of the data in the repository have led some researchers to discard substantial proportions of the data in terms of observations, and to discount the use of some variables in the modelling of, among other things, software development effort. In some cases the details of the discarding of data has received little mention and minimal justification. METHOD: We describe the process we used in attempting to maximise the amount of data retained for modelling software development effort at the project level, based on previously completed projects that had been sized using IFPUG/NESMA function point analysis (FPA) and recorded in the repository. RESULTS: Through justified formalisation of the data set and domain-informed refinement we arrive at a final usable data set comprising 2862 (of 3024) observations across thirteen variables. CONCLUSION: a methodical approach to the pre-processing of data can help to ensure that as much data is retained for modelling as possible. Assuming that the data does reflect one or more underlying models, such retention should increase the likelihood of robust models being developed.

            Content

            Author and article information

            Contributors
            Conference
            June 2008
            June 2008
            : 1-10
            Affiliations
            [0001]School of Computing and Mathematical Sciences, Auckland University of Technology

            Private Bag 92006, Auckland 1142, New Zealand
            Article
            10.14236/ewic/EASE2008.3
            9e613c1a-2a1f-4c4b-9964-72704ccf1773
            © Kefu Deng et al. Published by BCS Learning and Development Ltd. 12th International Conference on Evaluation and Assessment in Software Engineering (EASE)

            This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

            12th International Conference on Evaluation and Assessment in Software Engineering (EASE)
            EASE
            12
            University of Bari, Italy
            26 - 27 June 2008
            Electronic Workshops in Computing (eWiC)
            Evaluation and Assessment in Software Engineering (EASE)
            History
            Product

            1477-9358 BCS Learning & Development

            Self URI (article page): https://www.scienceopen.com/hosted-document?doi=10.14236/ewic/EASE2008.3
            Self URI (journal page): https://ewic.bcs.org/
            Categories
            Electronic Workshops in Computing

            Applied computer science,Computer science,Security & Cryptology,Graphics & Multimedia design,General computer science,Human-computer-interaction
            Empirical software engineering,ISBSG repository,data formalisation,effort prediction,regression,FPA

            Comments

            Comment on this article