34
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Combining classifiers for robust PICO element detection

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Formulating a clinical information need in terms of the four atomic parts which are Population/Problem, Intervention, Comparison and Outcome (known as PICO elements) facilitates searching for a precise answer within a large medical citation database. However, using PICO defined items in the information retrieval process requires a search engine to be able to detect and index PICO elements in the collection in order for the system to retrieve relevant documents.

          Methods

          In this study, we tested multiple supervised classification algorithms and their combinations for detecting PICO elements within medical abstracts. Using the structural descriptors that are embedded in some medical abstracts, we have automatically gathered large training/testing data sets for each PICO element.

          Results

          Combining multiple classifiers using a weighted linear combination of their prediction scores achieves promising results with an f-measure score of 86.3% for P, 67% for I and 56.6% for O.

          Conclusions

          Our experiments on the identification of PICO elements showed that the task is very challenging. Nevertheless, the performance achieved by our identification method is competitive with previously published results and shows that this task can be achieved with a high accuracy for the P element but lower ones for I and O elements.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: not found
          • Article: not found

          Answering Clinical Questions with Knowledge-Based and Statistical Techniques

            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Sentence retrieval for abstracts of randomized controlled trials

            Background The practice of evidence-based medicine (EBM) requires clinicians to integrate their expertise with the latest scientific research. But this is becoming increasingly difficult with the growing numbers of published articles. There is a clear need for better tools to improve clinician's ability to search the primary literature. Randomized clinical trials (RCTs) are the most reliable source of evidence documenting the efficacy of treatment options. This paper describes the retrieval of key sentences from abstracts of RCTs as a step towards helping users find relevant facts about the experimental design of clinical studies. Method Using Conditional Random Fields (CRFs), a popular and successful method for natural language processing problems, sentences referring to Intervention, Participants and Outcome Measures are automatically categorized. This is done by extending a previous approach for labeling sentences in an abstract for general categories associated with scientific argumentation or rhetorical roles: Aim, Method, Results and Conclusion. Methods are tested on several corpora of RCT abstracts. First structured abstracts with headings specifically indicating Intervention, Participant and Outcome Measures are used. Also a manually annotated corpus of structured and unstructured abstracts is prepared for testing a classifier that identifies sentences belonging to each category. Results Using CRFs, sentences can be labeled for the four rhetorical roles with F-scores from 0.93–0.98. This outperforms the use of Support Vector Machines. Furthermore, sentences can be automatically labeled for Intervention, Participant and Outcome Measures, in unstructured and structured abstracts where the section headings do not specifically indicate these three topics. F-scores of up to 0.83 and 0.84 are obtained for Intervention and Outcome Measure sentences. Conclusion Results indicate that some of the methodological elements of RCTs are identifiable at the sentence level in both structured and unstructured abstract reports. This is promising in that sentences labeled automatically could potentially form concise summaries, assist in information retrieval and finer-grained extraction.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              A method of extracting the number of trial participants from abstracts describing randomized controlled trials.

              We have developed a method for extracting the number of trial participants from abstracts describing randomized controlled trials (RCTs); the number of trial participants may be an indication of the reliability of the trial. The method depends on statistical natural language processing. The number of interest was determined by a binary supervised classification based on a support vector machine algorithm. The method was trialled on 223 abstracts in which the number of trial participants was identified manually to act as a gold standard. Automatic extraction resulted in 2 false-positive and 19 false-negative classifications. The algorithm was capable of extracting the number of trial participants with an accuracy of 97% and an F-measure of 0.84. The algorithm may improve the selection of relevant articles in regard to question-answering, and hence may assist in decision-making.
                Bookmark

                Author and article information

                Journal
                BMC Med Inform Decis Mak
                BMC Medical Informatics and Decision Making
                BioMed Central
                1472-6947
                2010
                15 May 2010
                : 10
                : 29
                Affiliations
                [1 ]DIRO, University of Montreal, CP. 6128, succursale Centre-ville, Montreal, H3C 3J7 Quebec, Canada
                [2 ]Department of Family Medicine, McGill University, 515 Pine Avenue, Montreal, H2W 1S4 Quebec, Canada
                Article
                1472-6947-10-29
                10.1186/1472-6947-10-29
                2891622
                20470429
                8139b62f-8e49-4738-9170-aa67e8975ca4
                Copyright ©2010 Boudin et al; licensee BioMed Central Ltd.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 7 December 2009
                : 15 May 2010
                Categories
                Research Article

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article