Text-Level Structure of Research Papers: Implications for Text-Based Information Processing Systems

This paper discusses the implication of text-level structure for text-based information processing systems. In this paper, text-level structure of research papers is described with a set of typical functional components of research papers such as, background, purpose, methods, etc. and their order in a text. In order to suggest various applications, the experiments of retrieval and passage extraction were conducted using a manually structure-tagged fulltext database of research papers. As a result, we show that searching full-length texts using text-level structure achieved higher precision, compared to the searching without it. The paper also shows examples of extracted passages and suggests the application of text-level structure for text-based information systems, including passage extraction, browsing, and navigation within/across texts.


Introduction
Text is not a mere collection of sentences, but is the set of concepts carefully constructed by the author in order to convey the message effectively.Each sentence in a text is interpreted in relation to the other sentences in the text.Although current information retrieval systems are usually focused on terms or sentence-level structures, the text-level phenomena such as, global structure of the text, relation among the passages within a text, anaphoric references, etc. and characteristic aspects of the text type should be considered in the design of textbased information processing systems.
The phenomena within units larger than a sentence have been investigated in discourse linguistics and text linguistics.Among the various approaches to text-level structure, this paper focuses on the typical structure of the genre or the text type.It is known that each text type has a typical structure.Many researchers have shown that the research papers and their abstracts possess a highly typical structure [1][2][3][4][5].Such a typical structure can be described with a set of typical components of the text type and their order in a text [2]."Background", "reference to the previous research", "purpose", "methods", "results", "discussion", and "conclusion" are examples of typical components of the text type of research papers.The set of such components is somewhat predictable in each text type.It is sometimes mentioned as the functional structure of the text since each component often represents the function or role that each part of the text plays in the text.It is a natural structure of informational content of the text which is familiar with the people who use the text type to communicate as a kind of social convention, therefore it is expected to be usable by users in a search process [6].
For the application for information retrieval, a search using text-level structure is expected to be more effective than one that does not.This improvement is not only by detecting the central theme in a text, but also by distinguishing the role or function each concept plays in the text or the relationship among them [4,7].
In addition to this, text-level structure of research papers is suggested as a model of the scientific research process [17], and a suitable frame for interacting with a user on the matter of the user's situation in the research process [6].Some of the typical components of research papers also consist of a part of the criteria of judging relevance or appropriateness of retrieved documents to users' tasks or problems [29][30].Users' situations, problems or tasks are integral components to information needs [31][32].The situation of each user is more complex than the text-level structure represents, but this can be a small step towards deriving factors that may relate to it from texts, and incorporating these factors to information systems.Moreover, users' situations cans be changed dynamically during the interaction with information systems [33][34].Text-level structure enables a more flexible interaction by providing various ways of displaying, browsing, or navigating within/among the texts through links other than topical or semantic relationships or logical structure of document style.It may facilitate users to obtain insight into the content of databases through interaction.

Research Goals
Based on previous research, in order to suggest various applications of text-level structure, this paper experiments on a pilot system of fulltext database of Japanese research papers using text-level structure, including full-length text retrieval in both with ranked-output and without it, passage extraction, and comparison of extracted passages.The paper also discusses the implication of text-level structure analysis in text-based information processing systems.
Passage retrieval is convenient for users who wish to obtain just a passage immediately relevant to their needs, rather than the long text which includes it.It is also reported to improve both precision and recall [35].But a passage is not an independent entity like a whole text.It should be interpreted and examined for validity in relation to the rest of the text.Text-level structure can be used to describe the relations among the passages within the text.
Readers examine the texts or passages, then compare and integrate their contents.Based on this, readers do information works like writing papers, decision making, problem solving, etc.Comparison of texts or passages may lead to the discovery of a new relationship among them [36].This paper is especially interested in the application which may support human information works by comparing extracted passages from different texts.
Comparison or summarization of multiple texts based on a knowledge based system [37] or using message understanding technique [38] are reported.However, the applications suggested here are based on "light" text processing which keeps the text in its original form and facilitates users' strategic or analytical reading or searching of texts by providing flexible ways of display and navigation by text-level structure.Moreover, satisfactory performance of the knowledge based approach can only be obtained when the domain is restricted to a narrow subject area.The approach of this paper is limited to the genre of research papers but no restriction on subject area is pre-supposed.
The next section describes the methods of analyzing the text-level structure, and section 3 describes the methods and results of experiments.Section 4 discusses the implication of text-level structure for text-based information processing systems and suggests future studies.

Analysis of Text-Level Structure of Research Papers
Text-level structure, as discussed above, is analyzed using a set of analytical categories which represent the typical components of the text type.The structure of each text is described by occurrence of categories and their order in the text.

The Categories : Typical Components of Research Papers
The set of Categories is presented in Figure 1.It was prepared through content analysis of 40 writing manuals and 127 Japanese research papers selected from four disciplines of medicine, physics, economics, Japanese literature, and interviews with researchers [19][20].It has been modified through applying to samples from other disciplines [26][27][39][40][41] and the inter-coder consistency tests [41].Each Category represents the role or function that the part of text plays in the text, the relationship between a part of text and the whole text, and between parts in the text and other parts in other text.
They are arranged hierarchically.The most specific level of Categories are assigned to each sentence.They can be translated into an upper level via hierarchy links when the overall structure of the text is studied.The characteristics and definition of upper level Categories are inherited to specific ones through the hierarchy links.This inheritance was also used in the rules for automatic detection of the Categories.
The Categories were assigned to all the sentences in the sample.That is, the informational content of research papers from various disciplines is described with the same repertory of the Categories, which can be used as a common framework for them.Text-level structure in each research paper was described with the Category occurrence and order in it.It was more complex than the "IMRD", or "Introduction, Methods, Results, and Discussion" organization, which is frequently mentioned as the structure of research papers in writing manuals [42] and roughly reflected in the sectional headings of empirical research papers.But these patterns  were eventually categorized and described by the combination of "basic patterns", "sub-patterns", repetition or omission of "sub-patterns", and "exception patterns".The detailed list of the patterns and number of occurrences in each level of Categories were reported [19][20].
A relatively small number of indicative clue phrases for each Category were revealed through the analysis.These phrases and the patterns of Categories were used in the rules for automatic Category detection.
Text structure is one of the fundamental characteristics of informational contents of texts, and can be applied to various processing methods such as: information retrieval, automatic abstracting, and information extraction.However, manual assignment of the Categories is rather time-consuming and is a labour intensive task, therefore making it impractical in operational settings.As a premise of application, automatic detection of Categories is necessary.

Automatic Detection of the Categories
Feasibility of automatic Categories detection was revealed through feasibility studies [24,[26][27][28].Three kinds of rules, i.e., (a) Indicative clue phrases, (b) Category order, and (c) Category scope, were used in these studies.The rules using indicative clue phrases are essentially templates.They may include groups of indicative clues or phrases, a Category of the previous sentence, and citations.The rules for Category scope identify the scope that a Category continues in the text.These are mainly based on conjunctions and anaphoric expression.Some rules provide stronger evidence than others, and each rule is assigned a weight.The score for weighting is initially calculated as the probability that the sentence falls into the specific Category when it is matched to the premise of the rule in the training corpus, and has been adjusted during successive trials with it.
The results of these studies indicate the feasibility of automatic assignment of Categories using surface level natural language processing.Based on this, the next section describes the results of the experiments using a pilot system with structure tagged fulltext database, and suggests various applications for text-based information processing systems.

Database and Search Engine
The experimental fulltext database consisted of fifty Japanese research papers on Viral Hepatitis type C, which was one of the corpora used in automatic Categories detection [26].They were selected systematically from an operational large-scale medical bibliographic/abstract database with conditions of "published in 1991", "containing the term 'Viral Hepatitis type C' or 'non A non B Viral Hepatitis'", "original papers", and "written in Japanese".The fulltext of papers was OCRed after gaining permission from the publishers.
Tags for the Categories, which represent the typical functional components of research papers that is shown in Figure 1, and the components of logical structure of documents, i.e, <article>, <title>, <sec> for section, <p> for paragraph, etc., were assigned manually.An example of database records is shown in Figure 2.
Tags represents the components of logical structure of documents, such as: <article> for the article as a whole, <title> for title, <abs> for abstract, <body> for body of text, <sec> for section, <h1> for section heading, <p> for paragraph, <s> for sentence, <cap> for caption for figure, and the Categories shown in Figure 1 such as: <A11>, <A21>,<B21> and so on.

Figure 2. English Translation of an Example of the Experimental Fulltext <article><title>Interferon
Therapy of Non A Non B Hepatitis</title> <body><sec><h1>Introduction</h1> <p><s><A11>It has been reported that interferon (IFN) treatment declines the serum transaminase level and improves the histological condition.</A11></s><s><A21>However there are some problems in the methods of administration since many cases whose level of serum transaminase increased after continuous administration of IFN for four weeks were reported.</A21></s></p><p><s><A312>In order to discuss the more effective method of IFN administration, The authors conducted the survey The search engine is OpenText 6 (OpenText Corp., Canada), which is a fast text searching system.In this system, a database is seen as one long string and the queries are based on sistring, i.e. a semi-infinite string that starts at that position and extends arbitrarily far to the right, or to the end of the text [43].The part of text enclosed by a beginning tag <tag> and an ending tag </tag>, which are specified in the data dictionary, is called a region.It is a unit for a search and results set."Including" and "within" operations among regions are available.With this engine, complex queries with structure relationship can be processed, for example, "any articles in which the term 'rat' occurred in the Category of 'B21.Attribute of subject' ", or "any paragraphs in which word 'weight' occurred in 'C1.original evidence of the article' in the articles retrieved by this query " , and so on.
The search with statistical ranked-output uses OpenText's "RankMode Relevance1", which "ranks the members of the returned set based on the term frequency, the document length, and the total number of all words" [44].

Procedure
The purpose of this experiment is to test the effectiveness of text-level structure in fulltext searching both with statistical ranked-output and without it.
The searches without ranked-output were conducted with these strategies, (a) without any specification of Categories nor logical structure, (b) using components of logical document structure (title, abstracts, same paragraph and same sentence, and title/caption of table/figure), and (c) using the Categories.In strategy (c), the Categories were combined with higher level ones as shown in Figure 3. Strategy (a) is based on co-occurrence of search terms in a text and the others are based on term occurrence in specific Categories/components in a text.
The searches with ranked-output were conducted with two strategies, i.e.(a) without any specification of Categories nor logical structure and (b) ranking based on term occurrence in the Categories "B.validity and Method" and "C1.Evidences", which were effective in searching without ranked-output, and ranking based on combination of term occurrence in these Categories and the whole text.
Sixteen search topics were collected from medical researchers for the experiment.Nine topics were omitted since the number of relevant documents for them was less than four or more than 25% of the entire database.Search statements were constructed manually including synonyms.

Results
The results for the search without ranked-output were shown in Figure 3.In general, precision was increased when Categories or components were specified."A3.Research Topic" are the parts of the text in which usually the central concepts of the texts, or purposes of research reported in the articles, are stated.Specifying this Category increased precision acutely but the recall ratio dropped."B.Methods and C1.Evidence" were effective in increasing precision while keeping recall level.By specifying the second level Categories of "B1","B2", "B3", "B4", and "B5" instead of "B", precision increased without significant decrease of recall.The results of the search with statistical ranked-output were shown in Figure 4. Searches with Categories "B.Methods" and "C1.Evidence", and those combinations with occurrence in the whole text were more effective than those that did not use any Category.The effects of the Categories differed in each search topic.
A specific combination of one of the detailed Categories and each term in a query is expected to increase the discrimination and precision of search because most of the concepts in the search topics are related to the specific Categories under "B.Methods" or "C1.Evidence".For example, when a combination of term ¦ Categories is represented in parentheses, one of the search topics can be represented as " (the difference in the effect ¦ B41.Items measured or C1.Evidence) of (interferon therapy ¦ B31.Procedure of Intervention) of (viral hepatitis type C ¦ B21.Attributes of Subjects) with (sex ¦ B21.Attributes of Subjects) ".Selecting appropriate Categories for each search term required expertise.Further investigation is needed on the automatic formulation of the search statement from users' input.
One of the medical researchers who provided search topics said, "…I'm interested in the papers on interferon therapy of Viral Hepatitis which uses Davis' criteria for evaluating the prognosis status, because I'm currently writing a paper which uses this criteria, so I'd like to know any other papers that use the same criteria and compare the results..." And he said, "..... Before our project team had decided the criteria for prognosis evaluation, we collected the papers on interferon therapy of Viral Hepatitis as much as we could, then scanned just the parts of methods used for evaluating the prognosis.And if a candidate was found in a paper, we examined whether the setting reported in the paper was appropriate for ours or not," This comment is an example of the relevance judgment and usage of texts, differing according to the users situation.The Categories discussed here can be used both for searching texts stating a specific measurement criteria like "prognosis evaluation criteria", and displaying specific parts like "methods" of retrieved texts.

Experiments 2: Passage Extraction
The set of Categories can be used as a framework to extract passages with a specific role or function in a text.It facilitates analytical comparison of informational contents across texts and provides the possibility for users to get insight about the content of databases, hence supporting users' information work like decision making or problem solving.Sentences with a specific Category or with a combination of Categories and terms can be extracted from the retrieved set or the database.For example, comparison of passages described "methodology" or "term definition" across the texts in the retrieved set is supposed to be effective in comparing each papers' standpoints and trends among them.
The examples were shown in Appendix 1 and 2. In order to keep cohesion within each extracted passage, the previous sentence was extracted together when the sentence to be extracted had a demonstrative pronoun or indefinitive pronoun as the first words of sentences, next to the first word of conjunction, or when the previous sentence was a topic sentence of the paragraph.

Implication of Text-Level Structure for Text-Based Information Processing Systems
The size of the database is one of the fundamental factors we should consider in the effectiveness of information retrieval.Since the size of the database used in this study was small, even though the records in it were systematically selected from an operational large scale database, we can not derive concrete conclusion regarding search effectiveness.However, the results of the experiments indicated some promising lines of investigation relating to text-level structure.
Using text-level structure, a flexible way of display and interaction with the system, such as passage extraction, browsing and navigation within/among texts, comparison of passages extracted across texts is available.These are helpful for users to get insight into the content of database or retrieved records.
The Categories proposed by this series of research has possibility to enhance the interaction between the user and the system by providing links other than topic and semantic or logical structural to the components within the text or in other texts.For example, during the browsing a text, a user found that the particular passage was interesting for him/her.A search of similar texts or passages can be conducted with a function of the search engine, based on terms and Categories occurring in the passage.This makes a search process as a continuous browsing and searching in a highly interactive setting.
Another possible application is the approach for analyzing text-level structure and the concepts of Categories using elaborated templates.It is expected to be applicable to information extraction.Information extraction is one of the fundamental and promising techniques for automatic abstracting, text summarization, integration, question, and answering .Some comments from the participant suggest the relationship between text-level structure and users' situations in the research process.Further research is needed on this.

Further Studies
The next step of the studies will be experiments using a large scale database with users.To utilize all the possible search functions in an interactive setting, design of the graphical user interface is one of the crucial points.In the process of developing the user interface, level of Categories used for search, unit of display, methods of query formation using the Categories, and the way of exposure of the Categories to users, should be decided based on users' cognition and search effectiveness.An evaluation method for such a system also requires investigation.Applicability for text other than Japanese also needs to be investigated.

English translation of extracted passage from #1
Secondly, we studied factors in 10 patients whose s-GPT were normalized after Interferon treatment and in 4 whose s-GPT were not normalized among the 14 patients (10 with continuous administration and 4 with intermittent administration)(Table 1).No significant difference was found between the two groups in terms of sex, age, history of blood transfusion, general liver function tests or serum OD level for anti-HCV.

English translation of extracted passage from #2
Multivariate analysis of the factors influencing the normalization of aminotransferase with IFN therapy was conducted.Table 5 presents the results of the studies on the parameters of age, sex pretreatment histology, administration of IFN (continuous administration for 4 weeks, 6 weeks or more, continuous and intermittent, intermittent).Significant differences were found in 5 of these factors.These were the total dosage of IFN, age when the treatment started, administration of IFN, and pretreatment histology.The normalization ratio of GPT by IFN was higher in the patients whose total IFN dosage was more than 400MU, whose age was less than 35, with continuous and intermittent administration, whose pretreatment histology was CH2A, and who were female.

English translation of extracted passage from #9
Next, we studied the patients who responded to IFN treatment and those who did not(Table 5).For the analysis of relevant factors, no significant difference was found with regard to age, sex, or pretreatment level of GPT but the number of patients with CAH2B was significantly larger among those who did not respond than those who did.

English translation of extracted passage from #27
3. Analysis of the patient factors in those who responded after 6 month-intermittent administration of IFN and those who did not(Table 1) The age of those who responded was significantly younger (p<0.01),40.3±12.3years versus 50.1±7.7 years for the nonresponders.The periods between blood transfusion and IFN treatment, and the duration of hepatitis were significantly shorter for the responders (p<0.05),66±112.3months and 24±33.2months, respectively, versus 218.9±152.4 and 62.8±37 months for the nonresponders English translation of extracted passage from #43

Univariate analysis between the groups(table 2)
The ages of the responders ranged from 22 to 28, 42±14.5 years old on average, i.e. significantly younger than the nonresponders who were from 51 to 67, 58.1±5.7 years old on average(p<0.01).
Among the patients who were anti-HCV positive, significant differences were found between those who responded to the treatment and those who did not in terms of age, pretreatment 2-5 AS activity, maximum 2-5AS activity, and serum GPT level.

(b) anti-HCV positive rate
English translation of extracted passage from #10 It has been reported that the anti-HCV positive rate in Japanese is 1.1% to 1.2% based on surveys of blood donors[KATA90a].

Figure 3 :Figure 4 :
Figure 3: Results of Full-Length Text Search without Ranked-output (Average)