Web application for development of domain information space for thematic information retrieval and reuse

This paper is devoted to a problem of creating the information space of subject domain for further information retrieval and reuse in various subject domains. An ontology-based approach to the collaborative development of the domain information space using the personal human cognitive spaces was proposed. In the framework of the proposed approach the information space ontology was constructed, the prototype of editor for development of the domain information space using the personal human cognitive space is developed.


INTRODUCTION
One of the important stages in any research work or learning process is an investigation of the subject domain.Thematic information retrieval is a search of information resources to satisfy information need and is one of the ways to explore the subject domain.Exploration of information resources with different coverage of the subject domain, complexity level, form of presentation, which are focused on different goals and outcomes allows improving the efficiency of learning.In the scientific research the use of open information resources is important on the early stages of this process for study the state of the art in subject domain, identification of research problems and goals, statement of the specific research tasks.
The subject domain can have rather complex structure and cover a huge set of concepts and relations between them, that complicates extremely the studying these resources by the person who is not familiar with the domain.In this case, it is advisable to create a domain model during the study of the resources or to use and extend the domain models created by other persons.So, the important and urgent problem is to automate the collaborative development of the domain information space for different subject domains.This domain information space should include the description of the subject domain structure as well as the available information resources for this subject domain.

THEMATIC INFORMATION RETRIEVAL
The accuracy of thematic information retrieval depends heavily on the quality of representation of document semantics in the form of document search pattern.For appropriate definition of the search pattern it is necessary to dispose of document context that defines the basic domain concepts and semantic relationships between them.
Existing approaches (using the key words and phrases, descriptors, classifiers and rubricators) do not allow to appropriately represent the domain concepts and interrelationships, and to describe the domain in terms of information needs of specific user groups.
Thus, a development of the document search pattern that adequately represents the subject domain in terms of specific needs of information retrieval subjects is a very actual modern problem.

STATE OF THE ART
Information support of learning implies the use of the appropriate information and educational resources.The problem is to find the resources that achieve the learning objectives and meet the requirements of the learner among a great number of available resources, and to provide the effective personal learning trajectory to achieve required learning outcomes.The OpenCourseWare approach requires that learning content is presented as a combination of the reusable and remixable learning objects.The number of low-quality learning objects on Internet grows fast, it makes the task of content search and filtering rather challenging and labor-intensive.The efficiency of learning resources retrieval can be improved using the resources metadata annotation on the base of the appropriate metadata standards and ontological models (5) (6) (7) (8).In scientific research the electronic libraries (like ACM Digital Library, IEEE Xplore Digital Library, SpringerLink, ISI Web of Sciences) can be used to access to high quality peer-reviewed scientific papers, but they provide the access only on subscription base and do not provide the ability to perform the semantic search within subject domains.The OpenScience approach involves publishing the open research, campaigning for open access, encouraging scientists to practice open notebook science, and generally making it easier to publish and communicate scientific knowledge.The services like arXive.orgallow to publish the papers and provide the open access for them.The services like Share-Latex (9), Authorea (10) and Intech (11) provide the ability for collaborative writing and/or publishing the papers.Scientific social networks like Research-Gate (12), Academia (13) allow to share publications, connect and collaborate with specialists in the related fields and co-authors.Some software tools (like Mendeley Desktop, Zotero) allow to create personal repositories of the information resources to use it in research and cite in the following publications.During the scientific research process the domain models (conceptual models, mind maps, ontologies) are created using various software tools, some of these tools allow to associate the information resources with the components of the domain models, but none of these allows to share the resulting model with included resources and to search the models and resources created and gathered by other researches.According (3) (4) , a cognitive space is the set of concepts and relations among them held by a human.The cognitive space can be individual as well as shared by a group of people.Using the modern software tools the cognitive space can be mapped into conceptual model represented as a mind map, topic map, concept map (conceptual diagram) or ontology.An information space (3) (4) is the set of objects and relations among them held by information system.The components of the information space for the information retrieval task include concepts, documents, words, relations among words and documents.So the information space should be consistent with the cognitive space of particular humans or groups.
The essential and actual problem is to create the information space for some subject domain which is relevant to personal cognitive space of subject of information process.

COGNITIVE-INFORMATION SPACE EDITOR
In (1) (2) an approach that allows to represent the subject domain in the form of concepts and relationships between them is proposed, as well as to link these concepts with the appropriate information resources (documents).As a result a cognitive-information space of the subject domain is formed and a collection of information resources in this space is generated which meets the requirements of some user group.To evaluate the appropriateness of proposed approach a prototype of the Cognitive-information space editor was created which implements the following basic functionality (Fig. 1): 1. creation of domain cognitive space as a set of domain concepts and relationships between them; 2. creation of domain cognitive-information space by setting the association rela-tionships between the concepts and information resources with the evaluation of resource relevance and the concept localization in the resource; 3. view of the existing cognitive-information space and its parameters.

CONCLUSION AND FUTURE WORK
The proposed representation of document search pattern as a fragment of cognitive-information space allows to take into account the relationships between the concepts of subject domain and to represent these concepts in the information resources in terms of information needs of different user groups.
The developed prototype of Cognitive-information space editor is planned to be used for creation of cognitive-information space in Programming languages subject domain and for manual and automated thematic search for students learning the programming in VSTU.

Figure 1 :
Figure 1: Cognitive-information space editor: Use case diagram