satisfy their information

This paper discusses how mass digitisation of music has led to an emerging discipline of Music Information Retrieval (MIR), which has focussed more on systems than on users, and identifies the area of information need for work purposes as a focus for planned research. A literature review provides an overview of developments in MIR, pointing out its multidisciplinary nature, which causes problems in evaluation and retrieval. Two types of systems, content-based and context-based are discussed, and it is suggested that each type meets differing user needs depending on the level of specialist or interest of the user and that information behaviour and need differs according to the type of user. Evaluation is discussed, suggesting there are historical links with text retrieval while proposing music retrieval has sufficient additional complexities to justify its own discipline. A discussion of user research suggests that both content and context should be considered, and that different users respond in different ways to music, leading to the requirement for systems which reflect a variety of approaches and interpretations, needs and uses. It is proposed that a range of music industry professionals are interviewed using semi-structured interviews, and observation in order to investigate their information needs and behaviour, and that the systems they use are evaluated by existing techniques of precision and recall as well as from interview and observation data. Interview questions will be based on a semiotic music analysis framework. Analysis and discussion of the data will be by reference to existing information need models and a reflexive communication model while a cognitive information seeking and retrieval model will ground the research in current thinking. It is planned that the analysis will allow the researcher to determine whether an ideal MIR system can serve the needs of the music industry professional. Finally discussion issues are raised which highlight the holistic focus and interdisciplinary approach of the project.


INTRODUCTION
When searching for music, users follow a number of search strategies depending on whether they are looking for known items or unspecified items that suit certain contextual criteria.Digitisation has made Music Information Retrieval (MIR) increasingly important because users now have access to a large number of globally and locally situated documents which are accessible from databases and via the internet.While there is an emerging discipline there is not much existing research in user needs, particularly in the area of users needing music for work purposes.This paper offers a brief overview of the MIR literature, discussing the three areas of systems, evaluation and user needs.It goes on to propose research which would identify information needs of music industry professionals, examine their information seeking behaviour and the way they communicate and resolve their needs, and recommend ways to improve their search results in the context of Information Retrieval (IR) theory and tools.Using an inductive approach and qualitative analysis the writer plans to do investigation of information seeking needs and behaviour by questionnaire, interview, and observation; evaluate some existing systems by traditional Information Retrieval techniques; and use the results to determine whether a general MIR system would be able to satisfy the needs of music industry professionals.This reflects the Interactive Information Seeking, Retrieval and Behavioural model of Ingwersen and Järvelin (2005:261), which calls for a holistic approach to Information Seeking and Retrieval (IS&R) research.The section on methodology details a qualitative approach for user interviews and established IR evaluation for systems while interviews will be based on a semiotic framework derived from popular music analysis theory.It is also proposed that a reflexive communication model will be used to discuss meaning in music.Finally issues are raised for discussion regarding the scope and focus of the project and its interdisciplinary approach.The researcher's extensive employment background in the commercial music industry combined with recent Masters-level Library and Information Studies and dissertation into user needs of a BCS IRSG Symposium: Future Directions in Information Access (FDIA 2007) folk music library puts him in a unique position as he has an understanding of the context and access to a wide range of normally inaccessible professionals.

Music Information Retrieval
The discipline of IR originally arose in the 1950s and when computers started to be widely used for managing information the term was used to describe 'the process in which users put questions to information systems and consequently get some answers' (Hjørland, 1998), or 'the optimal relationship between the input and the output of the information retrieval system' (McLane, 1996).Full-text searching now means IR includes finding information, as well as finding the document that contains it.The increase in availability of digital information packages has led to a widespread focus on multimedia (text, images, music, video) information retrieval.Although it was proposed in the 1960s by Kassler (1966), Music Information Retrieval is a relatively new discipline, having been more established at the end of the 1990s.A multidisciplinary approach is required to fully appreciate MIR.Futrelle and Downie (2002) proposed that this multidisciplinary approach is partly due to the range of possible representations of music (symbolic, audio, visual and metadata) and their complexity (the numerous facets of music such as harmony, polyphony and timbre have significant effects on MIR).Byrd and Crawford (2002) discuss how this complexity causes problems with evaluation of systems and recognise that little research has been done in the area of user needs.Downie (2003) discusses how research has to look at a wide variety of issues to deal with these difficulties and focuses on MIR systems (analytic/production systems and locating systems):

Systems
• Analytic/production systems aim for representational completeness and are designed for specialists who wish to make detailed study of music.They are predominantly content-based.• Locating systems (context-based) include OPACs, databases and search engines and aim for broad and extensive coverage.These systems have fewer access points but are good at finding known items.Adding full-text to a locating system improves search results, and search queries are easier to construct than in analytic/production systems.Typke et al (2005) examined a wide range of MIR systems for content-based music searching and found two types of analytic/production system: • Searching audio data by extracting perceptionally relevant features such as loudness, pitch or tone; audio fingerprinting; set-based methods and self-organising maps.• Searching notated music using distance measures and indexing in string-based methods for monophonic melodies, set-based methods for polyphonic music and probabilistic matching.They propose that these types of system would be useful for query by example, such as 'query-by-humming', to give greater insight into composition structures for musicologists and composers, and to evaluate copyright issues such as plagiarism.However this research suggests a lack of insight into the needs of the professional user.A music industry-related investigation was done in 2000 by Pachet and Cazaly (2000), who proposed a taxonomy of genres.There is no indication that this was tested on a user base.Although Orio (2006) discussed various issues relating to the music industry professional in his recent thorough tutorial and review of MIR these were not investigated in depth.Selfridge-Field (2000) discusses the importance of queries in the discipline: queries are likely to be 'fuzzy'; music can be coded in different ways; the performance is important.She also points out that curiosity is a key thread.Similarity and recommendation have been explored from the users point of view by Vignolli and Pauws (2005) who found that users prefer to have control over their own definitions.Some software applications are designed to recommend music that interests the listener (Uitdenbogerd and Schyndel (2002), Cunningham, Downie and Bainbridge (2005) and Celma, Ramirez and Herrera (2005)) and a variety of websites exist offering this as a service to the recreational user (including last.fm and Pandora), and the business user (Ricall, BMG Music Search, KPM).

Evaluation
MIR systems evaluation is an important research area but has been variable and unfocussed.There is a problem caused by lack of availability of large test databases so any evaluation that has been done is not comparable to those done on different databases.This was the case with text retrieval and was widely discussed in the 1960s (van Rijsbergen, 1979) with the Cranfield tests leading to the establishment of an evaluation methodology (Chowdhury 2004) and the Text REtrieval Conference (TREC) which provides a selection of large collections for text-based IR experiments (TREC 2007).Although traditional text-based IR evaluations (precision and recall) have been used (Foote 1999, Downie (1999) and Uitdenbogerd and Zobel (1999)) they have been found wanting, partly because of the numerous facets of the information but also because of the different analytic approaches taken (Downie 2003).It has been suggested by Warner (2000) that a more appropriate way to evaluate IR systems may be to focus on exploratory capability or cognitive control than precision and recall.Byrd and Crawford (2002) recommend finding an alternative to the Cranfield tests and Typke et al (2005) propose a 'TREC-like comparison of matching algorithms' to combat the problems of different approaches and different databases.This is supported by Downie (2003) who points out the need for professional music librarians to be involved in the development of TREC-like music query records and also introduces the idea of using 'n-grams' or summaries of elements of music similar to words, which are then tested using traditional techniques (Downie 1999).Although there have been problems caused by intellectual property issues (Downie 2003) this has led to the establishment of the music evaluation programme, MIREX (Music Information Retrieval and eXchange) which offers MIR researchers a standard test bed for testing analysis and retrieval algorithms.Futrelle and Downie (2002) state that it is very important to look at user needs in MIR.They highlight examples of research that ignore users, even when they are the subject, and discuss valuable user needs research such as Itoh's (2000) examination of transaction logs of OPAC searches which found that performance and genre were more important search terms than performer and title and recommended a more detailed qualitative survey into search purpose and satisfaction levels.The MELDEX usage investigation (McPherson and Bainbridge 2001) found that the greater part of searches in the investigation were by text and involved browsing.100 users were investigated by Baumann and Kluter (2002) who found that non-musical listeners often used lyrics to search.Kim and Belkin (2002) found systems must be able to accommodate verbalised emotional responses if they are to be of any value to the non-musical user.This was discussed further by Airey (2005) who investigated the use of emotive terms in democratic indexing and found that listeners do not respond the same way to the same piece of music.Queries were also discussed by Cunningham (2002) who confirmed that MIR systems that reflect user need can only be built if users are investigated, and found from her observations that users prefer a combination of browsing and known-item searching.She developed this further by investigating Google Answers (Cunningham 2005), finding that users predominantly require bibliographic metadata and that fuzzy queries were also widely used.This was supported and developed further by Lee and Downie's (2004) extensive investigations which found that users refer to collective knowledge (reviews, recommendations, ratings) and prefer to have access to content-metadata (musical and bibliographical) and context-metadata (relational and associative).One of the most cited investigations into user needs was by Downie and Cunningham (2002) who studied 161 postings of an old-time music Usenet newsgroup.They found four categories emerged from their analysis of the queries: information need, desired outcomes, intended uses of information and social and contextual elements and recommended these are studied in more detail when designing MIR systems.Downie (2003) also reported that music queries should reflect the library reference interview and that professional librarians must be involved in development of MIR systems if they are to satisfy user needs.Cunningham, et al (2004) found that systems should be easy to use and able to learn from users reflecting their idiosyncratic use of metadata when describing contents of music files.The focus of user needs research to date has been on recreational users' 'Daily-life Tasks' (Ingwersen and Järvelin, 2005) rather than music industry professionals' 'Work Tasks', which may differ.

DESCRIPTION OF PROPOSED RESEARCH INCLUDING MAIN RESEARCH QUESTIONS
This research will investigate user information needs in the community of the music industry professional, and evaluate how they are satisfied.It is then proposed to use these findings to recommend an MIR system for the community.There are vital applications for this throughout the music industry where increased accessibility to music will lead to increased revenues.The focus will be on how professionals seek for polyphonic Western music in large collections.Representatives of a number of industry sectors will be consulted including publishers, record companies, film music supervisors, and webmasters.After developing an overview of user needs and searching methods and drawing ideas from existing retrieval systems, attempts will be made to suggest whether an MIR system can be developed that will encompass the needs of the various types of music industry user.

Aims
a) To use a clear understanding of the conceptual and theoretical models in Information Retrieval to evaluate music industry professionals user needs b) To recommend improved ways in which music industry professionals can search for digital music in large collections c) To propose an MIR system for the music industry professional community

Objectives
a) To overview the literature relating to MIR and evaluate how it relates to traditional Information Retrieval.b) To identify music industry professional users of MIR systems and investigate their information needs and how they express those needs.c) To evaluate whether the results of their searching meets those needs.d) To identify and evaluate various retrieval systems used by the music industry.e) To use this information in proposing an ideal MIR system

Hypothesis
The ideal MIR system for the music industry professional combines content-based and context-based indexing.

Scope and definition
This project will investigate UK-based music industry professional users of Western commercial polyphonic music in the commercial sector between 2006 and 2009.

OUTLINE OF CONCEPTUAL / THEORETICAL MODELS THAT WILL BE USED
The research will follow the recommendations of Ingwersen and Järvelin (2005) by investigating the subjects' Information Seeking and Retrieval (IS&R) behaviour from a Cognitive Viewpoint.It will attempt to take a holistic approach to answering the research question by reference to their Interactive Information Seeking, Retrieval and Behavioural processes model (Ingwersen and Järvelin, 2005:261), which suggests that context is a key influence in IS&R.The key user models of Dervin and Nilan (1986), Ellis (1989), Kuhlthau (1991) and Wilson (1999) will be referred to.A reflexive communication model based on Tagg (1999) will be used to discuss the communication of meaning of music and the work of Tagg (1999), Middleton (1990) and Stefani (1987, in Middleton 1990) will be used to generate a semiotic analysis checklist which will then be applied to different types of users, to provide an insight into how context affects this area of MIR.

RESEARCH METHODOLOGY
Inductive research will be used to find out who the users are, and investigate their needs.There will be some statistical analysis in analysing the demography of the music industry user-base, and in evaluating their MIR systems using traditional information retrieval evaluations of recall and precision.There will also be qualitative research and a behaviourist approach will be taken, investigating the subjects' perception of the world and their different opinions on music.Real people will be studied in the real world by observation, questionnaire, and interview.

Investigation of user needs
a) Conduct a questionnaire based survey focusing on the way industry people search for and use music.
Possible questions may include: • Do you use music in your job (how often)?(to help establish whether they search for music in their job); • What do you use it for?(to help establish different social/work groups) • Do you have to select it from a wide range of music choices?(to indicate range of choice and relevance) • What type of system do you use for this? (to identify different systems) • Do you use words / sound / audience feedback etc to make this choice?(to determine content / context).b) This questionnaire will be sent to a wide range of industry professionals on an existing database established by the writer.It will also be made available on a website designed by the writer and publicised in the trade press and through industry word-of-mouth and professional networks.Response level of around 100 useable responses will be targeted.c) Interview a sample of those who use music frequently to find out more about their use of systems and search strategies and expression of needs using semi-structured interviews.d) Analyse these interviews looking for recurring themes using Discourse Analysis and a semiotic framework.e) Observe a sample of respondents in the workplace using their systems

Evaluation of existing music IR systems
a) Identify existing MIR systems: There are a wide range of MIR systems available.Some are designed specifically for the organisation that uses them, some are commercial software for professionals, and others are freely available to the consumer but have been adapted by the professional user to meet their requirements.b) Find out what systems music industry professionals use: Do they use bespoke, specialist or adapted systems?c) Undertake established IR evaluations (precision and recall) testing on them: Once the systems have been established, through desk and questionnaire / interview research these will be evaluated according to existing criteria (precision / recall).d) Analyse and discuss the results : The quantitative and qualitative results of this analysis will give an indication of the value of the existing systems.
5.3 Use this information to propose an MIR system for the professional user community.
The information generated by the research into the professional user needs and search strategies will provide the researcher with the opportunity to propose a system and make recommendations about what this would include and how it would be evaluated.

PARTICULAR ISSUES HIGHLIGHTED FOR DISCUSSION.
a) Is this too ambitious ie should it focus on just one aspect of the IS&R model (eg) the way the users express their request and / or query and how it relates to their context?b) Methodology -the author proposes the use of a semiotic framework to analyse how users express their information needs when describing pieces of music.This is designed to elicit the meaning given to a piece of music by the user and incorporates music elements such as timbre, key, tempo as well as aspects of communication such as the relationship between the transmitter and the receiver, their levels of musical expertise etc.. c) A social semiotic approach will be taken to the research, acknowledging the music has meaning when communication takes place, ie within context as well as from its content.d) This relates to a reflexive feedback communication model based on the work of Tagg (1999) which shows how meaning in music is communicated within socio-cultural codes and musical competences and that this information can travel from composer/performer to listener and back again.