Towards an Evidence-Based Understanding of Electronic Data Sources

Systematic Literature Reviews and Systematic Mapping Studies are relatively new forms of secondary studies in software engineering. Identifying relevant papers from various Electronic Data Sources (EDS) is one of the key activities of conducting these kinds of studies. Hence, the selection of EDS for searching the potentially relevant papers is an important decision, which can affect a study’s coverage of relevant papers. Researchers usually select EDS mainly based on personal knowledge, experience, and preferences and/or recommendations by other researchers. We believe that building an evidence-based understanding of EDS can enable researchers to make more informed decisions about the selection of EDS. This paper reports our initial effort towards this end. We propose an initial set of metrics for characterizing the EDS from the perspective of the needs of secondary studies. We explain the usage and benefits of the proposed metrics using the data gathered from two secondary studies. We also tried to synthesize the data from the two studies and that from literature to provide initial evidence-based heuristics for EDS selection.


INTRODUCTION
Secondary studies, such as Systematic Literature Reviews (SLR) and Systematic Mapping Studies (SMS), are the core methods of Evidence-Based Software Engineering (EBSE) [1], which is expected to help practitioners to make informed decisions about technology selection and adoption [2].Recently, there has been growing interest in performing SLR and SMS in Software Engineering (SE).Retrieval of relevant papers is an important concern in any SLR and SMS [3].This activity determines the coverage of the relevant papers to be considered in an SLR or SMS.Automatic search, i.e., executing search strings on Electronic Data Sources (EDS 1 ), is the dominant method for identifying relevant papers 2 .However, researchers' selection of EDS is usually based on their personal preferences or ad hoc experiences, rather than an evidence-based decision on the basis of evidences.We assert that building an evidence-based understanding of EDS is essential for improving the current state of the practice in identifying primary studies through automatic searches.This paper reports 1 EDS is singular as well as plural based on the context in which it is used. 2 Another method for identifying relevant studies is manual search, which refers to the search performed by manually browsing the journals or conference proceedings.Manual search has several advantages over automatic search; however, due to the vast volume of available literature, manual search is usually too time-consuming.our initial efforts in reaching such an evidence-based understanding of EDS.

ELECTRONIC DATA SOURCES (EDS)
We have collected a list of EDS from a library of published SLR and SMS maintained by our group.The following are the most often used EDS by SE researchers: We classify EDS into two main categories: index engines and publishers' sites.Index engines mainly index the work published by various publishers.Some popular index engines are WoS, EI, GS, SCOPUS, and CS.Publishers' sites refer to the online literature databases provided by the publishers to facilitate easy retrieval of the published literature.Some popular publishers' sites are IEEE, ACM, SD, SL, and WIS.Some EDS (e.g., ACM) can be considered both functions of index engines and publisher's sites.

METRICS FOR CHARACTERIZING EDS
In the context of primary study search, precision 4 and recall 5 are two widely used metrics in SLR.The two metrics are usually used for measuring the effectiveness of a search string when applied to a particular EDS or a particular set of EDS, and for measuring the effectiveness of the whole search phase.However, they are not adequate for measuring the effectiveness of EDS directly.
In the context of SLR or SMS, researchers often search potentially relevant papers from multiple EDS.There may be rare cases, if any, where researchers rely on only one EDS for identifying relevant papers.Rather, researchers usually select a set of EDS.It is the combination of those selected EDS that provide researchers with an expected coverage of available relevant papers.Thus, lights should be shed on the combination of several EDS as a whole instead of an isolated single EDS.Following this principle, we have proposed three metrics for characterizing EDS for helping better understanding of EDS by researchers performing secondary studies.The proposed three metrics are overlap, overall contribution, and exclusive contribution.They are summarized as bellow.
• Overall contribution represents the percent of the relevant papers returned by a certain EDS out of the total relevant papers.This metric can help identify the dominant EDS.• Overlap indicates the papers that are returned by multiple EDS.The overlap can be represented in an overlap matrix.• Exclusive contribution represents those papers that may be missed if a certain EDS is not searched.This metric can help researchers to decide as to which EDS should be omitted if they can not afford to search all EDS because of limited resources.It can also provide some indication about the number of papers that may be omitted when a certain EDS is not searched.

TWO CASES
We have used two secondary studies performed in our group as two cases.We have analyzed the search results obtained while performing the two studies using the abovementioned three metrics.The first study is an SLR of Variability Management (VM) in software product lines (short for "SLR of VM").The results of this study have been published in [4].The second study is an SMS of Architectural Knowledge Management (short for "SMS of AKM").It is at the stage of data analysis and reporting.In the following sections, we briefly describe each of these two cases.

SLR of VM (Case 1)
This study aimed to review the status of evaluation of VM approaches in software product lines (an overview of VM approaches is presented in [5]).The search string 6 used is as bellow: <<software AND (product line OR product lines OR product family OR product families) AND (variability OR variation OR variant)>> The search string was run on the following 7 EDS: IEEE, ACM, CS, SD, EI, SL, and WoS.From all sources, 628 papers were found after removing the duplicates.97 papers were finally selected.
The data of the metrics are presented in Table 1 to Table 3.The search was run on the following 7 EDS: ACM, IEEE, WoS, SD, EI, WIS, and SL.We received 1962 results by all the searched EDS.After applying the inclusion and exclusion criteria, 130 papers were selected.The results from the analysis using the three abovementioned metrics were presented in Table 4, Table 5, and Table 6 respectively.

A SYNTHESIS OF EXISTING EVIDENCES
We tried to synthesize the existing evidences about the three metrics.We performed an ad hoc literature search to find evidences related to EDS selection.The work by Bailey et al. [6] and Dybå et al. [7] were found.It is worth to note that Bailey et al. [6] concluded that "very little overlap was found" between the EDS.However, when we looked into their data presented in the paper, we found that the overlap does exist between index engines and publishers' sites (e.g., GS has considerable overlap with ACM and IEEE).The overlap between different search engines exists as well.The overlap between different publishers' sites is rare.We assert that classifying EDS into index engines and publishers' sites is necessary to increase the clarity of the data analysis and to draw concrete and fine-grained conclusions.
As believed by many researchers (including the authors), the suitable selection of EDS is topic specific because, e.g., area specific conferences and journals are published through particular publishers.However, the existing evidence indicates that certain rules/patterns do exist (e.g., as summarized above).Given the limited cases studied in this paper, we could not claim the above items as rules in this stage; however, our findings imply that some underlying rules/patterns may exist, which encourages us to collect more evidence for further validation.

The Usage of the Measurement Results
The results from analyzing the search results using the proposed three metrics can help increase researchers' evidence-based understanding of EDS for conducting secondary studies.For example, the results can provide researchers with evidence-based information for EDS selection related questions such as: (a) Which EDS to search for relevant papers?(b) How many relevant papers will be missed if a certain EDS was not searched?(c) How to prioritize different EDS during search?The measurement results using the three metrics from a particular SLR or SMS will directly benefit researchers in the area in which an SLR or SMS is performed.Even though the value of the metrics can be varying over time and topic, and unknown before the search, the synthesis of the metrics from multiple SLRs still show some observable phenomena (e.g., as summarized in Section 5).They can be useful for researchers doing SLRs in new topics.At least they can be used as evidence-based heuristics.To give more concrete understanding, we describe some scenarios here.
1) Suppose a researcher is going to do a SLR in VM (it is not uncommon that multiple SLRs will be conducted in the same area, for example, there are multiple SLRs in cost estimation), a sub topic of VM, or an area tightly related to VM, the data of the 3 metrics in Case 1 will be useful to him, especially when the researcher is new to the area.
2) Bjørnson and Dingsøyr [8] performed a SLR on knowledge management in software engineering.They searched IEEE, ACM, EI and WoS only based on their opinion that articles retrieved from SD, SL, KO, and WIS are also returned by either WoS or EI Compendex.Based on the existing evidences, their opinion appears to be incorrect.However, based on the existing evidence (as summarized in Section 5), the readers will have a clue that the study may have missed some papers due to not searching SD, SL, KO, and WIS, but only very limited number of papers may have been missed.
3) According to our interview with some researchers worldwide, some of them have problems to access SL.Thus, searching SL is difficult for them.Based on the synthesized results, the researchers will know that if they do not search SL, probably only very limited number of papers might be missed if they search EI and WoS.The researchers can decide whether or not to search SL according to the requirement of literature coverage of their review and their available resources.

Cautions
There are chances that the EDS with low exclusive contributions may return few important papers that will not be returned by any other searched EDS.In the case where the results need to be exhaustive, the EDS with low exclusive contributions may still need to be searched.Changes of EDS (e.g., changes in indexing scope) need to be considered when using the results from studies performed before the EDS changes.In practice, selection of EDS also needs to take other factors into consideration (e.g., usability of EDS).

SUMMARY
Retrieval of relevant papers is an important problem in any evidence-based discipline [3].The selection of EDS can have significant effect on the coverage of relevant papers for a secondary study.However, currently, the selection of EDS mainly relies on researchers' preferences and ad hoc experiences, rather than, evidence-based decision.This paper reports our initial efforts aimed at improving this situation.We have proposed a set of three metrics to characterize EDS.We have then presented two cases where we applied the proposed metrics.We also have synthesized the data from our two cases and that from the literature.The results can serve as initial evidence based heuristics for EDS selection.We have also discussed how the results from the analysis using these three metrics can be used.This work may increase the researchers' awareness of the characteristics of EDS and the impact of their combination on search performance.
We assert that if researchers report the results from their SLR or SMS for the proposed three metrics as byproducts of their SLR (only very limited extra effort is required per our experience), the results will benefit researchers interested in that particular area where the SLR or SMS is performed.The synthesized results from those SLR or SMS will gradually form an evidence base that can help researchers to make evidencebased decisions on EDS selection, which in turn will advance the state of practice of secondary studies in software engineering.

Table 1
presents the overall contribution of each EDS.EI has the largest overall contribution (i.e., 42, 43.3%).This means that 43.3% relevant papers can be retrieved by only searching EI.Table2presents the overlap matrix.The number in each cell tells the number of papers returned from both EDS indicated by the cell's column and row header; hence, the number represents the extent of overlapping in terms of number of papers returned from both EDS.For example, the overlap between EI and WoS is 15 papers, which means that there are 15 papers' overlap between the results from EI and WoS.Table3presents the exclusive contribution.For example, the exclusive contribution of SL is 1 (1.01%), which means that 1 (1.01%) relevant paper will be missed if SL was not searched.If researchers cannot afford to search all databases, they can consider omitting SL first.

Table 1 :
Overall contribution of each EDS in Case 1

Table 2 :
The overlap matrix of Case 1

Table 3 :
Exclusive contribution of each EDS in Case 1

Table 4 :
Overall contribution of each EDS in Case 2

Table 5 :
The overlap matrix of Case 2

Table 6 :
Exclusive contribution of each EDS in Case 2 [7]å et al.[7]reported that articles retrieved from SD, SL, KO, and WIS were also returned by either WoS or EI Compendex based on their SLR in agile software development.Based on this evidence, which covers six SLRs in six different research areas, we had the following findings: [6]ley et al. [6]mainly focuses on one (i.e., overlaps) of the three proposed metrics.The data reported by Bailey et al.[6]covers three SLRs in three different areas.