14th International Conference on Evaluation and Assessment in Software Engineering (EASE) (EASE)
Evaluation and Assessment in Software Engineering
12 - 13 April 2010
Background. Literal or theoretical replications are important to evaluate and assess empirical results. However, there are still few replications in software engineering, and fewer external replications, i.e., developed by researchers other than the original ones. Aim. This paper discusses the difficulties found and the lessons learned from performing two literal replications of an experiment involving human subjects. Results. Our results apparently contradict the conclusions of the original experiment. However, several differences in context made it difficult to achieve valid comparability. Conclusion. Experiments involving human subjects should collect and report as many qualitative context information as possible, so the results can be related to the conditions under which the hypothesis were found to be true. Besides, given the difficulties found in this study, literal replication does not seem to be the best strategy for experiments involving human subjects in software engineering.