The use of software to support teachers and students represents a considerable portion of UK schools’ annual budget spending and, while not a new concept, has been increasing strongly over recent years (Manning, 2017; Vickers, 2017). While some of the larger, more established publishers and software providers supply solutions of this nature, the majority of tools taking hold in the domestic market are developed by small and medium-sized enterprises (SMEs) and start-ups, usually founded by individuals with a background in teaching (Cukurova et al., 2019).
Tassomai, as one such education software provider, was built with the original intention of supporting the self-directed study of the founder’s own students working in core subjects at Key Stages 2 to 4 (for students aged 11 to 16). A strong word-of-mouth-driven adoption across British schools saw Tassomai grow to be used in over five hundred UK schools, with a focus primarily on supporting students through their Key Stage 4 science courses in preparation for terminal assessment at GCSE. During the academic year ending in summer 2018, students on the programme generated around 1.5 million rows of data daily through interactions with Tassomai content.
This fast growth in terms of usage created a large data set with which the company is able to run continuous internal analyses for the purpose of adapting both its content and its systematized algorithmic techniques to strengthen the existing product offering, and to inform further product development to support more students in new subjects or age groups.
A priority for Tassomai was the automation of internal analyses to assess the requirements for developments in the product and to monitor in real time the effect of such changes both on learning outcomes and user engagements. In seeking to build their education research capabilities, Tassomai worked with UCL EDUCATE between March and August 2018 to design and implement a project that could not only provide valuable insight to the company and the wider education community, but also serve as a project template for further research studies in the future.
Research and evidence in educational technology
The movement towards an increased dependence on technology in education has brought into sharper focus the need for providers to demonstrate evidence of impact, and for practitioners to seek better evidence in their procurement processes.
Increasingly, companies such as Tassomai have sought to increase their research function in order to better demonstrate and communicate the nature of their intervention’s impact, to discover aspects of their intervention that require improvement, and to ascertain the contexts and implementation scenarios in which their product can have the greatest effect.
UCL EDUCATE had identified this need both from the industry and from the market, and led the way in training technology founders in particular in research skills and methodology, although latterly they have also taken similar initiatives to train schools in discerning better-evidenced solutions.
Over the period of this collaboration, Tassomai joined a structured programme of seminars, clinics and discussions with the EDUCATE research and business teams at UCL Knowledge Lab. The team was able to consult and receive advice at all stages of the experimental research design and analysis. This advice concerned key concepts in experimental study designs for education; the creation of a logic model; the choice of topic for experimentation; considerations of experimental design to address validity; interest to the community; impact on the product’s development; ethical concerns for learners; channels and methods for dissemination; and the communication of the findings.
Every EDUCATE participant is allocated a research mentor, an experienced academic who will act as the first point of contact for the participant during their time on the programme, engaging actively with them during the research training, clinic sessions and other networking events.
EDUCATE research mentors help participants to define their research goals, and then support them to work towards them. A mentor will share knowledge, provide encouragement, and aim to inspire the participant to achieve their goals. The research mentor:
supports the participant through the EDUCATE research training opportunities
provides the participant with advice about the most relevant literature to read
oversees the participant’s overall research training progress using the relevant project platform.
The research mentor also offers guidance about ethical considerations, and detailed records of all meetings with EDUCATE participants are provided within three working days of each meeting to enable effective communication within the project team.
At the start of the programme, the research goals of the company are related to defining the product or service that provides the main context for the research (via the logic model) and, once this has been discussed and refined, a realistic research proposal is developed in line with the guidance and resources provided in the research training programme.
The details of the research training programme are presented in the introductory paper to this collection by Clark-Wilson et al. (2021).
Mentoring support normally ends when the participant has produced a draft research proposal and can get their research under way, although some companies may not quite be ready to do so. In the case of Tassomai, the collaboration continued longer than most other participants, allowing time to reflect on the collaborative process and share the learning through the publication of this paper.
In addition to the research aspects, the UCL EDUCATE programme supported Tassomai in the growth and development of their team with regard to hiring data analysts, engineers, operations/HR managers and product designers. Numerous members of Tassomai staff also attended seminars to improve their understanding of research, pedagogy, business development, presentation skills and marketing.
These seminars helped Tassomai to develop a logic model and develop a hypothesis to test through their research; additionally, user research and marketing team members learned better survey techniques to glean more insight from their questionnaires, and the strategic team received support in developing a lean canvas for business development and presentation skills for investment pitching. This work expanded Tassomai’s skill set from technological innovation to embedding programme evaluation in their business.
The business of conducting the research itself, and producing presentations at conferences and for this journal, also required in-depth collaboration with research mentors through the EDUCATE programme.
Principles behind Tassomai’s design and function
Tassomai is designed to both raise student attainment and support teachers with live assessment data through a computer-marked homework system.
For the student, it provides rapid formative assessment by asking multiple-choice questions with immediate corrective feedback and consequent adaptation. This is served through a web-software or a mobile app, and students log in to their own account to access exercises that are tailored to their individual profile.
For teachers, the software provides a detailed interactive ‘heat map’ of student knowledge based on their answers, which allows them to plan and target intervention, plenary and auxiliary assignments. Parents, likewise, receive analysis through weekly emails. These learning analytics dashboards provide teachers and parents with opportunities to facilitate learner progression and personalized support.
Typically, Tassomai is purchased by a school for entire cohorts, and students are encouraged by the app and by their teachers to complete a daily task on four days each week. The task will vary (depending on the individual’s usage patterns) between 15 and 40 correct question answerings, with bonus targets unlocked on completion. The game-style design of the user interface aims to reward students’ consistent and sustained usage. Research shows that exposing students to the main elements of teaching content on different occasions – that is, spacing learning over time – often increases the amount of information that learners can remember (Pashler et al., 2007). Alongside the game-like nature of quizzes, these features help learners and engage them with the platform.
In reporting usage to teachers, Tassomai ranks their students with the emphasis on the quality of their work rather than quantity. These considerations allow the app to steer behaviour in order better to achieve the ideal performance of Tassomai’s interleaving, spaced repetition algorithm (see Pashler et al., 2007, recommendations 1 and 2).
Students attempt questions in batched ‘quizzes’, each formed around a particular thematic area of their course specification. The questions are written with the intention that they will teach the student as well as test them – the aim being that the assessment content itself acts as the learning resource. The process of answering questions in a test or quiz can facilitate learning in the context of classroom instruction, and it helps increase the rate at which information is remembered (Roediger and Karpicke, 2006).
Rather than ask ‘Which chemical group contains chlorine?’ with the options being [1, 2, 7 or 8], Tassomai’s questions build familiar assertive statements in the student’s mind by asking questions such as:
Students attempting the question initially may not know which group contains chlorine – indeed, they may not know it is an element, or that elements are shown on the periodic table; the exposure to the question and Tassomai’s feedback and consequent adaptation aims to facilitate student learning.
On their first attempt, the student may guess, and would receive corrective feedback immediately telling them that chlorine is part of group 7. This question is then scheduled for quick repetition with support: the thematic nature of the quiz composition design and Tassomai’s internal content scaffolding means that other, related questions will cluster with this one on the next viewing.
Thus, the next quiz on the topic of elements might not only include the same question, where the student will be more likely to correctly identify chlorine as group 7, but will also contain questions such as:
Chlorine is an element found in group 7 of the periodic table. Group 7 elements are known as the … [halogens, noble gases, rare earth metals, alkali metals].
The HALOGENS are elements in group —— and include the element ——.
Likewise, similar questions will exist to teach and test knowledge of other halogens and alkali metals (which appear in the examples above as distractors), and beyond this to teach and test the required knowledge of trends within these groups for melting/boiling points, ion formation, bonding, reactivity and physical appearance.
Tassomai schedules each quiz based on the student’s recent and historical performance in the area, using this to inform the relative interleaving and spacing of topics and sub-topics, and setting the appropriate level of challenge. The composition of each quiz is then designed to incorporate correction and confirmation of previous questions seen, introduction of new material or variant questions, and the occasional re-examination of past-mastered material to assess a sensitive rate of recycling.
The design of the software has been built primarily around the principle that frequent, regular, spaced, low-stakes quizzing was the key to increasing student basal knowledge and raising attainment. Adding in some more variation in student experience beyond the quizzing interface was something that the product team at Tassomai wanted to investigate, not only for the potential direct learning benefits, but also as a driver to increased student retention and engagement, which might have its own impact on long-term learning.
Importance of feedback for learning and implications for Tassomai
The process of assessment helps learners build the knowledge, skills and understanding to achieve the learning outcomes and it is ‘central to the student experience’ (Hernández, 2012). During the learning process, as part of the continuous assessment approach, and at the end of teaching and learning activities, assessment provides the learners with feedback on the proficiency of their performance against a set of criteria.
Effective feedback for students is a key strategy in learning and teaching, and timely and appropriate feedback is most useful for learners. The need for timely feedback is even more important in online learning environments, where learners have limited or no opportunities to ask for help or receive face-to-face feedback. Wiggins (2012) lists attributes for useful feedback that lead to improved learning, which emphasize the timely, tangible and specific. Making feedback an embedded part of a solution, and continually refocusing the student on actions that can improve their performance, is therefore crucial.
Student engagement with feedback is one of the key elements for students’ learning and achievement (Price et al., 2011), and to help students to use feedback and successfully learn from it, they need to be able to understand it. In the online environment, there is an even greater need to be precise when offering feedback as it may be the only feedback the learner receives.
Tassomai’s product provides instant corrective/confirmative feedback to students on each interaction that they have with the product as it stands, as well as the delayed feedback methods of changes in scaffolding, interleaving and spaced repetition.
The Tassomai and EDUCATE teams decided to investigate the potential of providing the teaching content as pre-emptive instructive feedback prior to quizzing, rather than as a consequence of student error. The hypothesis was that this would increase the likelihood of student engagement with the feedback, give students more agency, and give students more direct reward as, having engaged with the video, they would be more able to apply that knowledge in the coming quiz.
Design of the instructive video content
As an enhancement to its existing feedback structure, Tassomai sought to accelerate student learning and engagement by the introduction of targeted instructional video content. The development team at Tassomai had in the past hesitated to create a volume of video content for a number of reasons. Past research with teachers highlighted concerns that video content risked creating a passive viewer experience. A further consideration was that, since Tassomai is used primarily on students’ mobile devices, the data requirements of video content might significantly increase the monetary cost to the user and adversely impact those students from less wealthy backgrounds.
The team felt that Tassomai was uniquely placed to provide video content that mitigated these concerns, since it could offer content that was highly tailored to students’ requirements (with attendant impact on metacognitive processes) with a high degree of efficiency. The ability to offer videos at the start of a themed quiz, rather than as a piece of therapeutic ‘consequential’ feedback, strengthened the concept that Tassomai provides highly tailored content that teaches students.
The intention of this research was to present highly focused instructive video resources to specific students who Tassomai’s data identified as having a low probability of correctly answering a scheduled problem question.
Tassomai wished to assess the impact on attainment that viewing the video would have: what is the effect of viewing a pre-emptive instructive video on a student’s likelihood of correctly answering a problem question? (Research Question 1)
Second, Tassomai was interested in the effect of the video on knowledge retention: having initially answered the question correctly, would the watching of the video have any effect, positive or negative, on the student’s rate of forgetting? That is, would the use of the video to answer the question correctly in the first instance mean lesser or greater likelihood of remembering the answer after a period of abeyance? (Research Question 2)
In addition, the company wished to ascertain the level of appetite for video content: would a student, when offered such a video, opt to view it and, having done so, watch the video in its entirety?
The opportunity to systematically research a new pedagogical approach as a feature in a piece of education software, simultaneously assessing appetite and impact, was one that presented some exciting methodological potentials and implications from the analysis. Candidates in the trial were not aware that they were participating in research – merely that they were being offered a new feature within their normal learning and revision practice. That the feature was able to be offered to a random sample, and that the assessment of impact from the intervention formed part of the normal user experience, removed many questions of selection bias, influence or sampling concerns, and gave the research team confidence that their results could genuinely inform the product development decisions that would arise from the study. This approach is aligned with the General Data Protection Regulation (EU) 2016/679 and the ethics of compiling data from student engagement in a depersonalized and anonymous format for the purpose of analysing user performance, behaviours and trends and developing products (www.tassomai.com/terms). It facilitated rigorous evaluation of the potential harms and benefits of an app without interrupting the user experience.
The first steps in the design of the experiment were to identify problem questions themselves, and to develop the criteria on which Tassomai could identify candidates for the trial.
The content team analysed the question database and filtered questions by their membership of the most popular courses in order to design videos that could be offered to as many students as possible. Next, content was ranked by Tassomai’s measure of difficulty. This is a measure that is constantly evaluated on a per-question basis by analysis of all global student answerings.
The content team then selected a shortlist of questions for discussion. Six questions were selected, each relating to different topic areas (thus it would be impossible that two explainer videos could be offered for the same quiz). These are referred to in the experiment as the LIST or as LIST questions.
The selection of questions and provision of feedback was another area where the two teams collaborated and exchanged ideas. Ethical concerns were at the forefront of discussions: the subjects of the trial were real students preparing for vital terminal examinations. Tassomai and EDUCATE were in agreement that any intervention offered had to make as positive a learning impact as possible.
Tassomai records all answers given by students for each question; thus, it was relatively straightforward, for each of the LIST questions, to query the database for individuals who had given an incorrect response to that question by selecting one of the three distractors at their most recent attempt. In Tassomai database notation, this is shown as **W. These identified students are referred to as CANDIDATES.
Making the videos
Each minute-long video was planned in collaboration with content specialists, who gave feedback as to whether they felt the topic was adequately explained and clearly communicated. The team used a blackboard wall background on which some explanatory text or pictures were drawn; the videos were filmed on a phone camera with close-mic audio recorded separately and synced in post-production.
The videos were then subtitled (for special needs students and ease of use in public with sound off) and uploaded to Tassomai’s server (Figure 1).
The decision process for the experiment (Figure 2) formed the design brief.
As set out in the flowchart in Figure 2, CANDIDATES were marked in the database if they satisfied the conditions of being **W for any of the LIST questions. At this stage, they were randomly assigned as being TEST or CONTROL CANDIDATES.
Students using Tassomai come from schools in all contexts of the UK secondary school system, and from all around the country, but Tassomai holds no personal details on the students themselves. Students entered the trial only on the basis that they had made themselves CANDIDATES through their engagement in the learning program.
It should be noted that, since there were six videos, any individual student might be a CANDIDATE for anything from zero to all six of the videos, and that, within those several candidacies, they might be a TEST CANDIDATE in some and a CONTROL CANDIDATE in others. It should also be noted that the assignment of CANDIDATES to the TEST or CONTROL groups was not equal, as we anticipated a high rate of attrition in the TEST group as many were expected to skip viewing. Likewise, disqualification from the later stages of trial through error in earlier parts was expected to reduce sample sizes unequally between the two groups. The probability of assignment to one group or the other was set and adjusted with the aim of having a comparable sample size in each group at the end of the trial.
Of some 3,000 CANDIDATES, 2,200 were assigned as TEST learners, while 823 formed the CONTROL group. Of this initial set, the final numbers of students passing through all stages of the trial consisted of 172 TEST and 123 CONTROL CANDIDATES.
In either case (TEST or CONTROL), CANDIDATES have the LIST question scheduled. The scheduling of the question means it overrides the ordinary selection and shuffling processes of the Tassomai quizzing algorithm, forcing the question to appear at the designed point directly after the video event. This is referred to in the experiment as the CHECK question. Following a period of abeyance (referred to as the EMBARGO period), students were again asked the same LIST question, known as the RETENTION question.
Offering the video and measuring its viewing
Both CONTROL and TEST students would therefore be posed the tricky LIST question on commencement of the next relevant quiz (that is, question 1), but in the case of TEST students, they would first be offered the video to view in advance of the question (Figure 3).
Giving the students a choice to watch or not was crucial for the product research (measuring appetite). Furthermore, the researchers, under advice of research mentors at UCL EDUCATE, felt that there was an ethical imperative, since many students consume Tassomai on their personal mobile devices and the viewing of video content could be disruptive, or could cost the individual in terms of their data allowance.
Tassomai recorded student interaction with the video prompt and the video itself to measure in each case the extent to which the student had viewed the video. A tolerance was set that qualified a student as having sufficiently watched the video; any TEST CANDIDATES who had refused the video or consumed less than 80 per cent of it were disregarded.
During this stage, both teams worked together to decide how and at what intervals the learners were offered the feedback, and how they were selected for this purpose following educational research guidelines, so that no learners were treated unfairly.
The CHECK question
Immediately following the video, students were asked the relevant CHECK question. This served the purpose in the experiment of measuring the extent to which the video had helped students’ attainment (the first of the research questions).
The CHECK question was forced by the scheduling script to appear as the first question in the quiz; this kept the question closely linked to the TEST students’ experience of the video just viewed, and also avoided potential leakage from TEST or CONTROL students quitting quizzes part-way through and missing the question altogether.
If a student quit the quiz without answering the CHECK question, they were disregarded.
If not disqualified for failing to answer the CHECK question, CANDIDATES in both TEST and CONTROL groups would then be shown the question a second time, without the pre-emptive video (known as the RETENTION question) after a certain amount of time (known as the EMBARGO period).
The RETENTION question
Since the spacing of topics and questions within topics is a function of student performance in that topic, the experiment design took steps to avoid a bias that might skew results by student ability.
The scheduling script forced the CHECK question to be the first to follow the video; it then embargoed that question from appearing for a randomized period of between two and seven days. Following that, it then forced the question to appear one more time as question 1 in a quiz (that is, the RETENTION question).
All responses to both CHECK and RETENTION questions were recorded and analysed in the context of each student’s status as a TEST or CONTROL CANDIDATE, and against their viewing of the video.
Data were anonymized and exported from the Tassomai database for analysis, which processed answers from both the TEST and CONTROL groups in three stages.
The first stage provided an overview of the two populations of students and assessed the extent to which they progressed through the stages of the experiment or, through error or failure to qualify, dropped out.
The second stage measured the effect of the video on attainment by comparing the two cohorts by their answers to the CHECK questions. The third stage measured the effect of the video on retention by examining the performance on the RETENTION questions.
The variables were tested for normality using the Kolmogorov–Smirnov and Q–Q plots. Non-parametric tests used for continuous and categorical variables were the Mann–Whitney U and Chi-squared tests respectively. Effects below p-values of 0.05 were assumed to be significant.
Impact from the videos
The effect of video viewing on CHECK question success
All students who watched the video were immediately given the CHECK question; there were 458 recorded answerings in response. Three video viewers were lost with no answer to the CHECK question, which the researchers assume was a result of them suffering connection issues or quitting the site.
Analysis of the scheduled CHECK questions to measure the effect of the videos on attainment shows first that a significantly greater (Chi-square test, p < 0.001) proportion of the TEST cohort answered the CHECK question correctly, compared to the control cohort (Figure 4a and Figure 4b).
The effect of video viewing on RETENTION question success
Only those students who answered the CHECK question correctly – blue sectors of the pie chart – progressed to be considered for analysis on retention (the RETENTION question). Thus, the analysis that follows compares the 70.7 per cent from the TEST group to the 27.6 per cent from the CONTROL group to measure the extent to which they can continue to correctly answer the question after a period of abeyance.
Due to the time of year when many students completed their examinations (and therefore access to Tassomai), a proportion (46.5 per cent) of both TEST and CONTROL groups who qualified for the RETENTION question did not answer it before their course terminated.
Analysing students’ ability to answer the question correctly a second time, having answered it correctly at the CHECK stage, the data showed a significantly greater (Chi-square test, p < 0.001) proportion of the TEST cohort answered the RETENTION question correctly, compared with the CONTROL cohort (Figure 5a and Figure 5b).
The time distribution between the CHECK and RETENTION questions was the same for both groups (Figure 6). No significant differences (Mann–Whitney, p = 0.11) were detected.
Engagement with the video offering
Users identified for any particular instructive feedback video as being part of the TEST group were offered that video on the first instance that they launched the relevant quiz. There were 2,220 offers to students to receive video feedback.
Figure 7 shows that 36.2 per cent of these users (804) opted to engage with the content when it was offered; 47 of these were lost (‘Error’), where the researchers assume connectivity issues prevented the video from being served to the client device.
For the purpose of this study, we assume a user watched the video if they had a watch ratio of 0.8 or above. Of the 757 non-error students who started the video, 461 watched at least 80 per cent of it, qualifying them for the next stage of the research; 401 watched the video in its entirety.
The researchers were surprised that the rate of acceptance for the video was as high as it was (36.2 per cent), and more so that significant engagement occurred in such a high proportion (60.1 per cent) of those who started the video. This paved the way to produce more video content for more ‘tricky’ questions.
Discussion and conclusion
It was clear from this investigation that students benefited from viewing targeted instructive video content. Although the effect on attainment in the short term was unsurprising, the extent of the effect was beyond what had been predicted by the researchers.
Given that the probability of success following the video was approximately triple that of students who had not viewed the video but had only learned through the normal corrective feedback, the expectation from the research was that these video CANDIDATES may well perform less well after an interval than their counterparts who had made their correct answer to the CHECK question with the help of past corrective feedback alone. Measuring performance of only those students in the RETENTION question who had succeeded in the CHECK question, it was beyond expectations to discover that performance in the TEST group not only matched that of the CONTROL group, but was found to be significantly better. Regarding engagement with videos, it was felt that uptake at 36.2 per cent was considerably above expectations, with a watch rate of 60.1 per cent being far in excess of what was anticipated.
The conclusion from this experiment was that instructive videos offered to students as pre-emptive to a quiz, and triggered by indicative errors in previous work, would not only be expected to be well used by students, but would also have marked impact on their attainment and knowledge retention in the longer term. As a result, the company has incorporated a wider roll-out of 150 videos into their product strategy for the following academic year, with a view to continuing growth of the content offering in future.
Product design implications
Embarking on a wider roll-out of the feature would also give rise to several other areas of interest for research, where the initially basic designs can be iterated upon and their effects measured. These would include:
new designs for prompting the video (where the company felt it was too easy for a student to mistakenly dismiss the video by pressing the ‘tempting’ green button)
information to explain to the student more concretely why the video was being offered
opportunity to view videos on demand from a shortlist, or as a follow-up to a quiz
appetite for explainer videos at different times of day, or under conditions such as ‘only when I’m connected to WiFi’
appetite for video contributors among Tassomai’s community of several thousand teachers.
Caveats and areas for further investigation
It was noted that the LIST questions, being generally of high difficulty, were unlikely to be accessed by students of lower ability. From a product perspective, the company would need to produce instructive videos for the whole range of question difficulties in order to serve all students.
Since our results group the results for all questions, the researchers were aware that variations in video effectiveness may be discovered when stratifying results against problem question input, instructional video and learner engagement behaviours.
The effect of these new videos on answering success may vary from what was measured here: either being less effective since the questions are not initially particularly taxing, or more effective if they are aiding students of lower ability who perhaps experience less of a benefit from the inbuilt corrective feedback.
The research for this project was conducted ‘manually’ by querying the database and performing statistical analyses on large data sets. The intention of the project was not only to give useful insight to the company’s education product development, but also to build a facility to automate analysis.
Therefore, as the instructional video offering rolls out into the product, the education team intends to build automatic live reporting on the viewings of each video, and the effect on success that each video provides. This will be reportable to the company and to the video author, and will allow the team to select the most effective videos for CANDIDATES and at the most appropriate moment.
This project began with real uncertainty about the value of a novel product: would it provide a passive experience or added value? This question was resolved by collaborating across commerce and the public sector, to combine technology and social science. Essential edtech resources included a large data set of user interactions with which the company is able to run continuous internal analyses for product development. Academics contributed social research methodology, including a consideration of ethical principles in the study design. Together, the two supported experimental study designs to evaluate the performance with routine data. Finalizing the research design required both teams to understand the other’s point of view and incorporate requirements for both the business and the educational aspects of the research design. The resulting achievement of this SME–university collaboration was moving from real uncertainty about the value of videos, and teachers’ concerns about creating a passive viewer experience, to confidence in a rigorously tested app that helped students answer questions correctly as they prepared for their examinations.