Technology is advancing rapidly, increasing the distribution of information in electronic rather than hard-copy formats. Consequently, electronic materials are becoming increasingly prevalent in educational settings. For example, many textbooks are available as audiobooks.1 Although providing educational materials in auditory rather than text formats may be more economical and convenient, it may not support maximal comprehension. Several studies have shown that reading text generates better comprehension and retention of information than listening to audio.2–5 This result is more pronounced with difficult material2 and among older children.3 Varao Sousa et al.5 proposed that the increased cognitive effort and physical engagement involved in actively reading text compared with passively listening to audio may explain this effect. Indeed, research has demonstrated decreased self-reported mind wandering during more active tasks such as silent and oral reading compared with listening.5–8
Because blind braille readers read through a different modality (i.e., touch), they provide an interesting subpopulation through which to explore the role of presentation modality for text comprehension. For several decades, researchers have explored the relative comprehension of information when it is presented in braille versus auditory formats. The findings to date have been inconsistent (Table 1). This inconsistency is largely due to varying parameters and methodologies, which include participant age,9 type and difficulty of reading material,10 amount of practice,13 audio speed,10–13 type of comprehension assessment13 and questions,14 and experimental design. All studies that found equivalent comprehension for braille and auditory formats used multiple-choice tests to evaluate comprehension and used a between-subjects design. Multiple-choice tests, however, probe recognition of correct answers rather than recall, potentially simplifying the comprehension exercise.15 Furthermore, using a between-subjects design to assess comprehension across different presentation modes is problematic, as participants have different baseline comprehension levels and varying degrees of experience. Thus, the groups being compared are nonhomogenous in ways other than material format, which may significantly confound the results. These confounds necessitate a reinvestigation of this research question that uses a controlled, within-subjects design, which is addressed by the present study.
Although prior research has investigated how comprehension is impacted by reading hard-copy braille compared with listening to compressed speech and/or audiobooks,9–13 technological advancements in the intervening years have augmented these with two additional options: refreshable braille displays and screen reading software. Refreshable braille displays are devices that connect to a computer via a USB port and translate text presented on a computer into braille one line at a time, making braille access more convenient and transportable. Screen reading software uses a synthesized voice and keyboard commands to read information presented on a computer screen. Although the synthesized voice quality and speech rate have improved from compressed speech,9,10,12,13 problems persist with word pronunciation and intonation. The present study is the first to test comprehension across four presentation modes: hard-copy braille, braille display, voice actor, and screen reader. This comparison will aid in understanding how to best support reading comprehension among individuals who are blind, a group already impacted by low high school and college graduation rates and underemployment.16,17 It will also elucidate how effective braille is for processing reading materials compared with two audio-based technologies.
In summary, the present study has four aims:
- To replicate the previous finding that actively reading text supports better comprehension compared with passively listening among sighted individuals5–8 using a novel comprehension assessment
- To correct methodological flaws in extant research on braille and audio comprehension among blind individuals
- To incorporate modern and popular assistive technologies to access reading materials
- To explore the hypothesis that tasks requiring greater cognitive effort and physical engagement improve comprehension
To address each of these aims, we analyzed the findings in two different studies. The first study (study 1) explored whether active reading modes (text or braille) generate better comprehension than do passive reading modes (listening to a voice actor) among sighted and blind individuals or whether the comprehension advantage from reading text compared with listening is domain specific to vision. The second study (study 2) investigated how currently used assistive technology may impact the reading comprehension of braille readers. This project used a within-subjects design, and a novel free-response comprehension assessment was developed to investigate comprehension among blind and sighted individuals. To our knowledge, this is the first project that uses the same comprehension assessment to make comparisons between blind and sighted participants.
We hypothesized that for sighted individuals text would be superior to auditory comprehension, supporting and replicating previous research.2–5 If reading text supports comprehension due to effortful processing, then blind individuals should perform better in the braille compared with the auditory condition, as reading braille is more physically engaging and cognitively effortful than passively listening.18 If, however, one of the alternate results—equivalent braille and auditory comprehension or better auditory comprehension—is found for blind participants, then reading text may promote better comprehension among the sighted owing to a domain-specific property of vision.
We hypothesized that comprehension among blind individuals would be better when reading passages using hard-copy braille or a braille display compared with auditory formats. This finding would support the link between better comprehension and more cognitively effortful and physically engaging tasks. Alternate findings include equivalent comprehension in all four presentation modes or an advantage for listening, which could be explained by the superior auditory memory of blind individuals.19
An interesting consideration is the potential comprehension difference between the two braille (hard-copy and braille display) and two audio (voice actor and screen reader) formats. We expected that using a braille display would lead to poorer comprehension relative to hard-copy braille because of more difficult spatial navigation, as braille displays require readers to regress one line at a time and to rely on their spatial memory to clarify information. We also hypothesized that individuals would comprehend better when listening to a voice actor compared with a screen reader because of the synthesized voice's poorer speech quality.
Thirty-four blind (19 to 71 years; median, 45 years) and 31 sighted (18 to 64 years; median, 22 years) individuals participated. The Smith-Kettlewell Eye Research Institute and University of California, Berkeley, institutional review boards approved the experimental procedures, and all individuals provided informed written consent before their participation. Furthermore, this research followed the tenets of the Declaration of Helsinki.
All participants were native English speakers, received at least a high school diploma, had normal hearing, and did not have any additional disabilities. All blind participants were fluent contracted braille readers. Some blind participants had no visual function, some could see light and/or light and shadow direction, and a few could detect the presence of large forms (Table 2).
Participants completed a pre-experiment survey that included questions about their age, visual impairment (if applicable), field of study, education level, and amount of experience reading text/braille and listening to audio. The amount of experience reading versus listening was calculated by asking participants how many hours per week they engage in these activities for school/work and leisure.
Isolated Word Reading
Each participant completed the Wechsler Individual Achievement Test—Third Edition isolated word reading tests: word reading and pseudoword decoding.20 Although these tests have been used to assess the reading skills of sighted individuals, we transcribed all materials in Unified English Braille for blind participants. Both tests begin with short, simple words and progress to longer, more complicated words. Each item was scored for correct verbal pronunciation by comparing recorded participant responses with a recorded answer key. These reading tests were used to confirm text and braille reading skills. Participants with scores below the 10th percentile for their age on these tests were excluded from final data analysis. Two of the 34 blind and none of the sighted participants were excluded based on this criterion.
Reading and Listening Material
Two kinds of passages were used: practice and experimental. There were seven practice passages, which were short, literary stories acquired from the Gray Oral Reading Fluency Test, Fifth Edition.21 These practice passages were written at a fourth- to fifth-grade reading level, according to the Flesch-Kincaid Readability Test.22
For the experimental passages, permission from Pearson publishing was obtained to use four different excerpts from their high-school Prentice Hall Biology eTextbook.23 Scientific passages were selected, as blind individuals are underrepresented in science, technology, engineering, and mathematics disciplines,24 emphasizing the importance of enhancing comprehension of material in a science, technology, engineering, and mathematics field. These passages were between 130 and 141 words. According to the Flesch-Kincaid Readability Test,22 all passages were at a 12th-grade reading level. The passage topics included zones of the ocean (passage A), relative dating of fossils (passage B), how viruses produce disease (passage C), and invertebrate body symmetry (passage D). The experimenters developed eight free-response questions per experimental passage (32 questions total). Half were “literal” questions, which had answers stated verbatim in the passage, and half were “inferential” questions, which had responses that required using passage information to infer answers that were not explicitly stated.
Reading and Listening Devices
For sighted participants, passages were presented in two different formats: text and audio. For the text format, passages were typed and printed on a 8.5 × 11-inch paper (12-point, Times New Roman font). For the audio format, a professionally trained male voice actor was recorded reading each passage and all comprehension questions in a soundproof room at the Smith-Kettlewell Eye Research Institute. These recordings were played aloud for participants during the study.
For blind individuals, passages were presented in four different formats: (1) hard-copy braille (Unified English Braille, embossed on a 11 × 11.5-inch paper), (2) refreshable braille display (Brailliant BI 32, HumanWare, United Kingdom), (3) voice actor, and (4) screen reader (NonVisual Desktop Access, NV Access, Australia). The NonVisual Desktop Access screen reader was used with the eloquence and vocalizer expressive add-on, which improved the voice clarity.
All individuals gave written consent to participate and be recorded. First, they verbally completed the pre-experiment survey. A video camera recorded their responses and was focused on participants' hands only.
Then, all participants were administered the Wechsler Individual Achievement Test—Third Edition tests, and upon completion, the experimenter read the study instructions aloud. The experimental design was within-subjects; that is, all individuals completed all presentation mode conditions for their participant group. For sighted participants, passages were presented in text and in audio (voice actor) formats. For blind participants, each of the four experimental passages was presented in one of the four presentation modes. The presentation mode–experimental passage pairings and the presentation mode conditions were pseudorandomized using a Latin square design and counterbalanced across participants. Furthermore, the order of free-response questions for each passage was randomized for each participant.
Before reading or listening to the experimental passages, participants completed one or two different practice sessions. Two practices were used for the voice actor, braille display, and screen reader conditions. The first practice familiarized participants with using relevant technological devices and did not require comprehension of any information in the passage. During this first practice, participants selected their preferred speech rate for the audio conditions. In all presentation mode conditions, participants were not timed and could make regressions, or go back in the text or audio, to clarify information. Allowing for regressions and speech rate adjustments simulated realistic reading, thereby enhancing the generalizability and ecological validity of the findings. For audio conditions, regressions were made using a key press on a keyboard.
The second practice session familiarized participants with the experimental procedure. This was the only practice for the text and hard-copy braille conditions because no technological training was required. Participants silently read or listened to the passage and answered two free-response questions, which were previously recorded by the voice actor and were played aloud. Participants were also given the questions in text or braille. Once the questions were played, participants could no longer return to the passage. Responses were given verbally. Participants said the word “next” to indicate that they wanted to progress to the next question.
After each practice, participants completed the experimental conditions, during which they silently read or listened to the scientific passages for comprehension. They were instructed to notify the experimenter once they finished. Then, the voice actor recordings of eight different free-response questions were played aloud. Participants were given the questions in text or braille, and once the questions were played, they could no longer return to the passage. Participants verbally responded and said “next” to progress to the next question.
Verbal responses were transcribed and scored for accuracy using Gradescope (Gradescope Inc., Berkeley, CA), a tool that allows for online grading.25 In Gradescope, the grading rubric is continuously visible, lending strength to interrater reliability. Two of the authors independently graded all responses. Their scores were compared to calculate interrater reliability using the Cohen κ value,26 which measures interrater agreement while considering the possibility of the agreement occurring by chance.27 For sighted participants' scores, κ = 0.984, and for blind participants' scores, κ = 0.949, indicating that, after subtracting out agreement due to chance, the two raters agreed approximately 98.4 and 94.9% of the time for sighted and blind participants' scores, respectively; therefore, there is high interrater agreement.
Assessment Tool Analysis
Because a novel assessment tool was created, it was important to confirm that the questions accurately assess comprehension. Scores were analyzed using Rasch model estimation with the “CRASCH” package in R 18.104.22.168,29 Scores predicted by the Rasch model were compared with the actual scores from the blind and sighted participants, separately, yielding an infit mean square statistic for each item that reflects the amount of randomness of the assessment tool. Twenty-four of the original 32 items had acceptable infit values30,31 for both blind and sighted participants, allowing for comparisons between these two groups, and were used in the following data analyses.
The total comprehension score was calculated by summing the scores of each fitted question for a passage and dividing by the total possible points for that passage. Total scores were analyzed using linear mixed-effects regression with the lme4 package in R 22.214.171.124,32 The variance associated with random factors was controlled for by using linear mixed-effects regression analysis as opposed to repeated-measures analysis of variance.33,34 Because participants had different comprehension abilities, their scores were not independent across presentation modes, thereby rendering each participant's responses interdependent.35 This nonindependence was resolved by including a random effect of participant in all regression models, which assumed a different “baseline” comprehension level for each participant.35
All analyses started with a null model that included total score as the dependent variable and participant as the random intercept effect. Fixed effects were incrementally added to investigate whether the model fit was improved by using χ2 tests on the log-likelihood values to compare the different models. The final model was selected if it revealed a significant P value and minimized the Akaike information criterion value. The fixed effects tested in all models include the following: presentation mode, passage topic, participant age, presentation mode and passage topic interaction, education level, primary field of study, experience using text/braille and audio, time duration of each condition, and, if applicable, the age at which the participant learned how to read braille (i.e., braille age) and visual impairment onset age.
Estimated marginal means, or the mean response for each factor adjusted for the other variables in the model, were computed for all models in this study using the “emmeans” package in R 126.96.36.199,36 The 95% confidence intervals and differences between conditions were computed using these values, and the differences were assessed using a Bonferroni confidence level adjustment and a Holm P adjustment method for multiple comparisons.
The goal of this analysis was to investigate whether the findings replicate previous research5–8 using our free-response comprehension assessment tool. For sighted participants, the fixed effects used in the final model included presentation mode (F1,93 = 5.9, P = .02), passage topic (F3,93 = 7.2, P < .001), and age (F1,31 = 8.5, P = .007), along with a random intercept of participant. The model met all linear mixed-effects regression assumptions. Passage topic was controlled to assess the differences in total score between presentation modes. The average total score for sighted participants was significantly better with the text (mean, 74.8%; 95% confidence interval, 70.5 to 79.1%) compared with voice actor format (mean, 69.7%; 95% confidence interval, 65.4 to 74.0%; P = .02; Fig. 1, left). This result replicates what has been found in previous research.2–5
The difference in average comprehension between only the hard-copy braille and voice actor formats was assessed, as these two conditions are most analogous to the text and voice actor conditions used for the sighted participant analysis described previously. The aim was to see whether a similar finding exists among blind participants. For blind participants, in addition to the random intercept of participant, the fixed effects included in the final model were as follows: presentation mode (F1,32 = 5.5, P = .03), braille age (F1,32 = 6.3, P = .02), and participant age (F1,32 = 13.1, P < .001). This model met all linear mixed-effects regression assumptions.
Average comprehension was significantly better with hard-copy braille (mean, 70.4%; 95% confidence interval, 63.3 to 77.5%) compared with voice actor format (mean, 61.9%; 95% confidence interval, 54.7 to 69.0%; P = .03; Fig. 1, right). Furthermore, there was a stronger negative correlation between participant age and total score (r = −0.400) than between braille age and total score (r = −0.254).
Study 1 Summary
On average, sighted and blind participants comprehended the scientific passages better when they were presented in text or braille compared with when a voice actor read them aloud.
In this study, the impact of four presentation modes (hard-copy braille, braille display, voice actor, screen reader) on comprehension among blind participants was investigated.
As in the blind participant regression model in study 1, the final model had a random intercept effect of participant and the following fixed effects: participant age (F1,32 = 17.8, P < .0001), presentation mode (F3,96 = 5.1, P = .002), passage topic (F3,96 = 4.3, P = .007), interaction between presentation mode and passage topic (F9,74.9 = 2.6, P = .01), and braille age (F1,32 = 4.3, P = .05). This model met all linear mixed-effects regression assumptions.
Differences in total score between presentation modes were assessed, controlling for passage topic. Average total score was significantly better with hard-copy braille (mean, 70.6%; 95% confidence interval, 0.634 to 0.777) than with a screen reader (mean, 60.7%; 95% confidence interval, 53.5 to 67.9%; P = .02) and better with the braille display (mean, 69.7%; 95% confidence interval, 62.5 to 76.9%) than with a screen reader (P = .04; Fig. 2). In addition, the average comprehension ability in the voice actor condition (mean, 62.5%; 95% confidence interval, 55.3 to 69.6%) was not significantly different from the hard-copy braille (P = .10) or the braille display (P = .07) conditions. Then, presentation mode was held constant to examine differences in comprehension between passages. A significantly greater average comprehension score was found for passage C (mean, 70.3%; 95% confidence interval, 63.1 to 77.4%) compared with D (mean, 60.1%; 95% confidence interval, 52.9 to 67.2%; P = .02).
Although insufficient observations existed in each passage–presentation mode pair to statistically analyze the interaction, the trends in the data suggest that average braille comprehension was better than average auditory comprehension for passages A, B, and D. There existed no comprehension differences, however, between presentation formats for passage C. This result likely occurred because the topic covered by this passage was familiar to the participants and therefore too simple. Taken together, these findings highlight the need to consider the passage topic and difficulty level when investigating the impact of presentation mode on comprehension in the future.
This project investigated how differences in presentation mode influence comprehension of scientific material. In study 1, regression analyses revealed that, for sighted and blind individuals, reading scientific passages in text and hard-copy braille, respectively, generated better comprehension on average compared with listening to a voice actor. This result supports the hypothesis that increasing task effort and physical engagement enhances comprehension.5–8
The findings from study 2 with only blind individuals, however, complicated this interpretation. That is, if superior comprehension is supported by more physically engaging and cognitively effortful tasks, then both braille conditions (hard-copy and braille display) should have yielded higher comprehension scores compared with both audio conditions. Although the results revealed that the average comprehension when listening to the voice actor was more similar to that when listening to a screen reader than that when using either braille format, the difference between each braille condition and the voice actor condition was not statistically significant; this was unexpected, given the results from study 1. To explore this further, the effect sizes were calculated for these differences using a formula for Cohen d for mixed-effects models.37 If the effect sizes are low (i.e., d = 0.238), then this supports the lack of significant comprehension difference between each braille and the voice actor condition.
The observed effect size between hard-copy braille and voice actor conditions was d = 0.561, and that between the braille display and voice actor conditions was d = 0.501. Because these are considered medium effect sizes,38 a possibility is that—despite attempting to control for extraneous factors and using a sample size calculated a priori for sufficient power—the number of participants tested was not adequate for the precision required to make the comprehension difference statistically significant.
Furthermore, the lack of significant difference between the hard-copy braille and voice actor conditions in study 2 could possibly be explained by the significant interaction between passage topic and presentation mode, which was not significant in study 1. Future research should further examine the extent to which the impact of presentation mode on comprehension depends on passage topic. Although comprehension among blind participants may not necessarily be explained by increased physical engagement, the findings from study 2 emphasize the need to continue producing scientific materials in braille, notwithstanding technological advancements that present information in audio. The results also suggest that the poor diction offered by synthesized speech may impair comprehension of nonfiction passages.
Although the findings presented here highlight the benefit of using braille formats to access reading materials, they do not recommend discounting the use of audio-based assistive technology in general. That is, the current results involving blind individuals particularly apply to fluent braille readers' comprehension of high-school biology passages using four different presentation modes. They do not imply that reading comprehension is impaired by using audio-based assistive technology. Indeed, it may be challenging for individuals who become blind later in life to learn braille, so they depend heavily on auditory means of accessing reading materials. There exist many other audio-based assistive technologies not tested here that may yield different results with the present experimental methodology.
In summary, the results from these studies aim to provide information on the relationship between altering the way reading materials are presented and comprehension; they are not meant to serve as a strict guideline for informing individuals how to access their reading materials, as this is ultimately at the discretion of the reader. In addition, the findings suggest that braille should not be replaced but rather supplemented by audio-based assistive technology in the education system.
Novel Contributions to Education
A novel comprehension assessment was developed to directly compare the results from blind and sighted individuals. This comparison lends insight into the mechanism of comprehension by addressing whether text comprehension is domain specific to vision or due to the increased effort hypothesis. If the superiority in comprehension is due to the latter, then blind participants' comprehension should be better when material is presented in braille than in audio.
These studies also incorporated widely used assistive technology. Because these devices tend to be more convenient, investigating the tradeoff between comprehension and accessibility is essential. In addition, these studies allowed for personalized adjustments to settings and for making regressions. Previous research has not allowed for regressions during audio conditions, although individuals made regressions during text/braille conditions. Because participants could use their preferred settings and make regressions, these studies represented a more realistic setting, thereby enhancing the applicability of the results to education.
These studies used a within-subjects design, producing more substantive results compared with a between-subjects design, because the comprehension differences between presentation modes were not biased because of varying preferences and/or experience with the different modes and devices. By using a within-subjects design, each participant acted as his or her own “control,” as preference and experience level were held constant across presentation modes. The results presented here therefore are due to differences between conditions.
Challenges of Creating Assessments
Many challenges come with developing an assessment tool that accurately assesses comprehension. To control for passage difficulty, the Flesch-Kincaid Readability Test22 was performed on each passage before experimentation. However, there still existed significant differences in comprehension between passage topics for blind and sighted individuals. One way to prevent these effects a priori is by collecting sufficient pilot data; however, at least 30 of blind and sighted participants each were needed to develop fit statistics for the comprehension questions via Rasch modeling, and it was difficult to recruit fluent braille readers. This difficulty limited our capacity to test enough fluent braille readers for pilot data. Despite this shortcoming, this study was the first of its kind to take appropriate statistical measures to analyze questions that accurately measured comprehension for each group. Interestingly, fit statistics for blind and sighted individuals were different, highlighting the need to consider differences in participant groups when developing assessment tools.
1. Business Wire. Audible.com
Announces a Tenfold Increase in Audiobooks Produced through the Audiobook Creation Exchange (ACX); 2013. Available at: http://www.businesswire.com/news/home/20130130005427/en/Audible.com-Announces-Tenfold-Increase-Audiobooks-Produced-Audiobook
. Accessed March 18, 2018.
2. Carver ME. Listening versus Reading. In: Cantril H, Allport GW, eds. The Psychology of Radio. New York: Harper & Brothers; 1935.
3. Caughran AM. The Effect on Language Comprehension of Three Methods of Presentation [Doctoral Dissertation]. Columbia, MO: The University of Missouri-Columbia; 1953.
4. Green R. Remembering Ideas from Text: The Effect of Modality of Presentation. Br J Educ Psychol 1981;51:83–9.
5. Varao Sousa TL, Carriere JS, Smilek D. The Way We Encounter Reading Material Influences how Frequently We Mind Wander. Front Psychol 2013;4:892.
6. Forster S, Lavie N. Harnessing the Wandering Mind: The Role of Perceptual Load. Cognition 2009;111:345–55.
7. Thomson DR, Besner D, Smilek D. In Pursuit of Off-task Thought: Mind Wandering-performance Trade-offs while Reading Aloud and Color Naming. Front Psychol 2013;4:360.
8. Smallwood J, McSpadden M, Schooler JW. The Lights Are on but no One's Home: Meta-awareness and the Decoupling of Attention when the Mind Wanders. Psychol Bull Rev 2007;14:527–33.
9. Lowenfeld B. Braille Talking Book Reading: A Comparative Study. New York: The American Printing House for the Blind, Inc.; 1945.
10. Foulke E, Amster CH, Nolan CY, et al. The Comprehension of Rapid Speech by the Blind. Except Child 1962;29:134–42.
11. Hughes CC. A Comparison of Braille and Compressed Speech as Learning Modes for Legally Blind Adults [Doctoral Dissertation]. Corvallis, OR: Oregon State University; 1979.
12. Tuttle DW. A Comparison of Three Reading Media for the Blind: Braille, Normal Recording and Compressed Speech. Educ Visual Handicap 1972;4:40–4.
13. Nolan CY. Reading and Listening in Learning by the Blind. Except Child 1963;29:313–6.
14. Edmonds CJ, Pring L. Generating Inferences from Written and Spoken Language: A Comparison of Children with Visual Impairment and Children with Sight. Br J Dev Psychol 2006;24:337–51.
15. Birenbaum M, Tatsuoka KK. Open-ended versus Multiple-choice Response Formats—It Does Make a Difference for Diagnostic Purposes. Appl Psychol Measur 1987;11:385–95.
16. National Federation of the Blind (NFB). Blindness Statistics: Educational Attainment (US); 2018. Available at: https://nfb.org/blindness-statistics
. Accessed March 12, 2018.
17. Bell EC, Mino NM. Employment Outcomes for Blind and Visually Impaired Adults. J Blind Innov Res 2015;5.
18. Büchel C, Price C, Frackowiak RS, et al. Different Activation Patterns in the Visual Cortex of Late and Congenitally Blind Subjects. Brain 1998;121(Pt 3):409–19.
19. Röder B, Rösler F, Neville HJ. Auditory Memory in Congenitally Blind Adults: A Behavioral-electrophysiological Investigation. Brain Res Cogn Brain Res 2001;11:289–303.
20. Wechsler D. Wechsler Individual Achievement Test. 3rd ed. San Antonio, TX: Psychological Corporation; 2009.
21. Wiederholt JL, Bryant BR. Gray Oral Reading Test—Fifth Edition (GORT-5): Examiner's Manual. Austin, TX: Pro-Ed; 2012.
22. Kincaid JP, Fishburne RP Jr., Rogers RL, Chissom BS. Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel. Orlando, FL: Institute for Simulation and Training; 1975;56:1–32.
23. Miller KR, Levine JS. Prentice Hall Biology, Student Edition. 2008C. Prentice Hall Online Edition, Pearson Education, Inc. Available at: ocas.pearsonschool.com/ph/cd/0-13-115540-7/
. Accessed May 9, 2017.
24. Supalo CA, Isaacson MD, Lombardi MV. Making Hands-on Science Learning Accessible for Students Who Are Blind or Have Low Vision. J Chem Educ 2013;91:195–9.
25. Singh A, Karayev S, Gutowski K, et al. Gradescope: A Fast, Flexible, and Fair System for Scalable Assessment of Handwritten Work. Proc ACM 2017:81–8.
26. Gwet K. Cohen's Kapp Index of Inter-rater Reliability; 2002. Available at: http://psych.unl.edu/psycrs/handcomp/hckappa.PDF
. Accessed March 6, 2018.
27. Fleiss JL, Cohen J, Everitt BS. Large Sample Standard Errors of Kappa and Weighted Kappa. Psychol Bull 1969;72:323.
28. Arneson A, Freund R, Toyama Y, Wilson M. Construct Mapping with Rasch, CRASCH Package for R. GitHub Repository; 2015. Available at: github.com/amyarneson/crasch
. Accessed March 6, 2018.
29. R: A Language and Environment for Statistical Computing: Version 3.4.4 [Computer Program]. Vienna, Austria: R Foundation for Statistical Computing; 2018. Available at: http://www.R-project.org/
. Accessed March 6, 2018.
30. Adams RJ, Khoo ST. Quest. Melbourne, Australia: Australian Council for Education Research; 1996.
31. Wilson M. Constructing Measures: An Item Response Modeling Approach; 2005. Available at: https://ebookcentral-proquest-com.libproxy.berkeley.edu/lib/berkeley-ebooks
. Accessed April 18, 2018.
32. Bates D, Mächler M, Bolker B, et al. Fitting Linear Mixed-effects Models Using Lme4. J Stat Software 2015;67:1–48.
33. Baayen RH, Davidson DJ, Bates DM. Mixed-effects Modeling with Crossed Random Effects for Subjects and Items. J Memory Lang 2008;59:390–412.
34. Judd CM, Westfall J, Kenny DA. Treating Stimuli as a Random Factor in Social Psychology: A New and Comprehensive Solution to a Pervasive but Largely Ignored Problem. J Pers Soc Psychol 2012;103:54–69.
35. Winter B. Linear Models and Linear Mixed Effects Models in R with Linguistic Applications; 2013. arXiv preprint:1308.5499. Available at: https://arxiv.org/ftp/arxiv/papers/1308/1308.5499.pdf
. Accessed March 2, 2018.
36. Lenth R, Singmann H, Love J, Buerkner P, Herve M. Emmeans: Estimated Marginal Means, AKA Least-squares Means. R Package Version 1.1.3; 2018. Available at: https://CRAN.R-project.org/package=emmeans
. Accessed May 4, 2018.
37. Brysbaert M, Stevens M. Power Analysis and Effect Size in Mixed Effects Models: A Tutorial. J Cognit 2018;1. Available at: https://www.journalofcognition.org/articles/10.5334/joc.10/
. Accessed November 30, 2018.
38. Cohen J. Statistical Power Analysis for the Behavioral Sciences. New York, NY: Routledge Academic; 1988.