The Internet is increasingly used as a medium for gathering and exchanging health information exchange. Healthcare professionals and organizations need to consider barriers that may exist within their patient-oriented Web applications. One approach to making the Web more accessible for those with lower health literacy may be to supplement textual content with audio annotation using text-to-speech engines, allowing for the creation of a virtual surrogate reader. One challenge is that with numerous text-to-speech engines on the market, objective measures of quality are difficult to obtain. To facilitate comparisons of text-to-speech engines, we developed an open-source Web application that measures user reaction times, subjective quality ratings, and accuracy in completing tasks across different audio files created by text-to-speech engines. Our research endeavor was successful in building and piloting this Web application; significant differences were found for subjective ratings of quality across three text-to-speech engines priced at different levels. However, no significant differences were found with reaction times or accuracy between these text-to-speech engines. Future avenues of research include exploring more complex tasks, usability issues related to implementing text-to-speech features, and applied health promotion and education opportunities among vulnerable populations.
Author Affiliations: School of Nursing (Drs Wolpin, Berry, Kurth, and Lober), School of Medicine (Drs Berry and Lober), and School of Public Health and Community Medicine (Drs Kurth and Lober), University of Washington, Seattle.
This study was funded by the University of Washington's School of Nursing Research Intramural Funding Program.
Corresponding author: Seth Wolpin, PhD, MPH, RN, Department of Biobehavioral Nursing and Health Systems, University of Washington, Box 357266 Seattle, WA 98195-7266 (firstname.lastname@example.org).