This study develops and evaluates an open-source software (called NimbleMiner) that allows clinicians to interact with word embedding models with a goal of creating lexicons of similar terms. As a case study, the system was used to identify similar terms for patient fall history from homecare visit notes (N = 1 149 586) extracted from a large US homecare agency. Several experiments with parameters of word embedding models were conducted to identify the most time-effective and high-quality model. Models with larger word window width sizes (n = 10) that present users with about 50 top potentially similar terms for each (true) term validated by the user were most effective. NimbleMiner can assist in building a thorough vocabulary of fall history terms in about 2 hours. For domains like nursing, this approach could offer a valuable tool for rapid lexicon enrichment and discovery.
Author Affiliations: School of Nursing, Columbia University, New York (Drs Topaz and Murga and Ms Bar-Bachar); Harvard Medical School & Brigham and Women's Hospital, Boston, MA (Dr Topaz); The Visiting Nurse Service of New York (Ms McDonald and Dr Bowles); and School of Nursing, University of Pennsylvania, Philadelphia (Dr Bowles).
The authors have disclosed that they have no significant relationships with, or financial interest in, any commercial companies pertaining to this article.
Corresponding author: Maxim Topaz, PhD, RN, 560 W 168th St, New York, NY 10032 (firstname.lastname@example.org).
Online date: September 3, 2019