Journal Logo

Feature Articles

Content Validation of a Quality and Safety Education for Nurses–Based Clinical Evaluation Instrument

Altmiller, Gerry EdD, APRN, ACNS-BC

Author Information
doi: 10.1097/NNE.0000000000000307


Many educators use Quality and Safety Education for Nurses (QSEN) as a framework for developing teaching strategies that support quality improvement and patient safety. In the 10 years since its formation, QSEN has significantly influenced prelicensure nursing education, yet there is a dearth of evaluative data regarding its efficacy. As tools and teaching strategies are developed based on the QSEN competencies, educators need to provide evidence to support the value of their implementation. This article discusses the development of a QSEN-based clinical evaluation instrument for implementation in prelicensure nursing education and describes the process of establishing content validation for its items.

The QSEN competencies identify the knowledge, skills, and attitudes needed by nurses to meet the demands of the health care environment, emphasizing patient-centered care, collaboration with other members of the health care team, evidence-based practice, quality improvement, safety, and the integrated use of informatics.1 Applying these concepts to clinical performance evaluation provides a clear framework for objective analysis of student competency in the clinical setting. In addition, the QSEN competencies organize the expectations for clinical practice into coherent categories so that students can more accurately identify areas of strength and weakness.

Developing the Clinical Evaluation Instrument

The purpose of this study was to develop a valid and reliable instrument to measure competency in prelicensure clinical nursing courses. Evaluation of clinical competency is a high-stakes assessment associated with a deliberate outcome or consequence, which makes it necessary to use a valid method of measurement.2 The senior-level acute care nursing course was chosen as the course in which to introduce and pilot the instrument. The goal was to reduce error in the measurement process and score assignment for clinical evaluation by providing an objective means with which to assign a score. The QSEN competencies were chosen as the framework for the instrument because of their applicability to practice.

Instrument development frequently consists of 2 phases; the first is creation of the new instrument with the goal of providing a comprehensive conceptualization of the construct, based on first-hand knowledge and thorough review of the literature.3 The second phase involves rigorous review of the instrument by content experts. The process of this instrument development began with a detailed review and gap analysis of existing instruments that specifically used the QSEN competencies as a framework to measure the construct of student nurse clinical competency. At the time this project began, 3 QSEN-based clinical evaluation instruments were located on the QSEN Web site or in the literature.4-6 Although all were described well, little information was available about each instrument's validity and reliability. The developers of the instruments were contacted and asked permission to review their instrument and incorporate some aspects into a newly developed instrument. All were agreeable to this request.

As development of the QSEN-based clinical evaluation instrument began, the 6 categories of the QSEN competencies provided the headings. A seventh heading was added to address professional role development because of its importance in clinical instruction. Forty-four performance items based on the knowledge, skills, and attitudes that describe the QSEN competencies were created and grouped under 1 of the 7 headings. Each item was followed by a number in parentheses that indicated the student learning outcome of the course with which that specific item associated.

The literature indicates that clinical evaluation utilized in prelicensure nursing programs is frequently perceived by students as having a high degree of subjectivity.7-9 The process varies among schools, and depending on the program, clinical evaluation can be a pass/fail rating or require a grade assignment. With our previous clinical evaluation form, a grade assignment was computed by averaging the total sum of the ratings, ranging from 1 to 4 (lowest to highest), for all items listed on the clinical evaluation. There was no clear criterion to determine what constituted a rating of 4 versus what constituted a lesser rating. With the development of new performance criteria came the opportunity to reorganize the grading system to decrease subjectivity. Keeping the 1 to 4 rating scale, clear scoring criteria were adopted and modified from existing QSEN-based clinical evaluations into a rating scale that would facilitate a more objective score assignment during the evaluation process. This clinical evaluation rating scale was included as a page of the evaluation instrument (Table). Four specific items that included safe medication administration, communicating patient changes promptly and appropriately, maintaining performance at the expected level, and protecting confidentiality were deemed as essential skills and marked with an asterisk. These criteria reflected critical knowledge, skills, and attitudes related directly to patient safety, therefore requiring an acceptable level of performance at all times.

Clinical Evaluation Rating Scale

Once the items were developed and the grading scale clarified, the instrument was reviewed for face validity by faculty members of the school of nursing. Faculty were asked to comment directly on the evaluation instrument; all feedback was welcomed. Most faculty chose to participate and offered feedback, which provided the basis for additional refinement of items to increase clarity and add meaning before conducting a rigorous review by content experts.


Data Collection Procedure

A panel of 6 expert nurse educators was recruited to score the specific items of the QSEN-based clinical evaluation instrument. Two reviewers were well versed in QSEN, having been members of one of QSEN's pilot schools and associated with the organization since. Another reviewer was from an early adopter school and had created multiple teaching strategies published on the QSEN Web site. These 3 reviewers were doctorally prepared and teaching in nursing programs at prominent universities in the midwestern, southeastern, and mid-Atlantic United States. A fourth reviewer who was master's prepared held a faculty position at the school of nursing and was familiar with QSEN through her work with simulation. The remaining 2 reviewers were experienced master's-prepared clinical nursing adjunct faculty, teaching part-time at the school of nursing while holding full-time hospital practice positions. Although these 2 reviewers did not have specific knowledge of the QSEN competencies, both were well informed through their practice position about the concepts the competencies represent. The selection of reviewers with varied levels of QSEN knowledge was purposeful to ensure that the clinical evaluation not only aligned with the QSEN competencies but also aligned well with clinical practice. Participation to review the instrument was incentivized by a $50 honorarium for completion of each review, which was estimated to take about 30 minutes. The honorariums were funded by a small grant awarded by the nursing school.

Data collection occurred over 2 rounds of reviews completed by all 6 content reviewers to determine the final version of the instrument. Both reviews followed the same execution process. For each review, the experts were provided clear written directions about the purpose of the review and directions for scoring the individual items. An opportunity to have questions answered was provided. The directions asked the reviewer to rate the level of agreement with the relevance (appropriateness) of the item to be included in a QSEN-based clinical evaluation instrument for senior-level nursing students. Reviewers were asked for detailed comments regarding individual items and the overall instrument during each review. During the first round of review, the focus was to determine whether the items thoroughly addressed the domain of clinical performance evaluation and if the construct of clinical performance was clearly represented by the items. The second round of review served to clarify that reviewer feedback was accurately reflected in the refinement of the items and to assess the content validity of the items and scale as a whole.


The goals in establishing content validity had 2 aims: the first was achieving expert consensus that the items were relevant for inclusion in a QSEN-based clinical evaluation instrument for senior-level nursing students, and the second was reducing error in the measurement process by increasing clarity of items. The approach chosen to assess content validity was the content validity index (CVI), a process used for calculating content validity based on ratings of relevance by an expert panel. The CVI is the most widely used method of determining content validity for multi-item scales in nursing research.10 Considered a process to compute consensus estimates, the CVI quantifies the extent to which experts agree; a low level of agreement indicates that the instrument does not create a shared understanding of a construct.

The CVI was used to quantify the degree of relevance for each item, as well as to compute a value for the overall instrument. Item levels were calculated based on scores assigned by 6 expert nurse educators using a 4-point ordinal scale with the following values: 1 = not relevant, 2 = somewhat relevant, 3 = quite relevant, and 4 = highly relevant. The item CVI (I-CVI) was computed as the number of experts giving a rating of 3 or 4, divided by the number of experts, in this case 6, which indicated the proportion of agreement about an item's relevance. Polit and Beck10 suggested that when there are 4 or fewer experts, 100% agreement is required, but with 5 or more experts, investigators of new instruments can tolerate 1 rating of “not relevant” for an item to be considered valid, allowing for a modest amount of disagreement.

The scale-level CVI (S-CVI) was computed using 2 methods, one determining universal agreement (UA) and the other averaging agreement among experts. Scale developers use a criterion of 0.80 or greater and 0.90 or greater, respectively, as the lower limit of acceptability for scale-level values.6,13 Because the CVI does not adjust for chance agreement, a modified k coefficient was calculated for each item, indicating chance agreement of relevance.

Data Analysis

Microsoft Excel was used to calculate mean scores and CVI for each item included in each of the reviews. This item-level information was used to refine or discard items. Content validity index was calculated by grouping quite relevant and highly relevant (ratings 3 and 4) items and not relevant or somewhat relevant (ratings 1 and 2) items. Comments from reviewers formed the basis for adjustments to increase clarity of items. Scale CVI was calculated using 2 methods to determine the content validity of the entire scale. The proportion agreement was calculated to show items judged as relevant across the 6 experts. A modified k coefficient, an index of chance agreement among reviewers that an item is relevant, supporting content validity, was calculated for each item to adjust for chance agreement among reviewers.


After the first review, adjustments were made to specific items based on CVI scores and reviewer feedback. Of the 44 items included in the first round, 4 items were rated as not relevant for a QSEN-based clinical evaluation instrument with CVI values of 0.57 (out of possible 1). Those items were eliminated or adjusted based on feedback from reviewers regarding syntax that would clarify item meanings.

After adjustments were completed, the same 6 expert educators participated in the second review of the revised 43-item QSEN-based clinical evaluation instrument. The second review yielded 42 items with a CVI of 0.83 or higher, indicating that those items were content valid and therefore appropriate for the final version of the instrument (see Table, Supplemental Digital Content 1,

The S-CVI of the final 42-item instrument was then computed using 2 methods. The first method followed the requirements of UA, which calculates the proportion of items on the instrument that achieved a rating of 3 or 4 by all content experts. The proportion can range from 0 to 1. The S-CVI/UA for the final version of this instrument was 1.0, meaning all items were rated as relevant (rated as 3 or 4) by all 6 experts. The second approach to measure S-CVI of the instrument was to compute the I-CVI for each item on the scale and then calculate the average (Ave) I-CVI of all 42 items. Using this approach, Polit and Beck10 recommend an S-CVI of 0.90 or higher for an instrument to be judged as having excellent content validity. Averaging across the 42 items of the final version of this instrument, the S-CVI/Ave yielded a score of 0.979 (out of possible 1). Such close agreement between these 2 different calculation methods for S-CVI does not always occur but was attributed to the fact that 37 of the 42 items had an I-CVI value of 1, meaning all 6 experts scored the item as highly relevant.

For this study, to address chance agreement as well as augment content validity, a modified k, a coefficient that reports an index of chance agreement among reviewers that the item is relevant, was reported for each item (see Table, Supplemental Digital Content 1, A modified k of greater than or equal to 0.78 is considered excellent.11 For each of the 42 items included in the final instrument, the modified k ranged from 0.81 to 1.

An additional index that supports scale-level content validity is proportion of relevance. It is a measure of the proportion of experts that agree on the relevance of all items included in the scale, very similar to what the CVI does. For this instrument, the proportion agreement of items judged as relevant across the expert educators was 0.965, well above the defensible minimal standard of 0.80.


The CVI is a well-established validity index widely used by nurse researchers. Process measures that support the content validation in this study included establishing face validity with school faculty before having nurse experts conduct a review, providing detailed instructions to the reviewers for quantifying the items during the instrument review, and using an identical process for the execution of both instrument reviews.11,12 Inviting detailed comments from the reviewers and including content experts from colleges and universities around the country that have published or presented nationally in the content area add rigor to the validation process.2 Descriptions and qualifications of the expert nurse educators that participated in the study establish their expertise to serve as reviewers to rate the items of a QSEN-based clinical evaluation instrument.

The data collected in this study provided content validation for this newly developed instrument. The high level of agreement among the 6 expert reviewers supports that the content is relevant and appropriate for inclusion in a QSEN-based clinical evaluation instrument to define and evaluate student clinical performance. High item-level scores, despite the varying knowledge level of the QSEN competencies by the expert nurse educator reviewers, suggest that the QSEN competencies provide a relevant framework for clinical practice evaluation and that the items included in this QSEN-based clinical evaluation instrument provide valid measures for contemporary nursing education and practice.

Nursing Implications

This QSEN-based clinical evaluation instrument provides a means to assign a score to student competency for the knowledge, skills, and attitudes associated with professional nursing practice, thereby providing an objective manner with which to assign student grades for clinical practice courses. The organization of this instrument lends itself to adaptability by many nursing programs. Items focused on nursing theories and completion of required written work before and after the clinical experience can be tailored to specific program requirements. A measured score requires greater precision with scoring than a pass/fail designation, which this instrument supports in its current state. The instrument can be modified to a pass/fail or satisfactory/unsatisfactory scoring system with minor adjustments.

Standardization reduces variation in practice, and the same applies to nursing education. Standardizing clinical evaluation based on a framework such as QSEN, which aligns well with nursing education and practice, clarifies the expectation for competent practitioners. Determining content validity of an evaluation instrument is essential when it is used in a high stakes assessment of student performance.2 Implementing a valid QSEN-based clinical evaluation instrument supports the work of nurse educators as they set the standard for competent student nurse performance.


This study establishes content validity for this QSEN-based clinical evaluation instrument. Content validity is only 1 aspect of an effective clinical evaluation instrument. A next step would include establishing construct validity to indicate that the instrument effectively measures the construct of clinical success. Future studies will be needed to provide further validity and reliability data as the instrument is adapted and leveled to multiple nursing courses, using language that demonstrates progression.


When developing an instrument, nurse educators need to address content validation and provide evidence that the construct is thoroughly addressed by the instrument. Evidence exists to support content validation of this QSEN-based clinical evaluation instrument. Applying the QSEN competencies to clinical performance evaluation provides a clear and organized framework for objective analysis of student performance in the clinical setting. The high degree of agreement among expert nurse educators demonstrates that the items included in this QSEN-based clinical evaluation instrument provide a relevant framework for contemporary nursing education and practice.


I would like to thank Amanda Eymard, DNS, APRN, CNE, PMHNP-BC, from Nicholls State University, Linda Flores, MSN, CEN, from Western University of Health Sciences, and JoAnn Mulready-Shick, EdD, RN, CNE, ANEF, from University of Massachusetts Boston for allowing me to review their clinical evaluations instruments and adopt aspects in the creation of this QSEN based clinical evaluation instrument.


1. Cronenwett L, Sherwood G, Barnsteiner J, et al. Quality and safety education for nurses. Nurs Outlook. 2007;55(3):122-131.
2. Rutherford-Humming T. Determining content validity and reporting a content validity index for simulation scenarios. Nurs Educ Perspect. 2015;36(6):389-393.
3. Orts-Cortes MI, Moreno-Casbas T, Squires A, Fuentelsaz-Gallego C, Macia-Soler L, Gonzalex-Maria E. Content validity of the Spanish version of the Practice Environment Scale of the Nursing Work Index. Appl Nurs Res. 2013;26:e5-e9.
4. Flores L, Shakhshir P, Lopez M. Clinical evaluation tools embodying AACN BSN essentials and 6 QSEN KSAs. Quality and Safety Education for Nurses Teaching Strategy. 2014. Available at Accessed May 13, 2015.
5. Mulready-Shick J. Integrating QSEN into clinical evaluation tools. Quality and Safety Education for Nurses teaching strategy. 2012. Available at Accessed May 13, 2015.
6. Eymard A, Lyons R, Davis A. Clinical performance evaluation tools utilizing the QSEN competencies. Quality and Safety Education for Nurses teaching strategies. 2012. Available at Accessed May 13, 2015.
7. Altmiller G. Student perceptions of incivility in nursing education: implications for educators. Nurs Educ Perspect. 2012;33(1):15-20.
8. Delprato D. Students' voices: The lived experience of faculty incivility as a barrier to professional formation in associate degree nursing education. Nurse Educ Today. 2013;33(3):286-290.
9. Lasiter S, Marchiondo L, Marchiondo K. Student narrative of faculty incivility. Nurs Outlook. 2012;60(3):121-126.
10. Polit DF, Beck CT. Nursing research: Generating and assessing evidence for nursing practice. 10th ed. Philadelphia, PA: Wolters Kluwer; 2017.
11. Polit DF, Beck CT, Owen SV. Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Res Nurs Health. 2007;30:459-467.
12. Brink Y, Louw Q. Clinical instruments: Reliability and validity critical appraisal. J Eval Clin Pract. 2011;18:1126-1132.

clinical evaluation tool; instrument development; nursing education; QSEN; rating scale

Supplemental Digital Content

Copyright © 2017 Wolters Kluwer Health, Inc. All rights reserved.