Secondary Logo

Journal Logo

Research Report

Validation of a Self-Report Clinical Decision-Making Tool Using Rasch Analysis

Macauley, Kelly PT, EdD, DPT, CCS, GCS; Brudvig, Tracy PT, PhD, DPT, OCS; Barry, Amanda PT, DPT, ATC; Lufkin, Olivia PT, DPT; McEnroy, Kevin PT, DPT; Milinazzo, Andrew PT, DPT

Author Information
Journal of Physical Therapy Education: September 2018 - Volume 32 - Issue 3 - p 248-257
doi: 10.1097/JTE.0000000000000019



The ability of a practicing clinician to navigate seamlessly through an episode of care depends on the development of their clinical decision-making skills (CDM). Clinical decision making is defined as “reasoning that results in action.”1,2 Clinical decision making is a multifaceted skill, comprising critical thinking, problem solving, and the use of evidence-based practice to deliver appropriate and effective patient care.3 The development of CDM is dynamic, occurs across several practice domains, and through social and professional enculturation.4,5 Decision making develops over time and is initially governed by hypothetico-deductive reasoning.3,6,7 In the hypothetico-deductive model, a clinician gathers some information about a patient and then generates hypotheses about the cause of the chief complaint. As a clinician's decision-making skills progress, the process shifts from a hypothesis-oriented model to an evolving process where new information emerges and is immediately integrated into a clinician's thoughts, thereby guiding the clinician's decisions.6

Although the terms clinical decision-making (CDM), critical thinking, and clinical reasoning are often used interchangeably, each represents a different process. Clinical reasoning is a complex process that results in a judgment that leads to an action.8 Clinical reasoning also takes into account contextual factors as part of the process and develops overtime.9 Clinical reasoning incorporates the ability to frame and solve problems relative to patient care.10-12 Clinical reasoning is complex and a component of it includes critical thinking12 and decision making.10 Critical thinking is “the disciplined activity of evaluating arguments or propositions and making judgments that can guide the development of beliefs and taking action.”13 Clinical-decision making occurs as a result of the clinical reasoning process.14 Like clinical reasoning, CDM develops over time.15 Although CDM is an end product of clinical reasoning, this study is focused solely on the development and assessment of CDM skills. To differentiate CDM from critical thinking and clinical reasoning, CDM will be characterized as “reasoning that results in action.”1,2

Clinical practice requires CDM. A physical therapist needs to make evidence-based choices and to follow a course of action, as physical therapists are tasked with making multiple decisions within one patient interaction.16 The decisions physical therapists make extend across both patient and practice management.2 Physical therapists also need to make decisions rapidly, often with incomplete data and/or competing goals, which adds another layer of complexity.16 These clinical practice requirements demonstrate why students and clinicians must learn and hone their CDM skills.

Knowledge and experience help to guide CDM.3,6 Therefore, novice clinicians have different CDM skills than expert practitioners.3,7,17,18 Jette et al19 defined the CDM characteristics of novice, entry-level clinicians in an acute care environment. For experienced therapists to consider students to be entry level, the entry-level students needed to demonstrate an ability to synthesize information, react to changes, or modify their management quickly. This required an ability to adapt, demonstrate flexibility, and to continuously evaluate, while maintaining a broad view. The novices needed to choose appropriate examination and intervention techniques while providing a rationale. Last, novices needed to recognize when something was outside of their scope of practice and advocate for assistance from more experienced therapists or other health care professionals.

Entry-level clinicians struggle with controlling a busy environment.18 As a result, they make decisions to give them more time rather than focusing on making decisions that would lead to optimal patient experiences or outcomes.18 Novice clinicians are also uncomfortable with not having all the information when faced with making decisions, especially when attempting to predict patient outcomes.18

In contrast to novices, experts have an improvisational style of patient care instead of a formal CDM process.7,18 When confronted with a frantic environment or lack of information, experts are able to control the environment and make decisions when faced with uncertainty.18 Expert therapists do not get bogged down in medical information and are able to better relate their CDM process to the patient's functional status than entry-level therapists. This allows the experts to make joint decisions with patients and family members.5 May and Dennis6 found that even within experts, there are differences in decision-making processes depending on the content expertise. Therapists treating patients with neurologic disorders used a preceptive data gathering style and intuitive data processing system. Therapists using a preceptive data gathering style move organically between systems based on the information provided and emerging patterns, and intuitive processing allows the therapist to consider all possibilities simultaneously. Experts in orthopedic physical therapy followed a process similar to the hypothetico-deductive style.

Patel17 described four ways expert physicians make decisions differently than novices: experts possess highly efficient ways of keeping information in working memory when they need to solve problems; experts have a large number of “domain specific rules” that help them to make decisions; experts use heuristics derived from a strong knowledge base; and they tend toward forward reasoning (facts to a solution). Similar in physical therapy, experts have meaningful recall of information, patterns, and relationships that help them make successful decisions.18

Another important distinction between experts and novices is that when experts make decisions, the patient's needs and goals are considered first.5 Experts are also better able to navigate or control competing demands.18 The patient is an important source of information, necessary for the decision-making process for experts.5

The current literature defines CDM at the novice or entry level, longitudinally through the first 2 years of practice,4,20 and the expert level of practice. However, the CDM development process is not well defined in students or prenovice clinicians, and the timeframe for development is unknown.3 Because CDM is an important component of physical therapy education and practice, and changes as a student develops into a novice clinician, it is beneficial to have tools that assess and guide a student's progression.

Tools That Assess CDM

There are limited tools available to assess the CDM process.15 May and Dennis created a tool to measure experts' CDM process.6 The researchers created items for the tool based on qualitative interviews of expert therapists. They developed six categories based on existing research in cognitive styles: receptive, preceptive, systematic, intuitive, affect or belief, and knowledge. The six categories of statements were then assessed on a four-point Likert scale. A pilot study was conducted on physical therapists in the United States and Australia, followed by a factor analysis of the tool. The results yielded a final tool with 15 demographic questions, 14 questions related to information sources, and 36 questions related to styles of decision making. The final statements on the tool are expert physical therapy practice based, limiting its application to students in academic settings.

Some CDM assessment tools have been identified in nursing research. These include the nursing performance simulation instrument,21 the Clinical Decision Making in Nursing Scale,22 Participation in Decision Activities Questionnaire,23 and a 56-item instrument based on decision-making theories.24 The use of these tools within physical therapist education is not feasible, as the items are specific to the nursing field and nursing practice. They also do not assess change in CDM skills overtime.25

CDM Tool Development

Because of a lack of appropriate assessments, Brudvig and Macauley25 developed a tool for evaluating CDM in physical therapist students. The original 25-item scale was designed to measure two constructs: CDM and clinical skills (CS). We believed that there was an advantage to using existing items from established scales given that these items had already undergone rigorous assessment demonstrating their potential utility.26 Therefore, we investigated using components of the Physical Therapist Clinical Performance Instrument (PTCPI).19 Physical therapist students' performance during clinical experiences is assessed using the PTCPI. The items on the PTCPI are organized into three themes: professional practice, patient management, and practice management.27 Each theme requires CDM by the physical therapist. In addition, the PTCPI was developed using key hallmark physical therapy documents, including The Normative Model for Physical Therapist Professional Education, The Guide to Physical Therapists Practice, Vision 2020, The Commission on Accreditation of Physical Therapy Education Programs Evaluative Criteria, and Professionalism in Physical Therapy: Core Values. Together, these documents establish what physical therapy practice encompasses, which includes CDM.

However, the PTCPI does not assess CDM independently. Rather, CDM is embedded in multiple items. By contrast, the sample behaviors describing each domain on the PTCPI touch on the cognitive, affective, and psychomotor domains of CDM. Therefore, the scale items were adapted from sample behaviors on the PTCPI (Figure 1). Sample behaviors were selected from the PTCPI based on components of CDM, clinical reasoning, and critical thinking found in the literature.2-7 Examples of words or implications in the sample behaviors that we felt related to these constructs are as follows: knowledge, assessing and reassessing, synthesizing, evidence, referral, interpreting results, patient-centered care, diagnosis, prioritize, and reflection. Content validity was assessed by a group of 10 expert physical therapists. The group included expert clinicians, clinical instructors, and faculty.

Figure 1.
Figure 1.:
Item Modification History
Figure 1-A.
Figure 1-A.:
Item Modification History

Because important components of CDM are self-assessment and reflection skills,3,7 and these skills develop over time,7 we created a self-report tool. The CDM tool can assist with improving self-assessment skills. McMillan and Hearn28 suggested that self-assessment requires a cyclical movement between self-monitoring, self-judgment, and learning goals. Self-monitoring includes focused attention on performance or thought processes or reflection on action.29 Self-judgment is moving oneself toward an identified behavior or benchmark. The CDM tool provides both the opportunity for a student to reflect on the construct of CDM, while also providing a continuum where they can catalog their progress. Their self-assessment of their CDM skills can be calibrated by other feedback received from clinical or academic faculty, peers, or examination results to help them achieve their learning goals.

The first version of the tool included 25 items, in which students rated their perception of the CDM and clinical skill abilities rated on a 6-point Likert scale. The anchors on the Likert scale were chosen to allow students who had no clinical experience to evaluate their perceptions of their ability, a departure from previous CDM assessment tools.

In a pilot study, students' perceptions of their own CDM and CS were compared with those of their clinical instructors on their terminal internship.25 Most students rated themselves below or equal to their clinical instructors' ratings. The findings were consistent with previous research indicating that students tend to rate themselves lower than a supervisor.30,31 However, Zell and Krizan31 found that students do have at least moderate insight into their abilities, when assessing both global and specific skills. Therefore, the researchers felt further testing was warranted because the results supported that students were consistent in their assessment of their perception of their CDM abilities.

In a follow-up, cross-sectional study, a larger sample of students was included (interns and students in their first, second, and third years of the curriculum).32 A narrative question was added to help further validate the tool: “Please describe your CS and clinical decision making in a paragraph.” This narrative question was posed to both interns and their respective clinical instructors. In addition, qualitative data were gathered from focus groups of students on the development of their CDM and CS as interns. Analysis of the quantitative results demonstrated a gradual increase in the median ratings of CDM scores as students proceeded through the curriculum, providing evidence for its validity. The tool also demonstrated high internal consistency. Qualitative analysis of the responses to the narrative included identifying themes. The themes primarily reflected aspects of CDM development captured in the tool. Based on these results, we felt that the tool was measuring one construct: CDM. Although the results of the follow-up study provided evidence of construct validity and reliability, it did not address the psychometric properties of the tool. Therefore, further evidence for the validity of the tool was needed.

The purpose of this study, therefore, is to provide evidence to support the validity of a tool to assess the development and level of CDM skills in Doctor of Physical Therapy (DPT) students.


Study participants included members of the DPT classes of 2013–2019 at the MGH Institute of Health Professions (MGH IHP). In phase 1, 488 DPT students from the classes of 2013–2017 were contacted and data were collected between August 2013 and August 2014. In phase 2, 209 members of the DPT classes of 2015–2017 were targeted. Data were collected between January 2015 and February 2015. In phase 3, a total of 361 members from the classes of 2013–2018 were contacted. Only 81% of the class of 2013 and 98% of the class of 2014 were emailed because of outdated contact information. Data were collected between April 2015 and November 2015. Phase 4 included a total of 406 members from the classes of 2013–2019. Data were collected between May 2016 and July 2016. See Figure 2 for details regarding the timeline for data collection.

Figure 2.
Figure 2.:
Timeline for Tool Development


Validation of the survey tool occurred over 4 phases of data collection (Figure 2). All data collections were cross-sectional, and some DPT classes in the curriculum were surveyed in multiple phases. If a participant responded more than once during a phase, the most recent response was included for the analysis. This only occurred on three occasions.

A description of the study and the survey tool was emailed to recruit participants. Weekly reminders were sent for 1 month. Consent was implied if the participant completed the survey. Study data were collected and managed using Research Electronic Data Capture (REDCap).33 Research Electronic Data Capture is a secure, web-based application designed to support data capture for research.

Descriptive statistics and frequencies were used to describe demographic data. Although summed Likert scale data can be considered a continuous measurement for which parametric data analysis is appropriate, because the data were not normally distributed, nonparametric statistics were used. The Kruskal-Wallis test was used to determine differences in survey scores between class years, with a post hoc Mann–Whitney test used to assess differences between classes. A statistical analysis was performed using SPSS, version 21.34

Each phase of data collection included a Rasch analysis, performed using WINSTEPS version Rasch analysis creates mathematical models that assess rating scales. Rasch analysis was originally designed as a psychometric tool for use in the social sciences because of its ability to measure abstract concepts, and it has been adapted for use in rehabilitation sciences.36 For the present study, Rasch analysis proved to be useful for the evaluation of construct validity, face validity, and item analysis.

Rasch analysis uses a test of fit between the subject data obtained and the Rasch Model to determine the validity of an outcome measure. The Rasch model converts ordinal data into an interval scale, allowing comparisons among observations.37 The model presents subject ability level along with the tool item difficulty, thereby providing a measure of the effectiveness of a tool's ability to measure a given construct.38 Rasch analyses are useful because they compare the observed survey responses against an ideal model to help researchers understand where the flaws exist in the assessed survey tool.39 Based on the results obtained from all analyses, modifications were made to the survey in preparation for the next phase of data collection.

The Spaulding Rehabilitation Hospital's Institutional Review Board (IRB), the IRB for the MGH IHP, approved all phases of this study.


Phase 1

The overall response rate was 46%, yielding a total of 144 responses. The response rates for 2013–2017 ranged from 27% to 60% between classes. The survey tool showed excellent internal consistency (Cronbach's α = 0.993).40

Rasch Analysis

The infit statistic results indicated that most items were clustered together with mean square values between 0.5 and 1.5, demonstrating that the items were positively contributing to the model and assessing one construct (Figure 3A). One item, performs interventions effectively, demonstrated a mean square value > 2, indicating that the item was not assisting in the measurement of the construct and degrading the measurement.41 Although we initially set out to measure two constructs, CDM and CS, after we reviewed all the items, we used consensus to reaffirm that the construct being measured was solely CDM.

Figure 3.
Figure 3.:
Infit Plots From Rasch Analysis

The probability curve failed to demonstrate overlapping sinusoidal graphs for the six levels of the Likert scale (Figure 4A).25 This finding indicated category disordering or an inability of the tool to consistently predict a person's score on the tool given their ability level. An ideal probability curve demonstrates a consistent ability to predict a person's score.

Figure 4.
Figure 4.:
Probability Curve for One Item on the Likert Scale

Figure 5A represents the item map. The numbers in the left column, −6 to 9, are the logit scores or the conversion of the tool to a continuous measure for analysis. Nine represents the highest score and −6 the lowest. The column of “x”s adjacent to the logits represents the subjects who completed the tool. The range of participant CDM abilities corresponds to the logits and is noted to be from −6 to 9. On the right side of Figure 5A are the items from the tool. The tight cluster suggested that the tool was measuring a narrow range of abilities, from approximately −1.5 to 1.5 logits. There were no items difficult enough to measure participants with high ability and no items easy enough to measure participants with low ability. This demonstrated possible ceiling and floor effects of the measure. The clustering of the items on the tool also indicated item redundancy.

Figure 5.
Figure 5.:
Item Maps From Rasch Analysis

Modifications to Survey

The results of the Rasch analysis indicated that the survey primarily measured one construct, CDM. From this point forward, we will refer to the tool as the CDM tool. The recruitment email was changed to reflect the single construct. Based on the results of the Rasch analyses, several modifications were made to the CDM tool (Figures 1 and 6). The Likert scale was collapsed from a 6-point scale to 5-point scale to rectify the category disordering. Rasch modeling allows the researcher to combine levels of the Likert scale that are not predicting well subjects' abilities to determine the number of levels necessary to optimize discrimination of the tool and reduce category disordering. The overlap between levels seen on the probability curves indicated that fewer response levels could create a more discriminating tool. In addition, the anchors of the Likert Scale were modified because of probability curve results. After data analysis, we found that many new students were assessing themselves at the highest CDM levels on the scale. It was unlikely that their abilities matched their perceptions. The change in language shifted the participant's focus from their level of agreement with each item to assessing the level of assistance required to complete the item. We changed the frame of reference to attempt to promote more accurate reflection in new students. We felt that the level of assistance was less abstract and could be understood more easily without context or experience. The stem of the items was changed from the level of agreement with the tool items to “how often do you seek advice, support, or intervention from a more experienced physical therapist.” It was anticipated that the changes to the scale would provide a better distinction between responses and improve the ability to predict a person's score based on their ability. We decided to keep the item that was not assisting in the measurement of the construct despite its lack of contribution. Andrich42 proposed that a balance must be attained between the mathematical model created and how it accounts for the data. In other words, we felt that it was important not to make too many changes based on one iteration of the model.

Figure 6.
Figure 6.:
Likert Scale Level Modification History

Phase 2

The overall response rate was 32.5%, yielding a total of 68 responses. Thirty-one percent were in the class of 2015, 36% in the class of 2016, and 33% in the class of 2017. The survey tool showed excellent internal consistency measured (Cronbach's α = 0.993).

Rasch Analysis

All items had a mean square near one and a fit index of 0, indicating a unidimensional construct. The item identified after phase 1 as negatively contributing to the model was now useful to the measurement and included in subsequent analyses (Figure 3B). The probability curves demonstrated similar category disordering as seen in phase 1, indicating that survey items were not effectively measuring subject ability (Figure 4A). Results of the item map were similar to phase 1 (Figure 5A). There was a clustering of item logit scores around 0 compared with a wide range of subject ability logit scores. Again, there was redundancy between the items.

Modifications to Survey

The Likert scale levels were modified again (Figures 1 and 6). The focus on the levels was kept at how much assistance the rater needed, but the choices were made more explicit. The Likert scale was also modified from a 5-point scale to a 6-point scale. It was felt that these anchors would be easier for students to understand and apply because they are less abstract. To increase student understanding, it was felt that six levels were needed for the new scale. We felt that switching the frame of reference would be particularly useful for novice students learning to reflect on their performance. Last, the wording was modified for several items on the tool to enhance clarity.

Phase 3

The overall response rate was 36.0%, yielding a total of 130 responses. The response rates for 2013–2018 ranged from 21.5% to 49.3% across the classes.

Rasch Analysis

The probability curves showed evidence of category disordering: levels 2, 3, and 4 demonstrated overlap and minimal distinction, few participants selected level 1, and levels 5 and 6 differentiated well. The findings indicated an improvement in scale functioning from previous versions, but further modifications were required for the Likert scale levels (Figure 6).

The infit statistics continued to show clustering into one group, in which some items showed mean squares near 0 with a fit index near 0. This pattern of a mean square less than one implies overfit.39 Overfit is caused by too little variation in the response pattern and is most likely attributable to redundant items. Similarly, the item maps continued to show a narrow range of item difficulty compared with subject ability (Figure 5A), which suggested redundancy.

Redundancy can limit the ability to distinguish between items.39 Research has demonstrated that condensing underused item-rating categories can improve infit statistics.37 Multiple item combinations were analyzed in WINSTEPS using various systematic methods of eliminating redundant items or collapsing underused item ratings. Iterations were performed sequentially with the phase 3 data set. In addition, a factor analysis was performed to attempt to eliminate redundant items. None of the analyses suggested pathways to effectively reduce the number of items.

Modifications to Survey

Attempts at reducing the item redundancy through item analysis were unsuccessful. Therefore, a decision-making theoretical framework was sought in the literature to guide the reduction in items. Decision-making models from health professions, military, gaming industries, business, aviation, and transportation were considered.43-47 All models described similar steps of the decision-making process. We chose the “DECIDE model” by consensus because of its simplicity and easy application to the CDM tool (Appendix A, Supplemental Digital Content 1,

The DECIDE model is a decision-making framework used by health care managers. There are six stages encapsulated in the DECIDE acronym (Appendix A, Supplemental Digital Content 1, To apply the DECIDE model to the current CDM tool, we categorized each item on the CDM tool into one of the six stages of the model. The two items that best represented each stage of the DECIDE model were selected by consensus among the research team, reducing the tool from 25 items to 12 items. The wording on several items was modified to enhance clarity and align better with the DECIDE model.

Last, the Likert scale was modified from a six-point scale to a four-point scale. After analysis of the probability curves, the first condition was removed from the Likert scale because it was rarely chosen by students to describe their perception of their CDM ability. The last level distinguished students' abilities clearly and was left untouched. There was a lot of disorganization between the remaining categories. Rasch analysis allows the data on a Likert scales to be collapsed to determine which model is the best fit. After further analysis, it was determined that collapsing the second and third into one category, and keeping the fourth and fifth levels untouched, yielded the best model (Figure 4).

Phase 4

The final version of the CDM tool included 12 items rated on a four-point Likert scale reflecting the perceived ability to perform the tasks described (Appendix C, Supplemental Digital Content 1, Scores were compiled by adding the values assigned to the Likert scale items. Higher scores indicated better CDM skills. The overall response rate was 44.8%, yielding a total of 182 responses. The response rates for 2013–2019 ranged from 24.5% to 74.6% in the respective classes.

Rasch Analysis

Figure 3B shows the infit plot for phase 4 of the data collection. The plot demonstrated that all mean-square values were within 0.5–1.5 and contributing productively to the tool.41 The findings indicated that the tool continued to assess one construct. The Rasch model for the CDM tool accounts for 78.9% of the variance.

Figure 5B is an item map demonstrating an improved ability of the tool to assess a range of person abilities compared with previous versions of the tool. The phase 4 version of the tool captured abilities between logit scores of −6.82 to 7.43 but still had a ceiling and floor effect. The floor and ceiling effect account for 22% of respondents, approximately 11% each. Data from the classes of 2013–2015 contributed to the ceiling effect, consistent with the lack of significant difference observed with the Mann–Whitney test. The ability level of the class of 2019 was not consistently captured and contributed to the floor effect.

Figure 4B is a representative probability curve demonstrating clear delineation between each level of the Likert scale. In other words, if a person has high ability, the tool predicts that the respondent will likely select the highest level on the Likert scale. Therefore, version 4 of the CDM tool showed an improved ability to predict subject scores given their ability compared with earlier versions.

At this stage in tool development, we examined the construct validity using known-group differences. A Kruskal-Wallis test demonstrated a statistically significant difference in CDM tool raw scores between the different class years (χ2 = 142.582, df = 6, P ≤ .001). See Appendix B, Supplemental Digital Content 1, The findings confirmed that the survey distinguishes CDM ability between students with different amounts of experience in the DPT program. Post hoc Mann–Whitney tests compared CDM tool scores by class year. A statistically significant difference in scores was found between each class year, except between the class years 2013–2015. Therefore, the tool was not able to differentiate CDM ability for students who were more than 1 year removed from completing their terminal internship.


Clinical decision making is an integral component of physical therapist education and clinical practice.19 Until now, there was no tool in physical therapist education to assess the development of CDM at the beginning of and throughout a physical therapist student's education. The previous tools in physical therapy or other disciplines contained items centered on expert clinical practice, which is not relevant to new students. The CDM tool can be used as an outcome measure throughout physical therapy curricula to track student progress or as a means of feedback on different learning activities. This tool can assess the effectiveness of both didactic and clinical education activities. It identifies students who are struggling with their perceptions of CDM so that interventions can be implemented to support the students before starting clinical affiliations.

The tool assesses students' perceptions of their CDM. We do not believe it is possible to directly measure students' CDM abilities as new physical therapist students because they do not have context yet to rate themselves. Both their self-assessment and reflection skills as well as their content knowledge need to be developed before direct measurement of CDM is possible. We observed this in early versions of the CDM tool, but by modifying the Likert scale to levels that would be more meaningful to students with no physical therapy knowledge, new students could more accurately rate their perceptions of their abilities compared with their clinical instructors. After modification of the tool, 89% of the new students' perceptions of their abilities were categorized on the Rasch analysis, indicating that the descriptors gave the appropriate context for the students to rate themselves. Another set of students were assessed before their first full-time clinical experience, and 100% of their perceptions of their CDM ability was captured by the tool. Therefore, we believe that tool can be used with preclinical students to assess their perceptions of their CDM skills.

The tool loses its effectiveness to distinguish perceived CDM abilities 1 year after internship ends or 1.5–2 years after graduation. This may be the time where a less generic tool or content-specific cases would be a more appropriate way to directly measure CDM. It is possible that some students are beginning to transition away from being novice practitioners, progressing toward expert practice.5,6

The CDM tool has undergone a meticulous development process. Spector49 suggests a five-phase process for developing a summated scale, which includes defining the construct, designing the scale, pilot testing, item analysis, and validation. The CDM tool was developed based on the PTCPI27 and pilot tested.25 The tool initially had known groups validation performed,32 followed by rigorous item analysis in the current four-phase study.

The results of the final Rasch analysis yielded a functional tool. The levels of the Likert scale differentiated well between different participant abilities. The tool now assesses a wide range of abilities, from novice students to entry-level practitioners. The redundancy in the tool was eliminated, which created a concise 12-item tool. The summed scores on the tool differentiated well between class years, which provided known-groups construct validation. Last, the tool is reliable with high levels of internal consistency.

The CDM tool has benefits compared with other tools. As previously mentioned, the tool does not require experience with clinical practice, which allows students to use it immediately on entering a physical therapist education program. The items are not specific to physical therapy practice. Therefore, the tool lends itself to possibly being used in other professions or in interprofessional activities. The CDM tool is 12 items, making it easy to implement and use repeatedly. By analyzing the process of decision making instead of specific tasks, it increases the flexibility of the CDM tool, allowing it to be used across practice settings. The CDM tool is a self-assessment tool, which is an asset because one does not have to be concerned about the interrater reliability of the tool. A student provides continuous assessment of their own abilities over time, which is a reliable measure.31

A significant difference was found in DPT students' CDM scores between all class years, with the exception of 2013 through 2015. The present study is the first in physical therapy to measure CDM skills quantitatively and therefore capture the stepwise progression of skills. A ceiling effect explains the inability to distinguish between graduates who are more than 1 year out from completing their internship. Black, Jensen, Mostrom et al4 found that there are common experiences that contribute to the development of novice physical therapists. Perhaps at the 1-year postgraduate time frame, novice physical therapists have further developed their perception of CDM to a level outside the tool's sensitivity. A floor effect limits the ability to capture perceived CDM skills for all new students. Students entering the DPT program come from very diverse backgrounds, making it difficult to develop a tool that captures low levels of CDM. Despite the limitations, the Rasch analysis demonstrates that the tool effectively measures a wide variety of abilities.


The CDM tool uses self-report for data collection. Self-report measures have benefits, including high internal consistency, the opportunity to provide precise construct development, convenience, cost efficiency, and easy implementation in a clinical or classroom setting.50,51 At the same time, self-report can create a response bias because of the potential inaccuracy of self-report by students.40 Students may underestimate or overestimate their performance to remain consistent with socially accepted norms.52 Brudvig and Macauley25 found that students consistently underrated their abilities on earlier versions of the CDM tool. Future study of the response bias in the tool is warranted.

A testing effect is a possible threat to internal validity. Performing repeated testing on a group of subjects increases the likelihood that they can gain knowledge of the tool and its purpose. However, repeated testing likely did not impact the results because modifications were made to the CDM tool between each phase of administration.

The current study used a sample of convenience, which limits the generalizability of the results. Convenience sampling helped control costs, maximized response rates, and assisted with timely data collection and survey modifications.53 Future research requires partnership and assessment of a heterogeneous sample.

The current study used a cross sectional design. A longitudinal design will assess how the tool responds for individuals over time. In addition, the information gathered will describe the tool's responsiveness to specific learning activities and clinical affiliations. Obtaining values for minimal detectable change or minimal clinically important difference would further improve the utility of the tool. For example, the information could inform when students are ready to move to the next phase of the curriculum.

The stem of the Likert scale is a potential limitation. The stem asks students to rate “how often” they need advice, support, or intervene. The Likert scale levels prompt the students to reflect on the amount of assist required. This may be considered a mismatch; however, we feel that support or advice implies the level of assist required.

Recommendations for Future Research

The CDM tool needs to be assessed in other student populations and longitudinally to further validate it. The other student populations include other physical therapist students and other professions. The CDM tool needs to be used in different contexts to further assess its utility. This can include assessment after a learning activity or a course. The tool may have the ability to determine CDM competency levels or cut points that are needed before initiating clinical practice, but this will need further testing. Last, validating the tool in other health professions or interprofessional activities that require similar decision-making processes would be beneficial.


Clinical decision making is recognized as a vital component of physical therapy practice.1,3 The present study provides validation of a self-report tool for assessing CDM skills in DPT students. The results showed that the tool can differentiate between scores across all class years of the DPT curriculum and up to 1 year after graduation. The tool has the potential to allow students and educators to target specific areas for further development of skills. With further validation in a more heterogeneous sample, the tool could be a convenient, cost-efficient way of assessing CDM skills in DPT students.


The authors would like to acknowledge Dr. Marianne Beninato PT, DPT, PhD, for her guidance in understanding Rasch Analysis and WINSTEPs. We would like to thank Barth Riley for his assistance with data analysis.


1. Higgs J, Jones M, eds. Clinical Reasoning in the Health Professions. 2nd ed. Boston, MA: Elsevier Health Sciences; 2000.
2. Wainwright SF, McGinnis PQ. Factors that influence the clinical decision-making of rehabilitation professionals in long-term care settings. J Allied Health. 2009;38:143-151.
3. Wainwright SF, Shepard KF, Harman LB, Stephens J. Factors that influence the clinical decision making of novice and experienced physical therapists. Phys Ther. 2011;91:87-101.
4. Black LL, Jensen GM, Mostrom E, et al. The first year of practice: An investigation of the professional learning and development of promising novice physical therapists. Phys Ther. 2010;90:1758-1773.
5. Jensen GM, Gwyer J, Shepard KF, Hack LM. Expert practice in physical therapy. Phys Ther. 2000;80:28-43;discussion 44-52.
6. May BJ, Dennis JK. Expert decision making in physical therapy: A survey of practitioners. Phys Ther. 1991;71:190-202;discussion 202-206.
7. Embrey DG, Guthrie MR, White OR, Dietz J. Clinical decision making by experienced and inexperienced pediatric physical therapists for children with diplegic cerebral palsy. Phys Ther. 1996;76:20-33.
8. Benner P, Hooper-Kyriakidis P, Stannard D. Thinking in action: An overview. In: Clinical Wisdom and Interventions in Critical Care: A Thinking in Action Approach. Philadelphia, PA: WB Saunders; 1994:4-13.
9. Nikopoulou-Smyrni P, Nikopoulos CK. A new integrated model of clinical reasoning: Development, description and preliminary assessment in patients with stroke. Disabil Rehabil. 2007;29:1129-1138.
10. Gilliland S, Wainwright SF. Patterns of clinical reasoning in physical therapist students. Phys Ther. 2017;97:499-511.
11. Edwards I, Jones M, Carr J, Braunack-Mayer A, Jensen GM. Clinical reasoning strategies in physical therapy. Phys Ther. 2004;84:312-330;discussion 331-335.
12. Huhn K, Black L, Jensen GM, Deutsch JE. Tracking change in critical thinking skills. JOPTE. 2013;27:26.
13. Huitt W. Critical thinking: An overview. Educ Psychol Interactive. 1998. Accessed September 1, 2016.
14. Magistro CM. Clinical decision making in physical therapy: A practitioner's perspective. Phys Ther. 1989;69:525-534.
15. Wainwright SF, Gwyer J. (How) can we understand the development of clinical reasoning? JOPTE. 2017;31:4-6.
16. Smith M, Higgs J, Ellis E. Factors influencing clinical decision making. In: Higgs J, Jones M, Loftus S, Christensen N, eds. Clinical Reasoning in the Health Professions. 3rd ed. Boston, MA: Elsevier; 2008:89-100.
17. Patel VL, Groen GJ. Developmental accounts of the transition from medical student to doctor: Some problems and suggestions. Med Educ. 1991;25(6):527-535.
18. Jensen GM, Shepard KF, Gwyer J, Hack LM. Attribute dimensions that distinguish master and novice physical therapy clinicians in orthopedic settings. Phys Ther. 1992;72(10):711-722.
19. Jette DU, Bertoni A, Coots R, Johnson H, McLaughlin C, Weisbach C. Clinical instructors' perceptions of behaviors that comprise entry-level clinical performance in physical therapist students: A qualitative study. Phys Ther. 2007;87(7):833-843.
20. Hayward LM, Black LL, Mostrom E, Jensen GM, Ritzline PD, Perkins J. The first two years of practice: A longitudinal perspective on the learning and professional development of promising novice physical therapists. Phys Ther. 2013;93(3):369-383.
21. Gover VF. The NPSI: A nursing performance simulation instrument. Nurs Res Conf. 1972;8:9-44.
22. Jenkins HM. A research tool for measuring perceptions of clinical decision making. J Prof Nurs. 1985;1(4):221-229.
23. Mirsaidi G, Lakdizaji S, Ghojazadeh M. How nurses participate in clinical decision-making process. J Appl Environ Biol Sci. 2012;2:620-624.
24. Lauri S, Salanterä S. Developing an instrument to measure and describe clinical decision making in different nursing fields. J Prof Nurs. 2002;18(2):93-100.
25. Brudvig T, Macauley K. Clinical decision making tool for DPT students. Acad Exchange Q. 2015;19(2):61-66.
26. Keszei A, Novak M, Streiner DL. Introduction to health measurement scales. J Psychosomatic Res. 2010;68:319-323.
27. Roach KE, Frost JS, Francis NJ, Giles S, Nordrum JT, Delitto A. Validation of the revised physical therapist clinical performance instrument (PT CPI): Version 2006. Phys Ther. 2012;92:416-428.
28. McMillan JH, Hearn J. Student self-assessment: The key to stronger student motivation and higher achievement. Educ Horizons. 2008;87(1):40-49.
29. Schon D. The Reflective Practitioner: How Professionals Think in Action. 1st ed. New York, NY: Basic Books; 1983:384.
30. Falchikov N, Boud D. Student self-assessment in higher education: A meta-analysis. Rev Educ Res. 1989;59(4):395-430.
31. Zell E, Krizan Z. Do people have insight into their abilities? A metasynthesis. Perspect Psychol Sci. 2014;9(2):111-125.
32. Brudvig T, Macauley K, Segal N. Measuring clinical decision-making and clinical skills in DPT students across a curriculum. J Allied Health. 2017;46(1):23-27.
33. Harris PA, Taylor R, Thielke R, Payne J, Gonzalez N, Conde JG. Research electronic data capture (REDCap)—A metadata-driven methodology and workflow process for providing translational research informatics support. J Biomed Inform. 2008;42(2):377-381.
34. IBM. SPSS Statistics [Computer Program]. Armonk, NY: IBM; 2012;21.
35. Linacre JM. WINSTEPS Rasch Software. 2014;3.81.0.
36. Tesio L. Measuring behaviours and perceptions: Rasch analysis as a tool for rehabilitation research. J Rehabil Med. 2003;35(3):105-115.
37. Kornetti DL, Fritz KL, Chiu Y, Light KE, Velozo CA. Rating scale analysis of the berg balance scale rating scale. Arch Phys Med Rehabil. 2004;85(7):1128-1135.
38. Bond TG, Fox CM. Applying the Rasch Model: Fundamental Measurement in the Human Sciences. 2nd ed. New York, NY: Routledge; 2007.
39. Green KE, Frantom CG. Survey development and validation with the rasch model. International Conference on Questionnaire Development, Evaluation, and Testing; 2002. Paper presented at the International Conference on Questionnaire Development, Evaluation, and Testing in Charleston, SC, November 14-17, 2002; Accessed September 1, 2016.
40. Portney LG, Watkins MP. Foundations of Clinical Research: Applications to Practice. 3rd ed. Upper Saddle River, NJ: Pearson/Prentice Hall; 2009.
41. Linacre JM. What do infit and outfit, mean-square and standardized mean? Institute for Objective Measurement Web site. Updated 2017. Accessed January 26, 2017.
42. Andrich D. Rasch Models for Measurement. Vol 68. Newbury Park, CA: Sage University Paper; 1988.
43. Jensen RS. The boundaries of aviation psychology, human factors, aeronautical decision making, situation awareness, and crew resource management. Int J aviation Psychol. 1997;7(4):259-267.
44. Marr JJ. The Military Decision-Making Process: Making Better Decisions Versus Making Decisions Better. Fort Leavenworth, KS: School of Advanced Military Studies at United States Army Command and General Staff College; 2000.
45. Gullickson T, Ramser P. Review of decision making in action: Models and methods. Contemp Psychol. 1993;38(12):1335.
46. Glass BD, Maddox WT, Love BC. Real-time strategy game training: Emergence of a cognitive flexibility trait. PLoS One. 2013;8(8):e70350.
47. Federal Aviation Administration. Aeronautical decision-making. In: Pilot's Handbook of Aeronautical Knowledge. Oklahoma City, OK: US Department of Transportation; 2008:chap 17-1 to 17-4.
48. Guo K. DECIDE: A decision-making model for more effective decision making by health care managers. Health Care Manag. 2008;27(2):118-127.
49. Spector PE. Summated Rating Scale Construction: An Introduction. Newbury Park, CA: Sage Publications; 1992:73.
50. Nunes V, Neilson J, O'Flynn N, et al. Clinical Guidelines and Evidence Review for Medicines Adherence: Involving Patients in Decisions About Prescribed Medicines and Supporting Adherence. London, United Kingdom: National Collaborating Centre for Primary Care and Royal College of General Practitioners; 2009.
51. Fulmer SM, Frijters J. A review of self-report and alternative approaches in the measurement of student motivation. Educ Psychol Rev. 2009;21:219.
52. van de Mortel TF. Faking it: Social desirability response bias in self-report research. Aust J Adv Nurs. 2008;25(4):40-48.
53. Kelley K, Clark B, Brown V, Sitzia J. Good practice in the conduct and reporting of survey research. Int J Qual Health Care. 2003;15(2):261-266.

Clinical decision making; Self-assessment tool; Physical therapy education

Supplemental Digital Content

© 2018 Academy of Physical Therapy Education, APTA