Text/HTML

Open Book

Text/HTML

AI and the NPTE: Using State-of-the-Art Tools to Advance Assessment  

As the proliferation of artificial intelligence (AI) and expert systems has continued from the experimentation phase to individuals’ desktops, FSBPT has integrated some of these tools into the NPTE development process. This article is based on a presentation by Teresa Briedwell, Marcia Himes, Sara Maher, Anissa Davis, and Lorin Mueller at the 2024 Annual Education Meeting. 

Text/HTML

In recent years, the integration of artificial intelligence (AI) and expert systems into various sectors has revolutionized traditional practices, bringing about significant improvements in efficiency, accuracy, and overall quality. The National Physical Therapy Examination (NPTE) is no exception. 

The NPTE is a critical component in the licensure process for physical therapists and physical therapist assistants in the United States. It ensures that candidates possess the necessary knowledge and skills to provide safe and effective care. Traditionally, the development and maintenance of the NPTE have relied heavily on human expertise, involving extensive manual processes. However, the advent of AI has introduced new possibilities for enhancing the examination's quality and efficiency. 

Currently, FSBPT strongly discourages writers from using AI to write their own items. In fact, all writers need to sign an agreement to state they will not use AI tools. This is for various reasons, including copyright issues as well as quality and experimental concerns. Rather than having item writers try AI tools on their own, FSBPT is working with established vendors in this area to examine the feasibility of using AI tools (generalized computer-based tools that learn to accomplish tasks successfully) and expert systems (specific, mature tools that perform a task very reliably) to support efficient NPTE development.  

AI in Item Writing and Review  

One promising application is FSBPT’s use of Alpine Intelligence, an item screening and feedback tool designed to identify items that do not conform to the style and structure criteria of the NPTE. The primary objective of Alpine Intelligence was to allow item writers to concentrate on the content of physical therapy rather than the intricacies of the NPTE Style Manual. This tool flags various issues, including repetition, negative phrasing, grammar errors, superfluous language, the use of pronouns, the length and balance of answer options, and cueing between the question stem and the options. Additionally, it can highlight terms that are no longer considered essential areas of practice and should not be included in the NPTE. 

FSBPT piloted the tool during the Basic Item Writer Workshop in August 2023 and again in March 2024. This tool was tested at the workshops, where it produced reports on individual test items, provided summary reports on the overall quality of the items, and offered feedback to item writers to help them identify patterns in their errors. 

Alpine Intelligence has shown promise in identifying spelling and grammatical errors, as well as inconsistencies with NPTE style. Additionally, in some cases, the tool has identified structural issues with items that sometimes aren’t caught in the early stages of item writing. There is still room for improvement, specifically in reducing erroneous flags and ensuring the tool aligns better with FSBPT's processes and terminology. FSBPT’s long-term preference is to have the tool work in real-time item writing workshops as item writers complete a draft of an item without disrupting the current workflow, which it currently does not do. Despite these issues, the editorial team is effectively using it as a quality assurance tool and to correct items that were edited prior to certain style changes being implemented. 

Natural Language Processing (NLP) for Exam Form Creation 

Natural language processing (NLP) is a subfield of computer science and artificial intelligence that automates the processing of text information with minimal manual effort. In the context of the NPTE, NLP can identify items with similar linguistic characteristics that may indicate potential overlap or queuing between items. 

In January 2024, the NLP system was trained using a practice examination assessment tool to identify enemy items—items that could potentially overlap or cue another item. Using approximately 500 practice test items as a training set, the system successfully identified twenty-five enemy item pairs, twenty-two of which had been previously identified through traditional review methods. Of the three additional pairs that had not been identified, two had been overlooked by human reviewers, and one pair was determined to be erroneously identified. 

These results demonstrated the potential of NLP to enhance the efficiency and accuracy of the exam form creation process by providing an initial review of test forms prior to them being reviewed by a content expert. This technology allows the AI system to identify and replace likely enemy items, allowing FSBPT staff content experts and the Examination Development Committee (EDC) to focus their attention on enemies that are harder to detect and other content issues that require human judgment. The ultimate result is higher-quality NPTE forms.  

Automated Item Generation: Creating High-Quality Items 

Automated Item Generation (AIG) is an expert system approach that uses structured frameworks to generate assessment items based on specific content parameters. This method has the potential to generate large numbers of items in fundamental areas using a small number of experts to develop valid item models. 

AIG involves creating a parent item with all the necessary information to determine the correct answer. The process then extrapolates from the parent item to define an item model with placeholders for variables, generate a list of possible correct answers, and develop a list of distractors with rationales for why each option is incorrect. These response options can then be mixed and matched in the correct ways to generate a range of unique items. The model-development process also helps ensure that distractors are parallel and that there is only one correct answer. The process also includes adding metadata to ensure the items align with the appropriate standards and can be incorporated into the form-building process. 

In a proof of concept completed in August 2024, three volunteer subject matter experts and a staff content analyst generated seven viable models that produced more than 30,000 high-quality items for in-demand content areas. Despite this success, AIG faces significant challenges. Out of an abundance of caution, we are initially considering items from the same model as "enemy items," meaning their inclusion on the same test form could give clues to test takers. As a result, the use of these items may be severely limited. As we review more items from each model, we might relax these rules to items from the same model but with very different content to be on the same forms. Additionally, the approval process for these items needs to be approached with caution. FSBPT is continuing to evaluate the pros and cons of this approach and how best to implement it.  

Challenges and Future Prospects 

Despite significant advancements in AI and expert systems, the human element remains crucial in the NPTE development process. These tools are designed to assist, not replace, human expertise. Volunteers and staff play a vital role in ensuring the quality and relevance of exam items, providing insights that AI alone cannot replicate. Volunteer support is essential for maintaining the integrity of the NPTE. Therefore, FSBPT is only looking at leveraging these tools to help more effectively use volunteers’ time, allowing them to focus on content and sharing their expertise.  

While these tools have shown promise to improve the NPTE development process, challenges remain. One primary challenge is ensuring that these tools do not disrupt the workflow of item writers and editors. As noted, the slow performance of Alpine Intelligence impacted the workflow, making it less useful for real-time feedback. Continuous improvement and refinement of AI tools are necessary to address these issues and enhance their effectiveness. 

Another challenge is striking the right balance between relying on these kinds of tools and continuing to invest in the skillsets of our volunteers. These tools are becoming well-suited to dealing with routine aspects of item development, but human experts are still essential to ensure we’re not perpetuating myths, including content with outdated information, and representing a broad range of clinical knowledge from a diverse range of perspectives. To the extent that we rely too much on these tools, it could hurt the efficacy of the NPTE for measuring clinically relevant knowledge and skills. 

FSBPT's commitment to continuous improvement is evident in its adoption of advanced tools. By enhancing quality assurance, internal usage, and ongoing refinement, FSBPT aims to stay at the forefront of assessment innovation. The future of NPTE assessment looks promising with the integration of Alpine Intelligence, NLP, and AIG, paving the way for more efficient and effective evaluation processes. By leveraging AI to enhance the efficiency, accuracy, and quality of the NPTE, FSBPT can ensure that the examination remains a reliable and effective measure of candidates' competence in physical therapy. Whatever is on the horizon, FSBPT is committed to ensuring the NPTE is a high-quality exam that accurately helps protect the public.  

 

Teresa Briedwell

Teresa Briedwell  is currently serving as a Co-Chair of the Exam Development Committee for FSBPT with a past history of item writing since 2014. Teresa has enjoyed forty-three years of practice as a physical therapist and is currently a professor at the University of Missouri Physical Therapy program.

 

Anissa Davis

Anissa Davis is an Assessment Content Analyst for FSBPT. Anissa provides physical therapy expertise in the development of the NPTE and provides content expertise and training for item development and item review workshops. Anissa was an Associate Clinical Professor in the Doctor of Physical Therapy Program at the University of Lynchburg for ten years and served as a volunteer with FSBPT as Co-Chair of the Exam Development Committee. Anissa holds a doctorate of physical therapy from the University of Tennessee and a bachelor of science in physical therapy from the University of Illinois. She has been an American Board of Physical Therapy Specialties certified Neurologic Clinical Specialist since 2013.

 

Marcia Himes

Marcia Himes  is the Program Director and an Associate Professor in the physical therapy program at Missouri State University. She served as the Assistant Director of clinical education from 2016-2024 and has practiced in a variety of settings, including critical care, outpatient orthopedics, home health, and long-term care. Marcia began her involvement with FSBPT in 2018 as an item writer and has served as a member of the Advanced Item Writer Task Force and the NPTE Exam Development Committee for physical therapists. She was inducted into the Academy of Advanced Item Writers in 2022 and was awarded the FSBPT Outstanding Service Award in 2023.

 

Sara Maher

Sara Maher serves as the Associate Dean of health sciences and Professor of physical therapy at Wayne State University. Her primary volunteer work has been with the Federation of State Boards of Physical Therapy, where she has served as an Item Writer, Item Writing Coordinator, Chair of the Exam Development Committee, and currently as the Trainer for Item Writers. Her volunteer work has also included the Foreign Commission for Credentialing Physical Therapists (Board of Directors), the American Council of American Physical Therapy (Education and Pedagogy Chair), and the APTA Academy of Education Section (Secretary).

 

Lorin Mueller

Lorin Mueller is the Managing Director of Assessment at FSBPT, where his primary role is to oversee the National Physical Therapy Examination program. He has over twenty years of experience in assessment and workforce research. He is a Fellow of the Society for Industrial and Organizational Psychology.

 

Text/HTML