AI-Driven Digital Twin for Visual Defect Inspection in Railway Prognostics and Health Management
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Université d'Ottawa / University of Ottawa
Abstract
Prognostics and Health Management (PHM) ensures safety and reliability in railways by monitoring assets like tracks, bogies, and wheels. Visual defects, such as cracks and corrosion, are critical health indicators. Traditional PHM methods, relying on AI-driven Digital Twin (DT) frameworks, face challenges like data scarcity, integration complexity, and resource constraints. Large Language Models (LLMs) offer potential advantages with minimal data requirements, adaptability to unseen samples, and data generation capabilities. However, a well-defined approach to integrate LLM into DT ecosystems for railway PHM remains unexplored. Therefore, we introduce DefectTwin, a comprehensive LLM-based DT ecosystem designed to address the key challenges in PHM for railway defect inspection. Our approach overcomes data scarcity by proposing a customized synthetic data generation pipeline that enables the fine-tuning of a Multimodal and Multimodel (M²) LLM component. The domain-adapted LLMs enhance the AI inference engine of DefectTwin, ensuring high accuracy and reliable performance for railway defect inspection applications. To further strengthen the ecosystem, we propose a pipeline that incorporates a core feature of the DT ecosystem: a Quality of Experience (QoE) feedback loop.
This mechanism is implemented to enhance the performance of LLMs (e.g., the quality of generated outputs) based on user feedback. Moreover, the synthetic dataset generated by our pipeline reduces the resource-intensive processes typically associated with traditional PHM systems, such as extensive data processing and model training. We conducted set of experiments to evaluate our proposed methods. The accuracy of the generated output was evaluated using the Canadian Pacific Railway dataset and synthetic data from our proposed pipeline. Testing on 600 image-based cases achieved a precision of 0.92, outperforming GPT-4 at 0.68 and Gemini-Pro-Vision at 0.88, with an F1-score of 0.92. In zero-shot scenarios, DefectTwin achieved a precision of 0.60 compared to GPT-4 at 0.40 and Gemini-Pro-Vision at 0.48, with an F1-score of 0.62. For video data, zero-shot precision and F1-score reached 0.55. In text-to-text defect analysis, DefectTwin averaged 150 tokens per response with 1.5 seconds latency, surpassing GPT-4, which averaged 220 tokens and 2.7 seconds, and Gemini-Pro-Vision, which averaged 180 tokens and 2.3 seconds. We obtained comparatively decent answer relevance (0.79) and context relevance (0.97) in defect detection tasks, outperforming GPT-4 (0.43, 0.52) and Gemini-Pro-Vision (0.41, 0.51), highlighting its precision and contextual understanding in railway PHM. Usability tests of a prototype based on DefectTwin ecosystem, confirmed practical applicability, with a good SUS score at an acceptance range of 70%. DefectTwin stands as the first LLM-integrated DT, specifically designed for visual defect inspection in railway PHM. This research paves the way for broader integration of LLMs into DT ecosystems, offering researchers and practitioners advanced approaches for tackling PHM challenges in railway maintenance strategies.
Description
Keywords
AI, LLM, Digital Twin, Computer Vision, Defect, PHM, Synthetic Data
