Comparison of different pose estimation models for lower-body kinematics A validation study
Main Article Content
Abstract
As pose estimation has garnered considerable attention for kinematic analysis, numerous pose estimation models have been developed in recent times. A pose estimation model is a trained neural network that predicts human body landmarks from an image. Each model contains different strong and weak points, which make it difficult for users to decide which model to use for kinematic analysis. The accuracy of the model can be one big factor for model selection, but there are not many studies investigating this critical point. Therefore, this study aims to investigate the accuracy of different models and variants by comparing the measurements from the models and variants against reference measurements. Five male participants were invited to this study. Each participant was asked to perform five exercises: squat, squat jump, counter movement jump, walk, and jog while being recorded by twelve normal RGB cameras (Contemplas) and ten marker-based tracking cameras (VICON). The video recordings from the Contemplas were processed by six different pose estimation models and variants: Mediapipe, MeTRAbs Small, MeTRAbs X Large, YOLO, MoveNet Lightning, and MoveNet Thunder to detect joint positions. From the detected joint positions, four joint angles, left hip, right hip, left knee, and right knee, were calculated. Three-way repeated measures ANOVA and Tukey HSD post-hoc analysis were applied to compare the pose estimation models with VICON measurements. The ANOVA result showed that exercise and model factors had a significant impact on the measurement errors although angle factor did not. In the post-hoc analysis, knee joint angle errors from YOLO, MoveNet Lightning, and MoveNet Thunder in jog and walk were significantly higher than those from Mediapipe, MeTRAbs Small, and MeTRAbs X Large. In conclusion, differentiated recommendations can be given for optimum model and variant choice in different conditions in kinematic analyses.
Article Details

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
References
Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G. S., ... Zheng, X. (2015). TensorFlow: Large-scale machine learning on heterogeneous systems [Computer software]. Retrieved from [Accessed 2025, 20 November]: https://www.tensorflow.org
Aleksic, J., Kanevsky, D., Mesaroš, D., Knezevic, O. M., Cabarkapa, D., Bozovic, B., & Mirkov, D. M. (2024). Validation of automated countermovement vertical jump analysis: Markerless pose estimation vs. 3D marker-based motion capture system. Sensors, 24(20). https://doi.org/10.3390/s24206624 DOI: https://doi.org/10.3390/s24206624
Bousigues, S., Naaim, A., Robert, T., Muller, A., & Dumas, R. (2025). The effects of markerless inconsistencies are at least as large as the effects of the marker-based soft tissue artefact. Journal of Biomechanics, 182, 112566. https://doi.org/10.1016/j.jbiomech.2025.112566 DOI: https://doi.org/10.1016/j.jbiomech.2025.112566
Conconi, M., Pompili, A., Sancisi, N., & Parenti-Castelli, V. (2021). Quantification of the errors associated with marker occlusion in stereophotogrammetric systems and implications on gait analysis. Journal of Biomechanics, 114, 110162. https://doi.org/10.1016/j.jbiomech.2020.110162 DOI: https://doi.org/10.1016/j.jbiomech.2020.110162
D'Antonio, E., Taborri, J., Palermo, E., Rossi, S., & Patane, F. (2020). A markerless system for gait analysis based on OpenPose library. In 2020 IEEE International Instrumentation and Measurement Technology Conference (I2MTC) (pp. 1-6). https://doi.org/10.1109/I2MTC43012.2020.9128918 DOI: https://doi.org/10.1109/I2MTC43012.2020.9128918
D'Haene, M., Chorin, F., Colson, S. S., Guérin, O., Zory, R., & Piche, E. (2024). Validation of a 3D markerless motion capture tool using multiple pose and depth estimations for quantitative gait analysis. Sensors, 24(22). https://doi.org/10.3390/s24227105 DOI: https://doi.org/10.3390/s24227105
English, D. J., Weerakkody, N., Zacharias, A., Green, R. A., Hocking, C., & Bini, R. R. (2023). The validity of a single inertial sensor to assess cervical active range of motion. Journal of Biomechanics, 159, 111781. https://doi.org/10.1016/j.jbiomech.2023.111781 DOI: https://doi.org/10.1016/j.jbiomech.2023.111781
Fukushima, T., Blauberger, P., Guedes Russomanno, T., & Lames, M. (2024). The potential of human pose estimation for motion capture in sports: A validation study. Sports Engineering, 27(1). https://doi.org/10.1007/s12283-024-00460-w DOI: https://doi.org/10.1007/s12283-024-00460-w
Full body modeling with Plug-in Gait. (n.d.). VICON Documentation. Retrieved from [Accessed 2025, 20 November]: https://docs.vicon.com/display/Nexus212/Full+body+modeling+with+Plug-in+Gait
Grishchenko, I., Bazarevsky, V., Zanfir, A., Bazavan, E. G., Zanfir, M., Yee, R., ... Sminchisescu, C. (2022). BlazePose GHUM Holistic: Real-time 3D human landmarks and pose estimation. https://doi.org/10.48550/arXiv.2206.11678
Hartley, R. I., & Sturm, P. (1997). Triangulation. Computer Vision and Image Understanding, 68(2), 146-157. https://doi.org/10.1006/cviu.1997.0547 DOI: https://doi.org/10.1006/cviu.1997.0547
He, K., Zhang, X., Ren, S., & Sun, J. (2015). Deep residual learning for image recognition. https://doi.org/10.1109/CVPR.2016.90 DOI: https://doi.org/10.1109/CVPR.2016.90
Islam, R., Bennasar, M., Nicholas, K., Button, K., Holland, S., Mulholland, P., ... Al-Amri, M. (2020). A nonproprietary movement analysis system (MoJoXlab) based on wearable inertial measurement units: Validation study. JMIR mHealth and uHealth, 8(6), e17872. https://doi.org/10.2196/17872 DOI: https://doi.org/10.2196/17872
Jeong, M. G., Kim, J., Lee, Y., & Kim, K. T. (2024). Validation of a newly developed low-cost, high-accuracy, camera-based gait analysis system. Gait & Posture, 114, 8-13. https://doi.org/10.1016/j.gaitpost.2024.08.077 DOI: https://doi.org/10.1016/j.gaitpost.2024.08.077
Jocher, G., Qiu, J., & Chaurasia, A. (2023, January 10). Ultralytics YOLO (Version 8.0.0) [Computer software]. Ultralytics. Retrieved from [Accessed 2025, 20 November]: https://github.com/ultralytics/ultralytics
Kitamura, T., Teshima, H., Thomas, D., & Kawasaki, H. (2022). Refining OpenPose with a new sports dataset for robust 2D pose estimation. In 2022 IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW). IEEE. https://doi.org/10.1109/WACVW54805.2022.00074 DOI: https://doi.org/10.1109/WACVW54805.2022.00074
Leung, K. L., Li, Z., Huang, C., Huang, X., & Fu, S. N. (2024). Validity and reliability of gait speed and knee flexion estimated by a vision-based smartphone application. Sensors, 24(23). https://doi.org/10.3390/s24237625 DOI: https://doi.org/10.3390/s24237625
Lin, T.-Y., Maire, M., Belongie, S. J., Bourdev, L. D., Girshick, R. B., Hays, J., ... Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. arXiv:1405.0312. Retrieved from [Accessed 2025, 20 November]: http://arxiv.org/abs/1405.0312 DOI: https://doi.org/10.1007/978-3-319-10602-1_48
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2016). Feature pyramid networks for object detection. https://doi.org/10.48550/arXiv.1612.03144 DOI: https://doi.org/10.1109/CVPR.2017.106
Lima, Y., Collings, T., Hall, M., Bourne, M., & Diamond, L. (2023). Assessing lower-limb kinematics via OpenCap during dynamic tasks: A validity study. Journal of Science and Medicine in Sport, 26(Suppl.), S105. https://doi.org/10.1016/j.jsams.2023.08.123 DOI: https://doi.org/10.1016/j.jsams.2023.08.123
Loper, M., Mahmood, N., Romero, J., Pons-Moll, G., & Black, M. J. (2015). SMPL: A skinned multi-person linear model. ACM Transactions on Graphics, 34(6), 248. https://doi.org/10.1145/2816795.2818013 DOI: https://doi.org/10.1145/2816795.2818013
Maji, D., Nagori, S., Mathew, M., & Poddar, D. (2022). YOLO-pose: Enhancing YOLO for multi-person pose estimation using object keypoint similarity loss. https://doi.org/10.48550/arXiv.2204.06806 DOI: https://doi.org/10.1109/CVPRW56347.2022.00297
McFadden, C., Daniels, K., & Strike, S. (2021). The effect of simulated marker misplacement on inter-limb differences during a change of direction task. Journal of Biomechanics, 116, 110184. https://doi.org/10.1016/j.jbiomech.2020.110184 DOI: https://doi.org/10.1016/j.jbiomech.2020.110184
Menychtas, D., Petrou, N., Kansizoglou, I., Giannakou, E., Grekidis, A., Gasteratos, A., ... Aggelousis, N. (2023). Gait analysis comparison between manual marking, 2D pose estimation algorithms, and a 3D marker-based system. Frontiers in Rehabilitation Sciences, 4, 1238134. https://doi.org/10.3389/fresc.2023.1238134 DOI: https://doi.org/10.3389/fresc.2023.1238134
Merker, S., Pastel, S., Bürger, D., Schwadtke, A., & Witte, K. (2023). Measurement accuracy of the HTC VIVE Tracker 3.0 compared to Vicon system. Sensors, 23(17), 7371. https://doi.org/10.3390/s23177371 DOI: https://doi.org/10.3390/s23177371
Molnár, B. (2010). Direct linear transformation based photogrammetry software on the web. ISPRS Commission, 38, 5-8. Retrieved from [Accessed 2025, 20 November]: http://www.isprs.org/proceedings/XXXVIII/part5/papers/130.pdf
Needham, L., Evans, M., Cosker, D. P., Wade, L., McGuigan, P. M., Bilzon, J. L., & Colyer, S. L. (2021). The accuracy of several pose estimation methods for 3D joint centre localisation. Scientific Reports, 11, 20673. https://doi.org/10.1038/s41598-021-00212-x DOI: https://doi.org/10.1038/s41598-021-00212-x
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., & Chen, L.-C. (2018). MobileNetV2: Inverted residuals and linear bottlenecks. https://doi.org/10.48550/arXiv.1801.04381 DOI: https://doi.org/10.1109/CVPR.2018.00474
Sarandi, I., Linder, T., Arras, K. O., & Leibe, B. (2021). MeTRAbs: Metric-scale truncation-robust heatmaps for absolute 3D human pose estimation. IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(1), 16-30. https://doi.org/10.1109/TBIOM.2020.3037257 DOI: https://doi.org/10.1109/TBIOM.2020.3037257
Tan, M., & Le, Q. V. (2019). EfficientNet: Rethinking model scaling for convolutional neural networks. Retrieved from [Accessed 2025, 20 November]: https://doi.org/10.48550/arXiv.1905.11946
TensorFlow. (n.d.-a). MoveNet: Ultra fast and accurate pose detection model. Retrieved from [Accessed 2025, 20 November]: https://www.tensorflow.org/hub/tutorials/movenet
TensorFlow. (n.d.-b). Pose estimation and classification on edge devices with MoveNet and TensorFlow Lite. Retrieved from [Accessed 2025, 20 November]: https://blog.tensorflow.org/2021/08/pose-estimationand-classification-on-edge-devices-with-MoveNet-andTensorFlow-Lite.html
Triggs, B., McLauchlan, P. F., Hartley, R. I., & Fitzgibbon, A. W. (2000). Bundle adjustment: A modern synthesis. In Vision Algorithms: Theory and Practice (pp. 298-372). Springer. https://doi.org/10.1007/3-540-44480-7_21 DOI: https://doi.org/10.1007/3-540-44480-7_21
Trowell, D. A., Carruthers Collins, A. G., Hendy, A. M., Drinkwater, E. J., & Kenneally-Dabrowski, C. (2024). Validation of a commercially available mobile application for velocity-based resistance training. PeerJ, 12, e17789. https://doi.org/10.7717/peerj.17789 DOI: https://doi.org/10.7717/peerj.17789
Turner, J. A., Chaaban, C. R., & Padua, D. A. (2024). Validation of OpenCap: A low-cost markerless motion capture system for lower-extremity kinematics during return-to-sport tasks. Journal of Biomechanics, 171, 112200. https://doi.org/10.1016/j.jbiomech.2024.112200 DOI: https://doi.org/10.1016/j.jbiomech.2024.112200
Washabaugh, E. P., Shanmugam, T. A., Ranganathan, R., & Krishnan, C. (2022). Comparing the accuracy of open-source pose estimation methods for measuring gait kinematics. Gait & Posture, 97, 188-195. https://doi.org/10.1016/j.gaitpost.2022.08.008 DOI: https://doi.org/10.1016/j.gaitpost.2022.08.008