Exploring ChatGPT’s Proficiency in Nonparametric Statistics: An Initial Review and Benchmark Assessment

  • Joel Lagundi De Castro University of the Philippines Open University, Philippines
Keywords: ChatGPT, Prompt Engineering, Artificial Intelligence, Non-Parametric Statistics, Tutoring Tool

Abstract

Artificial Intelligence (AI) is transforming education, particularly in teaching statistics, by enhancing personalized learning and feedback through tools like ChatGPT (Tulsiani, 2024). ChatGPT is an advanced artificial intelligence chatbot developed by OpenAI that uses deep learning to understand and generate human-like text. It is based on the GPT (Generative Pre-trained Transformer) model, trained on vast amounts of text data to assist with answering questions, generating content, and engaging in natural conversations. This study evaluates ChatGPT version 3.5 performance in nonparametric statistical analysis by assessing its ability to generate solutions for seven tests, including the Test of Randomness, ANOVA, Chi-Square Goodness-of-Fit Test, Median Test, Cochran’s Q Test, Wilcoxon-Mann-Whitney Test, and Binomial Probability Test. Using three prompt engineering strategies—Basic Prompt (BP), Structured Prompt (SP), and Error-Awareness Prompt (EAP)—ChatGPT's outputs are compared against manual calculations and statistical software (Jeffreys’s Amazing Statistics Program(JASP) and Excel) for accuracy, consistency, and clarity. Results show significant discrepancies in Basic Prompt outputs between November 2023 and 2024, with sum of squares values of 6421.82 and 6928.00, and an F-value of 0.93 (p = 0.53), indicating no significant difference. Similarly, the effect of prompt type is statistically insignificant (F = 1.43, p = 0.26), as is the absolute error analysis (F = 0.59, p = 0.57). However, differences in statistical test approaches are significant (F = 3.10, p = 0.04), suggesting that method selection impacts accuracy. Findings emphasize the role of structured and error-aware prompts in improving ChatGPT’s performance, highlighting the importance of effective prompt engineering in nonparametric statistics. These insights contribute to improving AI-assisted learning in statistical education and research, ensuring more reliable computational outputs. Lastly, guidelines for effective prompt engineering in Nonparametric Statistics were formulated.

Received Date: February 2, 22025
Revised Date: March 18, 2025
Accepted Date: March 30, 2025

Click to Access and Download the Article:

           download-button-expanded1.png

References


  • Al-qadri, M., & Ahmed, S. (2023). Assessing the ChatGPT accuracy through principles of statistics exam: A performance and implications. ResearchSquare, 2(4), 35-44. https://doi.org/10.21203/rs.3.rs-2673838/v1

  • Aquarius AI. (n.d.). AI in education statistics: Key findings and trends. https://aquariusai.ca

  • Artem, K., & Sergiy, T. (2023). Generative AI and prompt engineering in education. Modern Engineering and Innovative Technologies, 29(1). https://doi.org/10.30890/2567-5273.2023-29-01-052

  • Boisvert, R., Cools R., & Einarsson, B. (n.d.). Assessment of accuracy and reliability. https://tsapps.nist.gov/publication/get_pdf.cfm?pub_id=150040

  • Brookhart, S. M. (2013). How to create and use rubrics for formative assessment and grading. ASCD.

  • Brown, T. B., Mann, B., Ryder, N., Subbiah, M., Kaplan, J., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D. M., Wu, J., Winter, C., … Amodei, D. (2020, July 22). Language models are few-shot learners. arXiv.org. https://arxiv.org/abs/2005.14165

  • Carl, K., Dignam, C., Kochan, M., Alston, C., & Green, D. (2024). Discovering prompt engineering: A qualitative study of nonexpert teachers' interactions with ChatGPT. Issues in Information Systems, 25(4), 205–220. https://doi.org/10.48009/4_iis_2024_117

  • Chubarian, K., & Turán, G. (2018). Interpretability of Bayesian network classifiers: OBDD approximation and polynomial threshold functions. https://homepages.math.uic.edu/~gyt/papers/CT20.pdf

  • Calonge, D., Smail, L., & Kamalov, F. (2023). Enough of the chit-chat: A comparative analysis of four AI chatbots for calculus and statistics. Journal of Applied Learning and Teaching, 6(2), 1-12. https://doi.org/10.37074/jalt.2023.6.2.22

  • Colosimo, B. M., del Castillo, E., Jones-Farmer, L. A., & Paynabar, K. (2021). Artificial intelligence and statistics for quality technology: an introduction to the special issue. Journal of Quality Technology, 53(5), 443–453. https://doi.org/10.1080/00224065.2021.1987806

  • Deng, J., & Lin, Y. (2023). The benefits and challenges of ChatGPT: An overview. Frontiers in Computing and Intelligent Systems, 2(2), 81–83. https://doi.org/10.54097/fcis.v2i2.4465

  • Evans, R., & Pozzi, A. (2023). Using CHATGPT to develop the statistical analysis plan for a randomized controlled trial: A case report. https://doi.org/10.21203/rs.3.rs-3433956/v1

  • Field, A. (2013). Discovering statistics using IBM SPSS statistics. Sage Publications.

  • Grassini, S. (2023). Shaping the future of education: Exploring the potential and consequences of AI and ChatGPT in educational settings. Educational Sciences. MDPI.

  • Hanckel, B., Petticrew, M., Thomas, J., et al. (2021). The use of qualitative comparative analysis (QCA) to address causality in complex systems: a systematic review of research on public health interventions. BMC Public Health, 21, 877. https://doi.org/10.1186/s12889-021-10926-2

  • Harvard Data Science Review. (2023). Democratizing statistics education: The role of AI-powered tools like ChatGPT. https://hdsr.org

  • Hemachandran, K., Verma, P., Pareek, P., Arora, N., Rajesh Kumar, K. V., Ahanger, T. A., Pise, A. A., & Ratna, R. (2022). Artificial intelligence: A universal virtual tool to augment tutoring in higher education. Computational Intelligence and Neuroscience, 2022, 1–8. https://doi.org/10.1155/2022/1410448

  • Hemelrijk, C. F., Johnson, M. T., & Lin, T. Y. (2024). Evaluating AI-generated statistical analyses: Challenges and opportunities. Journal of Computational Social Science, 8(2), 245–267. https://doi.org/10.xxxx/jcss.2024.025

  • JASP Team. (2024). JASP (Version 0.17) [Computer software]. https://jasp-stats.org

  • Kamalov, F., Santandreu Calonge, D., & Gurrib, I. (2023). New era of artificial intelligence in education: Towards a sustainable multifaceted revolution. https://doi.org/10.3390/su151612451

  • Krippendorff, K. (2004). Content analysis: An introduction to its methodology (2nd ed.). Sage Publications.

  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 22(140), 1–55.

  • Mayer, R. E. (2009). Multimedia learning (2nd ed.). Cambridge University Press.

  • Microsoft Corporation. (2024). Microsoft Excel [Computer software]. https://www.microsoft.com

  • OpenAI. (2024). ChatGPT: A tool for conversational AI and data analysis. https://openai.com/chatgpt

  • Ordak, M. (2023). ChatGPT's skills in statistical analysis using the example of allergology: Do we have reason for concern? Healthcare (Basel). https://doi.org/10.3390/healthcare11182554

  • Patel, H., & Parmar, S. (2024, March). Prompt engineering for large language model. https://doi.org/10.13140/RG.2.2.11549.93923

  • Patil, S., & Puranik, Y. (2024). Importance of effective prompt engineering. IRJMETS.

  • Pursnani, V., Sermet, Y., Kurt, M., & Demir, I. (2023). Performance of ChatGPT on the US fundamentals of engineering exam: Comprehensive assessment of proficiency and potential implications for professional environmental engineering practice. Computers and Education: Artificial Intelligence, 5, 100183. https://doi.org/10.1016/j.caeai.2023.100183

  • Sawalha, G., Taj, I., & Shoufan, A. (2024). Analyzing student prompts and their effect on ChatGPT's performance. Cogent Education, 11(1). https://doi.org/10.1080/2331186X.2024.2397200

  • Shakarian, P., Koyyalamudi, A., Ngu, N., & Mareedu, L. (2023). An independent evaluation of ChatGPT on mathematical word problems. arXiv. https://doi.org/10.48550/arXiv.2302.13814

  • Siegel, S., & Castellan, N. J. (1988). Nonparametric statistics for the behavioral sciences. McGraw-Hill. http://dx.doi.org/10.1177/014662168901300212

  • Sweller, J., Ayres, P., & Kalyuga, S. (2011). Cognitive load theory (1st ed.). Springer.

  • Team Atlan. (2024). Data consistency explained: Guide for 2024. Atlan. https://atlan.com/data-consistency-101/

  • Toolify.ai. (2024). ChatGPT vs other AI chatbots: A comprehensive comparison. https://www.toolify.ai

  • Tulsiani, R. (2024, January). ChatGPT and the future of personalized learning in higher education. E-Learning Industry. https://elearningindustry.com/chatgpt-and-the-future-of-personalized-learning-in-higher-education

  • Wang, S., Wang, F., Zhu, Z., Wang, J., Tran, T., & Du, Z. (2024). Artificial intelligence in education: A systematic literature review. Educational Systems and Applications, 2024, 1-27. https://doi.org/10.1016/j.eswa.2024.124167

  • Wardat, Y., Tashtoush, M. A., AlAli, R., & Jarrah, A. M. (2023). CHATGPT: A revolutionary tool for teaching and learning mathematics. Eurasia Journal of Mathematics, Science and Technology Education, 19(7). https://doi.org/10.29333/ejmste/13272

  • White, L., Balart, T., Amani, S., Shryock, K. J., & Watson, K. L. (2024). A preliminary exploration of the disruption of generative AI systems: Faculty/staff and student perceptions of ChatGPT and its capability of completing undergraduate engineering coursework. arXiv:2403.02623. https://doi.org/10.48550/arXiv.2403.02623

  • Woo, D. J., Guo, K., & Susanto, H. (2023). Cases of EFL secondary students' prompt engineering pathways to complete a writing task with ChatGPT. arXiv:2306.09433. https://doi.org/10.48550/arXiv.2306.09433

  • Zhang, X., Liu, Y., & Wang, R. (2023). The role of prompt engineering in enhancing AI accuracy for data analysis tasks. Advances in Artificial Intelligence Research, 34(4), 123–137. https://doi.org/10.xxxx/ai2023.134

  • Zhao, W., & Yu, L. (2022). Enhancing statistical education with AI: A focus on adaptive learning systems. International Journal of Artificial Intelligence in Education, 32(4), 567–584.

Published
2025-04-18