Vol. 42
/
2025 / 12
/
pp. 105 - 136
研究人工智慧聊天機器人在香港大學英語必修課中自動進行寫作評估的潛力
Investigating the Potential of a Customized AI Chatbot to Automate Writing Assessment in a Compulsory English Course at a Hong Kong University
作者
李禕 Yi Li *
(香港城市大學陳馮曼玲陳淑蓮語言中心講師 Instructor at Chan Feng Men-ling Chan Shuk-Lin Language Centre)
楊潔 Jie Yang
(香港城市大學英文系博士生 PhD candidate of the Department of English, City University of Hong Kong)
何璿子 Xuanzi He
(香港城市大學英文系研究助理 Research Assistant of the Department of English, City University of Hong Kong)
蘭舸 Ge Lan
(專刊客座編輯;香港城市大學英文系助理教授 Special Issue Guest Editor; Assistant Professor of the Department of English)
李禕 Yi Li *
香港城市大學陳馮曼玲陳淑蓮語言中心講師 Instructor at Chan Feng Men-ling Chan Shuk-Lin Language Centre
楊潔 Jie Yang
香港城市大學英文系博士生 PhD candidate of the Department of English, City University of Hong Kong
何璿子 Xuanzi He
香港城市大學英文系研究助理 Research Assistant of the Department of English, City University of Hong Kong
蘭舸 Ge Lan
專刊客座編輯;香港城市大學英文系助理教授 Special Issue Guest Editor; Assistant Professor of the Department of English
中文摘要

自從2022年ChatGPT發佈以來,生成式人工智慧在語言教育的多個方面產生了巨大影響,例如語言教學、習得和評估。以香港某公立大學爲例,本研究旨在探究ChatGPT 是否可以作爲必修英語課程的寫作評估工具。爲了在這一教學情境中實現生態有效性,本研究構建了一個由人工智慧驅動的聊天機器人,以模擬英語教師在評估學生寫作任務前需要經歷的完整培訓過程,例如閱讀作業要求、熟悉評估標準以及審閱學生寫作樣本。該聊天機器人被用於自動評估100 篇由母語爲中文的本科生產出的敘述性作文。研究結果顯示,在三個總體寫作質量等級(A、B和C)之間存在輕微一致性(Kappa值 κ = 0.126),而機器人評分與教師評分之間存在正向中度相關(r = 0.446)。該結果揭示了在英語課堂中使用生成式人工智慧自動評估寫作的機遇與挑戰。最後該研究提出了對未來類似研究和語言教學實踐的啓示。

英文摘要

Since the release of ChatGPT in 2022, Generative AI has brought a large influence on multiple aspects of language education, such as teaching, learning, and assessment. This study aims to explore whether ChatGPT can be used as an assessment tool in a compulsory English course at a Hong Kong university. To achieve high ecological validity in this teaching context, an AI-powered Chatbot was built to replicate the exact training process that English teachers need to undergo before assessing students’ writing tasks, such as reading assignment prompts, familiarizing themselves with the assessment rubric, and reviewing standardization samples. This Chatbot was applied to automatically score 100 narrative essays written by undergraduate students, who speak Chinese as their first language. The findings show a slight level of agreement (Kappa value of κ = 0.126) between three general grade levels (A, B, and C) and a positive, moderate correlation (r = 0.446) between Chatbot scores and teacher scores. These findings reveal both the opportunities and challenges of using GenAI to automate writing assessment in English classrooms. The study concludes with implications for future research and language teaching practices effectiveness of GenAI in this context.

中文關鍵字

寫作評估;生成式人工智能;第二語言寫作

英文關鍵字

writing assessment; Generative AI; second language writing