Google Scholar: citations
Evaluating Current Tools for Pinyin Transcription : Can Customising a Chatbot Lead the Way Forward?
Rovira-Esteva, Sara (Universitat Autònoma de Barcelona. Departament de Traducció i d'Interpretació i d'Estudis de l'Àsia Oriental)

Additional title: 现有拼音转写工具评估:人工智能能否引领下一代技术?
Date: 2025
Description: 26 pàg.
Abstract: The Chinese government introduced the "Chinese Phonetic Notation Plan" (known as Pinyin) in 1958 to combat illiteracy, eventually formalizing it as a standardised transcription system in 2012. The correct application of Pinyin orthographic rules is essential for language learning, international communication, and digitization. This research is driven by the belief that accurate transcription of Chinese text into Pinyin is crucial, while acknowledging that the process can be difficult and tedious when done manually. Therefore, this study aims to assess the performance of various Pinyin automatic transcription tools, identify problematic aspects in transcription, and determine whether customised systems can improve results while reducing user effort. The study employs a multi-phase methodology, including the analysis of representative transcription tools, comparison of errors, and the customisation of a chatbot for enhanced performance. The results reveal that most dedicated tools segment transcriptions at the character level rather than by word. General GenAI systems perform better than specific tools, but none followed the rules consistently. Common problems were identified in reduplication, punctuation, neutral tone, and word identification. Although DeepSeek had better initial performance, the customised and trained version of ChatGPT-4 achieved superior results in adherence to Pinyin rules, though perfect accuracy proved unattainable. This research highlights the challenges faced in automated transcription and offers insights into future improvements for systems aimed at assisting users with Pinyin transcription.
Grants: Ministerio de Ciencia e Innovación PID2024156763OB-100
Rights: Aquest document està subjecte a una llicència d'ús Creative Commons. Es permet la reproducció total o parcial, la distribució, la comunicació pública de l'obra i la creació d'obres derivades, fins i tot amb finalitats comercials, sempre i quan es reconegui l'autoria de l'obra original. Creative Commons
Language: Anglès
Document: Article ; recerca ; Versió publicada
Subject: Pinyin ; Chinese transcription ; Pinyin converters ; ChatGPT-4 ; DeepSeek
Published in: International Chinese Language Education Communications (ICLEC), Vol. 3, Núm. 1 (2025) , p. 1-26, ISSN 3078-3348

DOI: 10.46451/iclec.20251101


26 p, 843.7 KB

The record appears in these collections:
Research literature > UAB research groups literature > Research Centres and Groups (research output) > Arts and Humanities > Grup d'estudi de la literacitat en l’ensenyament i l’aprenentatge de segones llengües i traducció (GELEA2LT)
Research literature > UAB research groups literature > Research Centres and Groups (research output) > Arts and Humanities > Research Group in Chinese-Catalan/Spanish Translation and Interpreting (TXICC)
Articles > Research articles
Articles > Published articles

 Record created 2025-11-13, last modified 2025-12-18



   Favorit i Compartir