Please use this identifier to cite or link to this item:
/library/oar/handle/123456789/126776Full metadata record
| DC Field | Value | Language |
|---|---|---|
| dc.contributor.author | Williams, Aiden | - |
| dc.contributor.author | DeMarco, Andrea | - |
| dc.contributor.author | Borg, Claudia | - |
| dc.date.accessioned | 2024-09-19T06:26:48Z | - |
| dc.date.available | 2024-09-19T06:26:48Z | - |
| dc.date.issued | 2023 | - |
| dc.identifier.citation | Williams, A., Demarco, A., & Borg, C. (2023). The Applicability of Wav2Vec2 and Whisper for Low-Resource Maltese ASR. 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023), Dublin. 39-43. | en_GB |
| dc.identifier.uri | https://www.um.edu.mt/library/oar/handle/123456789/126776 | - |
| dc.description.abstract | Maltese is a low-resource language with limited digital tools, including automatic speech recognition. With very limited datasets of Maltese speech available, a recent project, MASRI, developed further speech datasets and produced an initial prototype trained using the Jasper architecture. The best system achieved 55.05% WER on the MASRI test set. Our work builds upon this, producing a further two-and-a half-hour annotated speech corpus from a domain in which no data was previously available (Parliament of Malta). Moreover, we experiment with existing pre-trained self-supervised models (Wav2Vec2.0 and Whisper) and further fine-tune these models on Maltese annotated data. A total of 30 Maltese ASR models are trained and evaluated using the WER and the CER. The results indicate that the performance of the models scales with the quantity of data, although not linearly. The best model achieves state-of-the-art results of 8.53% WER and 1.93% CER on a test set extracted from the CommonVoice project and 24.98% WER and 8.37% CER on the MASRI test set. | en_GB |
| dc.language.iso | en | en_GB |
| dc.publisher | SIGUL | en_GB |
| dc.rights | info:eu-repo/semantics/openAccess | en_GB |
| dc.subject | Automatic speech recognition | en_GB |
| dc.subject | Low-resource languages | en_GB |
| dc.subject | Natural language processing (Computer science) | en_GB |
| dc.subject | Computational linguistics | en_GB |
| dc.subject | Speech processing systems | en_GB |
| dc.title | The applicability of Wav2Vec2 and whisper for low-resource Maltese ASR | en_GB |
| dc.type | conferenceObject | en_GB |
| dc.rights.holder | The copyright of this work belongs to the author(s)/publisher. The rights of this work are as defined by the appropriate Copyright Legislation or as modified by any successive legislation. Users may access this work and can make use of the information contained in accordance with the Copyright Legislation provided that the author must be properly acknowledged. Further distribution or reproduction in any format is prohibited without the prior permission of the copyright holder | en_GB |
| dc.bibliographicCitation.conferencename | 2nd Annual Meeting of the ELRA/ISCA SIG on Under-resourced Languages (SIGUL 2023) | en_GB |
| dc.bibliographicCitation.conferenceplace | Dublin, Ireland. 18-20/08/2023. | en_GB |
| dc.description.reviewed | peer-reviewed | en_GB |
| dc.identifier.doi | 10.21437/SIGUL.2023-9 | - |
| Appears in Collections: | Scholarly Works - InsSSA | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| The_applicability_of_Wav2Vec2_and_whisper_for_low_resource_Maltese_ASR.pdf | 333.52 kB | Adobe PDF | View/Open |
Items in OAR@UM are protected by copyright, with all rights reserved, unless otherwise indicated.
