OAR@UM Collection:/library/oar/handle/123456789/526542025-12-26T16:01:52Z2025-12-26T16:01:52ZA diphone-based Maltese speech synthesis system/library/oar/handle/123456789/748912021-04-29T05:02:31Z2019-01-01T00:00:00ZTitle: A diphone-based Maltese speech synthesis system
Abstract: While there has been work in the area, at the time of writing there are no available
TTS systems for Maltese, thus almost the entire system had to be built from scratch. In
light of this, a Diphone-Based Concatenative Speech System was chosen as the type of
synthesiser to implement. This was due to the minimal amount of data needed, requiring
less than 20 minutes of recorded speech.
A simple `Text Normalisation' component was built, which converts integers between 0
and 9,999 written as numerals to their textual form. While this is far from covering all the
possible forms of Non-Standard Words (NSWs) in Maltese, the modular nature in which it
was built allows for easy upgrading in future work. A `Grapheme to Phoneme (G2P)' component
which then converts the normalised text into a sequence of phonemes (basic sounds)
that make up the text was also created, based on an already existing implementation by
Crimsonwing.
Three separate `Diphone Databases' were made available to the speech synthesiser. One
of these is the professionally recorded English Diphone database FestVox's `CMU US KAL
Diphone'1. The second and third were created as part of this work, one with diphones
manually extracted from the recorded carrier phrases in Maltese, the other with diphones
automatically extracted using Dynamic Time Warping (DTW). The Time Domain - Pitch
Synchronous OverLap Add (TD-PSOLA) concatenation algorithm was implemented to
string together the diphones in the sequence specified by the G2P component.
On a scale of 1 to 5, the speech synthesised when using the diphone database of manually
extracted diphones concatenated by the TD-PSOLA algorithm was scored 2.57 for
naturalness, 2.72 for clarity, and most important of all, 3.06 for Intelligibility by evaluators.
These scores were higher than those obtained when using the professionally recorded
English diphone set.
Description: B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE2019-01-01T00:00:00ZMining drug-drug interactions for healthcare professionals/library/oar/handle/123456789/748892021-04-29T05:01:49Z2019-01-01T00:00:00ZTitle: Mining drug-drug interactions for healthcare professionals
Abstract: The fourth leading cause of death in the US are Adverse Drug Reactions (ADRs)red. One
such cause of ADRs is brought about through Drug-drug Interactions (DDIs). The positive
side of this is that such reactions can be prevented. DDIs are reported during the pharmacovigilance (PV) process. PV is the practice of monitoring and detecting ADRs once a
drug is launched into the market. ¸£ÀûÔÚÏßÃâ·Ñ related to DDIs is dispersed across different
biomedical articles. We propose medicX, a system that is able to detect DDIs in biomedical
texts by leveraging on different machine learning techniques. The main components
within our system are the Drug Named Entity Recognition (DNER) component and the
DDI Identification component. Different approaches were investigated in line with existing
research. The DNER component is evaluated using the CHEMDNER and the DDIExtraction
2013 challenge corpora. Conversely, the DDI Identification component is evaluated
using the DDIExtraction 2013 challenge corpus. The DNER component is implemented
using an approach based on LSTM-CRF. This method achieves a macro-averaged F1-score
of 84.89% when it is trained and evaluated on the DDI-2013 corpus, which is 1.43% higher
than the system that placed first in the DDIExtraction 2013 challenge. On the other hand,
the DDI Identification component is implemented using a two-stage rich feature-based
linear-kernel SVM. This classifier achieves an F1-score of 66.18%, as compared to the SVM
state-of-the-art DDI system that reported an F1-score of 71.79%.
Description: B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE2019-01-01T00:00:00ZdOMiNiuM PubliCuM : opinion mining of news portal comments/library/oar/handle/123456789/748752021-04-29T05:00:55Z2019-01-01T00:00:00ZTitle: dOMiNiuM PubliCuM : opinion mining of news portal comments
Abstract: Sentiment analysis is a research problem with great potential, given the enormous applications of being able to accurately summarise the opinion expressed by a person towards any topic. It has seen a lot of research into product reviews over the years, unfortunately research into sentiment analysis on more ambiguous data like news portal comments has been far more limited due to greater challenges. We propose a rules-based aspect level sentiment analysis using an opinion word corpus to detect the sentiment expressed towards entities in user-generated content, noted to be one of the more complex forms of data. The system is designed to use comments extracted from the Times of Malta news portal. Following the extraction of comments each sentence is processed to identify if it is in English or not to ensure that only English sentences are processed further. Sentences not in the English language are simply marked and stored. The English sentences within a comment are each processed for sentiment analysis. The scores from each sentence contribute to the sentiment mean value of each entity in the specific comment. Experimental results indicate that this approach looks promising for similar endeavours in the future.
Description: B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE2019-01-01T00:00:00ZA text-independent, multi-lingual and cross-corpus evaluation of emotion recognition in speech/library/oar/handle/123456789/748442021-04-28T05:12:14Z2019-01-01T00:00:00ZTitle: A text-independent, multi-lingual and cross-corpus evaluation of emotion recognition in speech
Abstract: Ongoing research on Human Computer Interaction (HCI) is always progressing and the
need for machines to detect human emotion continues to increase for the purposes of having
more personalized systems which can intelligently act according to user emotion. Varying
languages may portray emotions di↵erently which is a hiccup in the field of automatic
emotion recognition from speech. We propose a system which takes a cross-corpus and
multilingual approach to emotion recognition from speech in order to show the behaviour
of results when compared to single monolingual corpus testing. We utilize four di↵erent
classifiers: K-Nearest Neighbours (KNN), Support Vector Machines (SVM), Multi-Layer
Perceptrons (MLP), Gaussian Mixture Models (GMM) along with two di↵erent feature sets
including Mel-Frequency Cepstral Coefficients (MFCCs) and our own extracted prosodic
feature set on three di↵erent emotional speech corpora containing of several languages.
The aim for the prosodic feature set is to try and acquire a general feature set that works
well across all languages and corpora. We notice a drop in performance when unseen data
is tested but made better when merged databases are present in the training data and
when EMOVO is present in either training or testing. MFCCs work very well with GMMs
on single corpus testing but our prosodic feature set works better in general on the rest of
the classifiers. We evaluate all the obtained results in view of proving any elements that
could possibly form a language independent system but for the time being results show
otherwise.
Description: B.SC.ICT(HONS)ARTIFICIAL INTELLIGENCE2019-01-01T00:00:00Z