

An extra comparison was made to EasyAlign, which to our knowledge was the only aligner that supported BP at that moment.
PRAAT AUTO TEXT ALIGN SCRIPT MANUAL
As usual, tests comparing the automatic versus manual segmentations were performed. To the best of our knowledge, EasyAlign is the only HTK-based aligner that ships with a model for BP, but appears to be no longer maintained MFA is the only Kaldi-based one and UFPAlign has been evolving through time to work with both HTK and Kaldi as back-end.Īs a matter of fact, UFPAlign was initiated in, providing a package with grapheme-to-phoneme (G2P) converter, syllabification system and GMM-based acoustic models trained over the HTK toolkit. With respect to ASR-based frameworks, our research found only three forced aligners that provide pre-trained models for BP: EasyAlign, Montreal Forced Aligner (MFA) and UFPAlign. Regardless of the technique adopted, phonetic alignment resources for Brazilian Portuguese (BP) are still scarce. As several approaches have been applied to automate this process, some of them brought from the automatic speech recognition (ASR) domain, the combination of hidden Markov models (HMM) and Gaussian mixture models (GMM) has been for long the most widely explored for forced alignment. However, annotating phonetic boundaries of several hours of speech by hand are very time-consuming, even for experienced phoneticians. Furthermore, complex deep-learning-based approaches still do not improve performance compared to simpler models.įorced phonetic alignment is the task of aligning a speech recording with its phonetic transcription, which is useful across a myriad of linguistic tasks such as prosody analysis.

Evaluation took place in terms of phone boundary and intersection over union metrics over a dataset of 385 hand-aligned utterances, and results show that Kaldi-based aligners perform better overall, and that UFPAlign models are more accurate than MFA’s. The contributions of this work are then twofold: developing resources to perform forced alignment in BP, including the release of scripts to train acoustic models via Kaldi, as well as the resources themselves under open licenses and bringing forth a comparison to other two phonetic aligners that provide resources for BP, namely EasyAlign and Montreal Forced Aligner (MFA), the latter being also Kaldi-based.
PRAAT AUTO TEXT ALIGN SCRIPT FREE
This paper describes the evolution process toward creating free resources for phonetic alignment in Brazilian Portuguese (BP) using Kaldi, a toolkit that achieves state of the art for open-source speech recognition, within a toolkit we call UFPAlign.

This could be done manually for a couple of files, but as the corpus grows large, it becomes infeasibly time-consuming. Phonetic analysis of speech, in general, requires the alignment of audio samples to its phonetic transcription.
