MBROLA Formatted Diphone Database for Sinhala Language

Kumara, K.H.; Dias, N.G.J.; Wickramasinghe, R.l.P.

dc.contributor.author	Kumara, K.H.
dc.contributor.author	Dias, N.G.J.
dc.contributor.author	Wickramasinghe, R.l.P.
dc.date.accessioned	2015-05-15T05:19:36Z
dc.date.available	2015-05-15T05:19:36Z
dc.date.issued	2007
dc.identifier.citation	Kumara, K.H., Dias, N.G.J. and Wickramasinghe, R.l.P., 2007. MBROLA Formatted Diphone Database for Sinhala Language, Proceedings of the Annual Research Symposium 2007, Faculty of Graduate Studies, University of Kelaniya, pp 137.	en_US
dc.identifier.uri
dc.identifier.uri	http://repository.kln.ac.lk/handle/123456789/7461
dc.description.abstract	Diphone synthesis is one of the most popular methods used for creating a synthetic voice from recordings or samples of a particular person. Diphones are speech units that begin in the middle of the stable state of a phone and end in the middle of the following phone. The main interest in diphone synthesis is that they minimize the concatenation problems. The aim of the MBROLA project, recently initiated by the Facult' e Polytechnique de Mons (Belgium), is to obtain a set of speech synthesizers for as many voices, languages and dialects as possible, free of use for non-commercial and non-military applications. Central to the MBROLA project is MBROLA 2.00, a speech synthesizer based on the concatenation of diphones, takes a list of allophones associated with prosodic information as input and outputs 16 bit linear speech samples. Diphone databases tailored to the MBROLA format are necessary to run the synthesizer. Therefore we put forward a Diphone database, tailored to the MBROLA format, to generate synthetic voice for Sinhala language through MBROLA .pho reader. The first step of building the diphone database was the fixing a list of all the phones (acoustic instances of phonemes) of Sinhala language. Creating the diphone database was achieved in three steps: Creating a text corpus, Recording the corpus and Segmenting the speech corpus. For the text corpus, we used few selected chapters of two Sinhala novels. The corpus was then red by two (Male and Female) native Sinhala speakers, digitally recorded and stored. Then all diphones were spotted manually with the help of Speech Viewer ofCSLU toolkit which was developed by the Center for Spoken Language Understanding, Oregon Graduate Institute of Science and Technology, USA. A diphone database was finally created with 1004 diphone segments, which summarizes the results in the form of: the name of diphoncs, the related waveforms, their duration, and internal sub-splittings. Since we did not consider allophone variations in all instances, it may reduce the naturalness of the resulting synthetic speech. It is also possible that the number of diphone segments may higher than the above number (1004). However, most of the common occurrences of diphones were included in the database that we have developed. 137	en_US
dc.language.iso	en	en_US
dc.publisher	University of Kelaniya	en_US
dc.title	MBROLA Formatted Diphone Database for Sinhala Language	en_US
dc.type	Article	en_US