dc.contributor.author |
Kumara, K.H. |
|
dc.contributor.author |
Dias, N.G.J. |
|
dc.contributor.author |
Sirisena, H. |
|
dc.date.accessioned |
2015-05-11T04:09:35Z |
|
dc.date.available |
2015-05-11T04:09:35Z |
|
dc.date.issued |
2006 |
|
dc.identifier.citation |
Kumara, K.H., Dias, N.G.J. and Sirisena, H., 2006. A tool for automatic segmentation of a given Sinhala text into Syllables for Speech synthesis and Speech recognition, Proceedings of the Annual Research Symposium 2006, Faculty of Graduate Studies, University of Kelaniya, pp 67-68. |
en_US |
dc.identifier.uri |
|
|
dc.identifier.uri |
http://repository.kln.ac.lk/handle/123456789/7354 |
|
dc.description.abstract |
In the present era of human computer interaction, the educationally under privileged and
the rural communities of Sri Lanka are being deprived of technologies that pervade the
growing interconnected web of computers and communications. One good solution for
this problem would be computers talking to the common man in the language he is
comfortable to communicate in. Sri Lankan population has a significant percentage of
people who are educationally under-privileged. On one hand we claim that to build an EGovernment
or an E-Society in Sri Lanka on the other hand, the advances we make are
totally inaccessible by a large number of people in Sri Lanka. Under such circumstances,
we cannot expect rural/educationally under-privileged people to use computers and IT
products unless we remove the need of being literate, which exists as a barrier between
them and computers. However, the interaction between the computer and the user is
largely through keyboard and screen-oriented systems. In the current Sri Lankan context,
this restricts the usage of computers to a miniscule fraction of the population, who are
both computer-literate and conversant with written English. In order to enable a wider
proportion of population to benefit from Information technology, there is a dire need for
an interface other than keyboard and screen-interface that is widely in use at present.
Speech technologies promise to be the next generation user interface. Software
applications having speech and voice recognition abilities have a better chance to
communicate with a large percentage of population which include educationally underprivileged,
visually challenged and computer illiterates, if these applications can speak
and understand the native language. It is well known that the transcription of
orthographic words into syllables is one of the principal steps of a syllable based Speech
synthesis and Speech recognition. Hence we put forward a dictionary based automatic
syllabification tool for Speech Synthesis and Automatic Speech Recognition in Sinhala
language. Also it is capable to provide the frequency distributions of Vowels, Consonants
and Syllables of given Sinhala text. Although there is no universal agreement for syllable
definition, in this research our syllable definition can be considered as Cn 0
V n 1
Cn 0 where
Cn 0 signifies 0 to n consonants and V n 1
signifies 1 to n vowels. In this tool, detection of
Syllable boundaries for a given Sinhala sentence is achieved by four main phases: (1)
Reformat everything encountered (e.g. digits, abbreviations) into words and punctuation.(2) Derive a phonemic representation for each word. (3) Determine the C n 0
V n 1 units for a
given word. (4) Reformat above Cn 0
V n 1 units according to the Cn 0
V n 1
Cn 0 definition in
order to obtain the syllable boundaries. Following example will give a better explanation
of the algorithm. |
en_US |
dc.language.iso |
en |
en_US |
dc.publisher |
University of Kelaniya |
en_US |
dc.title |
A tool for automatic segmentation of a given Sinhala text into Syllables for Speech synthesis and Speech recognition |
en_US |
dc.type |
Article |
en_US |