Digital Repository

Developing a Dependency Tag Set for Sinhala: Procedure and Issues.

Show simple item record

dc.contributor.author Liyanage, C.
dc.contributor.author Wijeratne, W.M.
dc.date.accessioned 2017-12-12T04:24:38Z
dc.date.available 2017-12-12T04:24:38Z
dc.date.issued 2017
dc.identifier.citation Liyanage, C. and Wijeratne, W.M. (2017). Developing a Dependency Tag Set for Sinhala: Procedure and Issues. The Third International Conference on Linguistics in Sri Lanka, ICLSL 2017. Department of Linguistics, University of Kelaniya, Sri Lanka. p94. en_US
dc.identifier.uri http://repository.kln.ac.lk/handle/123456789/18502
dc.description.abstract Dependency Grammar (DG) is considered as one of the prominent theories of syntax. In order to analyze a particular language on DG and to make an annotated Dependency Treebank, a Tag set is needed. The objective of this research is to compile a Dependency Treebank for Sinhala. As part of compiling, the Treebank a Tagset was developed. This study is designed to explore the procedure and issues of developing a dependency tagset, with special focus to Sinhala Language. Methodology of the study includes 1. Identify same grammatical categories from benchmark tagsets 2. Find out syntactico-semantic categories from traditional Sinhala grammar books 3. Analyze sentences extracted from UCSC Sinhala corpus and further identify grammatical categories 4. Verify the tagset. In literature no reported work has been done based on DG for Sinhala. However, syntactic analysis on other grammatical traditions, Sinhala grammar books and several tagsets were referred in this work. Among the referred tag sets, Stanford typed dependencies manual (Marneffe and Manning, 2016) and AnnCorra: TreeBanks for Indian Languages-Guidelines for Annotating Hindi TreeBank (Bharati et al, 2012) were selected as benchmark tagsets. To ensure uniformity of the tagsets many tags for the same grammatical categories were taken from the above benchmark tag schemas. Findings of the research introduce syntactico-semantic categories and levels of dependency relations of words in Sinhala. The tagset comprises 42 tags and can be used in related works on DG for Sinhala. en_US
dc.language.iso en en_US
dc.publisher The Third International Conference on Linguistics in Sri Lanka, ICLSL 2017. Department of Linguistics, University of Kelaniya, Sri Lanka. en_US
dc.subject Computational Grammar en_US
dc.subject Dependency Annotation en_US
dc.subject Dependency Tag Set en_US
dc.subject Sinhala Grammar en_US
dc.subject Sinhala Linguistics en_US
dc.title Developing a Dependency Tag Set for Sinhala: Procedure and Issues. en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search Digital Repository


Browse

My Account