Skip to main content

Research Repository

Advanced Search

A part-of-speech (POS) tagged corpus of Classical Tibetan

Hill, Nathan W.; Garrett, Edward

Authors



Abstract

This part-of-speech (POS) tagged corpus of Classical Tibetan was prepared in the course of the research project 'Tibetan in Digital Communication' (2012-2015) hosted at SOAS, University of London and funded by the UK's Arts and Humanities Research Council (grant code: AH/J00152X/1). For a description of the tag set see Garrett et al. 2014. and Garrett et al. 2015. This corpus includes the Mdzaṅs blun (9th century, canonical), the Bu ston chos ḥbyuṅ (13th century, ecclesiastical history), the Mi la ras paḥi rnam thar and Mar paḥi rnam thar (15th century, biography).

Citation

Hill, N. W., & Garrett, E. A part-of-speech (POS) tagged corpus of Classical Tibetan. [Data]

Online Publication Date May 11, 2017
Deposit Date Jun 16, 2017
Publicly Available Date Jun 16, 2017
Publisher URL http://doi.org/10.5281/zenodo.574878
Type of Data part of speeched tagged texts
Additional Information References : Garrett, Edward and Hill, Nathan W. and Kilgarriff, Adam and Vadlapudi, Ravikiran and Zadoks, Abel (2015) 'The contribution of corpus linguistics to lexicography and the future of Tibetan dictionaries.' Revue d'Etudes Tibétaines, 32. pp. 51-86. Garrett, Edward and Hill, Nathan W. and Zadoks, Abel (2014) 'A Rule-based Part-of-speech Tagger for Classical Tibetan.' Himalayan Linguistics, 13 (1). pp. 9-57.