Amazigh Grammatical Labelling using n-gram Properties and Segmentation Pre-treatment
Dublin Core | PKP Metadata Items | Metadata for this Document | |
1. | Title | Title of document | Amazigh Grammatical Labelling using n-gram Properties and Segmentation Pre-treatment |
2. | Creator | Author's name, affiliation, country | Mohamed Outahajala; EMI - IRCAM; Morocco |
2. | Creator | Author's name, affiliation, country | Yassine Benajiba; Philips Research North America; United States |
2. | Creator | Author's name, affiliation, country | Paolo Rosso; NLE Lab - DSIC, Technical University of Valencia; Spain |
2. | Creator | Author's name, affiliation, country | Lahbib Zenkouar; EMI, Mohammed V University - Agdal; Morocco |
3. | Subject | Discipline(s) | |
3. | Subject | Keyword(s) | |
4. | Description | Abstract | This paper present the first Amazigh POS tagger. Very few linguistic resources have been developed so far for Amazigh and we believe that the development of a POS tagger tool is the first step needed for automatic text processing. In order to achieve this endeavor, we have trained two sequence classification models using Support Vector Machines (SVMs) and Conditional Random Fields (CRFs) after using a tokenization step. We have used the 10-fold technique to evaluate our approach. Results show that the performance of SVMs and CRFs are very comparable. Across the board, SVMs outperformed CRFs on the fold level (92.58% vs. 92.14%) and CRFs outperformed SVMs on the 10 folds average level (89.48% vs. 89.29%). These results are very promising considering that we have used a corpus of only ~20k tokens. |
5. | Publisher | Organizing agency, location | |
6. | Contributor | Sponsor(s) | |
7. | Date | (YYYY-MM-DD) | 2012-03-15 |
8. | Type | Status & genre | Peer-reviewed Article |
8. | Type | Type | |
9. | Format | File format | PDF (Français (France)) |
10. | Identifier | Uniform Resource Identifier | https://www.revue-eti.net/index.php/eti/article/view/60 |
11. | Source | Title; vol., no. (year) | Electronic Journal of Information Technology; Issue 6 |
12. | Language | English=en | fr |
13. | Relation | Supp. Files | |
14. | Coverage | Geo-spatial location, chronological period, research sample (gender, age, etc.) | |
15. | Rights | Copyright and permissions |
Copyright (c) 2012 Mohamed Outahajala, Yassine Benajiba, Paolo Rosso, Lahbib Zenkouar This work is licensed under a Creative Commons Attribution 4.0 International License. |