Electronic Journal of Information Technology, Issue 9

Automatic Indexing of Arabic Texts: State of the Art

Mohamed Salim El Bazzi, Taher Zaki, Driss Mammass, Abdelatif Ennaji

Abstract


Document indexing is a crucial step in the text mining process. It is used to represent documents by the most relevant descriptors of their contents. Several approaches are proposed in the literature, particularly for English, but they are unusable for Arabic documents, considering its specific characteristics and its morphological complexity, grammar and vocabulary. In this paper, we present a reading in the state of the art of indexation methods and their contribution to improve Arabic document’s processing. We also propose a categorization of works according to the most used approaches and methods for indexing textual documents. We adopted a qualitative selection of papers and we retained papers approving notable indexation contributions and illustrating significant results.