VỀ MỘT PHƯƠNG PHÁP XÁC ĐỊNH MỤC TIÊU VĂN BẢN TRONG TIẾNG VIỆT

Hùng

A SUITABLE MODEL FOR CLASSIFYING VIETNAMESE DOCUMENTS

163 views

Authors

Nguyen Canh Hung (Corresponding Author) Military Information Technology Institute, Academy of Military Science and Technology

Keywords:

Text Classification; Tokenization; Conditonal Random Fields - CRFs.

Abstract

In this paper, we proposed a text classifying model for Vietnamese document. Our model is a combination of two separated components: A tokenization algorithm based on Conditional Random Fields (CRFs)[1] and StarSpace[2] – a general text classification model. Experiments results indicate that our model performed well on classifying task (with accuracy above 90% on the testing dataset).

Downloads

PDF (Tiếng Việt)

Published

10-04-2020

How to Cite

Hùng. “A SUITABLE MODEL FOR CLASSIFYING VIETNAMESE DOCUMENTS”. Journal of Military Science and Technology, no. 66, Apr. 2020, pp. 238-42, https://online.jmst.info/index.php/jmst/article/view/243.