Use this url to cite publication: https://hdl.handle.net/20.500.12259/49977
Options
Classification of short legal Lithuanian texts
Type of publication
Straipsnis recenzuojamoje užsienio tarptautinės konferencijos medžiagoje / Article in peer-reviewed foreign international conference proceedings (P1d)
Author(s)
Morkevičius, Vaidas | Kauno technologijos universitetas |
Title
Classification of short legal Lithuanian texts
Is part of
RANLP 2015 : 10th international conference on recent advances in natural language processing, BSNLP 2015 : 5th workshop on Balto-Slavic natural language processing, 10–11 September 2015, Hissar, Bulgaria : proceedings. Shoumen, Bulgaria : INCOMA Ltd., 2015
Date Issued
Date Issued |
---|
2015 |
Publisher
Shoumen, Bulgaria : INCOMA Ltd., 2015
Extent
p. 106-111
Field of Science
Abstract
Statistical analysis of parliamentary roll call votes is an important topic in political science because it reveals ideological positions of members of parliament (MP) and factions. However, it depends on the issues debated and voted upon. Therefore, analysis of carefully selected sets of roll call votes provides a deeper knowledge about MPs. However, in order to classify roll call votes according to their topic automatic text classifiers have to be employed, as these votes are counted in thousands. It can be formulated as a problem of classification of short legal texts in Lithuanian (classification is performed using only headings of roll call vote). We present results of an ongoing research on thematic classification of roll call votes of the Lithuanian Parliament. The problem differs significantly from the classification of long texts, because feature spaces are small and sparse, due to the short and formulaic texts. In this paper we investigate performance of 3 feature representation techniques (bag-of-words, n-gram and tf-idf) in combination with Support Vector Machines (with different kernels) and Multinomial Logistic Regression. The best results were achieved using tf-idf with SVM with linear and polynomial kernels.
Type of document
type::text::journal::journal article::research article
Language
Anglų / English (en)
Coverage Spatial
Bulgarija / Bulgaria (BG)
Description
Konferencijos internetinis puslapis : http://lml.bas.bg/ranlp2015/cfp2.php ; http://bsnlp-2015.cs.helsinki.fi/index.html