Please use this identifier to cite or link to this item:https://hdl.handle.net/20.500.12259/49977
Type of publication: Straipsnis recenzuojamoje užsienio tarptautinės konferencijos medžiagoje / Article in peer-reviewed foreign international conference proceedings (P1d)
Field of Science: Informatika / Informatics (N009)
Author(s): Mickevičius, Vytautas;Krilavičius, Tomas;Morkevičius, Vaidas
Title: Classification of short legal Lithuanian texts
Is part of: RANLP 2015 : 10th international conference on recent advances in natural language processing, BSNLP 2015 : 5th workshop on Balto-Slavic natural language processing, 10–11 September 2015, Hissar, Bulgaria : proceedings. Shoumen, Bulgaria : INCOMA Ltd., 2015
Extent: p. 106-111
Date: 2015
Note: Konferencijos internetinis puslapis : http://lml.bas.bg/ranlp2015/cfp2.php ; http://bsnlp-2015.cs.helsinki.fi/index.html
Keywords: Balsų klasifikavimas;Lietuviški tekstai;Lithuanian texts;Classification of titles of votes
ISBN: 9789544520335
Abstract: Statistical analysis of parliamentary roll call votes is an important topic in political science because it reveals ideological positions of members of parliament (MP) and factions. However, it depends on the issues debated and voted upon. Therefore, analysis of carefully selected sets of roll call votes provides a deeper knowledge about MPs. However, in order to classify roll call votes according to their topic automatic text classifiers have to be employed, as these votes are counted in thousands. It can be formulated as a problem of classification of short legal texts in Lithuanian (classification is performed using only headings of roll call vote). We present results of an ongoing research on thematic classification of roll call votes of the Lithuanian Parliament. The problem differs significantly from the classification of long texts, because feature spaces are small and sparse, due to the short and formulaic texts. In this paper we investigate performance of 3 feature representation techniques (bag-of-words, n-gram and tf-idf) in combination with Support Vector Machines (with different kernels) and Multinomial Logistic Regression. The best results were achieved using tf-idf with SVM with linear and polynomial kernels
Internet: http://bsnlp-2015.cs.helsinki.fi/bsnlp2015-book.pdf
Affiliation(s): Baltijos pažangių technologijų institutas
Baltijos pažangių technologijų institutas, Vilnius
Informatikos fakultetas
Kauno technologijos universitetas
Taikomosios informatikos katedra
Vytauto Didžiojo universitetas
Appears in Collections:Universiteto mokslo publikacijos / University Research Publications

Files in This Item:
marc.xml7.82 kBXMLView/Open

MARC21 XML metadata

Show full item record
Export via OAI-PMH Interface in XML Formats
Export to Other Non-XML Formats

Page view(s)

194
checked on Mar 30, 2020

Download(s)

14
checked on Mar 30, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.