Please use this identifier to cite or link to this item:https://hdl.handle.net/20.500.12259/50538
Type of publication: Straipsnis recenzuojamoje užsienio tarptautinės konferencijos medžiagoje / Article in peer-reviewed foreign international conference proceedings (P1d)
Field of Science: Informatika / Informatics (N009)
Author(s): Kapočiūtė-Dzikienė, Jurgita;Nøklestad, Anders;Johannessen, Janne Bondi;Krupavičius, Algis
Title: Exploring features for named entity recognition in Lithuanian text corpus
Is part of: NODALIDA 2013 : proceedings of the 19th Nordic conference of computational linguistics, May 22–24, 2013, Oslo university, Norway / eds. Stephan Oepen, Kristin Hagen, Janne Bondi Johannessen. Linköping : Linköping University Electronic Press, 2013
Extent: p. 73-88
Date: 2013
Series/Report no.: (NEALT Proceedings, Vol. 16 1650-3740)
Note: ISSN (print): 1650-3686
Keywords: Named entity recognition;Named entity classification;Supervised machine learning;Lithuanian language
ISBN: 9789175195896
Abstract: Despite the existence of effective methods that solve named entity recognition tasks for such widely used languages as English, there is no clear answer which methods are the most suitable for languages that are substantially different. In this paper we attempt to solve a named entity recognition task for Lithuanian, using a supervised machine learning approach and exploring different sets of features in terms of orthographic and grammatical information, different windows, etc. Although the performance is significantly higher when language dependent features based on gazetteer lookup and automatic grammatical tools (part-of-speech tagger, lemmatizer or stemmer) are taken into account; we demonstrate that the performance does not degrade when features based on grammatical tools are replaced with affix information only. The best results (micro-averaged F-score=0.895) were obtained using all available features, but the results decreased by only 0.002 when features based on grammatical tools were omitted
Internet: http://www.ep.liu.se/ecp/085/011/ecp1385011.pdf
http://www.ep.liu.se/ecp/085/011/ecp1385011.pdf
Affiliation(s): Kauno technologijos universitetas
Appears in Collections:Universiteto mokslo publikacijos / University Research Publications

Files in This Item:
marc.xml7.03 kBXMLView/Open

MARC21 XML metadata

Show full item record
Export via OAI-PMH Interface in XML Formats
Export to Other Non-XML Formats

Page view(s)

160
checked on May 31, 2020

Download(s)

18
checked on May 31, 2020

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.