Use this url to cite publication: https://hdl.handle.net/20.500.12259/53516
Corpus of contemporary Lithuanian language – the standardised way
Type of publication
Straipsnis konferencijos medžiagoje Web of Science ir Scopus duomenų bazėje / Article in conference proceedings in Web of Science and Scopus database (P1a)
Author(s)
Author | Affiliation | |
---|---|---|
LT | ||
LT | ||
LT | ||
LT | ||
LT |
Title [en]
Corpus of contemporary Lithuanian language – the standardised way
Is part of
Human language technologies – the Baltic perspective: proceedings of the 4th international conference Baltic HLT, 2010 / editors Inguna Skadiņa, Andrejs Vasiļjevs. Amsterdam : IOS press, 2010
Date Issued
Date |
---|
2010 |
Publisher
Amsterdam : IOS press, 2010
Publisher (trusted)
Extent
p. 154-160
Abstract (en)
The paper presents the development process of the 160m word Corpus of Contemporary Lithuanian Language (CCLL), standardization issues being the focus of current development phase. The paper presents problems and solutions for the process of converting the CCLL from a proprietary format into a standardised one. Challenges in encoding the corpus using the Text Encoding Initiative Guidelines P5 are addressed, covering document metadata, text structure and morphological annotation levels that are already implemented in CCLL. Future perspectives for corpus development are discussed.
Series/Report no.
(Frontiers in Artificial Intelligence and Applications. Vol. 219 0922-6389)
Type of document
type::text::journal::journal article::research article
Language
Anglų / English (en)
Coverage Spatial
Nyderlandai / Netherlands (NL)
ISBN (of the container)
9781607506409
ISSN (of the container)
0922-6389
WOS
WOS:000321983200021
Other Identifier(s)
VDU02-000008962