Projekter pr. år
Abstract
Danish is a North Germanic/Scandinavian language spoken primarily in Denmark, a country with a tradition of technological and scientific innovation. However, from a technological perspective, the Danish language has received relatively little attention and, as a result, Danish language technology is hard to develop, in part due to a lack of large or broad-coverage Danish corpora. This paper describes the Danish Gigaword project, which aims to construct a freely-available one billion word corpus of Danish text that represents the breadth of the written language.
Originalsprog | Engelsk |
---|---|
Udgiver | ArXiv |
Antal sider | 6 |
Status | Udgivet - maj 2020 |
Emneord
- cs.CL
Fingeraftryk
Dyk ned i forskningsemnerne om 'The Danish Gigaword Project'. Sammen danner de et unikt fingeraftryk.Projekter
- 1 Afsluttet
-
The Puzzle of Danish
Christiansen, M. H. (Projektkoordinator), Tylén, K. (Deltager), Fusaroli, R. (Deltager), Bleses, D. (Deltager), Højen, A. (Deltager), Trecca, F. (Deltager), Dideriksen, C. (Deltager) & Ishkhanyan, B. (Deltager)
01/09/2017 → 31/08/2020
Projekter: Projekt › Forskning