Tools and resources within Danish Sign Language
The editorial staff behind the Dictionary of Danish Sign Language and the Corpus of Danish Sign Language are housed at Copenhagen University College. The corpus is not publicly accessible but functions as an internal tool for the editoral staff.
The corpus is primarily comprised of one-camera monologues recorded during the first phase of the dictionary project (2003-2008). Recently, a small sub-corpus of Facebook clips from deaf bloggers is added as well as small clips provided by a reference group (regarding specific signs).
The corpus is first and foremost used to support the dictionary work, i.e. to supply lemma candidates and evidence of sign use. Therefore, annotations in the corpus is very basic: sign (including variant info) and meaning, but not mouthing, negation, non-manuals, etc.
As both corpus and dictionary projects are run by the same project group, the same set of ID-glosses are utilised in the two projects (i.e. the lexical database of corpus equals the dictionary). The dictionary is edited in an MS Access database, the corpus uses the iLex system.
The project group annotates in two steps: first segmentation, then lemmatization. Currently (March 2021), the group has segmented about 62,000 tokens. 23,000 of these have also been lemmatized.
Contact information: firstname.lastname@example.org