SIGMORPHON 2023 Shared Task on Interlinear Glossing

Of interest, via the SIGMORPHON list and GitHub:

SIGMORPHON 2023 Shared Task on Interlinear Glossing

We are organizing a shared task on automated interlinear glossing (IGT) in the 2023 workshop of the ACL Special Interest Group on Computational Morphology and Phonology (SIGMORPHON).

Interlinear glossed text is a major annotated datatype produced in the course of linguistic fieldwork. For many low-resource languages, this is the only form of annotated data that is available for NLP work. Creation of glossed text is, however, a laborious endeavor and this shared task investigates methods to (fully or partially) automate the process.

In this task, participants build systems which generate morpheme-level grammatical descriptions of input sentences following the Leipzig glossing conventions. The input to the glossing system consists of (1) a sentence in the target language and (2) a translation of the target sentence into a language of wider communication, often English. More details are available on our task repo: GitHub - sigmorphon/2023glossingST: A repo for the 2023 Sigmorphon glossing shared task.


In the task repo, we provide the code, results, and downloadable trained models for our baseline system. We also provide the evaluation script and other helpful scripts for loading and processing the task data, to facilitate easy building of novel systems.

Training Conditions

In this task, there are two tracks that participants may participate in. In the closed track, systems are trained solely on input sentences and glosses. In the open track, systems may additionally make use of morphological segmentations during training time. In the open track, participants may additionally use any data and resources (including dictionaries and pretrained language models). The only exception is additional interlinear glossed data which is not allowed. For the open track, we also provide some extra information like POS tags for a subset of the languages.


The main evaluation metric for the competition is token accuracy. Systems are evaluated w.r.t. generation of fully glossed tokens (chiens → dog-PL). We will also separately evaluate glossing accuracy on bound morphemes like PL and free morphemes, i.e. stems, like dog.

Important Dates and Deadlines

  • April 1: Release of surprise language training and development data
  • April 24: Release of test data for all languages
  • April 27: Test predictions should be submitted to organizers
  • May 15: System description paper submission deadline
  • May 25: Notification of paper acceptance
  • May 30: Camera ready deadline for system description papers


  • Michael Ginn (University of Colorado)
  • Mans Hulden (University of Colorado)
  • Sarah Moeller (University of Florida)
  • Garrett Nicolai (University of British Columbia)
  • Alexis Palmer (University of Colorado)
  • Miikka Silfverberg (University of British Columbia)
  • Anna Stacey (University of British Columbia)

Read more

By the way, @SarahRMoeller and @alexispalmer are local heroes and I suspect might be willing to field questions. :slight_smile:

1 Like

I’d consider myself a “linguistic data guy”, and I’m big on IGT. I know the Leipzig Glossing Rules by heart and am working on a morphological parser for Yawarana. I am fluent in python and regularly manipulate large amounts of IGT using pandas.

However, I have absolutely zero experience in “actual” NLP, mainly because I am not interested in big languages. How can I participate/contribute in this / how would somebody like me get started in NLP @SarahRMoeller and @alexispalmer?


@fmatter I confess that I was thinking some similar thoughts.

I wonder if there is room for some sort of track about “next-to-no resource” languages — languages where there just isn’t enough data to do much NLP at all, and the primary computational question boils down to questions about user interfaces for producing very early stage (and to some extent uncertain) corpora that are nonetheless as granularly annotated as possible?

Sorry for the late reply but this happens every year so it is worth it :slight_smile: Shared Tasks are great for discovering and testing new algorithms for NLP in low-resource settings. In that way, they are a better fit for computer science teams. I worked with a team one year and felt that I was able to add little as a linguist. One way to be involved is to recruit a team from, say, your computer science department, to see what they can do.

This year I was pretty much provided the endangered language data. So that is another way OWLs (ordinary working linguists) can be involved.

What I loved this year is that we have a track specifically for field data (most times it is highly curated data). Miikka Silfverberg headed that up. The papers should be posted at by the end of this month and hopefully we gain some good insights about working with documentary data.

From these competitions we learn that so much is possible with low-resource and even next-to-no-resource languages. But unfortunately, without software interfaces, the algorithms remain mostly inaccessible to OWLs.