❓ What could your fieldwork app do for you?

Many of us here are interested in developing new tech for supporting fieldwork, and to this end, I thought it’d be nice to start an open-ended discussion about what an ideal fieldwork app would do. Many people use ELAN and FLEx for doing their documentation work, and we must respect the developers of these apps immensely for their groundbreaking and long-lived work in this space, but I know many people whose needs are not entirely served by either app.

So let’s pose the question: how do ELAN, FLEx, and other tools serve your needs at the moment, and in your ideal world, what would your app do for you? While I haven’t conducted serious fieldwork myself, I’ll get this thread started by sharing some thoughts I’ve heard from fieldworkers I know.

ELAN: nice because it has very configurable tiers and time-alignable annotations, but lacks support for “project-wide” data such as orthographies and morphemes/lemmas.

FLEx: has nice support for “project-wide” data like morphemes and lemmas, enabling bulk edits, but morpheme segmentation doesn’t always work as expected depending on orthographies (esp. for tonal languages), syncing data with collaborators doesn’t always work straightforwardly, and it can be difficult to install.

As far as a new app goes, here are some ideas that have been in my head:

  • “Free” dictionary: If an annotation app existed that were implemented as a web application, it would use a database (like SQLite), and since dictionary websites/apps, are primarily read-only, they could be implemented on top of the same database the web application is using, giving anyone using the app a low-maintenance way to also provide their communities with a dictionary
  • ML integration: there’s been a lot of work recently in the computational community on automating tasks fieldworkers have to do, such as audio transcription and interlinearization. It’s generally hard to integrate these algorithms into existing apps, but a good one would allow machine learning models to be integrated into the app so that e.g. when an audio file is uploaded it is automatically transcribed, or when a word is entered it is automatically interlinearized.
  • Import/export API: there are some formats that are quite common, such as EAF or FLEx XML, but people will inevitably have to import or export their data using less common formats, and the best solution for a case like this is to provide a simple Python or JavaScript API for people to use to import or export their data, assuming a minimum of programming knowledge.
2 Likes

Everything you mentioned was good! Just to echo what you said: the interlinearization and project-wide features of FLEx, as well as the flexibility and time-alignment features of ELAN, built-in support for the creation and use of machine learning models, support for non-specialist interfaces, and collaborative editing like Google Docs.

In fact, collaboration functions are increasingly a big need, IMO. If we really want to integrate communities into language documentation then we need tools that aren’t designed for solo researchers.

Also, one aspect of digital fieldwork that has yet to be fully explored and developed is the use of mobile technologies, especially for data collection and transcription/translation tasks. Many endangered language communities live in areas with limited electricity, so laptop use may be restricted or not even possible. A mobile app that works well with a computer app would be really nice. I’ve mostly been using spreadsheet apps with a bluetooth keyboard for linguistic data and Open Data Kit for metadata, and then exporting to CSV for use on the computer. An app like Aikuma that allows community members to create transcriptions and translations for data would be great.

ML integration: there’s been a lot of work recently in the computational community on automating tasks fieldworkers have to do, such as audio transcription and interlinearization. It’s generally hard to integrate these algorithms into existing apps, but a good one would allow machine learning models to be integrated into the app so that e.g. when an audio file is uploaded it is automatically transcribed, or when a word is entered it is automatically interlinearized.

ELAN does have the “recognizer” function which can be used for audio transcription, voice activity detection, etc. I don’t know how widely it is used, though, and I don’t think there is support for external interlinearizers (parsers, etc.). You are right that it would be great to have the flexibility to easily integrate ML models, and then to easily use data opened in the app to re-train models. You should be able to simply choose e.g. Montreal Forced Aligner and select a model, press “run”, make some adjustments to the output of the model, and then press “re-train model”. Rinse and repeat!

Import/export API: there are some formats that are quite common, such as EAF or FLEx XML, but people will inevitably have to import or export their data using less common formats, and the best solution for a case like this is to provide a simple Python or JavaScript API for people to use to import or export their data, assuming a minimum of programming knowledge.

Yes, ELAN is pretty good with its export functions (CSV, TextGrid, etc.), but notoriously bad for importing text data. I made a python script to combine timecode data from TextGrid and text data from a CSV to create an EAF file because it took so much clicking to import the data in ELAN. I’m not as familiar with other less common text formats that linguists may be using - what did you have in mind?

Another mobile app idea that I’ve had in mind is specifically for elicitation. It basically would consist of a spreadsheet for entering transcriptions and translations, paired with a timer function that is synced with the audio recorder. You press play on the audio recorder and start the timer simultaneously, and then press a button in the app to manually create start and end times for each elicited item (like a “lap” button on a stopwatch). You could easily delete or change any associated start/end times during and after recording (e.g. if the speaker makes a mistake and you want to repeat an item). This is an alternative to the automated segmentation of audio and hopefully one that would be even more time efficient, accurate, and accessible for non-tech-oriented linguists.

1 Like