I have a question that arose from reading this article:
Thieberger, N. 2004. Documentation in practice: Developing a linked media corpus of South Efate. Language documentation and description 2. 169–178.
It’s a bit old, and we don’t usually digitize from tapes any more.
Nonetheless, I did not realize that there was such a thing as a “variable bit rate encoding” for MP3s that could affect the accuracy of timestamps:
On my return from fieldwork I began digitising my analogue tapes using the built-in soundcard of a desktop computer. This was a mistake! I ended up with digital audio files that I then used to align with the transcript. However, these were not good quality audio files because the computer’s soundcard is simply not adequate to the task. Thus, when the opportunity arose to have the analogue tapes digitised at a higher, and archival, resolution it resulted in my having two versions of the digital data. These two digital versions of the same tape did not correspond in length due both to stretching of the audio tape, and to being played on different cassette players, with slightly different playback speeds. There was no simple correlation between the timecodes in the old and the archival version. While I linked all subsequent transcripts to the archival version of the audio file, due to the time constraints of dissertation writing I have kept the non-archival versions for presentation of the thesis data. Archival versionshave been lodged with PARADISEC.
The crucial lesson from this experience is to digitise field tapes at the best (archival) resolution possible and then use those files (or a down-sampled version, such as MP3 if the original is too large), as the basis for linking to transcripts. To produce the best quality digitisation, it is recommended to use external soundcards that avoid computer noise, that is, if you don’t have a friendly digitisation project at a campus near you. It is also worth keeping up with the technology to find out about new methods and media for doing this kind of work.
[footnote 10] MP3 files can be indexed by timecodes as long as the mp3 files are not encoded with a variable bitrate. Note that MP3 is absolutely not a suitable format for recording or archiving.
Does anyone know if this is still a thing? Is there a way to determine whether a given MP3 file is of this type? We’re always warned not to use MP3s, and I use WAVs, but the idea of corrupted timestamps really compounds the sense of danger!