Have anyone tried documenting via phone call?

I’m facing these problems while trying to document my heritage language:

  1. The speakers are scattered in 18 villages. Some are too far for me to visit.
  2. Most of the fluent speakers are baby boomers that not well equipped with the newest technology. But most of them have handphone and use it only for calls.

I’m thinking of doing remote documentation using phone call and getting the audio using call recorder, but I’m a bit unsure about the quality of the audio. Have anyone tried documenting via phone call or something similar?


I find WhatsApp to work better because they can record on their own device and send the recording - recording your phone on your end would result in pretty bad audio. Directly recording from their device with something like WhatsApp would work better.

I’d be on the fence about using such data for phonetic analysis, but anything else should be fine with WhatsApp recordings - I’ve used this method myself to confirm lexical items for languages I am already somewhat familiar with.


Nancy Dorian documented speakers of Scottish Gaelic over the phone starting in, I think, the 1980s. I’m not sure if she ever wrote something about how exactly she went about that but I have a suspicion that 1) she was mostly doing elicitation at that point (it was later in her career and she’d already done primary fieldwork in situ) and 2) she wasn’t recording the conversations but rather writing down (i.e. transcribing on paper) the answers.

My student Alexander Rice is currently working with speakers from a language in Ecuador via Zoom and making local recordings on his computer (via Zoom but also simultaneously via Audacity because Zoom recordings are always in a compressed format); he has a local young collaborator that manages the technology on the other end so one possibility would be to find a local young person who can help.

All of this to say, I think what you propose is definitely doable. It depends on what the goals of your project are. For phonetic analysis, probably not the best way to gather data (as @faytak said) but for anything else, I think this might work.


Hi @Daanisy, welcome to the forum and thanks for your question.

I mentioned this topic on Twitter and there are a few comments there you might also be interested in:


1 Like

@aryaman I think I recall you using WhatsApp for some work you did on Kholosi, right?

1 Like

@Daanisy, you might also want to take a look at this article by local hero @rgriscom:

It’s not focused exclusively on phone-based approaches, but there is a lot of useful, relevant information in there.

(By the way, I just looked up the languages you list on your profile. Very interesting! I didn’t realize Philippine languages are spoken so far south, in Indonesia. Neat! )

1 Like

Anyone who is interested in this topic should look at the following resources:



Thank you for the tips! By “phone call” I meant the one that doesn’t use any internet connection as the network is still bad in some villages. But I’ll consider to use WhatsApp when it’s available then :slight_smile:


My understanding is that network based communication would generally be classified as VOIP. Traditional phone networks were based on wires, and had certain frequency ranges, and there are several well known audio corpora with these methods. My understanding is that they all show clipping in the audio signal due to the limited range of the technology. Modern phone networks also known as cell phone networks use wireless technologies. Their speakers and microphones do have lower limits compared to something like the Zoom h4n. However, my understanding is that the dynamic range of these devices is higher than that of the wired networks. My understanding is also that cellphone based calls can get dynamically converted to VOIP calls by the carrier anyware in their transmission chain. So, it might be impossible to remove VOIP from the transmission train. However, in a methods section or data description, indicating that the data was collected over a 3G (or other type) network, along with the type of devices used in elicitation would be prefered. It is always nice if the frequency response of the devices used during elicitation are added to any documentation provided with the audio artifacts or analysis reports.