UCLA Phonetics Lab Archive

I feel like this site is a gem of web documentation which doesn’t get quite enough attention:


There is a ton of stuff here. (See the whole list of languages — 315!!) UCLA has a long tradition of research in phonetics and phonology, and this site is a lovely archive of that.

An aside about UCLA phonetics

I can’t resist sticking in a few photos of the founder the lab, the late Peter Ladefoged, who surfed to a ripe old age:


Here he is as a phonetics consultant on the set of My Fair Lady:


A fairly unusual aspect of this archive is that the whole thing is under a Creative Commons license, specifically this one:


This means that as long as you’re not making money off the archive, you can create — indeed, you are encouraged to create — derivative works.

Reusing digital documentation

Once data is digital, assuming reuse is permitted, it becomes raw material. It can be reshaped, repurposed, recombined, re-lots-of-other-things.

Given its size, it is not surprising that certain language families show up more than once in this archive. One that caught my attention was the Fula or Fulani family, of which there are 5 representatives:

  1. Fulfulde, Adamawa
  2. Fulfulde, Maasina
  3. Fulfulde, Nigerian dialect
  4. Fulfulde, Western Niger
  5. Pulaar
  6. Pular

Fulani is a fairly major language and I’m sure there are other documentary resources out there somewhere, but even so, it’s not hard to imagine some comparative work with this content.

It would be fun to try doing something like that down the road, but for now let’s just take a tour of what is available for a given language here.

The specifics vary from language to language, but in general the types of data archived include:

Audio file:

At least some metadata:

Recording 1
Filename (WAV) fuv_word-list_1962_01.wav
Filename (MP3) fuv_word-list_1962_01.mp3
Language: Fulfulde (Nigerian)
Recording Contents Word List
Recording Location Unknown; speaker is from Gombe, Bauchi Province, Nigeria
Recording Date 26 February, 1962
Fieldworkers Peter Ladefoged
Speakers Y. Abubakar
WAV Digitization Quality 44.1 K, 16-bit sound depth (bit rate=705 kbps)
MP3 Bit Rate 56 kpbs
Original Recording Medium reel tape
Unicode Word List fuv_word-list_1962_01.html
Unicode Word List Entries 1 - 28
Tiff Image fuv_word-list_1962_01.tif
Tiff Image 2 fuv_word-list_1962_02.tif
JPG Image fuv_word-list_1962_01.jpg
JPG Image 2 fuv_word-list_1962_02.jpg
TIFF Image Quality 300 dpi
JPG Quality 300 dpi
Rights of Access This work is licensed under a Creative Commons License.

A scan of the original fieldnotes:

Which have since been input in a Unicode IPA transcript (nice!):

Entry Fulfulde (Nigerian)  English 
1  to paadˀaa  Where are you off to? 
2  o barkidˀii  he is fortunate 
3  o bˀamtii  he lifted 
4  no mbaaldudˀaa  Have you had a good night? 
5  o tawii  he found 
6  o dawii  he started early 
7  o dˀaanike  he slept 
8    it's smelly 
9  o ndaarii  he looked 
10  o d̠ʑalii  he laughed 
11    they failed 
12  o ɡooŋdˀii  he told the truth 
13  o ŋatii  he bit 
14  o ʔaawii  he planted 
15  o mahii  he built 
16  o naatii  he entered 
17  o ɲaamii  he ate 
18  o fabˀbˀii  he delayed 
19  o warii  he came 
21  o ʃaawii  he wrapped 
22  o haarii  he is full 
23    he stirred 
24    he is white 
25    How tired are you? 
26  o jarii  he drank 
27    he asked 
28  toje n̠d̠ʑaanodˀaa  Where have you been? 

As you can see there are a few missing forms (although, listening to the recording, some of the missing forms were actually said).

Anyway, you get the idea. There is a ton here. It’s not hard to imagine derivative projects: what do you imagine?

1 Like

This is so neat!! Thank you for sharing, I’m honestly surprised I hadn’t heard of this specific archive/db just from my chronic hoarding.
The first thing that comes to mind is the noise in these recordings, which is reasonable given the time period. But it would be neat to see how the noise shows up and trying to learn how to manually use filters and noise reduction in some basic software like Audacity (I think Praat has denoising capabilities too, so all the better). I’ve certainly found myself confused about how denoising cut out certain things.