A Gallery of Parallel Text Formats

Ever since reading @ejk’s post on digitizing Tunen texts I have been thinking about the format notion of the “parallel text format”. I thought I would start a “gallery” thread, to try to build up a collection of examples that might be useful as inspiration for creating digital parallel text formats.

First up is probably the minimal case:

Layout Content
Verso page Lingala prose transcription
Recto page English prose translation

Woods, David R. & Fulbert Akouala. 2002. Lingala parallel texts . Dunwoody Press.

Worth noting that recreating this layout from a digitization would require annotating which sentences begin a new paragraph, and which are headings (LINDONGE / The Termite Heap).

1 Like

A page from Kashaya Texts by Oswalt:

This is essentially the same layout as the Tunen text above, with the distinction that paragraphs are numbered inline.

Layout Content
Verso page Kashaya prose transcription
Recto page English prose translation

It takes an awful lot of work to align these parallel texts, and you’re entirely on your own for morphology. Interestingly, Oswalt produced also produced an unpublished digital version of the book this screenshot is from (using an idiosyncratic orthography), which was aligned:

5. The Deer and the Bear
(Told by Herman James, August, 1958)


1. ma?al ?ama: dic'i':du duweni' bak^he ?aca? yacol dihqaw^.
This story from the old days was given to the Indians.

muli'do mi:li bu7aqa' q'o bih$e q'o' nohp^how- kulu: ?ama': tol^.
Bears and deer were living in the wilderness.

kulu: ?ama': tol- q^ho: no'hp^ho nohp^how^.
Two families lived in the wilderness.

menin hi?baya' c^hot' i'do q^ho: ?iy^.
Neither had men.

k'awada- ?ul- miya':daq^ha?yacol ?ul- cuhma qamu':muc'ba duhk^huy?^.
They were widowed; the husbands had been killed fighting the enemy.


2. mens'i':lido mul- kuma'ci- bah$a c^he'?e: ti bahnati':c'edu ba7aqa' ?em 
bih$e ?el^.
One day the bear asked the deer if she would go leach buckeye nuts with her.

mens'i:li bih$e ?em "hu':?^" nihcedu q^hama:ti' h$iyi?^.

kulu: mul bah$a c^he'?ep^hila q^hama:ti' h$iyi?^.

"hit'e:ti'm ?amadu'we^.

?amhu'l hit'e:ti'm" nihcedu bih$e ?el^.

"hu':?^" nihcedu mens'i:li bih$e ?i'ma:ta ?emu^.


3.  mens'iba ?ul mul ?amadu'we tubi'hciba ?ama: 7'i:- do?q'o'?diwac'ba- ?ul 
tiya':co?k^he c^he?e?k^he bak^he- buhq^ha'l li bawil?ba ?ul^ da:bi'c^hqa:^ 
bi?da ba'h7^hel tolhq^ha?^.

mens'iba ?ul bahci'l idom ma:ca? nohp^ho': to:- p^hila? mi'lhq^ha? ma:ca?^.

mens'iba mi': ?ul- ?ahq^ha ?i':li ?ul p^hima':c'i?- bu7aqa' ?em- la': he: 
bih$e ?e'mu hlaw^.

Okay, incompletely aligned. Anyway, I just point this out because the interlinear format in the Tunen text has some commonalities with the Kashaya one, and in fact it is probably that a tool for expediting the transcription of the Tunen text could have applications for other languages.