HTML markup for an Interlinear Glossed Sentence: Richard Ishida

Richard Ishida is the Internationalization Activity Lead at the W3C, one of the organizations that standardizes HTML and CSS and the rest of the web platform (the other organization is the WHATWG… long story).

He has a recent blog post on marking up interlinear glossed text as HTML, and it looks good to me:

Here is his HTML markup for an example in Old Ge’ez (it’s a biblical quote, but he says “not to read into that” :innocent: ) :innocent:

<div class="multilineGlossedText">
  <div class="stack">    
    <span class="legend">Ge'ez:</span>
    <span class="legend">Pronunciation:</span>
    <span class="legend">Gloss:</span></div>
  <div class="stack">
    <span class="base" lang="gez">ወሶበ፡</span>
    <span class="trans">wä-sobä</span>
    <span class="gloss">and-when</span></div>
  <div class="stack">
    <span class="base" lang="gez">ሰማዐ፡</span>
    <span class="trans">sämʾä</span>
    <span class="gloss">heard.he</span></div>
  <div class="stack">
    <span class="base" lang="gez">ኢሳይያሰ፡</span>
    <span class="trans">ʾIsayəyyas</span>
    <span class="gloss">ʾIsayəyyas</span></div>
  <div class="stack">
    <span class="base" lang="gez">ለንጉሥ፡ ...</span>
    <span class="trans">lä-nəguś ...</span>
    <span class="gloss">to-king ...</span></div>

And here is the CSS:

.multilineGlossedText {
    display: flex;
    flex-direction: row;
    flex-wrap: wrap;

.stack {
    display: flex;
    flex-direction: column;
    flex-wrap: nowrap;
    margin-right: .75em;
    margin-top: .5em;
.legend { 
  font-style: italic; 

Unfortunately I haven’t figured out how to embed rendered HTML+CSS examples in this forum (yet!), but here is a screenshot (and of course, follow the link to see the real thing):

The Ottoman Turkish example further down is even more amazing, given that it’s a right-to-left language. This really shows the flexibility of the web platform when it comes to rendering text.

There are two key featres of this markup: One is the use of the dir=rtlattribute on the <div> tag with the class value multilineGlossedText. (See MDN documentation for the dir attribute.) This makes the “stack” for each word lay out from right to left, as per the Arabic writing system. That’s because the top line is being used as the baseline in this interlinear. But note that inside some of the tiers for each word “stack”, some of the fields are set to lay out in right-to-left order. Note that the second and third tiers are actually reversed: the third word is aytmş’, but it “looks like” şmtya… but that’s actually not “logically” the case! In both of those tiers, the sequences of characters is a, y, t, m, ş. It’s just that markup is being used to change the layout system from tier to tier.

(Can that even be done in Flex? I have no idea. But the web sure can.)

There are also a couple tags here that I confess I’m not familiar with:



Some learnings to do!

1 Like