Designing a grammaticality-judgement-enabled search interface with mayhplumb

Over in :question: What could your fieldwork app do for you?, @mayhplumb describes ideas for a database of sentences that are filterable by grammaticality judgements. I have spun off a separate topic to discuss May’s

So here’s a super lo-fi imagining of what I understand you to be describing:

This is just the search part of your workflow, not the annotation step: it assumes that we’re using a Flextext as an input file. This might be a good first step toward an annotation workflow, where you’re actually transcribing into the app and maybe there is a big asterisk button you could click on each sentence to indicate that it’s ungrammatical.

As the absolute simplest representation of this data, we can imagine a table of transcriptions and grammaticality judgements (obviously, in reality, there might be translations, word-level glosses, etc. etc). Here’s our two-sentence corpus:

number transcription grammatical?
1 mipum ma-tanum niti fu pumum kamum pa false
2 min pan muna ma nipa nasa fu nuki katam tanum true

So the drawing above, rendered in HTML, would end up being some kind of HTML tag, and with CSS that could be rendered in essentially any way at all. In the mockup, I added a pink background and ridiculously gigantic asterisk in a red box. No mistaking that!

Of course, this is just a random idea; you could format it any way at all.

Really, any way at all.

I’m kidding!

So I’m talking about all this Flextext business because it would be possible to get something like this going pretty quickly, I reckon. Rendering existing data is generally easier than building an input system for new data. But I’d be happy to do some brainstorming :brain: :cloud_with_lightning_and_rain: with you about what a grammaticality-enabled transcription interface might look like.

About those Flex notes…

Local hero @sunny was just showing me some flextexts that had Note fields, and from what I understand of them I can see why they are frustrating. There’s nothing to distinguish one note from another in the XML:

<item type="note" lang="en">A</item>
<item type="note" lang="en">Phonetic</item>

Those “items” have nothing to distinguish one from the other, which is… weird. So it might take some doing to figure out how to parse that information out of the flextext. But it should be doable via the content, I guess, if the grammaticality note element can only contain “grammatical” or “ungrammatical”, say.