Yale Documentation Tool Design Day! Hannah Morrison: the Fulani Comparative Lexicon

For the third talk of our Documentation Tool Design day, we’ll be hearing from Hannah Morrison about her approach to making an app to visualize a Fulani Comparative Lexicon. Let’s use this thread to discuss the project and ideas that come up!

Records for books in ADLaM script in the Fulani language. There are encoding issues in the metadata for some of them so far; this is due to a glitch in Oracle that will improve when we migrate to a new system: https://search.library.yale.edu/catalog?f[format][]=Books&q=adlam+fula&search_field=all_fields

3 Likes

It seems like converters are really crucial for a lot of this kind of work. Maybe we should have some kind of open repository where people can put the converters they’ve created for the dozens of formats we end up going between? Where could we host that? What kind of maintenance would be necessary? I know I’ve made at least 20 of these things and I don’t want them to be just for me! But there’s so many that others have made too, so even if they’re all on my GitHub, I would love for them to sit next to all the others.

3 Likes

we should think about this further and where such a collection might live

I haven’t been able to follow the talk, so I’m just guessing at what you mean by “converter”.

Incidentally, I’m just working on converting the Micronesian Comparative Dictionary to CLDF. Now, having done the same thing for the ACD you might think this would be a prime example for such a collection of shared converters. Of course I did use my ACD conversion code as basis - but it needed so much adaption that just figuring out what the “shared core” between the two code bases is would be tedious.

So I’d rather consider this case an example of why such a collection may not be as useful as it seems prima facie. Among the “dozens of formats” I’d suspect quite a few in the category “too generic” or “too underspecified” to allow for any re-usable conversion code. Maybe my case, with “sort-of computer-generated HTML” as input is the worst case, but elaborate custom marker hierarchies in toolbox are close, I think.

What might make a collection of converters work would be some sort of ranking - and I’d guess “number of times the converter was used” would make a pretty good metric here. But I also suspect, most entries will have rank 1 :slight_smile: