LaTeX dictionaries?

I’m curious if anyone here has used LaTeX to typeset a dictionary, especially if you’d be willing to share your template.

I was talking to @ldg about this topic earlier today, and we were digging around for examples. He shared this one:

I also found this quite impressive Icelandic-Czech one:

Looks like this is part of a pretty well-developed project (not that I can read Czech or Icelandic :sweat_smile::

6 Likes

I have (without pictures so far)! My pipeline is spreadsheet > R > LaTeX with this dictionary template > pdf. It sounds more complicated than it is (once I figured it out). I’d be happy to share my work and hold a little workshop session if someone’s interested, but all of that needs to wait till October. Need to finish my thesis… (you know the drill :woozy_face:)

4 Likes

I’ve tried this for Kholosi: https://aryamanarora.github.io/kholosi/Kholosi_Dictionary.pdf. Unfortunately not generated from structured data :frowning:

1 Like

Wow, this is beautiful, @aryaman.

I’m curious what your workflow looks like if it’s not structured data at some point.

Would you be interested in sharing with us what your LaTeX template looks like? I think @Sandra might be interested in this as well.

:open_book:

Man, @aryaman, your whole site is awesome:

https://aryamanarora.github.io/kholosi/

It looks great! Do you mean that you had to type in all the headwords/definitions into the document or that you had to do all the formatting for each entry as well?

1 Like

Well I had a spreadsheet but wanted to make something a little nicer for my informant + go through all the entries and make sure they’re accurate. So one night I made this, based a bit off of another dictionary template on Overleaf.

The formatting is all macros, here’s the Overleaf: link. It should be not hard to make a script to generate the text from a structured lexicon, just haven’t had the time :frowning: I just recently got my corpus converted to JSON (using TwistedTongues) so I hope to automate this better soon.

2 Likes

Hi all —

I built a pipeline to parse a lexicon in a domain-specific language (DSL), e.g. backslash code, into JSON and then use EJS templates to generate LaTeX/HTML/InDesign XML. It might be overkill for some projects but it worked for cleaning up the 10,000+ word Warlpiri-to-English dictionary, which had been worked on by various folks since the 1950s (we also maintain the main dictionary in a Git repo with automated testing: https://warlpiri-tests.netlify.app/).

Here’s the repository with a miniature version of the Warlpiri dictionary (in src/wlp-lexicon_master.txt:

Parser:

Templates:

Outputs:

Pipeline defined in the .gitlab-ci.yml: https://gitlab.com/coedl/mini-wlp/-/blob/master/.gitlab-ci.yml

2 Likes