Best (printed) dictionary size and layout related practices

Hi! I’m working with a project where I’ll be responsible of the layout for a Ludian-Russian learners’ dictionary. It is quite standard one, I believe, with lexical entries in Ludian, Russian translation and possibly a few example sentences, but not for every entry. There seems to be around 12,000 entries, so it is not that small a dictionary. The idea has been that the dictionary would be around 200 pages, but that’s of course a matter of layout and font choices as well. The size in dimensions or pages is not really a constrain here, although some really non-standard size (from a European point of view) should be avoided as that can get more expensive to print.

It would be really nice if anyone has examples of something that can be considered a learners’ dictionary, in any language, that would be particularly well done and appreciated by the users. Getting fresh ideas and considering the choices I’m about to do is the main idea why I started this thread.

There are many interesting adjacent topics to discuss, for example, the internal data representation and also the layout creation process in itself, especially if we would also want an online version. But for now let’s stick to the questions around a physical book. The questions around digital versions and their functionalities are of course at least as exciting. As I mentioned, this is a learners’ dictionary, meaning it contains basic vocabulary, is done for a language pair most speakers know the best, and it should be easy to use and carry around.

Which paper size do you think works best in this context? Smaller, larger?

What about the font sizes?

Are two columns the way to go? What are the best ways to make longer entries legible even when they are squeezed into narrow columns?

In this video there is an example about the Russian-Ludian conversation book the same publisher did in 2019. But this dictionary will be larger and it doesn’t need to be visually similar. The binding mechanism will also be more durable etc.

I also saw there was a related discussion about generating a printable PDF with Python and LaTeX. I’ve been involved with similar work too, but I think in this case I will do the layout in InDesign, although if there are perfect LaTeX templates that already do what is needed, then I could also consider that. But there are complex pros and cons for all these approaches.


Oh boy, I love this topic.

How I really feel



Sounds very fun. I think the first things I myself would think of would be what kind of usage patterns you expect. I’m very fond of my little Collins Gem Portuguese<>English dictionary:

I dragged this thing all over Brazil in my younger years. This little guy has 40,000 entries, and is quite readable. Here’s a page:

Of course, this is mostly sentimental. To be honest, I never open it any more. In my younger years, the internet was barely a thing. Which is, I think a very relevant point: are people really going to use a pocket dictionary like this nowadays rather than their cell phone? Probably not, to be quite honest. (I’m assuming internet access is widespread in the area where Ludi is spoken.)


Which raises a different question: what kind of print dictionaries do people still like to use? I still like looking things up in larger format books. It’s just pleasant. This book, is one of my favorite dictionaries ever:

I’d describe this as pretty close to a paperback novel in size.

The typography is spectacular:

This is an example of squeezing a tremendous number of data types into each entry, but still maintaining readability and avoiding the “wall of text” syndrome. If you count the number of typefaces here, it’s pretty amazing. This book is a bit of an odd comparison since it’s organized around Kanji (Chinese characters used in Japanese), but I feel compelled to mention it because it’s a book I just love to look at and I still use just for fun. (And I barely study Japanese any more!)


A step up in size, and a work that content-wise is more similar to what you’re working on, is this Ilocano<>English dictionary and grammar by my friend Carl Rubino (mine’s signed! :blush:).

Not pocket-sized, certainly, and larger than a novel, but it could be thrown in a backpack pretty easily.


A typical page:

This dictionary very much emphasizes compounding and derived words (unsurprising for a Philippine language). If you look at the word ayát ‘love’ you’ll see that most of the entry is compounds (there are full example sentences as well). Interestingly, there is a longer definition here here for the form kaayan-ayat ‘sweetheart, lover’ than there is under that word as a headword:

kaayan-ayát (f. ayát) n. sweetheart

(No mention of lovers!)

Here’s what the English > Ilocano “finder” side looks like:

So yeah, at this scale we’re getting tons of information, and the dictionary part of the book is about 750 pages.

And, why not, let’s look at a beast of a dictionary too:

This dictionary is in the epic magnum-opus once-in-a-lifetime category. I think it’s way beyond what you’re up to, but it makes for an interesting upper bound:

Here’s a typical page:

(@Siri could explain it!)

And here’s a finder page:

Three columns here.

So anyway…

My own intuition is someone close to the “medium” category, right? I would think the key criteria would be:

  • How much info you have available for each entry: headwords, grammatical categories, definitions, compounds, and example sentences you want to give.
  • Your target audience’s familiarity with Ludi
  • How you expect the dictionary to be used
  • Cost. Color? Hardcover? Number of copies? Etc.

Anyway, please keep us apprised of your project’s progress!


Thanks, Pat, for good comments! I’ve also been going through a bit of dictionaries here and thinking of something like B5 as the paper size. So the dictionary will be Ludian–Russian–Ludian, which doubles the size, so I’m thinking the whole thing is around 500 pages. We plan to publish it first online as a PDF and some less-than-ideal printout, and then after few extra revision rounds we’ll get into printing.

Also what we see in your examples too is that the basic structure is often quite similar. There are example sentences and small grammatical information, but the structure is not so complicated that one should come up with more creative layout. There are certainly scenarios where that can be really beneficial too. Two columns, bold and italic, indentation – these seem to be well tested tools in our arsenal.

So the actual file is in Word (the most common dictionary editing tool at our field). There are relatively rigid formatting conventions used, which I believe saves the day. So what I’m doing is to convert the docx into markdown and reading that with Python. This involves quite a bit of restructuring. The content being in this point like this:

{'entry_lud': 'alav, -an, -ad',
 'compare': 'alahaine, madal',
 'entry_rus': 'низменный, низинный, низкий',
 'example_data': [{'rus': 'болото -- низменное место',
   'lud': 'suo on alav sija'}]}

So we have the Ludian entry and the Russian translation (which are both kind of treated like entries, as it is Ludian–Russian–Ludian dictionary, but that’s another issue). compare field refers to related entries. In the PDF I would like to have those as links.

This is then written back to markdown, which is then converted to ICML format which Adobe InCopy uses, following the instructions here:

pandoc -s -f markdown -t icml -o indesign_import.icml

This is then imported into InDesign, which allows updating the file from source (converted ICML file in this case – there is a warning every time it is changed). Most of the different data pieces shown above are still marked by different formatting conventions that are converted into InDesign character and paragraph styles, so manipulating them is pretty easy. This is how it looks like in my current layout, which is very easy to change since I set it up as described above:

There is a lot to fix still, but in principle this seems to work. Stuff like the header marking is done with quite typical conventions like this. I’m not doing this in LaTeX because the more final versions will have pretty detailed edits all over the place, and in some point I want to be able to make all kinds of tricks around the file so that it looks pleasing without LaTeX exploding on my face. In this point the Word > Markdown > ICML > InDesign pipeline will also be severed, so the authors have to give feedback through PDF. This doesn’t mean I wouldn’t be curious about similar LaTeX-pipeline too, I surely am. But I trust what I’m doing now is more likely to meet my deadline. :slight_smile: Although I’m aware this may all lead just into a bigger mess – we’ll see in few months.

Few things I’m still planning to do:

  • Adding links to the PDF from compared words to their entries
  • Coming up with something to get the Ludian hyphenation to work correctly (and Russian too)
  • I want to mark where a letter changes, but I’m still thinking of the best way to do this

Bit of a background The traditional way to do this is that everything is done in Word, and then that file is given to a person who makes the layout, sends back the proofs, and then we are for a long time in the correction limbo, in which very much work is done by the layout designer, in addition to the authors. This is quite expensive, slow, and the final corrected version is usually not available outside the final PDF that was sent to print (if that was kept). I’m tired to this, and try to find something that works better for me. And I want to share ideas and experiences – this is a common problem.

I think we also have examples of more database oriented work where the layout is done, for example, through LaTeX, but I’m wondering now how many of these we actually have in print and use. What are FLEx users doing, for example? Can anyone share examples of relatively large printed / print oriented dictionaries in which the layout is done in LaTeX or otherwise without traditional layout editing software?


About going LaTeX route, Language Science Press of course has excellent examples of beautiful typesetting, and the sources are available too:

And Pangloss has great examples of dictionaries both as PDF and HTML, didn’t check how that’s done, but looks very good.

Just to answer my own question, but other examples are more than welcome!

In the larger context of book making these are of course also examples of a publisher having a policy about their templates. My context and position are pretty different, but this is still very interesting. :thinking:


a couple of things (based on dictionary workshops over the years)

  • how familiar are users with dictionaries? Have they used them before? (Doesn’t go without saying and worth thinking about )
  • what language are they mostly using, and what are they using it for? e.g. for Bardi the dictionary is Bardi - English (where most of the detail is, plus finderlist), but most people are using it for English-Bardi, so they don’t see most of the information
  • text size - consider the age and eyesight of the audience; will it be too small?
  • include margins and blank pages so people can write their own entries in, unless you’re totally sure that you wrote down every word

Love your choices. I am a big fan of Macri’s The New Catalog of Maya Heiroglyphs (volumes 1 & 2). I’ll try to post some pics from my phone when I get a chance.


Thank you, Claire! These are very good questions to consider! Also in our context it can very well be that the users are familiar with the concept of dictionaries, but have not necessarily used them extensively. This is something that should be thought about much more, thanks for bringing it up! We want to have both Ludian–Russian and Russian–Ludian in the dictionary, so both directions are included, but I believe Russian–Ludian will be more used in practice, as Russian is usually a stronger language. Maybe this could be somehow accounted for in the references/links between the sides/entries. :thinking:

As I described, I would like to streamline the PDF layout making process so that this wouldn’t be an enormous bottleneck when we want to update it. We plan to have the first version out this spring, but making adjustments based on user feedback would not sound impossible. The first version would be available online, and we’ll make the printed version after more extensive feedback rounds. In a way having an online version already in this point would be very useful, but designing and hosting something like that is also an entirely different task with new unresolved issues.