I’m not sure if this scan is from the collection in question, but it might suffice as an indication of the kinds of texts they’re dealing with:
Y’all. Transkribus is nuts.
The “CER” or “Character Error Rate”s reported are… a little amazing:
|Model name||No. of pages checked||CER% for Training Set||CER% for Validation Set|
There’s a nice video about the project here: