An interesting article describing how the FLEx morphological parsing system was originally designed.
FLEx actually has two parsers now… “hermit crab” and … I can’t remember the other one… The original paper @rgriscom links to can be found in the SIL archive at: The SIL FieldWorks language explorer approach to morphological parsing | SIL International
2 posts were split to a new topic: Are online linguistic references maintainable?
Interesting article. I didn’t realize that FLEx referenced the GOLD Ontology:
A feature system built with typed feature structures containing simple features with atomic values and complex features with embedded feature structures of a specified type as values. The system includes a feature catalog based on morphosyntactic properties taken from the online GOLD ontology (which can be found at this URL: http://www.linguistics-ontology.org/gold.html). 5 Users select the features needed to describe the language from this catalog or add new ones when necessary.
The resulting HTML page is displayed via an embedded Internet Explorer browser. Figure 6 shows the opening of a generated sketch for Orizaba Nahuatl.
I’ll just be over here img into the abyss…
@hp3 , FLEx will use the xAmple parser and the Hermit Crab parser. xAMPLE is easier to set up, but morphological changes need to be defined statically:
bi- -> b / *_i
Hermit Crab takes some time to set up the phonemes and define non-overlapping distinctive features, but then those can be used in environment and transformations:
bi- -> b / *_[V]
Hi @MatthewLee, welcome to docling forum!
Along the lines of this topic, I thought I’d share this short paper about parsing in FLEx:
Haven’t read it carefully yet, but I believe this would be referring to the
xAmple parser as opposed ti the newer Hermit Crab.
The most up-to-date documentation for both parsers is actually bundled with FLEx.
This opens a nearly 100-page document explaining the Parsers, last updated in 2018 (but little has changed about the parsers since then).
Chapters 1-5 work for both parsers (xAMPLE and Hermit Crab), Appendix B explains the extra bits to make the Hermit Crab Parser phonologically aware. I’ve had good luck with both parsers, with the understanding that there are some limitations on multi-word/syntactic awareness.
I wonder if a copy could be put on REAP/SIL.org… folks without FLEx might be interested in reading the document without downloading the FLEx application.
I wonder if the help content is in the Github repo somewhere?
The parsing guide used to be online when FLEx was hosted at a different URL, and a few papers reference it, but it hadn’t been re-uploaded when they moved to the new URL. I found it in a GitHub repo, but I just asked them to move to the live server since the old links were broken.
The docs are hosted here:
and the new direct link to the Parsing manual is here:
You’ll also find the Introduction to Lexicography there, but as a new user, I can only post 2 links.
Hi @MatthewLee, I pushed some buttons, you should be able to add more links now. (Let me know if you have any trouble.) Thanks for your contributions!
FWIW, I was the original author of the Hermit Crab parser, back in the mid-90s, when it was embedded in SIL’s LinguaLinks (the predecessor to FLEx); it was later ported to FLEx by someone else. HC was intended to implement a phonological features-based phonology, along with morphology. It was intended to be sufficiently powerful to achieve observational adequacy, not necessarily descriptive (much less explanatory) adequacy, including things like partial or full reduplication.
The original design allowed for phonological rules to refer to words on either side, but I never got around to that
The use of phonological features makes it difficult for some field workers. One of these days, if/when I retire, I may try implementing a finite state transducer in FLEx. That would allow users to write phonological rules etc. without referencing phonological features. I know how to convert an XML grammar (like FLEx would export) to FST rules, but interfacing with FLEx’s UI would be more of a challenge.