Writing Out Loud: Abkhaz Converbs and Getting Data to Do Stuff

Hello All out there on DocLing and beyond!

This is the abrupt beginning to my thread about adventures during my PhD project which will describe converbs in Abkhaz (Northwest Caucasian).

For now, the focus of this thread is specifically on me developing my corpus of converb examples, which includes trying to develop of feature spreadsheet for examining the characteristics of Abkhaz converbs (and, really, converbs in a typological sense) and trying to get the most out of what software/coding can help me see about my data. Both are things I began thinking about in 2021, the first year of my PhD, however, I’m renewing focus on them now.

I did a bunch of things in 2021, but what I especially learned is that I need to guard the time I devote each day to reading and writing that is directly related to my project. The first two hours of my workday will now be devoted to this endeavor, and to keep myself accountable, I’ll post my musings here. And perhaps, occasionally, people will be interested in sharing their thoughts and reactions.

Thoughts for today, January 3rd, 2022:

  1. How do I break features down into discreate units? For instance, some of the examples I’ve gotten from the literature show things like “converbs formed with the -ны suffix and the stem of a static verb (and/or the pure stem of a static verb) to indicate additional state with a verbal predicate.” Not only does the example show morphological form (ны suffix, verb stem) and then a more “syntactic” form (converb + predicate verb) but also semantic meaning (indicating additional state of the syntagma). There’s a lot in here, and all of it is relevant.

So far I started experimenting with two breakdowns for features to describe one example (one row = one column in Excel):

  • feature: verb stem

  • value1: affixation

  • value1_name: affixed vs. unaffixed

  • value1_as_in_source: unaffixed

  • value2: dynamicity

  • value2_name: dynamic vs. static

  • value2_as_in_source: static

  1. I think what I need a program for is to set filters so that I can search for “examples with values X and Y” and then it brings up all the examples in the corpus that fit those filters

Even if it’s only ever me who reads my own musings, I think it’ll help, so thanks a lot to DocLing for existing and hosting :wink:

*The feature spreadsheet I want to make for this project and for other converb projects in the future (converbs are life) is based on the one developed by the Typological Atlas of the Languages of Daghestan which you can find here: Feature datasets

*Thanks to @pathall for suggesting I start a thread (you don’t have to comment, just wanted to credit you properly).


Thanks for sharing this! I love the idea of taking notes on docling, I hope you keep it up!

We aim to serve. :slight_smile:

I can assure you that at least I and perhaps others will read everything you write with interest!

(I have some thoughts on what you’ve already written but I’m in high-gear-finish-the-dissertation mode so they’ll have to wait!)

Also, I got you this.


1 Like

Winnie the Pooh… viːni pɔx?


1 Like

Thanks for the encouragement, Pat! (And for the Abkhaz video :smiley:). I’m determined to keep it up at least until I send in my potential paper (hopefully longer – but smaller goals first). Whenever you finish your dissertation, you can give me notes :spiral_notepad::+1:

I’ll have to see what I think is going on with Pooh’s name as well! xD

1 Like

Thoughts for January 4, 2022

Today I remembered that every single action in trying to put together a feature spreadsheet brings more questions. During the first year of my PhD (the late 2021), developing the feature spreadsheet was something I always had in the back of my mind but hadn’t started yet because, I think, it was exciting but overwhelming. This year, though, it still is a big undertaking, but I’m feeling more excited than overwhelmed. (I’m feeling a little bit behind in my goals though since I feel like I should have plunged right into it last year, but I’m not indulging the negativity).

Anyway, today I experimented with putting a non-NWC language example (of a converb) into my spreadsheet. I also tried to add a few more “features” that I have on a Big List of Stuff to Consider When Analyzing Converbs. This brought a ton of practical and theoretical questions which I have on, basically, a digital piece of paper. (Surprisingly useful though).

Some of thoughts for today:

  1. I want the feature spreadsheet to be useful for studying converbs cross-linguistically, but in terms of the scope of my project, it might be more practical to make as detailed a spreadsheet as possible for my own purposes (examining Abkhaz converbs from written material and, eventually, data collected during fieldwork) because there were a bunch of questions that came up trying to put an example outside this scope, like “What do with different transcription orthographies (including IPA but also idiosyncratic ones)?”, “What do with people’s glossing conventions?”, “What do when people use the same terms with slightly different meanings in different works?” Not that I won’t come across this for sources on Abkhaz, but like, maybe one area at a time is good xD

This ties into the overarching question

  1. What is my base – the thing I’m looking at? What data do I want to collect and what do I want to learn from me? This is something I’m still working out but at least I know the question is there.

And then there’s all these practical things that have to do with the spreadsheet design:

  1. If I use more than one “feature” in the spreadsheet (i.e. One spreadsheet – all things), what naming conventions should I use with the values? So far, I’m doing “feature 1” then “value1_1” (so value 1 of feature 1), “value1_2” (value 2 of feature 1). I’ve done this because I’m assuming I want every column to have a unique name because I have a vague sense this is important for the code to be able to distinguish the second value of feature 1 from the second value of feature 4.

  2. If I am trying to break features down into values that can have a unique, discrete answer (like “yes” or “affix” when the two possibilities are “affix” versus “no affix” or something, how do I distinguish between feature values that are possible/impossible versus features that are manditory/optional. For example, today I worked with the feature “coreference.” In some situations, Same Subject coreference might be optional while in other situations it may be impossible which looks similar to optional but is not the same.

  • it would be really neat if I could make a program go through a set of options like on those maps titled “Are You a Cat?” and then it goes to the first question that could be “You purposely knock objects off tables” and then depending on whether the answer is “yes” or “no” it takes you to another question or set of questions. I think this would be a useful way for a program to go through something like “Is SS coreference possible? Yes. Is it manditory? No.” Or something along those lines.

I’ll finish off this already long post with converb specific things:

  1. Certain features, like coreference, assume the converb has its own clause. However, morphological converbs are also found in complex predicates and don’t necessarily have their own clause. How do I deal with the different possible functions of converbs in the dataset?

  2. How am I going to define “converb.” Good question.


Thoughts for January 5, 2022

So, quick post today. Two of the things I was thinking about today include:

  1. Maybe I could use some kind of software to “look through” all the different resources I have to help me locate examples with converbs more quickly. A lot of the resources I have are PDFs (most are probably computer readable?). I could have the program search for particular glosses (or maybe even particular strings of characters – that are not English – but I don’t know how well that would work). I have a whole bunch of texts like grammars and readers and legit text collections to look through. And maybe there could be a way to have it search some of the databases that have Abkhaz texts online as well?

  2. The dataset I think I want to generate will be bunches of example sentences with converbs in them that are analyzed for a bunch of the features that are important for converb analysis (vague, I know xD).

  3. The features I want to look at come at several levels (sentence-level, word-level, morpheme-level). I’m not sure if this is important for how I structure my spreadsheet or not… but I keep bumping up against it.


Hey would you like to share a handful of the kinds of example sentences you’re working with? Let’s talk about the shape of the data — converbs are particularly interesting because they are multi-word constructions, which can be tricky to handle in a spreadsheet.

1 Like

@mjcarroll’s talk touches on searching through a library of pdfs. It’s well worth a watch if you haven’t yet!


Thanks! These talks actually inspired a lot of my thoughts about data, so I can definitely agree they’re well worth watching :wink:

100% yes about being multi-word constructions. And the other thing that’s interesting is that there are morphological considerations in terms of form but then also aspects related to function in a clause/sentence both syntactically and semantically, plus a bunch of other things. xD

So! Right now I’m focusing more specifically on Abkhaz data (because I want to use the sheet for my fieldwork), but the larger goal is to make it applicable for converb data cross-linguistically. Just to frame where I’m at.

I thought it would be more illustrative to should you what the data looks like in my spreadsheet (not my feature spreadsheet, but the spreadsheet I’m transcribing data out of written texts into)

Column A: written Abkhaz
Column B: Russian translation (as given in monograph)
Column C: English translation (right now with the help of DeepL)
Column D: my eventual (superior) English translation (:P)

Column H: Where I record what the example is given for in the manuscript.*

*The first example is “use of the verb stem as a converb without temporal or other special affixes” which is a bit more straightforward. But then you get reasons like this, “converbs formed with the -ны suffix and the stem of a static verb (and/or the pure stem of a static verb) to indicate additional state with a verbal predicate,” which, as you can see, cover several things at once

Anyway, this is where my data is currently living as I’m transcribing it out of PDFs and into the Excel sheet. (A lot of examples still live in PDFs and computer-generated texts, which is why I’m thinking it might help to use a program to search examples out going forward :thinking:)

Ultimately, my idea atm is: put examples in searchable spreadsheet with relevent information, analyze examples according to my own spreadsheet, have them be in a database where I can use different filters to search for examples which show, like, “static stem converb with SS coreference” or whatever. xD

Thoughts for January 10, 2022:

Today I was thinking about some potential paper topics and skimmed through at least one sketch grammar of Abkhaz (potentially more after this post) and took note (not for the first time – but in a different light) that “static vs. dynamic” verb stems and “transitive vs. intransitive” verb stems are aspects that come up a lot at least for linguists talking about converbs in Abkhaz (and probably other languages).

I also noted that they do use the term “converb” but they don’t necessarily define what they mean (not unusual). Some of the characteristics they give to describe them are morphological in nature but occasionally also functional (syntactic and semantic). There is at least one linguist that advocates for taking a functional approach to identifying converbs rather than a morphological one.

1 Like

Thoughts for January 30:

I’ve been away from posting on the thread, but have still been hard at work with converb things!

You may have summarized from my disjointed ramblings that I want to do something vaguely digital with my converb data and my converb feature sheet. Well, it’s happening, thanks very much to our own Dr. Pat – and anyone else on our lovely forum. :slight_smile: You can follow that specific journey here:

What have I been working on otherwise?

Lots of things – but in particular, I got to do a very chill presetantion for a student colloquium about my thesis project. It was a good opportunity to think about my project and my goals and how to articulate that. What I also got out of the presentation was a ton of questions to follow up on, which I’m pretty pumped about.

So, apparently, for Abkhaz converbs, when formed with a transitive verb stem, the subject marker (also on the verb) is overall not obligatory (only sometimes). However, when converbs are formed with an intransitive verb stem, the subject marker is generally obligatory. Why is this? Not really sure. Aristava 1960 mentions it briefly and says it has to do with the historical development of something (??) in Abkhaz – and also polypersonalism. Or something??? (It was in Russian; I didn’t pay much attention to it at first, but now my interest is, of course, the most renewed). Currently, I am looking everywhere for this article (cited in Aristava 1960 as a recommendation for a fuller discussion):

К. В. Ломтатидзе. Бессубъектные формы абхазского переходного глагола. Иберийско-кавказское языковедение, том II, Тбилиси, 1948, стр. 1-13.

I haven’t found it yet, but I will.

The other thing that will be followed by a lot of question marks (expressing super serious linguistic interest, naturally) is that intransitive verbs, somehow, all have to mark for a (direct?) object??? Or something??? About objects??? Apparently, you can have an instransitive verb that is also bivalent. This is news to me because in my head valency = transitivity and intransitive = valency of 1. However, my mind was recently blown when a linguist friend who knows more than me said that, actually, transitivity is based on the presence of the direct object. So, I guess this means indirect objects don’t count? Like whatever is happening in this English sentence (which was given to me as an example):

I was given a book.

Looks transitive to me :sweat_smile: But what I’m thinking is perhaps in this structure, “I” is, syntactically, the subject, and “book” is…the indirect object (???) and, therefore, it’s not transitive even though, to me, it looks transitive :woman_shrugging:

So, yes, I was today years old in PhD student-ness when encountered this phenomenon :trophy:


Transitive and intransitive verbs happened (why is the subject marker obligatory for intransitive verb stem converbs but not necessarily transitive verb stem converbs :face_with_monocle: )

Transitivity is not the same thing as valency (???) also happened

1 Like

Thoughts for February 11:

The past few weeks I’ve been digitizing the examples of converbs for a whole set of circumstances from Aristava (1960). This has continued to be tedious but important for a.) my understanding of converbs in Abkhaz b.) Russian reading improvement (a little bit with Abkhaz also – still working on this). I’m starting to identify a few gaps (which I probably should have noticed earlier, but when you’re hyperfocused on the text itself, sometimes you miss things xD)

Anyway, so gaps so far (that are listed as not be accounted for but it’s unclear if that means they don’t exist at all or don’t happen to exist in the corpus of texts Aristava draws from):

  • Pure static stem of the verb converb in a negative form
  • -шьҭа–>ҭа converb (dynamic tensed nonfinite) negative form in Abkhaz (it’s apparently in Abaza)
  • Negative forms of pure stem converbs (both static or dynamic) in Tapanta and Ashxara dialects

Of course, another thing that’s interesting about converbs as described by Aristava are the converbs that have no converb morphology and are just a verb stem. I haven’t thought deeply about this yet, but eventually I’ll have to when I decide what I’m going to consider a “converb” or not.

In other news, I did get a hold of this paper:

К. В. Ломтатидзе. Бессубъектные формы абхазского переходного глагола. Иберийско-кавказское языковедение, том II, Тбилиси, 1948, стр. 1-13.

It is, in fact not in Russian but in Georgian. Now my Georgian professor and I are working on reading it together – super fun and interesting but also intense. I find it’ll be easier to document what we talk about (the analysis, the translation, etc.) if the Georgian text is manipulable by a computer. The only way I know how to do this is to essentially transcribe the text from the PDF into a Google Doc. xD The scan is a bit blurry, and is also Georgian, so this is a time-consuming process. But worth it? ^^


Hi @SarahDopierala - I’m late to this party, but it’s interesting to read through your process. It’s similar to a lot of what I went through reading about serial verb constructions for my dissertation. I’ll attach my SVC review paper which goes over which lists which features are the most relevant in the SVC literature in case that’s a useful thing to think about:
Lovestrand 2021 Serial Verb Constructions.pdf (320.5 KB)

I ended up concluding that a similar “multivariate” approach is ultimately the way to think about comparing multiverb constructions across languages but also that a general typology of multiverb constructions is too much for a PhD thesis.


Мшыбзиа, I’m Sammy (they/them), and I’m doing my PhD dissertation on Abkhaz at the moment, it’s great to see more people working on the language! ^^

I work on the phonology, so I don’t necessarily have a lot to say about converbs, but I wanted to let you know about some resources in case you don’t already have them. There is a lot of research on Abkhaz these days and a lot of new material that keeps being published, mainly in Russian. I count at least six book-length grammars from the 21st century alone: Arstaa and Čkadua (2002), Chirikba (2003), Jakovlev (2005), Hewitt (2010), Yanagisawa (2013), and Arstaa et al. (2014). Chirikba, Hewitt, and Yanagisawa are in English, Jakovlev is in Russian, and the other two are in Abkhaz.

There is also the Apsnyteka at http://apsnyteka.org/ with literally thousands of freely available scanned PDFs of materials on the Abkhaz language, culture, history, and anything else you might be interested in. And here is a link to Lomtatidze’s published works, as useful as it is daunting (literally how do you publish this much??).

One interesting thing about the Abkhaz -ны suffix is that it is also often used to form adverbs, as in Ашәа (и)бзианы сҳәоит ‘I sing well’ lit. the.song (it)-good-ны I.say. I wonder if understanding this type of usage is important for understanding converbs in the language generally or if this is something unrelated.

I can’t promise I’ll be on this forum a lot, but I’m always happy to talk about Abkhaz and hear what other people are working on, so feel free to email me if you ever have any questions or thoughts about anything Abkhaz-related: samuel.andersson@yale.edu :slight_smile:


Just poking in to say welcome @samuelandersson, happy to see contacts like this happening!

1 Like

Thanks a lot for this @joeylovestrand ! You’re definitely not late since I hope to keep the thread going as my project keeps going :wink: Thanks so much for sending me your review paper! I bump up against SVCs and also medial verbs and other sort of nonfinite, complex verb-y things quite a bit, so this will definitely be worth taking a look at. Just be prepared that I might call on you with questions! xD


Thanks a lot for stopping by the thread and for sharing some thoughts and resources! Some of them I’m familiar with, but others I haven’t looked at too deeply, so this is a great motivation to do so. :slight_smile:

RIGHT??? T^T Glad it’s not just me who has this thought :sweat_smile:

It’s good for me that you mentioned the overlap here. It’s for sure interesting that this suffix also forms adverbs – I think, anyway. Honestly, I’m not 100% sure how “important” this is either, so time to look more closely at that relationship. :face_with_monocle:

It’s also cool for me to hear that there’s someone else doing work on Abkhaz (in the English-speaking world, that is, since, as you said, much is happening in the Russian/Abkhaz-speaking world :wink: ) I will definitely drop by your inbox for colleagial chatting. ^^

1 Like