Typos of a Linguist

This month I’ve been back in Musoma, doing some research to help me complete my Masters dissertation. (The research is also really helpful for developing the writing system of Simbiti, so it’s great to have this chance to dig into some things in more detail.)

I’ve been having a lot of conversations with Simbiti speakers, recording many hundreds of words and phrases, and then listening carefully to them afterwards.


Recording in the sound studio.

When I listen to the words later, I write them down, paying careful attention to the length of vowels and tone patterns.


However, some of the letters and symbols I need to type are not on an normal English keyboard. I use the IPA – International Phonetic Alphabet (not India Pale Ale!). And I have some settings on my computer that enable me to switch keyboards, and use various key shortcuts to type the different symbols.


For example:

The Simbiti have 7 vowels: i, e, a, o, u … but also ɛ and ɔ. The shortcuts for these are just <e and <o.

They also have a β sound, which is sort of between an English ‘b’ and a ‘w’ (a voiced bilabial fricative if you care to know!), and this is typed with =b.

I try to write high tone where I hear it, which I do with @3 … giving á.


One of my current spreadsheets that I’m using to analyse words.

However, sometimes I forget to switch keyboards before I start typing, or the keyboard gets switched back when I change programme or sleep my computer.

This means that I end up typing long strings of nonsense!

I meant to type:
βɑɾɑɣɑ́mbɑ íʃɔ jɑitéɣeːreːje βúːjɑ   (‘They say that yesterday he listened well’)

But instead, what came out was:
=b=a>r=a@3=g=amb=a i@3=s<o j=aite@3=ge=:re=:je =bu@3=:j=a

It’s a bit frustrating if it takes me a while to notice, because I just have to delete it and type it all again! Whoops!


Where’s the Small Jerusalem?

During my course in Gloucester, we covered a course called “Discourse Analysis”. This was essentially a study of how languages say what they mean, and how different languages can say the same thing different ways (and sometimes mean different things from the same thing!).

Let me explain with some examples …


One of my teachers has worked in Mexico.
In one of the languages she works with, there isn’t a separate word for village and city. So to describe Jerusalem, they thought they’d try a phrase “big village Jerusalem”, to emphasise that Jerusalem wasn’t a small village.

However, when looking at the translation with mother-tongue translators, the translators genuinely wanted to know, “Oh, where’s the small village Jerusalem?”.

In their language, if you say there is a “big Jerusalem”, then that necessitates that there is also a “small Jerusalem”!!


In English, adjectives can be used either to describe, or to differentiate:

If you say, “Oh, I like those red shoes”, it does not necessarily mean that you don’t like the other not red shoes. You could just be giving more descriptive detail to the shoes.

But, if you say, “The blue bin gets collected today”, you might mean that today it is the blue bin, and not the black bin, that gets collected.


Back to the language in Mexico:

This discovery had huge implications for Bible Translation!
Consider the following phrases, which in English we might read without thinking:

‘The Good News about Jesus’ – this would mean there was also bad news about Jesus!

‘This is my son, whom I love’ – this would mean God also had a son whom he didn’t love!

‘The one true God’ – is there also a not true God?!

‘He is the God who saves’ – i.e. as opposed to the God who doesn’t save …!


Wow! I was completely amazed at all the examples we were discussing, and the implications if this hadn’t been discovered!

It turned out that the translators had misunderstood parts of the Spanish Bible they had read, because they were not aware that Spanish (like English) could use adjectives and relative clauses just to describe.


I don’t think this is the case for the languages I work with in Tanzania … but it goes to show how careful we need to be to check things are understood in the correct way.

Just one more reason why Linguistics is an essential part of Bible Translation!

The Hat Game.

Sometimes in the Linguistics office we can be rather silly …

There are a number of roles that the linguists here fill, and some of my colleagues have very many “hats”.

Last week we commissioned a colleague (who is very arty and teaches some of the missionary kids here), to represent our many “hats” on the board in our office:


The artist at work.

The only hats I wear are the “Orthography Hat” (note the crazy eyes – a genuine side effect of trying to puzzle out spelling system issues for too long) and the “Linguist Hat” (with the ivory tower representing the peaceful environment of theoretical linguistics).

My colleagues also wear: the “Consultant Hat”, the “EC Hat” (Entity Committee …not sure exactly what they do, but important things for the running of our organisation….); the “Dictionary Hat”; the “Coordinator Hat”; and of course the “Cat in the Hat” for when we are not working!



Most of the linguistics team, together with the artist and our hats. (Our supervisor was not in that day and so is not pictured – perhaps that is why our silly hat discussion occurred!)


Seeing Beauty in the Chaos: Enjoying Imbrication.

(This is a linguistics post, but it has reminded me of the wonderful creativity of God. Skip to the end for this part!)

Lately, one of the problems I have been working on has been figuring out the reasons behind some very strange word endings.

In Kabwa (as in all Bantu languages – the big language family that Kabwa belongs to), pieces can be added onto words to add meaning (this is called “agglutination” – sounds like glue and acts like it too!).

This means that you can have a lot of fun, especially with verbs, adding together different endings to change the meaning slightly.

Sometimes the pieces stick together in a nice row, and it is obvious what pieces have been added.

However, sometimes the sounds in the different pieces like to play around and a totally different ending is created.
This is one of the difficulties I encounter when I’m checking the spelling of some Kabwa words.

One specific ending /-iri/ is used in some past tenses; /-iri/ really likes to play with its neighbours …

(For these examples I used the Kabwa verb /rih/ which means ‘pay’)

The pieces:

The outcome:






He has paid




He has paid on behalf of someone else




He has avenged (or made to pay)




He has been paid




He has avenged on behalf of someone else




He has been paid on behalf of someone else




He has been avenged on behalf of someone else


But then it gets more confusing, because sometimes the same ending (or almost the same) can be created from different pieces being added together, without the /-iri/ piece …

The pieces:

The outcome:






He should pay on behalf of someone else




He should avenge on behalf of someone else




He should be paid on behalf of someone else




He should be avenged on behalf of someone else


Those of you who have made it this far might be interested to know that the technical term for these pieces of words playing and overlapping is “imbrication“. This term can also be used to talk of actual overlapping in sedimentology, tiling, and surgery – fun!


At first, the mixture of all these different endings seemed like such a mess to me. But now that I can see the pattern, it really does seem sort of beautiful!

It has reminded me that often our lives can seem to be in a bit of a mess; we can’t make sense of everything; we can’t see the reason and the plan behind it all.

But the same God who created the universe, who created the wealth of intricately beautiful languages in the world, he created us. He wove us together and he has woven our lives together. We may not always be able to see the rhyme and reason, but he does, and sometimes that has to be enough.

Psalm 139 talks about this as well (verses 13-16) (NIV):

“For you created my inmost being;
    you knit me together in my mother’s womb.
I praise you because I am fearfully and wonderfully made;
    your works are wonderful, I know that full well.
My frame was not hidden from you when I was made in the secret place,
    when I was woven together in the depths of the earth.
Your eyes saw my unformed body;
    all the days ordained for me were written in your book before one of them came to be.”


You walked or you waaalked? Recording in Kabwa.

“You walked or you waaalked?” – Clearly this doesn’t make much sense in English, but in the Kabwa language the length of a vowel can make the difference between whether something was done earlier today or yesterday.

This month I have been doing some research and recording in the Kabwa language. As well as recording a basic word list (as I did last month in the Simbiti language), I have been asking questions about and recording various verbs.

There are a number of various verb forms in Kabwa that can sound very, very similar (or even identical in some cases, even to a mother-tongue speaker!).

The question is, how do we write down these very similar forms?
Does it matter if things that mean something different are written the same?
Won’t the context make it obvious what you mean?
Is it confusing to write differently words that sound the same but mean something different? (Or will this cause less confusion in the long run?)

In English we have similar spelling issues all the time:
“I read a book yesterday” vs. “I like to read books every day”
or even, “Did you read that red book about reeds that I said I’d read?”!

But back to Kabwa…

In Kabwa, there is a distinction made between whether an action was done earlier today, yesterday, or before yesterday. It is the first two of these that can sound similar; here are some examples:

(The bha- and the beginning means ‘they’ did it, and the –aa– and –iri parts make it past…)

‘They hid (yesterday)’
Kabwa: bha-aa-bhis-iri
‘They hid (earlier today)’
Kabwa: bha-bhis-iri
(The –bhis– part is the ‘root’ of the verb, here to do with to ‘hide’)

Now, with this particular example, it is possible to hear the difference between the two – hooray!
I can even prove it my using some of my favourite sound software (Praat) – I can “see” the word and even “see” the -a- vowel and work out how long it lasts! (Some might call it sad how excited I get about this … I’m not sure what I call it, but it makes me happy!)

D&E12 comparison

The area outlined in red is the -a- vowel; the top image is of the ‘yesterday’ form, the one below is the ‘earlier today’ form – can you see that the top vowel is longer?!

But now to one of the more problematic examples.

In the example above, the main part of the verb was -bhis- ‘hide’.
Problems come up though, when the main part of the verb starts with a vowel, for example –ahur- ‘choose’:

‘They chose (yesterday)’
Kabwa: bha-aa-ahur-iri
‘They chose (earlier today)’
Kabwa: bha-ahur-iri

Hmmm, if you take a look at the ‘earlier today’ form, there are two “a”s next to each other, which makes a long vowel (in the first examples, the ‘earlier today’ form had only one -a-). I was curious whether these two forms would sound different, whether the ‘yesterday’ form would have an extra-long vowel …

D&E35 comparison

Well … here the -a- vowel is almost exactly the same length. Hmmm …
According to Sasi, the Kabwa speaker I was working with, these two forms have “maana moja” – one meaning.

So… does that mean that the ‘yesterday’ and the ‘earlier today’ forms are the same when verbs begin with a vowel?
Can you not make the same distinctions as you can with other verbs that begin with a consonant?

Well, no they don’t just merge into one.
If you specify what kind of thing they chose, then you can hear the difference!

‘They chose it (yesterday)’
Kabwa: bha-aa-yi-ahur-iri
‘They chose it (earlier today)’
Kabwa: bha-yi-ahur-iri
(The –yi– part is the ‘it’, for example a chicken)

In both examples, the -yi- ‘it’ splits up the -a- vowels. This means that you can tell that the ‘yesterday’ form has a long -a- before the ‘it’, and the ‘earlier today’ form has only a short vowel before the ‘it’!

D&E37 comparison

The two red boxed sections are the -a- vowels either side of the -yi-.

It is clearly visible that the top example has two long -a- vowels either side of the -yi-, whereas the lower example has a very short -a- before the -yi- but a longer -a- afterwards.

Hooray! So it is possible differentiate between the two, even when the verb starts with a vowel!

The problem still remains as to the example above that you can’t differentiate.
Do we write it differently so that the difference is obvious when it is written, even though it sounds the same? Or do we write it the same because it sounds the same and risk some confusion? Which will cause Kabwa speakers less confusion when they are learning to read and write Kabwa?

Well, I don’t have all the answers! But fortunately I’m not working on my own. There are other linguists here with whom I can discuss these things, as well as two Kabwa translators and many Kabwa speakers nearby whom I can ask when I have more questions.

Now I should get back to listening to all the sound files at my desk, which is covered with rather a lot of papers as you can see!


(And congratulations if you made it all the way through this post!)


Recording in Simbiti – words, words, words!

Last week I spent a few days working with a man called Julius, to record a basic word-list is his mother-tongue, Simbiti.

We used a 51-page list of Simbiti words which had been collected and written down a number of years ago, but never recorded.

First Julius and I checked through 2 pages at a time, making sure he agreed with how the Simbiti word was written down, but also checking that it had been given the correct Swahili translation.

Finding a correct translation was quite difficult at times. One word we discussed was ‘ukusiighiitya’, which means to rub something lightly. However Julius was not happy with the Swahili words given; one ‘kuchua’ means to rub roughly or to chafe, and the other ‘kusugua’ means to clean something by rubbing or to scrub. Julius therefore asked me what the English word given was, and I apologetically explained that it was simply ‘to rub’! We didn’t manage to come up with a suitable Swahili translation; I’m still thinking on it.

We also had fun discussing a number of words to do with blowing:

  • ‘ukuhuuta’ means simply ‘to blow’.
  • ‘ukughwesya’ means ‘to blow something causing it to fall or drop’ (which Julius explained was different to ‘to blow something down’ – ‘ukuhuuta keghwe hanse’).
  • ‘okohaanyora’ describes the wind blowing the thatch of a roof.


After checking through two pages, we then recorded those words; Julius repeated each word 2 or 3 times each.

We did the recording in our little recording studio here at the office. There’s a main room with a desk, where I sat with my laptop and recording equipment. Julius sat in the next-door room, which is very well insulated against outside noises. There were some small holes in the connecting wall, through which I threaded the wires from his microphone to my recorder!



So now we have over 1800 words recorded in the Simbiti language! This is really helpful for us when we are analysing the sounds in the language. It is also helpful for other linguists who are interested in the languages here.

(Now all that remains is for me to cut and edit the sound files … so far I’m 6 pages down!)


I thank God for willing, friendly, helpful people like Julius who are passionate about their languages.

Aren’t languages amazing in their variety?! How wonderful of God to create us with some of his creativity and innovation in us!


Words, words, words…

The last few weeks the Linguistics department have been running a Rapid Word Collection workshop in a nearby village called Mmazame. We were working with the Kabwa community (one of the language groups that we’re involve with here in Musoma), with the aim of collecting as many Kabwa words as possible!

The long-term aim is to create a dictionary from the resulting data. Making a dictionary is incredibly useful, not only because it helps to standardise the orthography (the way the language is written), aids literacy work and is an wonderful tool in translation, but it also raises the status and value of the language in the eyes of the wider community! (read more at: http://rapidwords.net/)


Here’s a summary of how the past few weeks went:

The first week was used to train a selection of the Kabwa participants how to write Kabwa, and also about how the word collection process works.


The training was reviewed at the start of week 2 for all of the participants.

The participants were split into 6 groups, with each group including a person who had attended the training:


We use a questionnaire that is divided up into 9 main sections (“semantic domains”), with each section divided into many gradually more specific sections.


An explanation of the way that the topics are divided up was included in the training.

We had translated the questionnaire into Swahili, which was printed out section by section and put into corresponding (colour-coded and labelled!) folders.
(My urge to colour-code will not come as a surprise to many of you…!)

Flex qus

The questions in the computer database.

colour coded

Don’t the colours make it look fun?!

folder labled

Each group was given a folder, and then discussed the enclosed topics and questions, writing down the Kabwa words that related to that topic:

printed questions

A printed questions sheet to prompt discussions in the word-collection groups.

words group

Intense discussions going on…

After the folder had been handed back, the words were counted.
It was fascinating which topics collected the most words: “Cattle” had 28 words, “Bird” had 53, but “Fish” only 14 … The Kabwa do not live on Lake Victoria; perhaps the language communities on the lake would have many more for “Fish” but fewer for “Cattle”?

The Kabwa word sheets were then passed onto a group of translators – Kabwa speakers with particularly strong Swahili – who wrote down the corresponding Swahili meanings:


Kabwa-Swahili word sheet

On the left, the Kabwa words; on the right, the Swahili definitions.

Next, the Kabwa-Swahili word sheets were passed onto the data-entry group (after the words had been counted, again!). We entered both the Kabwa word and corresponding Swahili “gloss” (definition) into the computer database, making sure the words went into the correct section of the questionnaire:

Flex word entry

data entry team

The word lists were then counted (again!) in case any alterations had been made.

At the end of each day, the groups would share some of their favourite Kabwa words, usually involving a lot of animated discussion and laughter!

sharing words

Some of my favourite examples were:
okunuura – to take your clothes off in a fit of rage.
ekinokonoko – the little corner part of your eye, next to your nose.


This process continued for two weeks, by the end of which 8880 words had been collected!!

word jar

We measured our progress throughout the 2 weeks using this jar-chart! It was exciting to see the total adding up.

On the Friday congratulatory speeches were given, and each participant was presented with a certificate and a few pages of the printed word-list:



The print-out of the word list

dictionary look

It already looks like a dictionary!

The fourth and final week of the workshop, we were a much smaller group, with only 8 Kabwa speakers.

This week was for checking and “cleaning” the data – removing any duplicates, trying to put words and phrases in the correct grammatical form for a dictionary, checking spelling …

corrected word list

A sheet corrected by the checkers.

At the end of all this we ended up with a final word-count of 6134 words!! Many of the entries include multiple senses of one word, so the number of definitions is even more!

There’s still a lot more work to be done with the data before an actual dictionary is published, but this was a pretty good start!


During the workshop we also had a number of other things happening …:

One day a member of the literacy team came and sold some books that have already been printed in Kabwa, as well as some of the other local languages. There was such a demand for them that we had to bring back more on the following days!

books crowd

What a crowd! And these are just the participants from the workshop!

book stall

The book stall.

Our vernacular media specialist also came to visit.
He was able to show and distribute various recordings in Kabwa, as well as other languages.



Selling CDs.

A couple of times he and someone from the translation department went to the local market, and were able to sell many books and CDs, often in exchange for fruit rather than money!

Wherever they went, they always drew a crowd:

Michiel and kids

There was also the opportunity for some recording.
Here he is working with the man who voiced the narrator in a recording of the book of Acts in Kabwa.


All in all, a very interesting, exciting and encouraging few weeks!

unusual paperweight

One day we found an unusual paperweight! Don’t worry, he soon went back outside.

lunch hangout

Our usual lunch-time hang-out location.

Phonetics or Phonology?

This is a question I have sometimes been confused by, but having now studied both I thought I’d attempt an explanation of the differences, for myself as much as for you!

(Warning: the following post may contain geekiness)

Phonetics – this is the study of speech sounds.

During the first part of my course we had two hours a day of Phonetics. It involves making a lot of strange sounds, ranging from “uoaei” and “daa ada taa ata”, to practising gargling  (a uvular trill [ʀ], which I’m rather proud of) and sticking your fingers down your throat to stimulate a constriction of the pharynx (yeah, thanks for that one!).

We learnt (or tried to learn…!) to recognise and produce and transcribe the most common sounds in the world’s languages (and some of the not so common) as laid out in the International Phonetic Alphabet.

I found it all very exciting (yes, I know … geek); it’s simply incredible the range of sounds that can be made with the human vocal apparatus!
What I find really interesting though are the tiny adjustments that can be made, which are recognised as distinct sounds in some languages (but heard as the same sound in other languages)…

Say the English words “pin” and “spin” to yourself. Does the “p” sound the same to you?

I’m guessing you’ll say “Yes”? Well it’s not!
In “pin” the “p” has an extra puff of air after it, known as ‘aspiration’, so this sound is written phonetically as [pʰ].
In “spin” the “p” is not aspirated, so it is written simply [p].

Aspirated and unaspirated sounds are heard by most English speakers as the same sound, whereas in many languages (such as Hindi) they are recognised as separate, distinct sounds.

… which leads us onto …


Phonology – this is the study of the relationships between sounds within a language, or between different languages.

Once you have studied a language phonetically, and noted down all the tiny details of the sounds and the differences between them, you then move to phonology and work out how the sounds fit together in the language.

So, for example in English, although a phonetician would notice that English contains both the aspirated [pʰ] and the unaspirated [p], with phonology you’d see that actually [pʰ] occurs only at the beginning of words, and [p] does not occur at the beginning. Therefore (taking into account other things as well), they are just variations of one “phoneme” /p/ (a group of similar sounds) and they are recognised as one sound by native speakers.

Well, that probably made things clear as mud. But I hope you enjoyed it; I certainly do!
(If you’re wondering why I’m studying this, please click here, or ask me!)