Welcome to my professional website.

About myself and this website

I have a Ph.D. in Computational Linguistics, obtained on June 24th, 2013 at the University of Groningen under Prof. Gertjan van Noord, who acted as co-promotor with Prof. Dr. ir. John Nerbonne. My thesis work (download here) centres around the use of rule-based and statistical methods to improve syntactic tree-to-tree alignment, in the context of its eventual application to syntax-based machine translation.

I have worked on web app development, trained and applied machine translation models, built and trained pipelines for text digitisation, designed, built and extended/filtered/aligned corpora, parallel corpora, and (parallel) treebanks, and designed, built, or contributed to various databases (lexical, terminology, place names, etc.). I also have experience with various aspects of language practice such as lexicography, translation, editing, and writing. Here you will find more information about me and my work, which includes publications as well as downloadable data such as code and annotated corpora.

Post PhD career

In February of 2016, I was appointed as researcher in computational linguistics after my postdoctoral fellowship at the University of South Africa (Unisa) in Pretoria, South Africa, as part of the core research group at the Academy for African Languages and Science (AALS), under Prof Laurette Pretorius (now retired). AALS was a strategic project within the School of Interdisciplinary Research and Graduate Studies (SIRGS) under the College of Graduate Studies (CGS) on the Muckleneuk Campus of Unisa in Pretoria.

My work at Unisa included multilingual corpus design and implementation in TEI XML, digitisation solutions, statistical and neural machine translation, the collecting, collating and cleaning of (parallel) corpora, application of various natural language processing tools, as well as assisting and presenting at workshops where we helped participants to develop content for South African languages in Wikipedia. I also supervised postgraduate students.

In 2020, I have developed a software environment that performs various checks, queries and updates on an offline SQL version of the African Wordnet database at the Department of African Languages, Unisa.

I am currently in the process of developing a place name app for the Department of South African Sign Language and Deaf Studies at the University of the Free State, South Africa. This will contain place name information in both English and South African Sign Language. It is due to be distributed on Google Play by the end of May 2022.

I was also appointed as research fellow at the University of the Free State where I am performing research that relates to the mobile app project.

I am concurrently also working on a terminology web app for the Department of Geography at the University of South Africa. This is an adaptation of the Terminator software and will host various glossaries in English, isiZulu, Sesotho sa Leboa, and Afrikaans. I will link the Github repository here once it is ready.

Finally, feel free to look around by navigating the links (for example, here is my résumé). If there are any broken links, I would appreciate a message. You can contact me at:

E-mail: kotzegj [at] ufs [dot] ac [dot] za

Note: Given the urgency of the Covid-19 pandemic, .za Domain name holders are requested to link here: www.sacoronavirus.co.za