About Gideon Kotzé

These are some projects I have been working on.

South African Place Names App (12/2020-05/2022)

I am developing a mobile app at the University of the Free State's Department of South African Sign Language and Deaf Studies. This is to contain a selection of place names from Raper, Möller, and Du Plessis' Dictionary of Southern African Place Names (4th edition, 2014), and will also contain place name videos in South African Sign Language. The app will be distributed on Google Play by the end of May 2022.

Terminology web app at the University of South Africa (2021)

I am also developing a web application hosting a selection of term glossaries for the Department of Geography at the University of South Africa. This is an adaptation of the Terminator software and will host various glossaries in English, isiZulu, Sesotho sa Leboa, and Afrikaans. I will link the Github repository here once it is ready.

Sustainable Multilingualism in the South African Context (2014-2019)

In this period, I worked at the University of South Africa (UNISA) in Pretoria, South Africa. I was part of the core research group at the Academy for African Languages and Science (AALS) under Prof Laurette Pretorius until her retirement in 2018 and under Nozibele Nomdebevana in 2019. AALS was a strategic project within the School of Interdisciplinary Research and Graduate Studies (SIRGS) under the College of Graduate Studies (CGS) on the Muckleneuk Campus of UNISA in Pretoria. Project work included machine translation, digitisation and corpus design and creation.

Ph.D. thesis (2008-2013)

In the context of the PaCo-MT project (see below) I performed research on various approaches to the problem of tree alignment, focusing on the alignment of constituents (non-terminals) and the interplay with existing word alignments and other important features. I investigated the problem from both a statistical and a rule-based perspective. Most notably, I implemented Eric Brill's transformation-based learning algorithm to construct a classifier that can align trees both from scratch and as a tool to correct errors in the output of statistical alignment.

Parse and Corpus-Based Machine Translation (PaCo-MT) (2008-2011)

This is a syntax-based machine translation project where I have had the opportunity to work as a PhD candidate at the University of Groningen in collaboration with the University of Leuven and OneLiner bvba from 2008 to 2011. It was sponsored by the STEVIN programme (STE07007) of the Dutch Language Union (Nederlandse Taalunie).

I worked on the creation of richly annotated parallel corpora on a large scale for the language pairs Dutch/English and Dutch/French. Important steps were sentence alignment, word alignment, parsing and constituent alignment. This proved to be a valuable testing ground for the subject of my Ph.D. thesis.

ALEXANDER and Afrikaans WordNet (2006-2007)

During the time of my appointment at the CTexT research centre at the North-West University campus in Potchefstroom, South Africa, I was responsible as project leader for the initial construction stages of a wordnet for Afrikaans, my mother tongue. This was also the main topic of my Master's Thesis.

Soon after, the construction of the Afrikaans lexical database ALEXANDER was initiated, of which I was also appointed the project leader at the time. The wordnet was later integrated as part of the database. For more information and to inquire about availability, click here.

Gideon J. Kotzé

Home