These are some projects I have been working on.
Sustainable Multilingualism in the South African Context (2014-2019).
My work at the University of South Africa (UNISA) in Pretoria, South Africa is coming to an end. I was part of the core research group at the Academy for African Languages and Science (AALS) under Prof Laurette Pretorius until her retirement in 2018 and under Nozibele Nomdebevana in 2019. AALS was a strategic project within the School of Interdisciplinary Research and Graduate Studies (SIRGS) under the College of Graduate Studies (CGS) on the Muckleneuk Campus of UNISA in Pretoria.
Broadly speaking, my vision is to contribute to the development of language technology for lesser-resourced languages in South Africa. I aim to apply my knowledge of corpora, digitisation and machine translation to this context.
Ph.D. thesis (2008-2013)
In the context of the PaCo-MT project (see below) I performed research on various approaches to the problem of tree alignment, focusing on the alignment of constituents (non-terminals) and the interplay with existing word alignments and other important features. I investigated the problem from both a statistical and a rule-based perspective. Most notably, I implemented Eric Brill's transformation-based learning algorithm to construct a classifier that can align trees both from scratch and as a tool to correct errors in the output of statistical alignment.
Parse and Corpus-Based Machine Translation (PaCo-MT) (2008-2011)
This is a syntax-based machine translation project where I have had the opportunity to work as a PhD candidate at the University of Groningen in collaboration with the University of Leuven and OneLiner bvba from 2008 to 2011. It was sponsored by the STEVIN programme (STE07007) of the Dutch Language Union (Nederlandse Taalunie).
I worked on the creation of richly annotated parallel corpora on a large scale for the language pairs Dutch/English and Dutch/French. Important steps were sentence alignment, word alignment, parsing and constituent alignment. This proved to be a valuable testing ground for the subject of my Ph.D. thesis.
ALEXANDER and Afrikaans WordNet (2006-2007)
During the time of my appointment at the CTexT research centre at the North-West University campus in Potchefstroom, South Africa, I was responsible as project leader for the initial construction stages of a wordnet for Afrikaans, my mother tongue. This was also the main topic of my Master's Thesis.
Soon after, the construction of the Afrikaans lexical database ALEXANDER was initiated, of which I was also appointed the project leader at the time. The wordnet was later integrated as part of the database. For more information and to inquire about availability, click here.