The development of a theoretical framework for the compilation of a new type of semi-bilingual, corpus-driven reference work for the Bantu language Swahili is proposed. Outputs include a PhD dissertation and supporting research articles, a new corpus representative of standard Swahili, a proof-of-concept digital Swahili dictionary, as well as increased international visibility for the activities of the BantUGent research group.