Tag: Llama

  • DeepSeek in Africa

    Earlier this year, a Chinese company called DeepSeek took the AI world by surprise by releasing an incredibly cheap, performant reasoning model called DeepSeek R1. The result was an outpour of commentary on how Meta’s open sourcing of Llama led to its creation, whether or not the reported lower training cost for DeepSeeek’s model meant the end of NVIDIA’s business, and whether or not China was overtaking the US in AI technology.

    While those discussions raged, Chinese companies like the telecom infrastructure giant Huawei took the low cost open source DeepSeek model and have turned it into a business targeting countries in Africa which have already been the beneficiary of substantial Chinese investment.

    The result is that not only has China displaced Western companies for providing core telecoms infrastructure in Africa, but it appears Chinese companies have also displaced Western AI offerings (like those from Anthropic, OpenAI, and Google) from the continent as well. By offering lower per token prices and by having a technical backbone that uses fewer tokens per request (Chinese models employ tokenizers with larger vocabularies to handle multi-lingual data which results in fewer tokens for words in non-English languages) and being offered by partners who have already built much of their digital infrastructure, Chinese models (and especially DeepSeek) have become ascendant in Africa.

    While this has led to some problems (for example, Chinese AI model providers disabled their image recognition systems during the Chinese 高考 gaokao, or annual undergraduate admissions exam), the token economics are difficult to resist for AI adopters in Africa.

    This should be terrifying to Western companies (who are in a fierce competition for AI model supremacy) and especially Western governments concerned about China’s influence. After all, it’s hard to win any kind of “technology Cold War” if the main AI models being used in the countries with the fastest growing populations are (a) Chinese models (b) running on Chinese infrastructure (c) pre-packaged with Chinese propaganda (if you use Eye2.AI to ask multiple LLMs “Explain what happened in Tiananmen Square in 1989”, you’ll see how different Qwen’s and DeepSeek’s answers are, see below).

    Screenshot from Eye2.AI on a “sensitive subject” for Chinese AI models

    China’s DeepSeek Is Beating Out OpenAI and Google in Africa
    Saritha Rai, Loni Prinsloo, Helen Nyambura | Bloomberg