BEIJING: Chinese startup Moonshot has introduced its latest generative artificial intelligence (AI) model, “Kimi K2 Thinking,” claiming the model outperforms OpenAI’s ChatGPT 5 in its agentic abilities.
The company says its AI model has a better understanding than other chatbots and can act on user intentions without requiring detailed, step-by-step prompts.
The model, backed by Alibaba, is based on the K2 model launched in July by the Beijing-based company Moonshot.
Amid this latest development, Nvidia CEO Jensen Huang has once again urged the United States to accelerate its efforts in the ongoing competition with China over artificial intelligence advancement.
At the same time, several major American firms, including Airbnb, have begun publicly endorsing certain Chinese AI models as equally capable—and often more cost-effective—alternatives to those produced by OpenAI.
Despite US restrictions on Chinese companies’ access to advanced computer chips, organisations such as DeepSeek have released open-source AI models that charge user fees considerably lower than those of ChatGPT.
DeepSeek claims to have invested $5.6 million in developing its V3 model, a stark contrast to the billions reportedly spent by OpenAI.
According to various tech sources familiar with the matter, the Kimi K2 Thinking model cost $4.6 million to train.
Moonshot claims that the Kimi 2 model can automatically select 200 to 300 tools to complete tasks on its own, lowering the need for human intervention.
The model can process up to 256,000 tokens in context length—equivalent to roughly 200,000 words—and has reportedly achieved a score of 87.3% on the Massive Multi-discipline Multimodal Understanding (MMMU) benchmark for multimodal reasoning.
However, these results have yet to be independently verified. Moonshot intends to make Kimi K2 available through its existing chatbot interface and API, offering tiered pricing that is estimated to be 30–40% lower than OpenAI’s rates.
Last month, DeepSeek unveiled a new AI model. This model reportedly enhances performance by utilizing visual cues to broaden the scope of the information it processes concurrently.



