Stanford University team apologised for copying Chinese AI model

Llama 3-V was accused of plagiarising the open-source work of Chinese scientists, MiniCPM-Llama3-V 2.5.

Jun 05, 2024

This article was originally published in the South China Morning Post. We're pleased to share it with you, our reader, as it has sparked significant discussion on Chinese social media, highlighting the breaking down of stereotypes on innovation in China (check out our previous article ).
To delve deeper into the updated development of Chinese AI model, please looking forward our comprehensive intelligence report launching on website.

A Stanford University team has apologised after being accused of plagiarising the open-source work of Chinese scientists to create a new artificial intelligence model.

The AI model, called Llama 3-V, drew global attention for its powerful performance when it was launched on Wednesday last week.

But on Sunday, two Stanford students involved in the project admitted that “our architecture is very similar” to another model, MiniCPM-Llama3-V 2.5.

“We want to sincerely apologise to the original authors,” Stanford computer science undergraduates Aksh Garg and Siddharth Sharma said in a statement posted on X on Monday.

They said the original model had been taken down.

Llama 3-V was developed by Garg, Sharma, and another researcher, Mustafa Aljadery, who is not from Stanford. The three researchers did not immediately respond to requests for comment.

Launching Llama 3-V last week, they claimed it could be trained to rival the performance of cutting-edge AI models such as GPT4-V, Gemini Ultra, and Claude Opus at a cost of just under US$500.

Soon after its release, Llama 3-V made it into the top five trending lists on Hugging Face, a popular artificial intelligence platform.

But questions were raised within the AI community over whether a large part of the new model might have been stolen from MiniCPM-Llama3-V 2.5. That model was jointly developed by Tsinghua University’s Natural Language Processing Lab and ModelBest, a Beijing-based AI start-up founded in 2022.

Content posted by one whistleblower on the open-source platform GitHub suggests the model structure and code of the two projects are almost identical.

Liu Zhiyuan, co-founder of ModelBest, said in a WeChat post on Monday that he was “relatively sure” that the new model had stolen from their project.

He said MiniCPM-Llama3-V2.5 had an embedded feature—it could identify bamboo slips from the Warring States Period (about 475-221 BC).

In 2008, Tsinghua University acquired 2,500 bamboo slips—Chinese texts written on strips of bamboo—from this period.

Liu’s team scanned and annotated the texts verbatim to create a dataset for training. That dataset is not publicly available, but the Llama3-V model showed the same recognition ability, according to Liu.

“Even the wrong cases are the same,” he said.

Liu said rapid development of AI could not be achieved without global open-source sharing of algorithms, data, and models. He noted that their model had used the latest open-source Llama 3 from Meta as a base.

But he said the cornerstones of open-source sharing were adhering to protocols, trusting other contributors, and respecting and acknowledging the work of pioneers, which the Stanford team had “seriously undermined.”.

In Monday’s statement, Garg and Sharma, the two Stanford students, said the third team member, Aljadery, had written all the code for the project.

“We apologise to the authors and take full responsibility for not doing the diligence to verify the originality of this work,” they said.

In a post on X on Tuesday, Christopher Manning, a professor of computer science and linguistics at Stanford University and director of the Stanford Artificial Intelligence Laboratory, said he did not have any knowledge of the case. “‘Fake it before you make it’ is an ignoble product of Silicon Valley,” he added.

The case has caused a stir on social media, particularly in China, where it topped the list of hottest topics on Weibo on Tuesday. It has also prompted a broader discussion of China’s progress in artificial intelligence.

“The community keeps ignoring the Chinese ML ecosystem work. Theyare doing amazing stuff with interesting LLMs, VLMs, audio, anddiffusion models .
Qwen, Yi, DeepSeek, Yuan, WizardLM, ChatGLM, CogVLM, Baichuan,InternLM, OpenBMB, Skywork, ChatTTS, Ernie, HunyuanDiT, etc.”
— Omar Sanseviero

Lucas Beyer, a researcher at AI research lab Google DeepMind, commented in a post on X that “such a good model” already existed—MiniCPM-Llama3-V 2.5—but had received a lot less attention because it was not from an Ivy League university but from a Chinese lab.

In his WeChat post, Liu from ModelBest acknowledged the “significant” gap between China’s generative AI models and top-tier Western projects such as Sora and GPT-4. But he said China had rapidly gone “from a nobody more than a decade ago to a key driver of AI technology innovation.”.

MiniCPM: https://arxiv.org/abs/2404.06395

Stanford University team apologised for copying Chinese AI model

Llama 3-V was accused of plagiarising the open-source work of Chinese scientists, MiniCPM-Llama3-V 2.5.

Discussion about this post