The Chinese company claims to surpass OpenAI in long text processing

Baichuan2-192k is the latest large language model (LLM) from Baichuan – the company behind the popular Chinese search engine Sogou. Founder Wang Xiaochuan said the new LLM is based on a “Context Window” that can handle about 350,000 Chinese characters, thereby becoming the world’s most powerful model for processing text commands. long version.

Baichuan founder Wang Xiaochuan. Photo: Weibo — Baichuan founder Wang Xiaochuan. Photo: *Weibo*

Context window is a concept that refers to the combination of input and output text that the model can process during the conversation with the user. According to the WeChat post, Baichuan2-192k has 14 times more context window processing power than GPT-4, the large language model in OpenAI’s ChatGPT.

LLM achieved the world’s largest context window specification, previously held by Anthropic’s (Amazon-backed) Claude 2, introduced in July. This model can hold context window data of 75,000 words of English. English, corresponding to hundreds of pages of documents or a book. If Baichuan’s statement is correct, Baichuan2-192k is nearly five times stronger than Claude 2.

Baichuan claims its model surpasses Claude 2 in terms of response quality and ability to understand and summarize long text. This statement is based on the test results of LongEval, a project initiated by the University of California, Berkeley and other US institutions to evaluate the processing level of a particular LLM model.

According to Xiaochuan, Baichuan2-192k is useful for businesses that need to process and create long documents every day, such as the legal industry and media. finance. The company is testing the model internally for a number of partners.

However, according to research by scholars from Stanford University and UC Berkeley, processing more information does not necessarily make an AI model better. Before Baichuan, some Chinese LLMs also claimed to surpass ChatGPT. On October 31, Alibaba said that Tongyi Qianwen – an AI model trained with hundreds of billions of parameters – surpassed OpenAI’s GPT-3.5 and Meta’s Llama2, and “significantly narrowed the gap” with GPT-4 . Meanwhile, Zhipu AI, a startup backed by Alibaba and Tencent, last week launched ChatGLM3 with many improvements, including faster inference speed, lower training costs and the addition of a coding assistant.

Bao Lam (according to SCMP )

The Chinese company claims to surpass OpenAI in long text processing

Leave a Reply Cancel reply

Recent Posts

Recent Comments

Archives

Categories

Meta