Reka Core is a competitive multimodal LLM accepting images, videos, and audio as input

Reka Core is a performant multimodal LLM with a performance comparable to GPT-4 in the MMMU benchmark, outperforms Gemini Ultra in the Perception test, and ranks above Claude 3 (Opus) and Gemini Pro 1.0 on a third-party blind human preference evaluation. Reka Core features multimodal (image, video, and audio) understanding, a 128K-token context window, advanced reasoning skills, coding and agentic capabilities, multilingual skills from being trained in 32 different Asian and European languages, and flexible deployment options that include API, on-premises and on devices. Its first version is available starting April 15, and all the features outlined above are spelled out in the technical report and showcased in the example outputs.

According to Reka, Core is one of the two commercial offerings supporting images, videos, and audio as input. The model's groundbreaking capabilities will reach the general public partly thanks to Reka's industry-leading partners: Snowflake, Oracle, and AI Singapore. Reka Core will be accessible through Snowflake Cortex and Oracle Cloud Infrastructure. Additionally, Reka's partnership with AI Singapore contributes to the development of high-quality open-source models specialized for Southeast Asia.