Microsoft AI announced three new foundational models on Thursday, emphasizing competitive pricing and practical enterprise value as it refines its strategy under CEO Mustafa Suleyman's leadership. Like other AI companies, Microsoft is under increasing pressure to deliver more revenue that can be directly attributed to AI. Microsoft's more recent stance on AI is reminiscent of OpenAI's as it became known that the latter would be cutting back on "side quests" to prioritize coding and enterprise solutions. OpenAI cited this shift in strategy as the cause of its decision to shut down Sora (both the app and model), deprioritize its browser Atlas, and indefinitely pause the previously announced 'adult mode' for ChatGPT.

Microsoft's long-standing partnership with OpenAI has never kept the company from developing in-house models, although none so far have enjoyed a particularly stellar reception. This was partially due to a clause in the Microsoft-OpenAI partnership, which prevented Microsoft from competing with OpenAI in the development of "superintelligence". Now, a renegotiation of the partnership terms has freed Microsoft to pursue more ambitious AI goals in-house, with Suleyman stating that the quest for "superintelligence" has long been his focus.

The models, which can be seen as the first products resulting from this renewed focus, were developed by Microsoft's MAI Superintelligence team, and include MAI-Transcribe-1 for speech-to-text across 25 languages (2.5x faster than Azure Fast), MAI-Voice-1 for audio generation (producing 60 seconds of audio in one second with custom voice options), and MAI-Image-2 for video generation.

Microsoft is positioning these as cost-effective enterprise solutions. MAI-Transcribe-1 starts at $0.36 per hour, MAI-Voice-1 at $22 per million characters, and MAI-Image-2 at $5-$33 per million tokens. All three are available through Microsoft Foundry with built-in guardrails, governance, and enterprise-grade controls for compliant deployment.

Microsoft and Suleyman are now increasingly careful to distance themselves from the discourse surrounding the term "artificial general intelligence" or AGI, which often carries lofty implications that generative AI will have a profound impact on society, from developing cures and treatments for all known diseases to eliminating the need for work.

Suleyman recently told The Verge that, for him (and Microsoft), "Superintelligence is really about, 'Are these models capable of delivering product value for the millions of enterprises that depend on us to deliver world-class language models?'" Similarly, on the blog post announcing the three models, he writes that: "At Microsoft AI, we're building Humanist AI [...] putting humans at the center, optimizing for how people actually communicate, training for practical use."

Several years into the AI craze started by ChatGPT's meteoric rise to fame back in 2022, the shortfalls of generative AI have become increasingly evident, as the pressure to deliver tangible results continues to increase. It is no wonder that, given this unfavorable landscape, some of the most important players in the industry are quietly and subtly shifting their narratives, from fantasies of utopia and existential risk to promises of enhancing the productivity of both businesses and individual consumers.