Generative AI is proving to be an expensive and resource-consuming endeavor. The true costs behind creating, training, and deploying large language models aren't often discussed. Regardless, it is easy to conjecture, by looking at the bigger players in the field and their agreements, that consumers are getting locked into heavily subsidized services that only work within those same companies' environments. Speculation surrounding the costs of running ChatGPT has led SemiAnalysis' Dylan Patel and Afzal Ahmad to place the computing costs of running the popular chatbot at around $700,000 daily. Other calculations indicate that ChatGPT may consume as much electricity as 175,000 people and that this calculation is coherent with SemiAnalysis calculations on the monetary costs of the chatbot.
From a purely abstract perspective, Lemurian Labs' co-founder and CEO, Jay Dawani, notes that the team at Lemurian found that AI developers currently train models that require exaflops of compute, all while planning on models requiring zettaflops of compute. To put this number into perspective, consider that there is only one true-exascale (non-distributed) supercomputer ranking in the most recent Top500 list. To top this off, an Nvidia GPU shortage is currently putting pressure on startups and cloud service providers.
Much of this may be behind the recent trend favoring small open-source fine-tuned models as an alternative to enormous black-box closed generalist LLMs. However, another less popular, almost implausible-sounding strategy to achieve more with less is to try and reinvent the chips that power every AI operation. Reinventing computing is the way Lemurian Labs has chosen. The startup recently secured $9 million in funding to keep advancing its plan to deliver efficient, sustainable, and affordable AI for everyone.
Their plan involves developing new software, a data type, and hardware. In a nutshell, their starting point is that while the pure processing power of GPUs is what attracted AI developers to use the chips in the first place, GPUs were not intended for this kind of use. So, the fact that they have the pure processing power to handle LLMs does not mean that they will perform this task well. Thus, reimagining computing from the ground up to optimize it specifically for AI development begins to gain traction. A more detailed explanation of Lemurian's breakthroughs is included in their origin story.
Dawani recently told TechCrunch that the company is releasing the software part of the stack first, which it hopes to have available by late 2024. Although Darwani and the team are aware of the challenges intrinsic to designing new hardware, they remain hopeful they can follow through in a few years. Lemurian Labs is also hoping for a Series A round to allow it to grow its team. Even if there is no way of predicting how this will go, should Lemurian Labs manage to pull off everything it's planning, then it will effectively revolutionize AI computing as we know it and will be responsible for a more sustainable, efficient, and affordable era in generative AI.
Data Phoenix Newsletter
Join the newsletter to receive the latest updates in your inbox.