The market of foundational generative AI models — those that are powerful and capable enough to serve a broad swath of use cases, from coding to content generation — is getting more crowded by the day.
But Israeli startup Deci is hoping to make a splash in the industry by targeting one very specific and difficult goal: efficiency.
Today, the four-year-old company delivered a flurry of blows toward its competitors, launching a duo of open-source foundation models — DeciDiffusion 1.0, an image-to-text-generator and DeciLM 6B, a text-to-text generator — as well a software development kid (SDK) called Infery LLM, which will allow developers to build applications atop the models, all which are intended for commercial and research purposes.
Efficiency gains and cost savings
Importantly: Deci’s entire mission is achieving new standards of efficiency and speed for generative AI inferences — the actual user-facing models — noting that DeciDiffusion is three times faster than direct competitor model Stable Diffusion 1.5, while DeciLM 6B is 15 times faster than Meta’s LLaMA 2 7B.
“By using Deci’s open-source generative models and Infery LLM, AI teams can reduce their inference compute costs by up to 80% and use widely available and cost-friendly GPUs such as the NVIDIA A10 while also improving the quality of their offering,” reads the company’s press release.
With many in Silicon Valley discussing the apparent shortage of suitable graphics processing units (mostly from market leader Nvidia) for training and deploying AI models and inferences, Deci’s moves to offer a more power and cost-efficient model — q pair of them — and an SDK, appears to be excellent timing.
Deci highlights cost savings in its blog post on DeciDiffusion, writing that it “boasts an impressive reduction of nearly 200% in production costs,” compared to Stable Diffusion 1.5, as well as “costing 70% less than Stable Diffusion for every 10,000 images generated.”
Attacking the competition by rebuilding it with AutoNAC
Deci says it is able to achieve these awe-inspiring results through its proprietary Neural Architecture Search (AutoNAC) technology which essentially analyzes an existing AI model and constructs an entirely new AI made up of small models “whose overall functionality closely approximates” the original model, according to a Deci whitepaper on the tech.
“The AutoNAC pipeline takes as input a user-trained deep neural network, a dataset, and access to an inference platform,” the white paper states. “It then redesigns the user’s neural network to derive an optimized architecture whose latency is typically two to ten times better—without compromising accuracy.”
In other words, Deci’s tech can look at whatever models your business or organization currently has deployed, and then completely redesign them to run far faster and more efficiently, vastly reducing the cloud server costs you would have incurred by running the original, larger model.
In the case of DeciDiffusion and DeciLM 6B, the models were developed by training on Stable Diffusion 1.5 and Meta’s LLaMA 2 7B, respectively. Deci took advantage of both open source models, applied its own proprietary training architecture to them, and created new, faster, more efficient models that do the same things.
Because Deci’s models are also open source, they are free to use, even for commercial purposes. So how does the company plan to monetize? It’s charging for the SDK, of course.
“Infery-LLM SDK requires a subscription,” wrote a Deci spokesperson to VentureBeat via email. “Teams can use our open source models with any tool they want and enjoy better performance compared to other models. But to maximize the speed and efficiency to the fullest they can get access to Infery-LLM SDK to optimize and run the models in any environment they choose.”
It “was trained from scratch on a 320 million-sample subset of the LAION dataset,” and “fine-tuned on a 2 million sample subset of the LAION-ART dataset,” and achieves quality comparable to Stable Diffusion 1.5 with 40% fewer iterations.
When it comes to DeciLM 6B, the model includes:
- 5.7 billion parameters
- 32 layers
- 32 heads
- 4096 tokens sequence length
- 4096 hidden token size
- Variable Grouped-Query Attention (GQA) mechanism
It was trained on the SlimPijamas dataset using Deci’s AutoNAC methodology, and then “finetuned on a subset of the OpenOrca dataset” to create an even faster, smaller, and more efficient model called DeciLM 6B-Instruct, designed for following short prompts. Both DeciLM 6B and DeciLM 6B-Instruct are available now from Deci.
Both DeciDiffusion 1.0 and DeciLM 6B are “intended for commercial and research use in English and can be fine-tuned for use in other languages,” according to their HuggingFace documentation.
VentureBeat’s initial test of the DeciDiffusion 1.0 demo produced mixed results: the model struggled, as does Stable Diffusion 1.5, with more complex prompts with multiple elements on the first try.
Meanwhile, VentureBeat’s brief test of the DeciLM 6B-Instruct model on HuggingFace yielded more impressive results, delivering mostly accurate summaries of history and a legible cover letter, as seen in the screenshots below.
Clearly, Deci hopes to make a compelling offering to enterprises considering open source LLMs and foundation models for their businesses, as well as to the research community, by building upon and advancing from current open source AI models. Whatever happens, it’s an exciting and fiercely competitive time in open source AI, and generative AI more broadly.
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.