BLOOM, a radically open-sourced new mega LLM is one of the most important AI models released to date and you can now deploy BLOOM as an InferenceService on CoreWeave today – click here to see how.
Why This Model Matters: Traditionally, large and powerful LLMs - like OpenAI’s GPT-3 - have been restricted from public access, limiting both innovation and AI research. With the launch of LLMs like EleutherAI’s GPT-NeoX-20B and now BigScience’s 176B parameter BLOOM, the AI community is able to access extraordinarily powerful, open-source models.
BLOOM is the work of more than 1,000 researchers from around the world who collaborated with institutions like Hugging Face, the French government and the Montreal AI Ethics Institute with the goal of ensuring that AI research is open, inclusive and responsible to the betterment of humanity.
At 176 billion parameters, BLOOM is larger than OpenAI’s 175-billion-parameter LLM, GPT-3.
BLOOM is an autoregressive LLM, trained to continue text from a prompt on vast amounts of data, using industrial-scale computational resources. As such, it is able to output coherent text in 46 languages and 13 programming languages that is hardly distinguishable from human writing and speech. BLOOM can also be instructed to perform text tasks it hasn't been explicitly trained for, by casting them as text generation tasks.
Serving LLMs, like BLOOM, is complex and computationally intensive. At CoreWeave, we focus on building solutions that reduce infrastructure complexity so you can focus more time on what you do best. Click here to deploy BLOOM as an InferenceService on CoreWeave today.
About CoreWeave: CoreWeave was NVIDIA’s first Elite Cloud Services Provider for Compute. We deliver the highest distributed training performance possible, constructing our A100 distributed training clusters with a rail-optimized design using NVIDIA Quantum InfiniBand networking, and in-network collections using NVIDIA SHARP.
CoreWeave’s InferenceService is backed by well supported open-source Kubernetes projects like KNative Serving. Optimized for fast spin-up times and responsive auto-scaling across the industry’s broadest range of GPUs, our infrastructure delivers 50-80% lower performance-adjusted cost, compared with legacy cloud providers.