Impressive advances in large language models (LLMs) are showing signs of what could be the beginnings of a major shift in the tech industry. AI startups and big tech companies are finding novel ways to put advanced LLMs to use in everything from composing emails to generating software code.
However, the promises of LLMs have also triggered an arms race between tech giants. In their efforts to build up their AI arsenals, big tech companies threaten to push the field toward less openness and more secrecy.
In the midst of this rivalry, Hugging Face is mapping a different strategy that will provide scalable access to open-source AI models. Hugging Face is collaborating with Amazon Web Services (AWS) to facilitate adoption of open-source machine learning (ML) models. In an era when advanced models are becoming increasingly inaccessible or hidden behind walled gardens, an easy-to-use open-source alternative could expand the market for applied machine learning.
Open-source models
While large-scale machine learning models are very useful, setting up and running them requires special expertise that few companies possess. The new partnership between Hugging Face and AWS will try to address these challenges.
Developers can use Amazon’s cloud tools and infrastructure to easily fine-tune and deploy state-of-the-art models from Hugging Face’s ML repository.
The two companies began working in 2021 with the introduction of Hugging Face deep learning containers (DLCs) on SageMaker, Amazon’s cloud-based machine learning platform. The new partnership will extend the availability of Hugging Face models to other AWS products and Amazon’s cloud-based AI accelerator hardware to speed up training and inference.
“Since we started offering Hugging Face natively in SageMaker, usage has been growing exponentially, and we now have more than 1,000 customers using our solutions every month,” Jeff Boudier, product director at Hugging Face, told VentureBeat. “Through this new partnership, we are now working hand in hand with the engineering teams that build new efficient hardware for AI, like AWS Trainium and AWS Inferentia, to build solutions that can be used directly on Elastic Compute Cloud (EC2) and Elastic Kubernetes Service (EKS).”
The AI arms race
Tech leaders have been talking about the transformative nature of machine learning for several years. But never has this transformation been felt as it has in the past few months. The release of OpenAI’s ChatGPT language model has set the stage for a new chapter in the race for AI dominance.
Microsoft recently poured $10 billion into OpenAI and is working hard to integrate LLMs into its products. Google has invested $300 million into Anthropic, an OpenAI rival, and is scrambling to protect its online search empire against the rise of LLM-powered products.
There are clear benefits to these partnerships. With Microsoft’s financial backing, OpenAI has been able to train very large and expensive machine learning models on specialized hardware and deploy them at scale to millions of people. Anthropic will also receive special access to the Google Cloud Platform through its new partnership.
However, the rivalry between big tech companies also has tradeoffs for the field. For example, since it began its partnership with Microsoft, OpenAI stopped open-sourcing most of its machine learning models and is serving them through a paid application programming interface (API). It has also become locked into Microsoft’s cloud platform, and its models are only available on Azure and Microsoft products.
On the other hand, Hugging Face remains committed to continuing to deliver open-source models. Through the partnership between Hugging Face and Amazon, developers and researchers will be able to deploy open-source models such as BLOOMZ (a GPT-3 alternative) and Stable Diffusion (a rival to DALL-E 2).
“This is an alliance between the leader of open-source machine learning and the leader in cloud services to build together the next generation of open-source models, and solutions to use them. Everything we build together will be open-source and openly accessible,” Boudier said.
Hugging Face also aims to avoid the kind of lock-in that other AI companies are facing. While Amazon will remain its preferred cloud provider, Hugging Face will continue to work with other cloud platforms.
“This new partnership is not exclusive and does not change our relations with other cloud providers,” Boudier said. “Our mission is to democratize good machine learning, and to do that we need to enable users wherever they are using our models and libraries. We’ll keep working with Microsoft and other clouds to serve customers everywhere.”
Openness and transparency
The API model provided by OpenAI is a convenient option for companies that don’t have in-house ML expertise. Hugging Face has also been delivering a similar service through its Inference Endpoint and Inference API products. But APIs will prove to be limited for organizations that want more flexibility to modify the models and integrate them with other machine learning architectures. They are also inconvenient for research that requires access to model weights, gradients and training data.
Easy-to-deploy, scalable cloud tools such as those provided by Hugging Face will enable these kinds of applications. At the same time, the company is developing tools for detecting and flagging misuse, bias and other problems with ML models.
“Our vision is that openness and transparency [are] the way forward for ML,” Boudier said. “ML is science-driven and science requires reproducibility. Ease of use makes everything accessible to the end users, so people can understand what models can and cannot do, [and] how they should and should not be used.”
VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.