When working on a Machine Learning problem, improving an existing solution and reusing it can help you find high performance solutions much faster. Using existing models is not only beneficial for data scientists, but also for companies. It allows them to save on computational costs and time. Today, there are several companies that provide open-source libraries containing pre-trained models and Hugging Face is one of them. Hugging Face is a French startup that became known thanks to the NLP infrastructure they developed. Today, it is about to revolutionize the field of Machine Learning and automatic natural language processing. In this article, we will present Hugging Face and detail the basic tasks that this library can perform. We will also list its advantages and alternatives.
Hugging Face is an open-source NLP library that provides an API to access several pre-trained models. It facilitates learning and experimentation, as the models are already trained and ready to be used. It also offers tools to manage data and models, as well as to develop and train new models. The startup was founded in 2015 as a company called ItNest, founded by Victor Sanh and Thomas Wolf, with a mission to make artificial intelligence accessible to everyone. It was renamed Hugging Face in 2016. In 2019, the company raised $15 million in Series A funding, with Menlo Ventures as its lead investor. A pioneer in AI, it was named one of the world's most innovative companies in 2020 by MIT Technology Review. Hugging Face has developed a range of AI-based products, including an open-source Deep learning library called Transformers. It also offers an online collaboration platform for users to manage, share and develop their AI models. The Hugging Face library has many advantages. Here are some of them:
In recent years, Hugging Face has launched several products, including:
One of the main products are chatbot applications that allow users to interact with artificial intelligence developed by the company. To that end, Hugging Face developed its own natural language processing (NLP) model called Hierarchical Multi-Task Learning (HMTL) and maintained a library of pre-trained NLP models under PyTorch Transformers, available only on iOS.These applications are Chatty, Talking Dog, Talking Egg, and Boloss. This AI is intended to be a digital companion that can entertain users.
The startup has also developed a set of tools for its developer community to manage, share and develop their own machine learning models:
BLOOM is an open-source, full-text trained, large language model (LLM). It is capable of producing coherent text in 46 languages, including Spanish, French, and Arabic. BLOOM can also be trained to perform text tasks for which it has not been explicitly trained by projecting them as text generation tasks. BLOOM will be the first language model with more than 100 billion parameters ever created.
In addition to documentation, Hugging Face offers NLP training using the Hugging Face ecosystem libraries, such as Transformers, Datasets, Tokenizers, and Accelerate, as well as the Hugging Face Hub. The training is completely free and ad-free.
Although Hugging Face has good models and rich functionality, it has some important shortcomings: