DIGITAL LIFE

Small model approach could be more effective than LLMs
Small language models are more reliable and secure than their large counterparts, primarily because they draw information from a circumscribed dataset. Expect to see more chatbots running on these slimmed-down alternatives in the coming months.
After the widespread rollout of OpenAI's large language model (LLM) in late 2022, many other big tech companies followed suit—at a pace that showed they were not far behind and had actually been working for years to develop their own generative artificial intelligence (GenAI) programs using natural language.
What's striking about the various GenAI programs available today is how similar they truly are. They all basically work in the same way: a model containing billions of parameters is deep-trained on huge datasets made up of content available on the internet.
Once trained, the models in turn generate content—in the form of texts, images, sounds and videos—by using statistics to predict which string of words, pixels or sounds is the most probable response to a prompt.
"But this method comes with risks," says Nicolas Flammarion, who runs EPFL's Theory of Machine Learning Laboratory. "A hefty chunk of the content available online is toxic, dangerous or simply incorrect. That's why developers have to supervise and refine their models and add several filters."
How to avoid getting drowned in information...The way things currently stand, LLMs have created a suboptimal situation where machines housed in vast data centers crunch through billions of data bytes—consuming large amounts of energy in the process—to find the tiny fraction of data that's relevant to a given prompt. It's as if to find the answer to a question, you had to flip through all the books in the Library of Congress page by page until you came across the right answer.
Researchers are now exploring ways of leveraging the power of LLMs while making them more efficient, secure and economical to operate. "One method is to limit the sources of data that are fed into the model," says Martin Rajman, an EPFL lecturer and researcher on AI. "The result will be language models that are highly effective for a given application and that don't attempt to have the answers to everything."
This is where small language models (SLMs) come in. Such models can be small in various ways, but, in this context, size usually refers to the dataset they draw from. The technical term for this is retrieval-augmented generation (RAG). EPFL's Meditron provides an example of how this can be applied in practice: its models rely exclusively on reliable, verified medical datasets.
The advantage of this approach is that it prevents the spread of incorrect information. The trick is to implement the limited datasets with chatbots trained on large models. That way, the chatbot can read the information and link different bits together in order to produce useful responses.
Several EPFL research groups are exploring the potential of SLMs. One project is Meditron, while another is a pilot test under way based on Polylex, EPFL's online repository of rules and policies. Two other projects are looking at improving how class recordings are transcribed so that they can be indexed more reliably, and streamlining some of the school's administrative processes.
Cheaper to use...Because SLMs rely on smaller datasets, they don't need huge amounts of processing power to run—some of them can even operate on a smartphone. "Another important advantage of SLMs is they function in a closed system, meaning the information users enter into a prompt is protected," says Rajman.
"That's unlike ChatGPT, where if you ask it to transcribe a meeting and write up minutes, for example—something the model can do quite well—you don't know how the information will be used. It gets stored on unknown servers, although some of the information could be confidential or include personal data."
SLMs have all the chatbot-running capabilities of large models and come with considerably fewer risks. That's why businesses are getting more and more interested in the technology, whether for their internal needs or for use with their customers. Chatbots designed for specific applications can be both very useful and extremely effective, and this has prompted tech companies worldwide to rush their version to market.
2023 may have been the year when LLMs—with all their strengths and weaknesses—made the headlines, but 2025 could very well be the year when their smaller, tailored and fully trustworthy counterparts steal the show.
No comments:
Post a Comment