Feedback Loops May Result in LLMs Suffering a ‘Model Collapse’ Akin to Mad Cow Disease

A curious issue has emerged in the world of generative artificial intelligence. The problem is so potent and complex that it might jeopardize the scope of generative AI. Before delving deeper into understanding this issue, let us have a quick look at why a roadblock anticipated on the ascending curve of AI is deemed to be so significant.

How AI is Supposed to Make the World More Efficient

Over the past few years, the benefits that generative AI has offered the world have ramped up its adoption significantly. 92% of Fortune 500 firms have adopted generative AI, while 70% of GenZ have tried these tools.

Generative AI’s coverage is so widespread that nearly 90% of American jobs could be impacted by it, and 95% of customer interactions may involve it by 2025. While the current market is worth nearly US$45 billion, AI could only get bigger and generate as many as 97 million jobs by the next year.

The facilitating potential of AI has been unmatched, to say the least. Over two million software developers are building on OpenAI’s API, and most of these developers are from Fortune 500 companies that could have availed themselves of the best resources available for their work. Yet, they chose to develop on API.

This affirmative sentiment towards generative AI’s potential is also reflected in a survey carried out by Deloitte, which said that 94% of business executives believed that AI was a key to future success.

From all these data and facts, it is evident that many people worldwide are looking forward to generative AI for future growth. However, an issue has arisen that could challenge AI’s growth potential: Feedback Loops. In the following segments, we will explore this issue in greater detail, examining all its possible ramifications and implications.

What are Feedback Loops

In their most basic form, Feedback Loops are not bad or harmful. It is the process of leveraging the output of an AI system and corresponding end-user actions to retrain and improve models over time.

Due to their nature, they are called closed-loop, and the process is known as closed-loop learning. Here, a consistent comparison occurs between the AI-generated output and the final decision to provide feedback to the model, allowing it to learn from its mistakes.

However, a problem with such feedback loop learning arises when synthetic data is put into action. Training generative AI models require vast amounts of data, which are limited in supply and scarce to the point that these models exhaust their training resources altogether.

That’s why big tech is keen to leverage synthetic data to train future generations of AI models. However, synthetic data has its share of benefits and disadvantages.

The Advantages and Drawbacks of Synthetic Data

Synthetic or AI-synthesized data has many benefits. For instance, it is cheaper than real-world data, and—more importantly—the supply of such data is limitless.

Since the data is synthesized, it does not come from a real-life individual and poses very few privacy risks. Otherwise, real data, especially those used in the fields of med-tech and healthcare, run the risk of privacy breaches. Moreover, there are instances where the use of synthetic data has helped improve AI performance.

However, on the flip side, AI models fed with too much synthetic data may become limited in their functionalities and potential, causing negative impacts on generative AI models’ future iterations. The anticipation of this risk is not unfounded, as a recent work carried out by the Digital Signal Processing group at Rice University goes on to validate.

According to Richard Baraniuk, Rice’s C. Sidney Burrus Professor of Electrical and Computer Engineering:

“The problems arise when this synthetic data training is, inevitably, repeated, forming a kind of a feedback loop ⎯ what we call an autophagous or ‘self-consuming’ loop.”

The Challenge of Synthetic Data-Induced Feedback Loops

Now, to understand the full nature of the problem, we must look into the case of feedback loops and synthetic data together. Professor Baraniuk has aptly summarized the problem by saying the following, “Our group has worked extensively on such feedback loops, and the bad news is that even after a few generations of such training, the new models can become irreparably corrupted. This has been termed ‘model collapse’ by some ⎯ , most recently by colleagues in the field in the context of large language models (LLMs). We, however, find the term’ Model Autophagy Disorder’ (MAD) more apt, by analogy to mad cow disease.”

To explain the relevance of this comparison, Mad cow disease is a fatal neurodegenerative illness that affects cows and has a human equivalent caused by consuming infected meat. It is compared with the feedback loop challenge that generative AI models face because the disease proliferates through the practice of feeding cows the processed leftovers of their slaughtered peers. The act of feeding leftover data or data that comes out of the process itself as a byproduct is the ‘infection’ that perpetuates the generative AI models of the future.

The study that looked into this issue was titled ‘Self-Consuming Generative Models Go MAD.’ What is also significant about the research is that it is the first peer-reviewed work on AI autophagy that deals with popular and well-known visual AI models like DALL-E, Midjourney, and Stable Diffusion. Although the team worked with Visual AI, as Professor Baraniuk pointed out, “the same mad cow corruption issues occur with LLMs.”

Now that we’ve looked into the issue and factors that drive it, we must look into its impact and the ways it might affect the future development of generative AI models.

Lack of Fresh Data Resulting in Unhealthy AI

The core problem stemming from an affected or infected LLM is an impoverished AI model. With deficiencies in fresh real data, the outputs become warped and lack quality and diversity.

To explain this scenario in a more understandable and visual fashion, we may take the case of generating human faces with AI. With data fed through progressive iterative loops, the possibility of all faces becoming the same, more and more like the same person emerges.

The intensity of such damaging possibilities is severe, to say the least. The intensity can be gauged from a comment by Baraniuk, who says:

“Some ramifications are clear: without enough fresh real data, future generative models are doomed to MADness.”

Is there any cure to this doomsday kind of situation that seems to be looming large in the AI development world? For the time being, the researchers have introduced a sampling bias parameter that accounts for the tendency of users to favor data quality over diversity. This tendency is called ‘Cherry Picking.’ The one good thing about it is that it helps preserve data quality over a greater number of model iterations. However, the tradeoff is in the form of a greater decline in diversity.

We have already seen the kind of impact AI could have on our future world. With so much potential ready to be unleashed, this issue might affect a whole lot of stakeholders, especially new and innovative businesses that have been working on it.

Google is one of the forerunning companies to power LLMs for a large and diverse set of stakeholders. And the lack of diversity in AI-generated feeds is something that Google would have to deal with in the days to come. Here, we briefly look into Google AI’s capabilities as far as powering LLMs is concerned.

#1. Google AI

In the field of AI, two operational wings that help Google accomplish its goals are Google DeepMind and Google Cloud. To explain the collaboration more specifically, Google Cloud brings innovations developed and tested by Google DeepMind to its enterprise-ready AI platform so that customers can start using them to build and deliver generative AI capabilities. Google Cloud offers three LLM services.

The first is Generative AI on Vertex AI, which allows users to access Google’s large generative AI models to test, tune, and deploy in their AI-powered applications.
The second one is Vertex AI agent builder, which is meant for enterprise search and chatbot applications. It comes with pre-built workflows for usual tasks like onboarding, data ingestion, customization, etc.
The final one is Contact Center AI, an intelligent contact center solution that includes Dialog Flow, Google’s conversational AI platform with both intent-based and LLM capabilities.

However, Google also had to go through its fair share of trouble with a just utilization of AI. In May this year, Google CEO Sundar Pichai made it a point to clarify that the company was wrong when it generated racially biased image results in response to user queries.

Google had to pause Gemini’s image-generating features after social media users complained of their solution of perpetuating wrong images, including the portrayal of people of color when depicting Nazis.

Echoing much of the research we cited here, the company acknowledged that an issue arose from “limitations in the training data used to develop Gemini.”

Elaborating further on where the problem lies, Sundar Pichai had the following to say as well:

“There are certain times you answer [for questions like] ‘What’s the population of the United States?’ Yes, it’s an answerable question. [But] there are times you want to surface a breadth of opinions out there on the web, which is what search does and does it well.”

According to Pichai, the work that Google has done over the years ensures that – from a Search standpoint – the company offers trustworthy, high-quality information.

finviz dynamic chart for GOOGL

Financially, Alphabet Inc. (Google’s parent company) reported revenues of $307.4 billion in 2023, up 9% from $282.8 billion in 2022. Operating income reached $84.3 billion, a significant rise from $74.8 billion the previous year. Despite operating expenses increasing to $223.1 billion, net income was $73.8 billion, compared to $60 billion in 2022.

#2. Cohere

Another entity that has been doing great work in this field is the Toronto-based Cohere. One of their initiatives that calls for a special mention is the ‘Cohere for AI,’ a non-profit research lab that seeks to solve complex machine learning problems. Unlike a traditional research group, the Cohere for AI works as a hybrid lab with both a dedicated research staff and support for open science initiatives. The company collaborates openly with independent researchers all over the world to conduct top-tier ML research.

Cohere’s open science research community is an intersectional space that comprises researchers, engineers, linguists, social scientists, and lifelong learners. The space thrives with contributors coming from more than a hundred countries in the world.

Some of Cohere’s models include Aya 23 – 8B and Aya 23 – 35B, state-of-the-art accessible research LLMs, C4AI Command R – 104B and C4AI Command R – 35B, model weights for democratizing research access, and Aya, the massively multilingual research LLM.

Recently, Cohere and Fujitsu announced a strategic partnership to develop and provide enterprise AI services with industry-leading Japanese language capabilities. Under the terms of the collaboration, the companies will build powerful enterprise-grade large language models (LLMs) to serve the needs of global businesses.

In March this year, Cohere announced that it was seeking a valuation of $5 billion in its latest fundraising efforts. It was looking to raise approximately $250 million through this funding round, according to a source familiar with the matter.

Concluding Thoughts

LLMs and the generative AI models they power are emerging and evolving rapidly. This space involves cutting-edge technology that is massively ambitious and intensely aspirational. One could, therefore, fairly expect a significant research influx in this space in the years to come.

Research organizations, big tech, educational institutions—every possible stakeholder is keen to intensify their resource allocation in this space. Such a flurry of activities would mean more drawbacks coming to the fore. But it would also mean course correction, thinking outside the box, and the reimagining of existing solutions in the most optimized way possible.

Click here to learn all about investing in artificial intelligence.

Source link