Opportunities and Risks of Foundation Models

This month, the Centre for Research on foundation models at the University of Stanford published an insightful paper called On the Opportunities and Risks of Foundation Models

From the abstract (emphasis mine)

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks.
The paper calls these models foundation models to underscore their critically central yet incomplete character.
The paper provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles (e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations).
Though foundation models are based on conventional deep learning and transfer learning, their scale results in new emergent capabilities, and their effectiveness across so many tasks incentivizes homogenization.
Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream.
Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties.

To expand further from the (long!) paper my notes and comments from the paper

foundation models have the potential to accentuate harms, and their characteristics are in some ways poorly understood.
Foundation models are enabled by transfer learning and scale. Foundation models will drive the next wave of developments in NLP based on BERT, RoBERTa, BART, GPT and ELMo.
But the impact of foundation models will be beyond NLP itself
The report is divided into four parts: capabilities, applications, technology, and society.

Capabilities

This Report looks at five potential capabilities of foundation models. These include the ability to process different modalities (e.g., language, vision), to affect the physical world (robotics), to perform reasoning, and to interact with humans.
The paper explore the underlying architecture behind foundation models and identify 5 key attributes. These include expressivity of the computational model, scalability, memory, compositionality and memory capacity.
The ecosystem surrounding foundation models requires a multi-faceted approach:

more compute-efficient models, hardware, and energy grids all may mitigate the carbon burden of these models – environmental cost should be a clear factor that informs how foundation models are evaluated, such that foundation models can be more comprehensively juxtaposed with more environment-friendly baselines

the cost-benefit analysis surrounding environmental impact necessitates greater documentation and measurement across the community.

Language

The study of foundation models has led to many new research directions for the community, including understanding generation as a fundamental aspect of language and studying how to best use and understand foundation models.
The researchers also examined whether foundation models can satisfactorily encompass linguistic variation and diversity, and finding ways to draw on human language learning dynamics.

Vision

In the longer-term, the potential for foundation models to reduce dependence on explicit annotations may lead to progress on essential cognitive skills which have proven difficult in the current paradigm.

Robotics

Using strategies based on transfer leaning, Robots can be used to learn new but similar tasks through foundation models enabling generalist behaviour

Reasoning and search

foundation models should play a central role towards general reasoning as vehicles for tapping into the statistical regularities of unbounded search spaces (generativity) and exploiting the grounding of knowledge in multi-modal environments (grounding)
Researchers have applied these language model-based approaches to various applications, such as predicting protein structures and proving formal theorems. Foundation models offer a generic way of modeling the output space as a sequence.

Applications

foundation models can be a central storage of medical knowledge that is trained on diverse sources/modalities of data in medicine.
For example, a model trained on natural language could be adapted for protein fold prediction.
Also, pretrained models can help lawyers to conduct legal research, draft legal language, or assess how judges evaluate their claims.

Technology

The emerging paradigm of foundation models has attained impressive achievements in AI over the last few years.
The paper identifies and discuss five properties, spanning expressivity,scalability, multimodality, memory capacity, and compositionality, that they believe are essential for a foundation model to be successful.

Society

Finally, the impact on society will be most profound.

The paper asks what fairness-related harms relate to foundation models, what sources are responsible for these harms, and how we can intervene to address them.
The issues relate to broader questions of algorithmic fairness and AI ethics but foundation models are at the forefront of impact and scale
People can be underrepresented or entirely erased, e.g., when LGBTQ+ identity terms are excluded in training data.
The relationship between the training data and the intrinsic biases acquired by the foundation model remains unclear. Establishing scaling laws for bias, akin to those for accuracy metrics, may enable systematic study at smaller scales to inform data practices at larger scales.
Foundation models will allow for the creation of content that is indistinguishable from content created by humans which poses risks
Also, even seemingly minuscule decisions, like reducing the. the number of layers a model has, may lead to significant environmental cost reductions at scale.
Even if foundation models increase average productivity or income, there is no economic law that guarantees everyone will benefit because not all tasks will be affected to the same extent.

Finally, a view that I very much agree with

The widespread adoption of foundation models poses ethical, social, and political challenges.

OpenAI’s GPT-3 was at least partly an experiment in scale, showing that major gains could be achieved by scaling up the model size, amount of data, and training time, without major modelling innovations. If scale does turn out to be critical to success, the organizations most capable of producing competitive foundation models will be the most well-resourced: venture-funded start-ups, already-dominant tech giants, and state governments.

To conclude, this is a must read paper from a number of perspectives: Ethics, future of AI, foundation models etc