The Age of Generative Artificial Intelligence
Generative Artificial Intelligence is the rocket ship of the creative mind.
We're entering the Age of Generative AI, a period in which computers amplify human creativity by generating content, code, insights, and actions from our prompts rather than simply scoring, identifying, or extrapolating from a dataset. Self-supervised learning algorithms are unlocking the hidden knowledge in large, unlabeled, unstructured datasets, allowing for the creation of surprisingly novel outcomes. Through multimodal learning, we can combine diverse data such as images, text, and various sequences. There seems to be no limits on where generative AI can be used—from generating artful pictures, video, and audio, to coding, interacting with software screens and making novel chemical and drug designs.
Generating Novel Artwork
For the past several years, A.I. has created artwork, but tools released this year have enabled AI to create art with increased realism and complexity. For example, Jason Allen's “Théâtre D'opéra Spatial” was named the digital category champion at the Colorado State Fair, gaining recognition from the New York Times. This year, DALL-E 2, Midjourney, and Stable Diffusion have made it simple for anyone to create intricate, abstract, or photorealistic art by typing a few words in a text box. For example, with a prompt like “Shopping on Amazon Alexa for Christmas Gifts” in Stable Diffusion, I was able to generate the realistic image shown below, which includes an Alexa-like device, a charming subject, and a Christmas theme where the subject appears to be conversing with the Alexa device. For an exercise that only took two minutes, this is an impressive result.
Prompt: Shopping on Amazon Alexa for Christmas Gifts
Emergent capabilities and in-context learning
Models like Stable Diffusion and Dall-E 2 generate images from textual prompts, but generative algorithms can do far more. Foundation models trained on large and diverse datasets have exhibited surprising emergent capabilities. Shortly after GPT-3 went online in 2020, its creators discovered that not only could GPT-3 generate English sentences and paragraphs in a variety of styles, but it can also generate code. As it turned out, the vast amounts of Web pages used in its training included many examples of computer programming accompanied by descriptions of what the code was designed to do, thus enabling GPT-3 to teach itself how to program. Now with in-context learning, large language models such as GPT-3 can be prompted to do math, generate web templates, write valentine’s day cards, plan road trips or even generate creative inspirations based on recent trends. Surprisingly, emergent capabilities can appear abruptly and discontinuously as the size of the large language model grows. For example, three-digit addition is performed accurately by GPT-3 less than 1% of the time on any model with less than 6B parameters, but this jumps to 8% accuracy on a 13B parameter model and 80% accuracy on a 175B parameter model.
Astonishing capabilities from media and code generation to semantic automation
Astonishing new capabilities are being developed on a daily basis. CogVideo, a large-scale text-to-video system, generates novel videos from Chinese text. In September 2022, AudioLM demonstrated an impressive language modeling approach to generate audio completions prompted by audio input, ranging from speech to piano notes. GitHub Copilot utilized OpenAI's Codex to convert natural language prompts into coding suggestions in dozens of languages in their code editor. Adept.ai recently released ACT-1, a transformer model that can translate natural language prompts into a sequence of actions on websites and software screens. The next generation of AI assistants will most likely be powered by such models that can translate natural language into browser actions and API calls, and orchestrate existing web services to perform highly sophisticated tasks. Generative AI has now crossed over to a new frontier of semantic automation of computing.
New fields of engineering, science and medicine enabled by Generative AI
New subfields of engineering, science, and medicine are emerging as Generative AI advances. Generative Engineering Design employs AI methods to discover and synthesize novel materials, shapes and structures using generative modeling. Deep Generative Modeling may create novel materials through exploration and manipulation of a latent space that encodes material structure and/or properties. Generative Molecular Discovery enables the development of novel molecules that may be used to solve important challenges in chemistry, therapeutics, and engineering. Generative biology is a revolutionary approach to drug discovery and development that leverages AI to design novel biological molecules and proteins that might be useful as vaccines, cancer treatments, or even tools for extracting carbon pollution from the air.
Accelerating Drug Discovery with Generative AI
Using generative AI, virtual “in-silico” labs for computational drug discovery are now being developed. I recently joined the board of directors of AbSci, a drug discovery firm that uses generative AI and synthetic biology to develop novel drug targets, discover optimal biotherapeutic candidates, and produce the cell lines needed to manufacture them. If approaches like theirs succeed, computers will discover the protein structures for future drugs. Personalized, highly targeted drugs will be developed orders of magnitude faster (at the speed of computation versus at the speed of traditional laboratory processes), revolutionizing medicine.
Summary
Generative models and self-supervised learning have enabled AI to make more progress in the last five years than in the previous fifty. We've finally achieved a general, although still early, method of learning from vast stores of human knowledge at scale, and of integrating information in a latent space of neural representations. Algorithms utilizing these deep latent representations can now synthesize new, high-quality innovations that amplify humanity’s innovation potential.
Steve Jobs referred to computers as the bicycle of the mind. Generative AI has now become the rocket ship of the creative mind.
Why is it so frustrating to read this statement? three-digit addition is performed accurately by GPT-3 less than 1% of the time on any model with less than 6B parameters, but this jumps to 8% accuracy on a 13B parameter model and 80% accuracy on a 175B parameter model.