Last week I had the pleasure of co-hosting an AI Salon with Fiona Su of Google. Our discussion focused on agentive AI systems and just happened to coincide with the release of Gemini 2.0 and the dawn of the “Agentic Era,” as dubbed by Sundar Pichai in his keynote.
In this post, I’ll share some of the highlights from that discussion, and our predictions for how agentive AI will change marketing and advertising, and perhaps the world as we know it.
Shameless Plug: Please Vote for Our SXSW Panel With Google
If you’re interested in this topic and want to hear more, please vote for our SXSW London Panel, where Google’s Suzana Apelbaum and I will share more insights.
The Rebirth of AI Agents
The idea of “AI Agents” has been around for decades. As early as the 1950s, early AI pioneers speculated about machines that would be capable of solving problems autonomously. AI Agents are defined by their ability to think and take action toward a goal. Thanks to the advances of generative models in recent years, they’ve evolved from a theoretical concept into a practical one.
We began playing with AI agents in 2022, when the concept first took hold in the AI developer community, and open-source platforms like LangChain made rapid prototyping possible. Back then, they were largely a novelty. Due to the cognitive limitations of models like GPT-3, thinking and planning process had to be outsourced to external AI functions. For example, you might specifically inform the model “you have a notepad” and then program a simple Python function that the AI can use to take notes and decide what to do next. These types of agentive systems could perform simple tasks, like conducting a web search and returning the relevant results. But for more complex actions, these systems quickly hit a combinatorial ceiling.
Agentive Foundation Models
Google’s release of Gemini 2.0 last week represents a shift towards models designed for agentive capability. These models are increasingly adept at completing complex tasks – like surfing the web or making a dinner reservation – as well as the more familiar task of language completion.
This new generation of agentic models are increasingly adept at thinking, deciding what to do, and taking the appropriate action. When you stream a model like Gemini 2.0, you can actually see it thinking, in brief pauses followed by bursts of text or code. Combined with its built-in code generation and function-calling capabilities, models like Gemini 2.0 are increasingly adept at performing complex tasks in pursuit of a goal.
The implications of this are profound. While recent AI adoption has centered on ‘human-in-the-loop’ use cases like chatbots, the future lies in ‘human-on-the-loop’ systems, where AI increasingly handles the workload.
And as these AI models become more powerful, they’re no longer limited to operating behind the scenes. New models like Gemini 2.0 are increasingly multi-modal, combining visual, audio, and linguistic inputs enabling them to understand and interact with the world around us.
Agentive Use Cases
Combining multi-modal AI capabilities into an Agentic AI system opens up new use cases, from knowledge work to robotics and beyond.
Here are a few of the use cases we’re exploring:
Desk Research: Systems like those demonstrated in Gemini’s Deep Research capability go from providing quick web-searches to longer, more drawn-out research. This approach enables you to mine massive amounts of data and uncover insights in simple, natural language.
Synthetic Panels: Through customization, this market research approach can be adapted to incorporate Synthetic Panels – fine-tuned AI models that are trained to represent a particular demographic or psychographic persona. By giving an Agentic AI System access to these panels you can conduct and synthesize market research in near real time.
Content Creation: AI Agents can be used to automate complex, step by step, content production tasks with minimal human supervision. Workflows like this pipeline we designed for Google Shoppings Holiday campaign move the human from ‘in-the-loop’ to ‘on-the-loop’, using AI to handle tasks from data synthesis and insight creation, to image generation and copywriting.
Video Versioning: Agentive AI can be taught to use video production software like After Effects, which it can operate through cloud hosted machines and an API interface. By giving the system a human edited motion design system and detailed instructions, Agentive AI Systems can perform a wide range of video versioning tasks (more on this in our next post).
Agentive AI systems are both exciting and terrifying. They open the door to a thousand new use cases, while simultaneously threatening to displace millions of jobs. As the models begin to exceed average human capability at many tasks, Agentive AI Systems could easily go from a mere R&D use case to a competitive advantage that companies would be wise to adopt.
Shameless Plug: Please Vote for Our SXSW Panel With Google
If you’re interested in this topic and want to hear more, please vote for our SXSW London Panel, where Google’s Suzana Apelbaum and I will share more insights.
Addition is an applied AI studio for modern brands.
Visit Addition.ml to learn more.