Announcing the next Betaworks Camp program — AI Camp: Agents

Jordan Crook
Betaworks
Published in
8 min readNov 27, 2023

--

Update: the application to AI Camp: Agents is now open! Apply here.

We think agents — fully developed agents and the tools to enable them — represent the next wave of AI and of innovation. In the past nine months, we’ve seen ebbs and flows of buzz around agentic artificial intelligence, from the flurries of press coverage around Baby AGI and AutoGPT to the temporary excitement that was building around OpenAI’s ChatGPT plug-in ecosystem and now the excitement around GPTs.

This is just the tip of the iceberg.

Our last Camp cohort focused on augmentative AI technology — software that was purpose-built to pair with a human in their existing or emergent behaviors, workflows, etc. to create a positive-sum outcome. In the midst of that Camp, we saw twinklings of the gravitational pull toward agentic AI.

Moreover, Betaworks has an extensive portfolio in the AI space, including but not limited to Stability, Nomic, Flower, and HuggingFace. HuggingFace is a particularly important part of the emergent agent ecosystem and has historically provided support to Camp companies in the AI space.

Betaworks Camp is a cohort-based investment program at the pre-seed level. Betaworks invests in between 8–12 companies, all of which are focused on an irruption-phase technology (in this case, agents). Details on the investment, application timeline and more are forthcoming. (You can stay in the loop with us here.)

Our next AI Camp is focused on agents and the technology that both enables their creation and ensures they fulfill your/their goals. What defines an agent? In our view, an AI Agent can:

1. Perceive, synthesize, and remember its context;

2. Independently plan a set of actions toward an abstract goal;

3. Use the tools necessary to execute against that goal without human support; and

4. Evaluate the results of its work against the overarching goal.

Modern LLMs need to be enhanced in each of the dimensions of the above definition, and those enhancements will probably arise from infrastructural/framework unlocks, rather than purely adding more training data or model parameters. While the hype on agents has gone up and down for close to 25 years, we now see that the infrastructure is currently being built, such that the moment for agentic applications and software is coming.

(1) Improved Perception & Synthesis

LLMs do a fine job of modeling both syntax and semantics, but as John Searle outlines in the “chinese room” thought experiment, this is really just a function of sufficiently powerful information processing and pattern recognition systems, not necessarily indicative of “true knowledge or understanding” itself. We observe this in LLMs ability to confidently hallucinate obviously incorrect or nonexistent information. Adding the capacity for a more holistic understanding of the environment will be critical to ‘agency’. Some of this will certainly be solved by just greater context, but there will probably have to be more optimal methods stacked on top — already we’re seeing people using extant models/context windows and getting them to perform better synthesis with certain metacognitive frameworks.

Memory, Reflection, & Metacognition

Metacognition is the ability to think about how one is thinking. Being able to reflect on one’s own information processing (and linking/comparing it to prior versions) can allow systems to retain context across multiple actions — core to being able to develop and execute complex tasks. Other solutions to this problem will be not just putting more memories into context, but actually knowing which memories or data are most relevant to include.

(2) Improved Goal Setting & Planning

Once models have done a better job synthesizing their immediate environment, they also need to do a better job of actually coming up with a plan of execution. Currently, it seems the problem here is that models do not do a good job of scoping out a “well-shaped” decision tree. By that we mean, sometimes the agent’s decision tree is too bushy (evaluating too many alternative paths) or too tall (evaluating too far down a particular path).

There is obviously no deterministic way to know ex ante exactly how wide or deep the decision tree needs to be. However, there are a couple of potential methods that can help us reduce this problem.

Adversarial scoping

General Adversarial Networks (GANs) are an old concept in deep learning — old but not that old! You might recall the LP deep dive we did on GAN’s in 2018. The basic premise is that you have 1 model (the generator) do all of the generations, and another model (the discriminator) do some kind of “grading”, until the generator passes a sufficient grading threshold. You can imagine a world where agents are paired with adversarial LLMs whose job is to ‘red-light’ a particular path of decisions as either being too narrow/wide/expensive/etc.

Cooperative scoping

Alternatively, you can get agents to improve their accuracy by giving them the affordances of a ‘toolkit’ or a ‘team’. For more on this, check out the SayCan paper and the Dreamcoder paper. Teams for agents are also relatively straightforward in that you can split an agent’s workflow into the work of many subordinate agents with more limited scopes.

Reflecting & Pathfinding

Beyond using some external system to help moderate an agent’s behavior, agents can also use some frameworks for self-reflection and memory to improve their performance. Ideas like “Show your Work” and “Chain of Thought” are already showing up in LLM applications where an LLM has to outline their plan to solve a problem, solve the problem, and then reflect on their solution. Merging reflection with decision trees can result in something akin to “Trees of Thought”. The basic intuition here is that we can take approaches to pathfinding through a graph/tree structure from classical computer science and apply them to decision trees to compare alternative paths with the depth of a given one.

(3) Improved Executables and Evaluations

There will probably have to be structured ways for LLMs to actually interface with data, applications, APIs, and other agents to affect change in the “real world” of the user. We saw a version of this with ChatGPT plugins, but that fizzled out (as of this writing) — that likely has to do more with the fact that devs will be reluctant to build for another’s application, vs just the idea of exposing their product to an LLM in some way.

Maybe code synthesis agents will automatically learn to use Stripe APIs, or use LangChain, or maybe there is still a business/product to be built standardizing APIs and making them more accessible to models. We doubt that this is an entirely trivial problem, since modern SOA LLMs still have trouble properly structuring JSON outputs, but there are some examples of people attempting to solve this in AI-native ways. The Gorilla paper, for example, is an LLM fine-tuned specifically for making API calls. The same way a classical developer today pulls in a structured code module in their software project, we imagine a world where an AI-native developer (or maybe even an agent) pulls in a task specific model like this to handle deciding on API calls.

AI Camp: Agents

Incorporating the above synthesis around infrastructural white spaces, we’ve spent some time identifying the attributes of this category that we’d hope potential Camp companies have a unique POV on:

Autonomous — an agent must be able to perceive and synthesize data and ultimately navigate its own path of decisions via that synthesis

Language — the unlock or “why now” of agents is most certainly the rise of LLMs which can simulate some manner of cognition, providing a thoroughfare through which agents can process complex information and develop plans. Moreover, language is the native human processing language — it is both incredibly precise and highly flexible, which allows for both fine tuned control and interpretability.

Domain Native Interface — the method by which a human interacts with an agent is still highly TBD. Agents may very well provide the most utility when they infiltrate an already high-touch/high-visibility interface, rather than paving their own path to the user.

Personalization — By definition (at least by our definition), agents are not about a single call and response. That’s search. Because of this, the overarching goal given by a human user likely means something different to human A than it does to human B. Infrastructure that gives an agent some measure of theory of mind (see Plastic Labs from AI Camp: Augment), rather than developing an agent that reaches for the lowest common denominator, is more likely to provide high utility.

Human Guardrails — Given that agentic AI is still in its infancy, and that it’s predominantly built off of non-deterministic LLMs, the notion of agents skittering off and executing against their own plans without any checks and balances isn’t all that attractive. Our position is that the interface for agentic AI must have some concept of choke points, wherein a human can see the pathing of the agent and either approve or disapprove of the work. Furthermore, agents that can outline and ‘explain’ their work on a step-by-step basis allow for businesses, developers, and even legislators to navigate the next generation of AI and lay down policy.

Disposable — We don’t believe that all AI agents will be disposable, but we believe that disposable software is an important layer to consider when approaching agentic AI. Some of the most taxing and tedious tasks in our lives are things that we do only once every so often, meaning that the market has not built any high-value tool for that task. Agents represent the ability to efficiently off-load a task without the high cost of developing human-built software against it.

Tool Using — Humans are only as good as the tools they can build and wield. A great deal of the value agents can provide will come from their ability to use tools, as well (see Unakin from AI Camp: Augment).

There are roadblocks, indeed, that are sprinkled along the path of AI agents in their march toward prime time. Some are technical — the current context capacity of these reasoning models is limited to a finite amount of data, and models that fail at one step in their chain of reasoning and execution struggle to remember where they left off, and instead start from scratch. Others are political — if the concept of AI is sending a shiver through the public around jobs, safety, etc, then more autonomous AI is certain to attract some headwinds.

We believe that the potential for value creation is incredibly high and that the timing is right to build a portfolio in this space.

Are you building agentic AI technology? Apply here.

--

--

Jordan Crook is a partner at Betaworks, where she invests in the intersection of AI and gaming. Previously, she was Deputy Editor at TechCrunch.