Mastering claygents: A 5-step guide to automating expert judgment

Most people approach Claygents the wrong way. They open Clay, think about what they want the agent to do, type something out, and ship a 200-word prompt. It kind of works. They tweak it. It still kind of works. They move on.

The problem isn’t the prompt. The problem is everything that didn’t happen before the prompt was written.

This article walks through how to think about building Claygent workflows for tasks that require real judgment, not just data retrieval. The kind of tasks where a human expert would look at a company and say “yes, this is a fit” or “no, pass” based on years of pattern recognition. The kind of tasks where a generic AI prompt will give you generic AI outputs.

The problem with starting in Clay

When you open Clay and start writing a prompt, you’re implicitly making an assumption: that you already understand the decision well enough to describe it to a machine.

Most of the time, that assumption is wrong.

Not because you’re bad at your job as a GTM engineer. Because most decisions that matter in GTM are tacit. The knowledge lives in people’s heads. A good salesperson knows what a qualified lead looks like, but if you ask them to write it down, they’ll give you a list of obvious criteria that misses all the nuance that actually drives the call.

The result is a prompt that captures the surface of the decision but not the texture of it. It gets the easy cases right and fails on everything in the middle, which is usually where most of your leads live.

5 Step process to make multi-Claygent workflows that get things done

The fix is not to write a better prompt. The fix is to do the work that should have happened before the prompt.

Step 1: Interview the people who make this decision manually

Before writing a single word in Clay, sit down with the people who make this judgment call in real life. For a lead qualification workflow, that might be your CSO, your best SDR, your head of sales. For a content classification workflow, it might be your editorial lead. For a vendor evaluation, it might be your ops or procurement person.

Ask them how they actually think through the decision. Not how they’d explain it to a new hire, but how they really do it when they’re moving fast.

Good questions to ask: What do you look at first? What kills it immediately? What’s the thing that makes you say “definitely yes” even if other signals are weak? What’s the thing that makes you say “definitely no” even if other signals are strong? What cases genuinely give you pause? What mistakes have you seen when this decision was made wrong?

Record everything. Take notes on the messy parts especially. The hedges, the “it depends,” the exceptions to the exceptions. Those are the cases your workflow will eventually fail on if you don’t address them up front.

This step has nothing to do with AI. It’s just process discovery.

Step 2: Write the SOP before you touch the prompt

Take your interview notes and turn them into a written standard operating procedure. Not a prompt. An SOP.

The SOP should answer: what gets checked, in what order, and what does each possible output mean? It should be specific enough that a new hire could follow it to make the same decision a senior person would make. If you can’t write it without help from AI, that’s a signal you don’t fully understand the decision yet. Go back and do more interviews.

This document becomes the source of truth for everything that follows. The agents you build are just automated versions of steps in this SOP. The prompts you write are just structured versions of the logic in this SOP.

The reason the best Claygent prompts are long is not because more words makes AI smarter. It’s because the underlying decision is complex, and the prompt has to carry that complexity. If your SOP is thorough, your prompt will naturally be thorough. If your SOP is thin, no amount of prompt engineering will save you.

Step 3: Map the SOP to agents

Once you have a solid SOP, read through it and ask: which parts of this process require research, which require classification, and which require scoring?

That mapping tells you how many agents you need and what each one does.

A common mistake is trying to do everything in one agent. One agent that researches the company, classifies it, tiers it, and scores it. This almost never works well, because each of those tasks requires a different kind of attention and a different kind of output. Agents that are given too many jobs will shortcut on some of them.

One agent, one job. If your SOP has four distinct judgment calls, you probably need four agents.

Think about the sequence too. Which agents gate the ones that follow? If a company isn’t qualified at all, there’s no reason to run the classification, tiering, and scoring agents. The first agent should be a hard gate. Only qualified leads proceed. This saves credits and reduces noise downstream.

For the TradeForce workflow in the video, this maps to four sequential agents: qualify the company, classify which trade division should own it, determine how big the account is, and score whether now is the right time to pursue them. Each agent only runs if the previous one passed. Each one has a single job and produces a single output.

Step 4: Write prompts from the SOP, not from your head

Now you can open Clay.

By this point, the hard work is done. You know exactly what each agent needs to decide, what signals it should look for, what the outputs mean, and what the edge cases are. The prompt is just translating that into a format the agent can follow.

A few things that make prompts work better in practice:

State the output format before anything else. The very first lines of a Claygent prompt should describe exactly what the output looks like. Not “here is the context” first. The output rule first, stated aggressively. Claygents will default to verbose explanatory outputs if you don’t constrain them hard from the top.

Name the #1 mistake explicitly. Whatever the most common failure mode is for this type of task, call it out by name in the prompt. If agents tend to disqualify companies that also do service work, write a section called “The #1 Mistake” that explicitly addresses it. Agents make the same kinds of errors humans make when given ambiguous instructions.

Describe both correct and incorrect outputs with examples. Don’t just tell the agent what to do. Show it what the wrong output looks like and why it’s wrong. Show it what the right output looks like. The contrast helps more than description alone.

Use explicit logic over implicit judgment. Don’t write “use your judgment on edge cases.” Write out what the edge cases are and how to handle each one. The goal is to remove ambiguity, not to rely on the model to resolve it.

Step 5: QA before you scale

No workflow ships without a quality check.

Run the first hundred companies through manually in parallel. Check the agent output against what you would have decided. Look for patterns in where it gets it wrong. Wrong in a specific direction (too aggressive, too conservative) tells you something different than wrong randomly.

Once you’re satisfied the workflow is calibrated correctly, don’t stop checking. Sample every 10th or 20th result depending on your batch size. Claygents can drift in unexpected ways when the input data changes character. A workflow that works well on US companies might behave differently on international companies. A workflow tuned for large enterprises might misfire on early-stage firms.

QA is not a one-time task. It’s an ongoing practice.

Why this process produces better Claygent prompts

I was recently talking with a GTM agency founder based in the United Kingdom. He used Claygents for qualification but the results weren’t up to the mark. I shared the google doc displayed in the video with him and he realized: his prompts were quite short compared to mine.

When I asked him where the SOP was, there wasn’t one. The AI had helped his team member write the prompt based on a description of what the client supposedly wanted, but that description skipped all the hard cases. The prompt was 180 words.

The original prompts on which my example is based were created over 10 hours of rigorous brainstorming, interviewing, documentation, reviews, and testing against real data. It included collaboration with the CMO, CSO, CSM, SDRs, and data mining team members.

The result: My client could have very basic input (as little as the company name. no context whatsoever) and we could find the website, LinkedIn page, and qualify it, research and map the CRM fields including tier and prioritize hot leads.

The problem was never the prompt. The prompt was just carrying the weight of a decision process that hadn’t been designed yet.

This is the thing that separates Claygent workflows that are actually in production from the ones that almost work. The almost-working ones were built by starting in Clay. The production ones were built by starting with a conversation, writing it down, mapping it out, and only then opening the tool.

The prompt is the last thing you write. Everything before it is the work.

Neel is the founder of GTM Daily. With a background in core marketing, he is an AI enthusiast. Through GTM Daily, he is building an online space for GTM engineers and business leaders to find and share opportunities, learnings, and feedback on all things GTM engineering.

How to build multi-claygent workflows for judgment-heavy tasks

The problem with starting in Clay

5 Step process to make multi-Claygent workflows that get things done

Step 1: Interview the people who make this decision manually

Step 2: Write the SOP before you touch the prompt

Step 3: Map the SOP to agents

Step 4: Write prompts from the SOP, not from your head

Step 5: QA before you scale

Why this process produces better Claygent prompts

Creating Automated, Research-Backed Content with n8n

How to build multi-claygent workflows for judgment-heavy tasks

The problem with starting in Clay

5 Step process to make multi-Claygent workflows that get things done

Step 1: Interview the people who make this decision manually

Step 2: Write the SOP before you touch the prompt

Step 3: Map the SOP to agents

Step 4: Write prompts from the SOP, not from your head

Step 5: QA before you scale

Why this process produces better Claygent prompts

Related Posts

Creating Automated, Research-Backed Content with n8n