Building AI Agents just changed forever! Or did it...

72 hours testing OpenAI's agent builder - our brutally honest feedback

Hey builders,

OpenAI just made some of their biggest moves yet. And honestly? A lot of start ups should be nervous.

This isn't just about ChatGPT anymore. OpenAI is positioning themselves as an entire platform - one that could reshape how we build, deploy, and scale AI systems.

This week they dropped three major releases:
AgentKit - their new platform for building agents
App SDK - native apps inside ChatGPT (huge deal, we’ll cover this in depth next week)
Codex - improvements to their coding agent

A clear signal they're coming for the entire AI stack.

We've had our engineering team put each product through its paces over the past 72 hours to see if it really lives up to the hype.

I'm breaking down what actually matters, what's overhyped, and what you should be paying attention to.

Let's dive in.

OpenAI DevDay Annoucements

AgentKit: The Agent Builder Everyone's Talking About

OpenAI just dropped AgentKit and everyone's asking: "Is this the tool that kills n8n, Make, and every workflow builder?"

My take: No. But this release is significant.

Agentkit

Here's what happened:

AgentKit is OpenAI's new framework for building agents with:
→ a drag-and-drop workflow builder UI
→ human-in-the-loop controls and guard rails
→ direct MCP integrations
→ ability to chain agents together visually

This is a big unlock for the industry. It'll mean:
→ non-technical teams building simple agent workflows (that's why the UI is the main unlock here)
→ ChatGPT-native experiences that need agent orchestration
→ quick prototyping without writing code

However... I Personally Felt Let Down by the Launch

After putting AgentKit through its paces, here's the reality check. These are my thoughts and the reflections of our engineering team after testing it for a few days:

What's actually good:

✅ The UI and ChatKit are solid - you can create a lot of variation with minimal effort. The interface is intuitive and the drag-and-drop experience works well.

✅ Easy deployment - getting agents live is straightforward, much faster than some alternatives.

✅ Guard rails are genuinely new - compared to n8n and other workflow builders, the human-in-the-loop controls and safety features are a real differentiator.

✅ Future potential - since it's built by OpenAI, there might be exclusive features down the line that could make this interesting.

But here's what's holding it back:

❌ MCPs barely work - could be early bugs, but the experience was terrible. This is supposed to be a core feature.

❌ No intelligent looping - the agent doesn't loop back to ask for missing information. It's a one-way street. If the user hasn't provided everything needed, it just fails.

❌ Still very buggy - error messages don't give clear details about what went wrong or where. Compared to n8n's debugging experience, this is far behind.

❌ Extremely limited nodes and integrations - the ecosystem is bare bones right now.


But beyond this, there are some critical limitations that are not being covered enough which are important to understand before you abandon your current stack:

1. Complete platform lock-in
You're trapped in OpenAI's ecosystem.

2. Model restrictions
You can only use OpenAI models... no Claude. No Perplexity. No Gemini. No mixing models for different use cases.

At Ghost Team, we use Claude for coding tasks, Perplexity for deep research and Gemini for many use cases. With AgentKit? You're stuck with one model for everything. That's incredibly limiting.

3. Limited integrations
Compared to n8n's 400+ integrations, AgentKit is bare bones right now. Building complex, multi-tool systems? Not happening (yet).

4. Less flexibility
Want to build sophisticated automation with custom logic, webhooks, and complex data transformations? You'll hit walls fast.

The Bottom Line

I don't see AgentKit killing workflow builders from Day 1. It's an interesting option for simple, ChatGPT-centric use cases.

But if you're building serious systems that need:
→ Multi-model flexibility
→ Complex integrations
→ Custom logic and transformations
→ Platform independence

Stick with tools like n8n, Make, or build custom with LangGraph.

My recommendation? Monitor AgentKit's development, but don't rip out your existing stack. The potential is there, especially with OpenAI's resources behind it, but the execution isn't ready for production-grade agent systems yet.

We're continuing to test and build with AgentKit over the coming weeks. I'll share updates on what works, what breaks, and where it might actually fit into your workflow.

App SDK: The Next Major Distribution Play

App SDK Release

But here's the release that actually has my full attention: OpenAI's App SDK.

Apps are now inside ChatGPT. And this isn't just a feature release – this is potentially OpenAI's "App Store moment."

Think about it:

  • ChatGPT has nearly 1 billion users

  • Your app can appear directly in the chat window

  • Users can sign into your product through ChatGPT and you can monetize

Example: "Figma, turn this sketch into a diagram"
→ First version appears in the ChatGPT chat window
→ You make changes directly in the chat
→ When ready for refinements, you open it in Figma

Right now, you can already access: Booking.com, Coursera, Figma, Spotify, Zillow, Expedia, and more. These companies were chosen to develop the first apps.

But soon, you'll be able to submit applications to the app store ready for review.

First Apps

OpenAI just opened up what is potentially a new trillion dollar opportunity.

While some of the apps looked basic, remember, today is the worst an app inside ChatGPT will ever be. The people who get ahead on this wave will be the ones who can arbitrage the opportunity early.

Two critical insights:

1. If you're a company with existing data or an app, you have no choice
Your customers will eventually demand it. They're already using ChatGPT - if your competitor builds a connector first, you're behind. This isn't optional, it's survival.

2. New distribution channels rarely open up
We've seen this maybe 3-4 times in the last 20 years (Google search, iPhone App Store, Facebook platform, mobile-first). When they do, early movers build unfair advantages that compound for years.

Ghost Team will be experimenting in this space. We're already exploring how to position our tools in ChatGPT's ecosystem.

I'm going deep on this in next week's newsletter - breaking down exactly how ChatGPT's discovery algorithm works, the metadata strategies that determine if your app gets called, and what you need to know before building.

Ghost Team’s Amsterdam Meet Up

By building a company with people across US, Europe and Asia, I’ve learned that being distributed is a superpower - but nothing beats meeting face-to-face.

Last week, we had some of our team fly in to Amsterdam for a meet up.

It was awesome getting some of us together to reflect on our journey so far and do some more innovative thinking about the opportunities ahead of us and plan what’s next.

We’ve got a lot to share with you all and I couldn’t be more excited about what’s next.

We’ll be sharing more soon - stay tuned!

Become an AI Agent & Automation Expert

Want to stay ahead of the AI Agent & Automation curve? Join our community. We have an insane talent density the group.

We’ll also be bringing some new and exclusive things for members very soon.

The playbook is being written by those in the room, join us to be one of them.

Click here :)

Feel free to drop me a reply with feedback or questions about AgentKit or any of the other OpenAI releases.

I read every reply!

Happy building,

Elliot

Learn how AI Agents & Automation can grow your business.
Book a strategy call with our team below 👇

Click here :)