• AI30.io
  • Posts
  • Devin.ai vs Cursor vs Codex: Who’s Winning the AI Coding Race?

Devin.ai vs Cursor vs Codex: Who’s Winning the AI Coding Race?

Semi-autonomous AI agents for coding are evolving

👋 Happy Friday! It’s George

Last month or so I was not able to contribute to AI30 newsletter as I was fully focused on building Promptrun. We did couple of pivots there, and service that we’re building now is fully focused on developers. It’s a prompt execution and management system, think like a Github for prompts (with bit more analytics for prompts, usage, latency and quality tracking). I’ll be updating you more about that.

The race is heating up 🔥 

We’re watching the rise of AI coding copilots evolve into something more powerful, semi-autonomous agents. Tools like Devin, Cursor, and OpenAI's Codex aren’t just completing lines anymore. They’re debugging, building features, and in some cases, shipping entire tasks with minimal human input.

But not all agents are built the same.

Devin.ai: The Hype Machine?

Cognition Labs made waves by demoing Devin, a full-stack autonomous engineer. It browses Stack Overflow, sets up environments, fixes bugs, and even ships pull requests. The catch? It’s still closed beta and heavily curated. Real-world generalization remains unproven. Devin is a vision more than a daily tool, for now. I’ve tested it for quite a while, spent couple of hundred dollars on that and at some point I’ve realized I spent more time on fixing bugs that Devin introduced, rather than focusing on shipping features.

Cursor: The Hacker’s Secret Weapon

Cursor is a VSCode fork supercharged with AI context awareness. It’s grounded. It understands your codebase, helps navigate large files, and offers inline suggestions based on real project context. While not fully autonomous, it massively boosts dev velocity. Cursor is the most “usable today” out of the three. Cursor has just introduced today, an agent that can be integrated into your Slack and you can just tag and tell it what exactly you want to do (exactly same way, how Devin works). You can see demo and documentations here.

btw. We’re using Cursor Agents within Promptrun. Every PR get’s reviewed by a Cursor agent and tells us about potential performance, security and logical issues that we might missed.

Here’s an example from out codebase, how Cursor Agent reviewed one of our PRs and gave a feedback.

Codex / OpenAI DevTools: Quietly Powerful

Codex laid the groundwork, and OpenAI’s dev tool demos show where it’s headed, creating full React components from prompts, refactoring, writing tests. Initially when I first tried Codex, I had a feeling that, “It feels like Codex is the engine, but others are building the car around it”, but now, Codex provides best outputs, and it’s only $20 per month. It’s extremely good value for that money. We use Codex tons, to make sure that we can work in parallel on various features and tasks. I strongly recommend give it a try if you haven’t tried it yet.

💡 My Take

If you’re not already experimenting with these tools, you’re falling behind. AI-native engineers are building faster, testing more, and shipping with leverage. In the next 6–12 months, knowing how to work with AI agents will be the new baseline.

📣 Let’s Stay Connected

Have a great day!

George

Reply

or to participate.