How I Actually Vibecode

Apr 11, 2026

Everyone talks about vibecoding like it’s just opening Cursor and typing “build me an app.” That’s not what it is. After building Ryva mostly solo as a 16 year old with no team, I’ve developed a real opinion on what actually works and the gap between using AI to code and building fast is much larger than most people let on.

Before I get into the stack, here’s the actual proof. The entire codebase of Ryva, egeuysal.com, my brain repo, and ibx are all generated by AI. I reviewed all of them. I read every file before it shipped. But none of it was written by hand. The Ryva dashboard below, the ibx task manager below that, all of it came out of agents working with the right context. The design is real. The logic is real. The code works in production.

This is my actual setup. Not the cleaned up version. The real one.

Codex over Claude Code, and why

I run two ChatGPT Plus subscriptions. Together they come out to around $40 a month. If I switched to Claude Code for the same workload I’d be looking at the $100 plan. That number alone tells you something.

Codex is smarter for actual logic. It reasons through multi-step problems better, handles filesystem access without losing the thread, and doesn’t burn through limits nearly as fast. Claude Code’s rate limits are genuinely painful when you’re deep in a flow state. You lose context, you lose momentum, and you end up re-explaining yourself to the agent which defeats the entire point.

Where Claude Code wins is on speed for focused work. Fast UI output, quick iterations, stuff where you just need something functional in the next ten minutes. So the split I’ve landed on is simple. Claude Code for fast design passes. Codex for everything that requires actual reasoning and filesystem work.

Pick the right tool. Don’t be loyal to one just because you paid for it.

Automations, not just chat

The real unlock for me was not using agents to write code on demand. It was using them to replace repetitive daily work entirely. I have four automations that run every day at 6pm inside the Codex desktop app.

Create Daily Tasks for my personal site
Create Founder Posts pulling from my brain repo
Find 10 Outreach Posts scanning for targets
Find X Posts doing the same for X

These aren’t vibecoding in the traditional sense. This is closer to building an operating system on top of AI. I wrote about that separately so I won’t go deep here. But the point is that you should not limit agents to the editor. They can run your entire day if you let them.

Make repetitive tasks agent-friendly before you hand them off

Before you throw a complex task at an agent, make it easy for the agent to succeed. If you need it to fetch data from Reddit, write the fetch script first. Hand the agent a working tool, not a vague instruction. Agents fail on friction more than anything else. Remove the friction before you add the agent.

Here’s a simple example. Instead of saying “go scrape the top posts from this subreddit,” I give Codex a script that already handles the Reddit API auth and returns structured JSON. Then I say “use this script, filter posts where engagement is above 100, and write me a list of hooks I can respond to.”

# fetch_reddit.sh - run this first, give codex the output
python3 scripts/fetch_reddit.py \
  --subreddit SaaS \
  --limit 25 \
  --min_score 100 \
  --output posts.json

# scripts/fetch_reddit.py
import praw, json, argparse

parser = argparse.ArgumentParser()
parser.add_argument("--subreddit")
parser.add_argument("--limit", type=int, default=25)
parser.add_argument("--min_score", type=int, default=0)
parser.add_argument("--output", default="posts.json")
args = parser.parse_args()

reddit = praw.Reddit(
    client_id="YOUR_ID",
    client_secret="YOUR_SECRET",
    user_agent="ryva-outreach/1.0"
)

posts = []
for post in reddit.subreddit(args.subreddit).hot(limit=args.limit * 3):
    if post.score >= args.min_score:
        posts.append({
            "title": post.title,
            "score": post.score,
            "url": post.url,
            "comments": post.num_comments,
        })
    if len(posts) >= args.limit:
        break

with open(args.output, "w") as f:
    json.dump(posts, f, indent=2)

print(f"Saved {len(posts)} posts to {args.output}")

You write the script once. The agent uses it forever. That’s the compounding return.

Prompts matter less than context

Specific prompts help. But context is what actually moves the needle. We’re past the era where prompt engineering is the main skill you need to develop. AI no longer needs perfectly crafted sentences to produce good output. It needs to understand your project deeply, and that understanding has to be built into the environment you give it.

Three things give agents real context.

Memory files live in your repo and tell the agent about your project. Your design system, your conventions, your stack decisions, your naming patterns. Without memory files the agent is starting from zero on every session. With them it already knows you use Convex, you name routes in kebab-case, and your primary color is whatever you set in globals.css.

Skill files make agents actually competent at specific tasks. The clearest example is shadcn/ui. If you don’t give the agent a shadcn skill, it will try to hardcode components from memory instead of using the CLI. The shadcn skill tells it how shadcn actually works and what commands to run. Here’s what a basic skill file looks like for context.

# shadcn skill

## When to use this skill

Use when the user asks to add any UI component to a Next.js project.

## How to add a component

Always use the CLI. Never hardcode shadcn components manually.

npx shadcn@latest add button
npx shadcn@latest add card
npx shadcn@latest add dialog

## File locations

Components are added to /components/ui automatically.
Never create files in this directory manually.

## Theming

Edit /app/globals.css to change CSS variables.
Never edit component files directly for styling.

I also use the caveman skill which compresses agent context by around 67 percent. When you’re running long sessions this matters a lot. Less context burned means more actual work done per session.

MCP servers are something most people have stopped installing because skills cover a lot of ground now. But I still run a few I’d recommend to everyone. Exa for web search. Resend for sending emails programmatically. Context7 for pulling in the latest documentation when a skill doesn’t cover a new API version. And Composio is worth looking at. I haven’t used it myself but people I trust have said it’s genuinely good for connecting agents to external services without writing your own integration layer.

Here’s what my actual skills directory looks like. Each of these is a folder the agent can load when it needs to do that specific thing. The ryva-execution skill is the one I wrote myself. It carries all the product context so I never have to re-explain what Ryva is, what the data model looks like, or how the agent integration works.

.
├── caveman
├── caveman-commit
├── caveman-review
├── compress
├── figma-implement-design
├── find-skills
├── pdf
├── ryva-execution
├── screenshot
├── security-best-practices
│   └── references/
│       ├── golang-general-backend-security.md
│       ├── javascript-typescript-nextjs-web-server-security.md
│       └── javascript-typescript-react-web-frontend-security.md
├── sentry
├── taste
├── vercel-deploy
└── yeet

The taste skill is one people underestimate. It gives the agent an opinion about design. Without it, agents produce output that is technically correct and visually forgettable. With it, the agent pushes toward something that actually looks like it was made by someone who cares. That’s how Ryva’s dashboard ended up looking like the screenshot above instead of looking like a default shadcn template.

For UI, show don’t tell

Don’t describe what you want the UI to look like. Show it.

Give the agent screenshots before you give it any written instructions. Better yet, use the Figma MCP server so the agent can read your design files directly and pull the exact values. Even better than that is paper.design which is a new Figma alternative that can read designs, generate them, and export directly to production-ready code. That pipeline is genuinely fast and cuts out a translation layer that causes a lot of errors.

I also use the Playwright MCP server combined with a screenshot skill. This lets the agent actually control the browser and see what it’s looking at. When something breaks visually, the agent captures it itself instead of relying on your description. You stop saying things like “the button is slightly off to the left” and the agent just sees it.

# openai.yaml for screenshot skill
name: screenshot
description: Takes a screenshot of the current browser state and returns it as context
tools:
  - name: take_screenshot
    description: Captures the current viewport
    parameters:
      url:
        type: string
        description: URL to navigate to before screenshotting
      full_page:
        type: boolean
        default: false

One more thing on the testing side. Use your tools’ CLIs before you start mocking things. If you’re building on Convex, reproduce the database error using the identity flag so the real system surfaces the real error. Don’t mock what you can actually hit.

# reproduce convex error as a specific user instead of mocking auth
npx convex run myFunction --identity '{"subject": "user_123", "issuer": "https://yourapp.com"}'

Review everything except UI

This is the part people skip and then regret publicly. Don’t ship code you haven’t read. UI is the exception because visual output is something you can evaluate just by looking at it. Logic is not. Read the logic. Every time.

I use the security skill on every project. It checks against known vulnerability patterns for your specific stack. There are separate references for Next.js, Go, React, and more. The agent runs through it before anything gets pushed.

skills/
├── security-best-practices/
│   ├── references/
│   │   ├── golang-general-backend-security.md
│   │   ├── javascript-typescript-nextjs-web-server-security.md
│   │   ├── javascript-typescript-react-web-frontend-security.md
│   │   └── ...
│   └── SKILL.md

Don’t be another story about an AI-generated app leaking user data because nobody read what the agent wrote. That’s an embarrassing and completely avoidable problem.

The actual stack

Skills I use constantly: caveman for context compression, caveman-commit and caveman-review for git workflows, security-best-practices on every project, screenshot and figma-implement-design for UI work, vercel-deploy for shipping, sentry for monitoring, and my own ryva-execution skill that carries all the Ryva-specific context so I never have to re-explain the product to an agent.

MCP servers: Exa, Resend, Context7. Playwright for browser work.

Tools: Codex desktop app for heavy lifting and all my automations. Claude Code for fast UI passes when I need something visual in under ten minutes. paper.design for the design-to-code pipeline. The p CLI I built for uploading screenshots to my CDN. Shipr as my SaaS scaffold so I’m not rebuilding auth and billing from scratch on every new project.

That’s the real stack. Not one AI tool, not one workflow. A set of layers that compound on each other. The agent is only as good as the context you give it and the context is only as good as the systems you build around it. Get those systems right once and everything downstream gets faster.

Look at the ibx screenshot again. That’s a task manager built entirely by agents using this exact stack. The design is clean, the data model is solid, the CLI works. I use it every day to run my schedule. That’s the bar you can hit when the context is right and the agent isn’t guessing.

The goal is not to write less code. The goal is to build more, faster, with higher quality than you could alone. That’s what this stack actually does when you put it together properly.