Wild Wild Vibecoding

Vibecoding promised democratization. We got blindcoding instead.

Feb 03, 2026

Something has been bothering me lately while working with AI dev agents, and it’s not the usual “AI makes mistakes” stuff.

It’s that I genuinely don’t know what I’m paying for anymore.

The pattern

I’ve been building tools and exploring concepts on the major vibecoding platforms for a few months now. The pattern is consistent enough that I stopped doubting my perception.

Early in a project, the agent is genuinely impressive. It parses complex logic, handles edge cases I hadn’t even thought of, and refactors code across multiple files without breaking things. Progress feels fast, almost effortless.

Then, at some later stage, something shifts.

Same codebase. Same me. But now simple bugs take six attempts to fix—each “fix” introducing new problems. The agent confidently proposes changes that break functionality it had built days earlier. Reasoning that used to be precise becomes vague, generic, sometimes wrong in ways that feel almost lazy.

I’ve seen the same pattern on different platforms. Out of the blue, the behavior just gets worse. No warning. No changelog. No explanation.

It drives me crazy. I’ve built projects close to production, only to abandon them entirely—not because the idea was wrong, but because I ran out of patience watching tokens burn on the simplest fixes.

I can’t prove the model changed. But I can tell you the behavior changed—dramatically, and not because my project got harder.

Vibecoding was supposed to be the democratization of software development. What we got instead is blindcoding—building with tools that won’t tell you what’s actually running, what changed, or why the quality just collapsed.

Generated with Gemini

The business model problem

Let’s be honest: vibecoding platforms don’t run a single model. They mix and route across models depending on task, time, load, or cost. This isn’t a secret—it’s a rational product strategy to optimize inference spend. This kind of task-specific, multi-model routing is well documented and widely discussed as a cost–quality optimization strategy by AI tooling providers themselves (see, for example, Prompts.ai’s overview of task-specific model routing).

What is problematic is pretending this is just an internal detail.

If output quality depends on the model, then model switching is not a neutral abstraction. It’s a material condition of the service.

Generated with Gemini

You can’t advertise a certain level of capability, charge for it, and then silently deliver something else once the economics change—while keeping the same subscription and “credits” price. This kind of silent model change isn’t just anecdotal — enterprise-oriented observers have documented the risks when vendors switch underlying AI models without notice, leaving users scrambling to diagnose degraded behavior (see, for example, “The Hidden Switch” on LinkedIn).

Imagine paying a top law firm partner rate and later discovering the work was mostly done by a junior associate. No disclosure. No repricing. “Proprietary staffing strategy.”
That wouldn’t fly anywhere.

Why should this be acceptable in AI?

The opacity tax

What bothers me most is not that models are switched—it’s that I’m given no way to know, accept, or price in that behavior.

If a platform intends to dynamically route across cheaper and more expensive models, fine—but say it upfront and reflect that variability honestly in pricing. Give me predictability. Give me the option to opt in or out.

Otherwise, from the user’s perspective, it looks less like optimization and more like margin expansion at the expense of customers who thought they were paying for something else.

I pay in “credits,” but everything that actually matters is hidden:

which model is running
whether it changes mid-session
whether I’m being throttled to cheaper inference
whether my credits reflect compute or internal margin smoothing

This kind of unexplained variance isn’t unique to vibecoding platforms. Even direct users of foundation models have reported major differences in output quality between interfaces (for example, between OpenAI’s Playground and API), despite using ostensibly the same models and prompts—differences later attributed to hidden defaults, system prompts, or configuration layers. The problem isn’t that variance exists; it’s that users are given no clear signal when or why it happens.

What I do know is when quality drops.

Blindcoding isn’t a bug. It’s a business model.
And like any business model, it has incentives—and side effects.

It is increasingly taken for granted that platforms can mix and route models at will, regardless of the practical impact on users’ projects.

The psychological cost

Here’s what nobody talks about: when you can’t verify what changed, you start blaming yourself.

Did I prompt badly?
Is my codebase too messy?
Am I asking for something unreasonable?
Should I just drop this whole vibecoding stuff and just applaud the “AGI” discourse without trying to see what it is about?
Maybe I’m just tired.

The opacity isn’t neutral. It shifts the burden of explanation onto the user. You’re conditioned to doubt your own judgment about quality variance that’s actually on the platform’s side.

That’s not a bug in the system. It’s a feature of the business model—one that works precisely because you can never prove what happened.

The regulatory gap

In the EU, when I buy a service, I’m entitled to know what I’m paying for—and what materially affects its quality.

The AI Act pushes toward responsibility, traceability, and predictability. Article 50 and the GPAI transparency obligations require providers to document model capabilities and make certain information available to downstream providers who integrate models into their systems.

But here’s the gap: vibecoding platforms are downstream providers. They integrate foundation models into their own products. They may receive transparency information from model providers—and then absorb it, keeping users in the dark about what’s actually running.

That’s not a technical limitation. It’s a choice.

The AI Act’s transparency rules came into force in August 2025, with full enforcement in August 2026. Practices like silent model downgrades create exactly the kind of opacity the EU is increasingly uncomfortable with: users experience degraded behavior without explanation or recourse.

That’s the opposite of where regulation—and enterprise trust—is heading.

What I’m actually asking for

To be clear: I’m not asking for open weights or raw logs.

Generated with Gemini

I’m asking for something much simpler:

tell me which model I’m using
tell me when it changes
tell me what my credits actually buy—and how variability is priced into the plan

Some platforms are starting to move in this direction. OpenAI now labels when routing occurs but community reports note occasional discrepancies between Playground and API outputs, suggesting backend routing or fallbacks that are not always labeled upfront. Tools like Cursor let you manually or automatically select models. Others like LLM Gateway, document their routing approach and give developers explicit control over model selection and routing criteria. That’s the floor, not the ceiling — but it’s a start.

I’m not against abstraction. I’m against blindcoding. There’s a difference. And for now, blindcoding seems—at least from my experience and that of other users—the norm rather than the exception.

The bottom line

You can feel when the model behavior changes.
The reasoning gets shallower.
The structure degrades.
The hallucinations creep in.

But you can’t prove it—because the system is designed so you can’t.

Is this really how we want the future of AI software development to look?

Thank you for reading,

vasile

Vasile’s Substack

Discussion about this post

Ready for more?