26 April 2026 · 5 min read
Post #1When Claude gets it wrong (and I push back)
One of the most persistent misconceptions about AI build is that AI is always right. Or almost always. Or often enough that you can trust it.
That's not true. AI is very good, also in my experience. But it's also confident about things it shouldn't be sure about. If you don't push back, you get a polished first impression and a choice that might not be the best one.
This week I had a concrete example I want to share. Not to bash Claude, but to show how the six-eyes principle works in practice.
The situation
For playitsmart I have an architecture where two different workloads each need their own place:
- The website (Next.js, people visiting playitsmart.nl)
- The Python trading job that runs every morning, fetches FMP data, calculates scores, and later places orders via IBKR
I asked Claude earlier what hosting we should use. The advice was clear: Vercel for the website, Render for the trading job. Two platforms, each good in their own role.
All correct. But while setting up, the question came back: does it really need two platforms? Can't I do everything on Vercel, or everything on Render? Simplifies my life. One dashboard, one bill, one thing to learn.
First answer from Claude
I asked. Claude gave a well-reasoned answer. Vercel fundamentally cannot do the Python trading job because:
- Vercel is serverless: functions run for max 60 seconds to 5 minutes, my job takes 10 to 15 minutes
- IBKR API works through a stateful TCP connection, Vercel cannot fundamentally host that
- Python as a primary runtime is limited on Vercel, Node.js is at home there
Strong answer with a table of concrete limits and alternatives. Render was recommended for three reasons: server-based so long jobs are possible, native Python support, scheduled tasks as a first-class concept.
So far convincing.
But then: pushing back
I believed the part about Vercel. But I wanted to know: why Render specifically? There are more platforms in that segment. Railway is popular, Fly.io too. I asked:
"And how do you see Railway? It's also widely used, isn't it? Are you sure that wouldn't fit better?"
The second answer from Claude was different from the first. No repetition of the Render recommendation, no defense. Quite the opposite:
"Good question to push on. You're absolutely right that Railway is very popular, and honestly: for many projects it's the better choice. Let me be honest, not stubbornly hold on to my first proposal."
Then a real comparison. Railway has a nicer UI, faster deploys, usage-based pricing. Render has longer request timeouts, more predictable costs, native cron support.
And then came what I was waiting for: the specific weighing for my use case. Not generic "Render is better", but concrete:
- My cron job takes 10-15 minutes, fits Render without workarounds
- My IBKR Gateway needs to run 24/7 later, Railway's credit-based pricing can suddenly become expensive
- Cron jobs are first-class on Render, on Railway you have to build workarounds
Conclusion: for my use case, Render. For another project, Railway could easily be the first suggestion.
What happened here
Two things I want to call out.
One: Claude's first answer wasn't wrong, but also not complete. It presented Render as if it were the obvious choice, without mentioning that there are perfectly fine alternatives. That's a pattern AI does more often. A motivated recommendation sounds like THE answer, while it's actually one of multiple good answers.
Two: Claude didn't defend his first choice. When pushed back, he immediately admitted it was more taste than objectivity. After that came a well-reasoned final choice, based on my specific context.
This is how I want it to work. Not: AI always gets it right. But: AI gives a thoughtful first answer, and through pushback a better weighing emerges.
Where this can go wrong
Suppose I hadn't pushed back. Then I would have chosen Render based on a recommendation I hadn't fully weighed. Maybe Render works well, maybe Railway would have worked for me too. I wouldn't have known.
Worse: suppose I had felt Railway was nicer, and still chose Render because "the AI said so". Then I would have been frustrated after a week, without understanding why.
Pushing back is not distrust. It's responsible decision-making. My choice, my project, my risks.
What I recommend to people starting out
Three rules I hold for myself:
Rule one: With every recommendation, push back at least once. Not "is this good?", but "what are alternatives and why exactly this one?"
Rule two: Give AI permission to reconsider its first answer. Literally ask: "are you sure this fits best, or is it more taste?" The difference in answer can be significant.
Rule three: Decide for yourself. Not because AI is better or worse, but because you bear the consequences. If the hosting is wrong, you pay. Not AI.
Finally
I use AI every day for this project. For design, architecture, review, content. It speeds me up by a factor of ten.
But that speedup only works if I keep playing the critical role. The six-eyes principle isn't a luxury. It's why the output is reliable. Three perspectives on every decision: me, Claude (thinking and review), and Cursor (building). None of them can be skipped.
When in doubt: push back. When AI is certain: push back twice.