23,125 historical scores in one afternoon

From five hours debugging to fifteen minutes per refactor. Not because I'm faster. Because I started writing down what I learned.

In classic software development, what I did recently would have been a sprint of two weeks. Maybe three. A team of four people, daily standups, code reviews, sprint retrospective. At the end a feature published, with a changelog item.

What I did: one afternoon. Alone. Twenty-two thousand one hundred twenty-five historical score points generated, validated, and put into a database. Four different factor implementations, each with a different calculation model, all working.

Not because AI tools type ten times more code per minute. It's something else, and it has to do with learning from mistakes.

What happened

My trading system calculates four factors: Value, Quality, Momentum, and Catalyst. Each is a separate module that calculates a score for 552 stocks on a given date.

For the backtest I needed historical scores. Not just today, but for every month since June 2022. That's 51 months. Four factors times 552 stocks times 51 dates = 23,125 score calculations, plus all the related fundamental data.

In one afternoon everything was calculated and stored in the database. Not by magic. By something more specific.

The first time was hard

My Momentum factor, the first one I built, cost me three iterations. Three painful iterations.

Iteration 1: Cursor wrote code that worked for one date. I ran it on multiple dates. Crash.

Iteration 2: Problem found, namely that I was calculating across different trading days as if it was all the same list. Cursor rewrote. Worked for most dates but gave odd scores for a few.

Iteration 3: Bug was in how I was fetching the price history. For each stock separately I had to grab the most recent N trading days, not a fixed number of calendar days. Cursor rewrote. Worked.

Between iteration 1 and 3 was about five hours of my day. Two hours letting Cursor code, two hours testing, one hour debugging. A whole afternoon.

And then came the second one

When Momentum worked, I thought: now I have to do this three more times for Value, Quality, and Catalyst. Five hours each. Fifteen extra hours.

But before I started on Value, I did something I'd never done before. I wrote down what I had learned.

Not as a blog, not as documentation for others. As a list of principles that had to apply to my factor implementations. Three lines:

No fixed windows in scripts. Always parameterizable.
Use per-ticker grouping for cross-period calculations.
For fundamental data: 60 day buffer for look-ahead bias.

Three lines. Nothing groundbreaking. Standard knowledge for anyone who has done data analysis. But three lines that had cost me five hours of debugging to discover.

I took those three lines and pasted them as requirements at the top of my next prompt for Cursor. "Build the Value factor. Take these three principles into account."

Value was done in fifteen minutes.

After that it went fast

Quality factor: fifteen minutes. Cursor already knew what not to do.
Catalyst factor: fifteen minutes. Same story.

Between Momentum (five hours) and the other three (45 minutes total) was a factor 20 in pace. Not because I had become better at coding. Not because Cursor had become faster at typing. Because I had learned what the pitfalls were and explicitly communicated them to Cursor.

Lessons-driven prompts, I call it now. The idea: after every painful debugging session, write down the three to five principles you learned from it. Reuse that list as requirements in your next prompt. Cursor does what it's supposed to do, with the constraints you discovered with pain last time.

Why this works

Classic software development knows this pattern too, but informally. Senior engineers learn what junior engineers don't. Someone who's been doing Java for ten years knows which patterns work and which don't, and applies them implicitly. A new engineer makes the same mistakes, learns them anew, and slowly becomes senior.

With AI that knowledge isn't built in. Cursor knows a lot, but not what specifically applies to my project. My database structure, my data sources, my conventions. I have to tell him that. Every time.

But I don't have to start from scratch. I can keep a list of what I've learned. Comparable to a personal codebook. For each type of task: here are the mistakes I've made in the past, here are the assumptions that turned out to be wrong, here are the patterns that do work.

For my factor refactors: three lines on one note. Five hours to fifteen minutes. For other tasks the list would be different.

What this means for others

I'm building a trading system because I find it fun, and because I want to see how far you can get with AI when you have a specific goal. But this lessons-driven pattern doesn't only work for trading systems. It works for every complex software project where you let AI code.

Three tips for anyone who wants to try this themselves:

Tip one: after every debugging session, write down what you learned. Not as a novel, as bullet points. Three to five sentences.

Tip two: keep those lessons somewhere central in your project. A LESSONS_LEARNED.md file works well. For me it's a living document I can consult at every prompt.

Tip three: actively use those lessons as requirements in new prompts. Not "read my LESSONS_LEARNED.md," because the AI won't do that on its own. Specifically copy-paste the three lines that are relevant for this task, at the top of the prompt.

The thread

What a programmer used to do: come up with the solution and type the code. What I do now: I come up with the solution, formulate the requirements, and let the code be typed by someone else. That someone else, in this case, is Cursor.

The result: more time to think about what the right solution is, less time on syntax. But also: a new kind of work, namely curating my own lessons. What have I learned that I need to pass on next time?

That knowledge can sit in your head, but that's fragile. With me it sits in a file. A file I literally copy to Cursor when relevant. The difference between implicit and explicit becomes important all of a sudden.

For my 23,125 score points in one afternoon it meant: five hours of making mistakes, then 45 minutes of profiting from those mistakes. Not coding fast, but coding smart. Pace is not the win. The discipline to keep track of what you learn, that's the win.

Follow weekly?

Follow on LinkedIn RSS feed