On AI

This is a brain dump of my thoughts including many assumptions based on my gut feeling. I’m hoping these thoughts stop haunting me at 3am.

AI is not hype

The Claude Code release around December made things change for me. I only noticed it in February though, because almost all the hate AI gets is rooted in truth, so I ignored it.

Calculators reduced the need of manually adding, but didn’t reduce the need to know how percentages work. You can generate “slop” with a calculator if you don’t know what you’re doing.

Before the latest LLMs, software developers were like the architect, civil engineer, and construction worker on a construction site. But now, it’s like we are a high level architect and we have a conveyor belt dropping workers with an incredible memory every 30 minutes. The problem is, they don’t know much about the project. We need to guide them.

If we’re just prototyping the work is reduced to an extreme. Weeks become hours. But if it’s real work, we still should validate everything, and the more novel the work, the most likely we need to do it by ourselves.

Vibe coding is prototyping

I think “vibe coding” and “something something debt” are bad descriptions.

The issue here is, some people assume all code should be maintainable, and some people assume AI can create better software as long as you come up with the right prompt.

Just use vibe coding for prototypes that you throw away. Debt implies paying at a later time, but there is no debt when you throw away something.

Want to keep the code? Verify it, understand it, become familiar with it!

It should be named something like “prototype coding” and “codebase familiarity”.

Text files are having a moment

Some people say it’s filesystems having a moment but to me it looks more like text having a moment on personal PCs.

I live in a terminal, constantly looking at text diffs, I use a tonne of markdown files, and prompts all the time! Everything on my PC, which I control. Instead of online sessions, I connect remotely via my phone to some folder on my PC.

And the extra work is not that extensive.

On the other hand, some application rises its prices, or removes a feature I liked and here I am needing to a migration to another place that might do the same in 2 years. My data, I’m the one controlling it.

Verifying is paramount

Which translates into:

Reading is paramount
Diffing is paramount

The complaints about “too much code to review” sound like nonsense. It’s been awhile since people stopped hunting for a living. They would do that, and they didn’t need to go to the gym.

If I don’t go to the gym, I attrophy. And my hunting capabilities are non-existent. And I’m fine with that.

I prefer to workout my brain by reading code, understanding changes (diffing), getting familiar with codebases, learning new high level concepts.

I’m first and foremost a builder, so I’m fine if the building blocks are put in place by someone (or something) else.

Will I miss fighting with an API to write 5 lines of code that won’t change ever again? I’ll miss it as much as I miss hunting.

Code familiarity is king

Navigating an unknown codebase is like trying to find your way in a house you don’t know while the lights are turned off. And good luck trying to find the light switches!

So you’ll fight changing, maintaining, and debugging these mystical codebases.

There are some ways to get to know a codebase:

Writing the code
Reading the code: During code reviews, or while you’re finding your way writing the code
Checking docs, diagrams, tests, talking with people, etc.

With AI, writing code becomes minimal. So that’s somewhat out of the equation.

While code reviewing I want to 1) improve the codebase 2) learn/explain the codebase. Enter AI again, and I’m basically learning what’s going on from code reviews. And what am I supposed to do with a diff with 2000 new lines? I believe the code review needs to improve and I’m experimenting with a little tool I created.

Anyway, how I review AI code:

Use my tool to go around the codebase and mark what I read and what I think it’s important or not
Check red/green TDD tests
Ask to generate manual tests
Ask to generate test representations (e.g. ask to generate an md file showing a duration in a calendar based off of tests)

Some people started taking a day per week to manually write a feature, or replicate one implemented by AI. This also seems like a nice thing but I haven’t tried it yet.

The moats need to be redefined

By moats I mean stuff that makes someone choosing an app instead of another.

I believe I can now prototype something in a week or less that could take me many months. On top of that anybody can make a professionally looking landing page / video / etc.

This means it’s easy to oversell prototypes, and at the same time people are losing their trust (even more) on new software.

Open source projects in particular are in a gray area. If I use an open source license, how legal is it to ask an LLM to rework it in a different architecture/language and release it with another license? Some people started making their tests proprietary to reduce the ease of copying a project.

What looks like the new moats for apps:

Code familiarity. Something breaks? You can fix it. And changes are doable
Tests. Your app is stable
Infrastructure/Security. This is too cryptic for new “vibe” developers
UX. Although AI can create nice UX, people lack the experience and they just mix all the colors until it’s all brown
Easy validation and comparison. With AI integrated into apps, creating and editing might be replaced by AI. So the creation and editing process can’t be opaque, users need to be able to validate what changed.

So basically:

Initial software creation lost value
Stability / Quality are more important
Checking AI changes more important

Management will be lost

It’s hard for a manager to understand the subtleties of AI. They prompt an AI, make an impressive prototype and off they go firing all the junior devs. Fast forward 5 years and senior devs will dry out.

Go strong on checks

It’s extremely easy to become lazy, so the more checks in place the safer we are.

Tests: I love all types of tests! Unit, manual, e2e tests, fuzzy, you name it! Code coverage on the other hand I believe it might create bad incentives.

e.g. 2 files, one with 80 lines of code, the other with 20. The 80 line file is 100% covered, and the other 0%. So we have 80% code coverage. Now I reduced code duplication and the file went from 80 to 40 lines. I just reduced code coverage to 40% with an improvement.

Static typing: LLMs can easily check if the types work, it’s like having a million little tests for free.

Static analysis: Stuff like sonarqube

And of course, BIG code reviews.

Some tech will win

Maybe this will change, but I have the feeling some technologies like React and Golang are going to have an advantage. Why?

Technology with more available code, easier for LLMs to find examples. Like TypeScript/React
If a language is less descriptive, or had less idioms, the more likely the examples are more consistent and easier to follow. Like Golang
The advantage of dynamic languages seem less important if the typed code is still readable

This is mega speculative though. And I know I know, I put a lot of lists which annoys me it now looks like AI generated… At least I didn’t like em dashes…

Code is still the spec

Some people reason “how can I transform 100 characters input into 1000 characters output? I need to define the code!”

And… sort of. A dictionary transforms a word like “castle” into a paragraph. And I think AI can reliably generate some stuff by approximation. But not others, and those others you need to validate the code.

Examples:

UX approximations work quite well. But sometimes you need to intervene. And good luck trying to write a high level spec to fix it…
Custom algorithms break all over the place. Tests help but they are guardrails. And if you played a car game and do the turns by riding the guardrails, you know how effective that is.

Conclusion

Hard to know what’s going to happen, and I’ve missed the huge signs for several months. Writing lost a lot of value and people are just confused, excited, scared. I’m very excited, maybe too much. But it’s nice to still be this interesting even after starting messing with computers in another century.