For two years I fought with AI. For two years I thought of it as something that, whether I liked it or not, I would have to use if I wanted to keep up. I thought of it as this thing that was all the way over there and everything I love about programming was over here. For two years AI didn’t make me smile, didn’t keep my thoughts running while I lay in bed, didn’t make me want to share stories over beers.
Of course, like everyone else present at the Big Bang, I clapped and was excited and tried everything I could think of — from translating phrases to generating poems, to generating code, to asking these LLMs things I would never ask a living being.
The world is changing, this is big, I told myself, keep up. I watched the Karpathy videos, typed myself through Python notebooks, attempted to read a few papers, downloaded resources that promised to teach me linear algebra, watched 3blue1brown videos at the gym.
But it felt like doing homework on a sunny Friday afternoon.
I couldn’t find a sense of fascination to pull me along. These LLMs are slot machines, I thought. You write your prompt, hit submit, and sometimes you win; sometimes you lose. I tried to get better at it, but turned my nose up at the term prompt engineering. Why worry about how you pull the lever of the slot machine? Why put effort in if it comes down to luck anyway?
Programming has always been a source of inspiring stories and mysteries to me. Full of heroes and black boxes, riddles and puzzles, and unbelievable feats of ingenuity that made me get up early, eager to learn more, because I knew the mysteries were all solvable, that the magic isn’t magic but only magical, and that the stories can be replicated if you just put in the effort.
LLMs, on the other hand, had none of that. They’re unknowable oracles and they for sure don’t listen to me.
Then, throughout the last six months, I changed my view on LLMs completely.
The first thing that happened: I tried Cursor Tab. We were working on Edit Predictions at Zed and I wanted to figure out what the current state of the art is. Until then, I had only tried Copilot-style completions. But Cursor’s Tab offered something different. Not only did it suggest how to complete the text next to your cursor, but it also suggested where your cursor could go next and what you could do there.
I would delete a field in a struct definition and it would suggest “hey, delete it down here too, in the constructor?” and I’d hit tab and it would go “now delete this setter down here too”, tab, “… and this getter”, tab, “… and it’s also mentioned here in this formatting function”, tab. Tab, tab, tab.
Now don’t think of me as smug, I’m only trying to give you a frame of reference here, but: I’m pretty good at Vim. I’ve been using it seriously for 15 years and can type 130 words per minute even on a bad day. I’ve pulled off some impressive stunts with Vim macros. But here I sat, watching an LLM predict where my cursor should go and what I should do there next, and couldn’t help but admit to myself that this is faster than I could ever be.
When I delete a debug statement, the model suggests to delete the other five debug statements, one by one, faster than I can spot them. When I add a log statement to a branch of a switch statement, it suggests log statements for the other branches, faster than I can come up with keystrokes to create a repeatable macro. When I add a suffix to one of many strings, it suggests adding it to them all — no Vim backflip with vGnorm$F"i<suffix>
necessary.
This, it struck me, this is not… playing the slot machine, is it? This is not me asking the oracle about what code I should write, it’s not me kneeling in front of the AI, hands folded, waiting for it to suggest the implementation of a function and getting frustrated when it gets it wrong again. No, this is using an LLM to take away the toil and chores of programming for me. This is more in the realm of mechanical helpers than oracles. And it’s fast and, actually, surprisingly precise and, hey, even if it sometimes doesn’t get it right? So what, I didn’t even wait for it.
Then we fine-tuned an LLM at Zed.
I had heard of fine-tuning, of course, but my assumption had been that fine-tuning is reserved for people who have ML in their job title, people who know linear algebra, people who don’t type 17*2 into a calculator.
Turns out that’s not necessarily true. You don’t need someone else to call you a scientist to fine-tune an LLM. If you’re able to create, say, a hundred text files, read four to five blog posts, and click through a Python notebook, you can already get some results.
But the bigger and more impactful surprise was how much fine-tuning changed my perception of LLMs. More than the previous two years of repeated attempts to better understand them did.
When you fine-tune a model and build up your data collection to tune it with, you start to get a real feeling for how the data can nudge a model into different directions. You start to say things like “if we remove these fifty examples here, and instead add more of these other ones, then it will probably more often …”
The slot machine becomes translucent. You begin to see its wiring and, hey, actually, this isn’t a slot machine at all. It’s more like an elaborate marble run into which you throw a marble — your prompt, the thing you want it to complete — and sometimes it goes over here and picks up stuff on the way and sometimes it goes over there. You develop an intuition for why sometimes it does this and sometimes it does that, because you see the data that goes into making the marble run and how the data moves its tracks, and turns, and drops. With your mouth open in fascination you begin to sense how much data is required to build such a model from scratch, and you start to theorize on how the data must have differed to make this model do that and this model do that.
To switch and mix metaphors again: you start to get a feel for the grain of the model, for what it “wants” to do and what it really doesn’t want to do, no matter how much fine-tuning you attempt. You begin to see where fine-tuning ends and coarse-tuning would need to start.
Fine-tuning showed me that effort, and care, and patience are rewarded. It’s not a slot machine and it’s not a coin toss when you ask it to complete something.
With a big feeling of Huh and scratching my head, I went back and started to go through it all again: Karpathy, 3blue1brown, everything I could find on the transformer architecture, about how attention works, how these big labs train their models. This time around I had concrete questions I wanted answers to. I could connect what I learned to practice. Learning felt useful, giving me things I could try and see, and it stopped feeling like studying for a test I wouldn’t pass anyway if I didn’t know the secret answers.
At the same time, intrigued by these models now, I started to look around for more people that I could follow, people who were interested in AI but didn’t have a dream to sell. I started following near, who was talking about Claude like a life companion. near used Claude in every possible situation: to research, to program, to weigh life options, to crack jokes. I found it all very confusing at first, but fascinating a few weeks later. How could someone so smart see so much I haven’t been able to see? What was I missing?
I also followed Pliny the Liberator and while looking at the strange ASCII art and spell-like phrases in pliny’s jailbreaks I couldn’t help but think of hacking and phreaking in the 80s and 90s — something I was too young to participate in.
In Munich I spoke at a meetup that was held in the rooms of the university’s AI group. While talking to some of the young programmers there I came to realize: they couldn’t give less of a shit about the things I had been concerned about. Was this code written with Pure Vim, was it written with Pure Emacs, does it not contain Artificial Intelligence Sweetener? They don’t care. They’ve grown up as programmers with AI already available to them. Of course they use it, why wouldn’t they? Next question. Concerns about “is this still the same programming that I fell in love with?” seemed so silly that I didn’t even dare to say them out loud.
Now, when I sit down to program with the help of AI, I no longer think of slot machines.
I no longer think of myself as placing an offering in front of the oracle’s shrine, hoping that this time it will give me the answers that I seek.
Instead, I smile.
I smile because now, in my toolbox, next to my editor and my terminal and the Internet, I now have these models, these giant orbs made out of oceans of human knowledge, laboriously distilled into unimaginable large rows and columns of numbers.
When I write a prompt it’s no longer a coin being tossed I have in mind. Instead, the words I wrote are being tossed into a giant orb, and in there they will get turned into numbers, just like everything else in the orb, and my words turned numbers will get multiplied, and multiplied, and multiplied yet again. Words I chose, I typed out, multiplied with all that knowledge, all that meaning.
Each prompt I write is a line I cast into a model’s latent space. By changing this word here and this phrase there, I see myself as changing the line’s trajectory and its place amidst the numbers. Words need to be chosen with care, since they all have a specific meaning and end up in a specific place in latent space once they’ve been turned into numbers and multiplied with each other, and what I want, what I aim for when I cast, is for the line to end up in just the right spot, so that when I pull on it out of the model comes text that helps me program machines.
Add this phrase here and that file’s content and I know that the line I cast will end up somewhere else in latent space than if I had mentioned the names of these five programmers and pulled in that other file. Throw in examples of what I want to see and they too will pull on the line.
Tell the orb to think and it spits out thoughts that are themselves lines in latent space and they too — the “thoughts” — are multiplied with what I wrote and with each other and with what everyone else once wrote on the Internet and by spitting out its thoughts, the orb itself moves the line.
Turn the orb into an agent and have it explore my computer on its own and every command it runs, every file it opens, every error it runs into — it will all be turned into numbers too and those numbers will tug and drag and yank the line through latent space.
And, well, now I can’t help and have to say: it’s beautiful, isn’t it?
This was really great. It very much mirrors the journey I’ve had so far. From almost 30 years in vim to an agent feels like whiplash, brain is exhausted daily, and yet I’m waking at 4am because I can’t wait to jump back in.
What a ride 😂