Joy & Curiosity #78
Interesting & joyful things from the previous week
Imagine working in the oil industry and someone figures out how to turn rainwater into oil. Some in the industry aren’t impressed: “More oil. Pah. That won’t change much, actually. It’s just more oil. We’ve been dealing with oil for decades. Sure, there’s more, but hey: more work for us. The rest is the same old, same old.”
They’d be right to some extent. It is more oil and some things would not change. Oil would still be a physical business. You would still need customers and contracts and sales channels and salespeople. You would still need refineries and storage and transport and distribution. You would still need safety and regulation and all of that.
But, also: everything else would change. Because the oil industry isn’t built around oil. It’s built around hard-to-find, only-in-some-places, hard-to-extract oil.
The price of crude oil would collapse. Reserves would lose their value. Finding oil fields and drilling for oil would not be a thing anymore. Location wouldn’t matter anymore, since it rains nearly everywhere.
And then come the second-order effects: on energy policy and geopolitics, on plastics and chemicals and fertilizers, on the parts of the industry that only refine and move and sell oil. Oil wouldn’t stop being oil, but the bottleneck would move through the industry and bump into and kick over many things along the way.
You know me. I’m not here to provide indirect political commentary on rising petrol prices. No, I’m talking about software, of course, and I want you to again consider: we now have buttons that we can smash and out come hundreds and thousands of lines of working code, in seconds.
Those buttons are not just another type of developer tool and “we’ve had code generators for decades” is not a valid reply.
Code is no longer hard-to-find, only-in-some-places, hard-to-extract. And yes, I am preaching to a choir here, but it’s Sunday and this is my newsletter and, damn it, I have to say this again, because I keep bumping into engineers who still don’t seem to understand what follows from that.
They’ll say something like: yes, someone should rebuild GitHub, because GitHub is dead. And I agree, yes, I’ve been saying that. But what they actually mean is: someone should rebuild GitHub as-is, with the same fundamental assumptions, with the same shape of open source as we know it, and built on the idea that code is scarce.
And I want to shake them and go: man, don’t you see? All of it was built on the assumption that code is expensive! And most of it doesn’t make sense anymore when code is cheap. Yes, some things won’t change. The need to do proper engineering won’t go away. But many, many, many things will, because a single constant in a very fundamental equation has been changed.
Craig Mod built “the accounting software I’ve always craved” (called TaxBot2000) and is now software bonkers: “It’s strange times. Anyway, I’m mad for software right now. Bonkers. I can’t stop thinking about things to make, things to make better. And then I go and make them. There’s an energy around all this that is — truly — epochal. If you’re not playing with models like Claude, you should probably take a peek. It’s the time of building.”
Great page: background-agents.com. There’s obviously (no: it’s very obvious) a bias towards the creators of the page there, but leaving that aside: this is where it’s going.
This tweet by Mitchell might have saved me this week. I read it and while I’m not like the guy in the video, I immediately felt guilty for getting distracted so often. Apparently, I have built up muscle memory to cmd-tab to a different window as soon as I submit a prompt. So, after reading that tweet, I closed the browser window with my private profile, put my phone away, and swore to myself that I’ll now either try to figure out the same thing the agent is trying to figure out or do something else on my own while it’s running. That lead to two incredibly productive days that made me feel great.
Karpathy released autoresearch, which is a repository, a tiny bit of code, and a Markdown file to instruct a coding agent to act like an LLM researcher: “The idea: give an AI agent a small but real LLM training setup and let it experiment autonomously overnight. It modifies the code, trains for 5 minutes, checks if the result improved, keeps or discards, and repeats. You wake up in the morning to a log of experiments and (hopefully) a better model.” The idea of running an agent in a loop isn’t new, but what I find fascinating: how small this repo is, how small the codebase is, how direct and clear the instructions and the workflow are, and the meta thing of this being exactly what the non-nano researchers at the big labs are doing, at least kind of. Tobi Lütke then used the same loop, through the pi-autoresearch plugin, but instead of training a model the agent optimized his templating language. Now the question is: what problems are as verifiable as a training run result or performance? Also, if you read this whole paragraph without thinking of the word “Ralph” that means we live in different bubbles.
Six Selfish Reasons to Have Kids, by Kevin Kelly.
Florian Brand on LLM benchmarks: “It is hard to see real-world utility being measured here. […] The other issue is the harness: It includes a set of tools to look at the files, revert to a previous step and edit code, but the model has to return a block of reasoning, followed by the tool call in triple-backtick delimited markdown. This is not how models work these days! […] So, what happens when you fix those mistakes?” I guess we all know by now that the benchmarks that are shared on the day of a model release are just pointers in a general direction, but this was still very, very interesting to read.
Why ATMs didn’t kill bank teller jobs, but the iPhone did: “The history of technology, even exceptionally powerful general-purpose technology, tells us that as long as you are trying to fit capital into labor-shaped holes you will find yourself confronted by endless frictions: just as with electricity, the productivity inherent in any technology is unleashed only when you figure out how to organize work around it, rather than slotting it into what already exists.” Good piece. The framing of “automating a job is much harder than making it irrelevant” makes a lot of sense to me and seems like a useful lens.
Amazing: howisFelix.today? Lots of nice little insights. Don’t miss the conclusion at the end.
“What’s your favourite disassembler? Mine’s a font.” Yes, that’s one hard line, and yes, you read it right: “This font converts sequences of hexadecimal lowercase characters into disassembled Z80 instructions, by making extensive use of OpenType’s Glyph Substitution Table (GSUB) and Glyph Positioning Table (GPOS).” Watch the video.
Gruber’s review of the MacBook Neo: “The Neo crystallizes the post-Jony Ive Apple. The MacBook “One” was a design statement, and a much-beloved semi-premium product for a relatively small audience. The Neo is a mass-market device that was conceived of, designed, and engineered to expand the Mac user base to a larger audience. It’s a design statement too, but of a different sort — emphasizing practicality above all else. It’s just a goddamn lovely tool, and fun too. I’ll just say it: I think I’m done with iPads. Why bother when Apple is now making a crackerjack Mac laptop that starts at just $600? May the MacBook Neo live so long that its name becomes inapt.” And that first line is the most Gruber line he’s ever published.
But this review of the MacBook Neo I really loved. Not only because of this paragraph: “Downloaded Xcode and dragged buttons and controls around in Interface Builder with no understanding of what I was looking at. I edited SystemVersion.plist to make the ‘About this Mac’ window say it was running Mac OS 69, which is the s*x number, which is very funny. I faked being sick to watch WWDC 2011 — Steve Jobs’ last keynote — and clapped alone in my room when the audience clapped, and rebuilt his slides in Keynote afterward because I wanted to understand how he’d made them feel that way.” But also because of this one: “That is not a bug in how he’s using the computer. That is the entire mechanism by which a kid becomes a developer. Or a designer. Or a filmmaker. Or whatever it is that comes after spending thousands of hours alone in a room with a machine that was never quite right for what you were asking of it.”
Apple Does Fusion: “This is why I think Fusion Architecture is the real story.
Not because of what M5 Pro and M5 Max can do today. Because of what it opens up. Once you’ve proven you can split the chip and keep unified memory working across the pieces, the question changes. It is no longer ‘how big can we make this chip?’ It is ‘how many pieces can we connect, and in how many dimensions?’”
Some Words on WigglyPaint. In the Joy column: this looks so lovely! I want to play with WigglyPaint! In the Curiosity column, the ending: “The most wildly successful project I’ve ever released is no longer mine. In all my years of building things and sharing them online, I have never felt so violated.”
Drew Breunig is asking why is Claude an Electron app. His hypothesis: “For one thing, coding agents are really good at the first 90% of dev. But that last bit – nailing down all the edge cases and continuing support once it meets the real world – remains hard, tedious, and requires plenty of agent hand-holding.” After having worked on Zed and contributed a few things to Ghostty (the first and only two truly native macOS apps I’ve worked on): I think most engineers underestimate how hard it is to build a truly great native application. And the question is: will your users notice, or care? If you’re building the application for a business, will going native make the business more successful? On top of that: once you’ve worked on a native application you realize what an amazing platform the web is and how much developer tooling has been built in the last twenty, thirty years around it.
And here’s Nikita Prokopov’s answer to Drew’s question: Claude is an Electron App because we’ve lost native.
Helen Min: Software isn’t dying, but it is becoming more honest. Fascinating stuff. This line here, for example: “I often hear founders and other hyper-rational types ask why we haven’t always billed for outcomes. The answer usually boils down to technical limitations and risk.” That made me wonder: because now you can kiiinda say that tokens are substitute for outcomes? If you spend millions of tokens on something, won’t you get outcomes? It might not be dying, but software is changing, man. And the old software we knew — that’s dead, I’m pretty sure. Dead in the sense that rock & roll is dead.
I also found this podcast with Bret Taylor to have some interesting thoughts on outcome-based billing.
Yes: “Willingness to look stupid is a genuine moat in creative work"
The 8 Levels of Agentic Engineering. Interesting, but at this point I’m convinced that in a year that ladder will look very funny and outdated. The models will wash away a lot.
Talking about models washing away stuff, here’s Simon Willison: “Drop a coding agent into any existing codebase that uses libraries and tools that are too private or too new to feature in the training data and my experience is that it works just fine—the agent will consult enough of the existing examples to understand patterns, then iterate and test its own output to fill in the gaps.” Many, many things I believed over the last year have been washed away by these models. If you still think Opus 4.6 is the peak, try deep mode in Amp, which uses GPT-5.3-Codex right now. Stare into its eyes.
Not a short form video guy, but I am a this-is-funny guy and this is funny: Taking my mate ChatGPT to lunch. (But, seriously, will AI cliche phrases disappear in the future or always be a thing?)
Or I guess I should’ve said “trope” instead of “cliche”, because I’m going to ask a model to create a really, really dense version of this and then I’ll put it in my ChatGPT system prompt: tropes.md.
Temporal: The 9-Year Journey to Fix Time in JavaScript. Years ago, back when we had such things, I was in a quarterly planning meeting. I ran the meeting, in fact. I was the manager, and I asked an engineer on my team to give a rough estimate of how long something would take. “Whew, really hard to say,” he said. “Come on,” I pushed. “We need something here, so—gun to your head—how long?” “Gun to my head?” he said. “I’d take the bullet.” So, anyway, that’s what I think of every time date and time libraries come up. Fix Time in JavaScript? I’d take the bullet.
I love Google Maps but I don’t really enjoy using it to find places to eat in a city I don’t know. And “don’t really enjoy using” it is putting it mildly. Now Google Maps is getting Gemini and that seems like one of the most interesting “we put an LLM in it” product changes in a while.
Paula Muldoon is saying staff engineers need to get hands-on again: “This definition of staff engineering, particularly the organisational impact, made a lot of sense before 2025. Staff engineers need to stop being hands-on with the code as the majority of their work and spend time teaching others, making strategy etc. […] AI software tools have changed that.” Yes. And now let’s all consider what other roles and processes in the Big Tech Org Chart 2010-2025 don’t make a lot of sense anymore. This isn’t 2018 anymore.
Boredom Is the Price We Pay for Meaning: “If you try to distract yourself from boredom, if you run from it, all will be lost. Brodsky quoted an imperishable line from Robert Frost: ‘The best way out is always through.’ A note written by the novelist David Foster Wallace makes a similar point: ‘Bliss—a second-by-second joy and gratitude at the gift of being alive, conscious—lies on the other side of crushing, crushing boredom.’”



How to build products now in agentic era without losing IQ?
New here. Thought the title was for this particular piece (which I've seen as a trend recently) but glad it is the ongoing thread!