Joy & Curiosity #81
Interesting & joyful things from the previous week
Know what agents are really bad at? Writing confident code. Code that says: this is how this works, boom, fist on the table. And this is true and this is false, forever and always. And if both have changed, it means we’ve been moved to a different universe. This thing can never be null, and this can never be nil, and if this isn’t wired up to that, then down is up anyway and we can pack up and go home.
Instead they’re prone to write code that constantly asks, with the voice of scared mouse: but what if this is null? What if this is missing? What if this is undefined? What if the file was overwritten? What if the database got corrupted, by space rays? What if a client reconnects, after a seven year ping timeout?
And then, the second time you let them loose on the codebase, they see all of those what-ifs and now rightfully conclude that in this codebase anything’s possible really and no one knows anything about how any of this works and their teeny tiny mouse voice gets even smaller and puts yet more questions into places where there should be statements. And then, the third time… Well.
Over the last few weeks, I’ve found again and again that what I once called paint-by-numbers programming is still required when you’re building something new, when there’s no statements in the codebase yet, only blank pages and possibilities. You put in the numbers and the lines and then you let the agent put in the colors. You also need to watch the agent to make sure it doesn’t put in new lines and numbers that you don’t agree with.
But here’s the tricky part: depending on what you’re doing, the lines and numbers might already be there — in the training data, in the framework, in the idea of what you’re building. If you’re building a web application with a popular framework, there’s already quite a few statements in the codebase. Of course there’s going to be request object, and this is how migrations work, and this is what happens on restart, and no, this isn’t possible.
When you’re building something new, the challenge now is to figure out what’s true about your idea and your codebase and whether the agents know that too. Once they do, you can let them loose again.
This is one of the truest, realest things I’ve read on doing serious software development with AI: Eight years of wanting, three months of building with AI. There’s a lot in there that made me want to share this with you, but this is the one thing that’s fundamental to it all and that I keep finding over and over: “The takeaway for me is simple: AI is an incredible force multiplier for implementation, but it’s a dangerous substitute for design. It’s brilliant at giving you the right answer to a specific technical question, but it has no sense of history, taste, or how a human will actually feel using your API. If you rely on it for the ‘soul; of your software, you’ll just end up hitting a wall faster than you ever have before.”
Thomas Ptacek says Vulnerability Research Is Cooked. This was published before the whole Anthropic Mythos thing: “Now consider the poor open source developers who, for the last 18 months, have complained about a torrent of slop vulnerability reports. I’d had mixed sympathies, but the complaints were at least empirically correct. That could change real fast. The new models find real stuff. Forget the slop; will projects be able to keep up with a steady feed of verified, reproducible, reliably-exploitable sev:hi vulnerabilities? That’s what’s coming down the pipe. Everything is up in the air. The industry is sold on memory-safe software, but the shift is slow going. We’ve bought time with sandboxing and attack surface restriction. How well will these countermeasures hold up? A 4 layer system of sandboxes, kernels, hypervisors, and IPC schemes are, to an agent, an iterated version of the same problem. Agents will generate full-chain exploits, and they will do so soon.”
Also, published before Mythos: Claude Code Found a Linux Vulnerability Hidden for 23 Years.
But now: Anthropic Mythos and Project Glasswing. A lot has been said about it, even though no one outside Anthropic’s tried it and (lived to?) told the tale. I’m not going to attempt a summary, but this was a great overview: Why Anthropic believes its latest model is too dangerous to release.
Then you had people either stoking the flames or saying it’s not a big deal. Anthropic’s Jack Lindsey, for example, shared his scary-sounding impressions: “In one episode, the model needed to edit files it lacked permissions for. After searching for workarounds, it found a way to inject code into a config file that would run with elevated privileges, and designed the exploit to delete itself after running.” Theo also said “Claude Mythos is the start of the end. I think this is my psychosis moment.” Then others said you should go offline, or create data checkouts of everything you have online, because Mythos is coming, and the world’s ending, and so on. But others are skeptical and say this is not a new capability. Dan Shipper says he isn’t scared and I found that video to be very good. But if even the Economist asks: How dangerous is Mythos, Anthropic’s new AI model? And if the model is so dangerous that you only give it to some of the world’s richest companies with a minimum spend clause attached, days after you proudly share how much money you’re making from your less-dangerous-but-once-upon-a-time-deemed-also-very-dangerous models, you have to wonder whether it’s the model we’re in awe of here or the marketing campaign. (I’m sure the model is impressive and I’m convinced the models will get really, really good.)
As you know, I took the whole Easter weekend off. I’m very much not a religious person, but it’s public holidays here in Bavaria, school’s out, and I wanted to be mostly offline and read. So that’s what I did. One thing I read was this New Yorker article on Mary Magdalene from 2006, because it came recommended in their newsletter. Again: not religious, but that did hit the spot, the spot right between Umberto Eco, Raiders of the Lost Ark, Dan Brown, and all the other stuff I find strangely fascinating. Good read.
I also finally read The Inner Game of Tennis after years of hearing and seeing recommendations and the book being on my to-read list. It was wonderful. What a fantastic little book. Let’s check back in a year, but I think this has permanently changed how I think about learning, about performing, about concentration and attention and focusing. It’s a very gentle book that seems to know exactly what it’s supposed to be. Highly recommend, even if you’ve never held a tennis racket, which I haven’t.
(Obligatory mention of David Foster Wallace’s Roger Federer as Religious Experience when tennis comes up.)
Okay, I know how this sounds, believe me, I read this paragraph multiple times, but I do think there’s a line you can draw — a very thin, wobbly, hard to see if you don’t squint line — from The Inner Game of Tennis to… “retardmaxxing”, which is, to let Marc Andreessen explain: “There’s this guy on YouTube who has basically a hundred videos on retardmaxxing. He’s like my new life coach. I haven’t met him, but from a distance. It’s basically just—retardmaxx. Go to work, do a good job, come home, it’s fine. Start a company, it succeeds, it fails, it’s fine. Have too much to eat one night at dinner, it’s fine. Go to the gym, don’t count your reps, it’s fine. Ask a girl if she wants to go out with you, if she says no, it’s fine.” That did sound alluring to me, I have to admit. So I watched the video and, sorry, but he’s got a point, doesn’t he, when he asks: “Brian Johnson, who measures his son’s boners, told me not to have caffeine after 1pm. Dude, what are we doing here?”
Food for thought: The machines are fine. I’m worried about us. Many things in there that make me say mmmh, or tilt my head, or nod.
“I sent ChatGPT an audio file of a series of FART sound effects and asked what it thinks of ‘my music’ and this is what it said.” Incredible stuff. Finally we have the technology. “It feels more like an atmosphere piece than a traditional song.”
Rick Rubin interviewed Adam Neumann and I couldn’t stop listening. I listened to the whole three hours. I don’t know anything about Neumann really, nor about WeWork. I’ve been in a few WeWorks, thought they were nice. Never dug deeper into it, never watched the tv show, went in expecting nothing and ended up thinking that this guy is an amazing storyteller. I don’t know whether his stories are true, but it’s a great listen. There’s some very interesting ancedotes about high-growth companies and high-level funding in it, and then there’s Rick Rubin saying “beautiful” with a period at the end at just the right moments.
caveman, a skill that “makes agent talk like caveman — cutting ~75% of output tokens while keeping full technical accuracy.” Because: “why use many token when few do trick.”
The Cognitive Dark Forest claims that “open web with AIs is turning into a dark forest” and now sharing knowledge or code is no longer beneficial. Instead, “hiding is the most rational - the only - strategy of survival.” It’s a very interesting thought experiment. I don’t know whether I agree, but do I have to say that even in the last year, my feelings (that I can’t even articulate yet) on building on public, or sharing things, have changed a lot. At one end of the spectrum there’s the feeling of “eh, this is not worth sharing, it’s just a prompt away” and on the other there’s “if I share this, the result is just a prompt away for everyone.” Sci-fi times in any case.
This was fascinating: A Dot a Day Keeps the Clutter Away.
Tim O’Reilly talked to Harper Reed: “Conviction Collapse” and the End of Software as We Know It. A lot of good stuff in there. This, for example: “AI is not just a tool. It is a substrate that we shape. It’s a medium, like clay or marble or bronze for a sculptor, or words for a writer. Everybody had access to the same capabilities of English as Shakespeare, but Shakespeare made something out of them that nobody else did. Creating a software product is increasingly like creating a document or an image or a piece of music. And that means that it can range from something throwaway to an enduring work of art.”
How Microsoft Vaporized a Trillion Dollars. This is a multi-part blog post and I only read the first three parts so far, but that was interesting already.
rip-grep.com or R.I.P. Grep: “Monitor the situation about what's dead and dying”
Incredibly fascinating: Fake Fans by Eliza McLamb, a musician and writer, about the digital marketing agency Chaotic Good that creates artificial fans on social media or in comment sections. That sounds horrible, of course, because “create artificial fans” is just weird way to say “fakes fans”, but when you read the piece, which is very honest and reflective, you start to realize that it maybe isn’t that simple. The bigger shocker: Geese, the band, is mentioned as a customer of Chaotic Good. And McLamb herself recounts how she found the band and fell in love with their music and how, yes, she loves the music, but also: they have fake social media fans? And then I realized that I found out about Geese on Twitter, because some young-person-looking account simply posted four album covers with the text “what a run” or something more Gen Z sounding. So I sat down and did a reverse Google image search and found the albums and started listening and recommending and fell in love with the music too. Because of a tweet! That might’ve been fake! Anyway: read the post, then feed it into a model, and ask it what McLuhan would say about it. Fascinating, like I said.
Hell yes: How we built a virtual filesystem for our Assistant. Very, very neat. Of course the fact that agents love the CLI and grepping and listing isn’t new and it’s one of the reasons why Amp came out of Sourcegraph and why we had a code search subagent before we had embeddings search, but seeing it implemented like this, completely virtual, is amazing.
Ryan Holiday: 5 Years of Lessons From Running My Own Bookstore. I didn’t know that Ryan Holiday had a bookstore. Great post.
Waterfall, Agile, AI. Nailed it.
The question “if AI is so great, where’s all the amazing software?” has always bugged me a little. That’s not how technological progress works! In 1999 it sure didn’t look like the information superhighway called Internet had changed anything, did it? And now we have freaking cloud kitchens and people making a living by being YouTubers. This post, Why Isn’t Everything Different Yet?, tries to answer that question with a bit more nuance than I just did. Good picture to keep in mind: “When electricity became commercially available, you know what most factories did? They replaced their steam engine with an electric motor. One motor. In the same spot the steam engine was. Driving the same central driveshaft that spun all the same belts and pulleys to all the same machines.”
SwiftLM, a “native MLX Swift LLM inference server for Apple Silicon” that makes use of the TurboQuant compression released by Google recently. The numbers are impressive: “100K context on 24 GB MacBook Pro: […], you can process 100,000 tokens of context on a 24 GB machine — only utilizing 22.3 GB total. (Previously required a 64 GB Mac Studio).” Imagine if all model progress stopped today: how much performance would be squeezed out of them?
This is great and I’m kinda sad that I only read it now: The Complicator’s Gloves. Sure would’ve loved to link to it in some discussions in the past, eh?
The post mortem of the axios npm supply chain attack has this comment here that explains how the attack went down. Nuts: “First thing is typically delivered as part of a social engineering ploy involving a fake Zoom or MS Teams call. There’s A LOT leading up to the call. It’s not urgent, pressing, suspicious at all. It’s not a one-click, get phished. They’ll schedule a call for next week and then reschedule it for the week after. It’s crazy disarming.”
Apparently, many years ago, Paul Ford, one of my favorite writers, wrote about media appearances — podcasts, TV, radio (yes, as I said: many years ago) — and it’s great: Be Our Guest!
Someone who knows the industry in which the “first vibe-coded billion dollar company” was created shared some insights. Not a big surprise: it’s not about the software. It’s about the market, the product, the customers, marketing. (Reading through that whole thread is a very interesting peek in a very different world than the one I usually live in. Recommended.) But maybe it’s also about doing shady things, as the editor’s note in the New York Times sounds like.
Didn’t know that’s a thing (potential?) investors do: Save Snap Now.
Aphyr on AI, the future, and well, everything: The Future of Everything is Lies, I Guess. Very pessimistic and dark and maybe even cynical, but (or maybe because of that:) interesting. Especially this bit here, very realistic: “I don’t think people are well-equipped to reason about this kind of jagged ‘cognition’. One possible analogy is savant syndrome, but I don’t think this captures how irregular the boundary is. Even frontier models struggle with small perturbations to phrasing in a way that few humans would. This makes it difficult to predict whether an LLM is actually suitable for a task, unless you have a statistically rigorous, carefully designed benchmark for that domain.” Jagged frontier. Or as Karpathy wrote this week: “peaky” in highly technical areas.
Very cool: Understanding Traceroute. I had no clue about the TTL thing.
Also, very very cool: “This map shows all buildings in the Netherlands, colored by their year of construction. This map is made by Bert Spaan.”
A 13-year-old blog post by Steven Sinofsky about Learning from Competition. It’s great. Here is the pitch: “Studying your competitor, well, gives you a chance to evaluate your choices in an entirely different context. When you make a product choice you are making it in the context of your company, strategy, business model, and people/talents. What if you change some of those? That is what knowing the competition allows you to do, and basically for free (no consultants or top secret research).”
How NASA Built Artemis II’s Fault-Tolerant Computer: “Effectively, eight CPUs run the flight software in parallel. The engineering philosophy hinges on a ‘fail-silent’ design.”
Unfolder for Mac. I have absolutely no use for this app but looking at it makes me want to find one. Beautiful.
Started reading Jonathan Franzen and while spelunking his Wikipedia page, I ended up on this 2012 Guardian article called Ten rules for writing fiction in which not only Franzen but many other authors provide their own ten rules. Margaret Atwood’s are good. Franzen’s too. Most fascinating is how different they all are from each other.
Scott Chacon’s GitButler raised a Series A: We’ve raised $17M to build what comes after Git. Bold.
How many did you already know? Very good list: Shell Tricks That Actually Make Life Easier (And Save Your Sanity)
Bill Gurley on SaaS, stock-based compensation, and what’s happening now: “The SaaS universe has been repriced — sharply and broadly — as the market recalibrates what recurring software revenue is worth in a world where AI is compressing development cycles, automating workflows, and threatening to commoditize features that once commanded premium pricing. [...] For employees, this means vesting tranches worth a fraction of what they expected when they accepted their offer letters or negotiated their last refresh. And because they’ve always treated RSUs as cash, they don’t experience this as an investment that didn’t pan out. They experience it as a pay cut.”
What clouds, man: Mark Maggiori — Arizona Cannonball.
Stewart Lee on Stewart Lee fans: “The man is a genius. And so am I because I like him.” I’ve watched this six times already. (Because I’m a genius.)


