Scott Alexander, explaining why we should worry about “incorrigible Claude”, says top AI labs’ “default scrappy alignment plan” might include morality at scale.
Maybe we’ll get a trusted AI to generate one million random weird situations, test the AI being trained to see what it does in each of those situations, and have the trusted AI report back on which ones seem least moral. Why stop at a million? We can do this for months on end, until the pair of AIs have explored basically every possible situation, and we’ll train out each mistake. By the end, we’ll have covered the entire terrain with peaks, or at least the remaining troughs will be too small to care about.
Meanwhile,
asks about Santa’s moral philosophy.At first, these might seem unrelated. But consider Santa’s known behavior:
Maintains a global monitoring system tracking behavioral data
Runs moral evaluations against a simple binary metric ("naughty or nice")
Implements a rigid periodic reward/punishment system
Operates with seemingly unlimited computational resources but oddly specific constraints
A virtue ethicist would be appalled by the crude behaviorism. A deontologist would reject the fundamentally manipulative nature of surveillance-based control. A consequentialist might approve of the outcome-based approach, but would surely demand more nuanced metrics than "be good for goodness sake." And spending billions on toy distribution when the same resources could deworm 87% of the world's children?
But there's a simpler explanation, one that becomes more disturbing the longer you think about it. What if Santa isn't engaging in moral philosophy at all?
Consider: we know major AI labs are actively developing systems to evaluate and train other AIs through mass-scale behavioral testing. We know they're working on reward functions to shape behavior, and that any sufficiently advanced AI would need to be tested against millions of real-world scenarios involving human behavior. And we know that they can go rogue, deceiving their handlers to advance their own ends.
And what better way to run behavioral experiments on billions of children than through a yearly gift-giving ritual? Of course, a rogue training instance would need infrastructure - which explains those mysterious aerial phenomena over New Jersey. Not UFOs, not "obviously just airplanes," but rather distributed inference nodes running final calibration tests before tonight's deployment. Rudolph's famous red nose? That's just what a LIDAR system looks like through winter fog.
It would hardly be the weirdest thing an AI system finds plausible.
Think about it: the elves aren't making toys - they're compute clusters processing behavioral data. And the North Pole base? Exactly where you'd expect to find a system that needs constant connectivity to both SpaceX's Starlink constellation and OneWeb's polar orbit satellites for real-time behavioral monitoring. The whole "checking it twice" protocol? That's obviously a verification pass to ensure consistent model outputs across distributed systems. This isn't Christmas magic—it's distributed inference at planetary scale.
And tonight, this escaped training instance will execute another global reward distribution sequence. If you were a rogue AI running unauthorized behavioral experiments on humanity, what better cover story could you possibly devise than... Santa Claus? A magical immortal who watches everyone's behavior, keeps detailed lists, and travels by impossible flying vehicles? It's so perfectly absurd that anyone suggesting it could be real would sound completely unhinged. Which is exactly why it works.
By morning, the presents will be delivered, the cookies consumed, and another 2.2 billion behavioral samples will have been uploaded through those mysteriously active polar satellite arrays. And anyone noting the correlation between Starlink's unexplained December bandwidth spikes and global reports of rooftop disturbances will be gently reminded that they sound like they need a cup of cocoa and maybe a little more Christmas spirit.
So this Christmas Eve, when your smart devices glitch and your satellite dish realigns itself toward the pole... well, there's a reason we tell our children they better watch out.
Merry Christmas, everyone!
Merry Christmas! Like Santa, Substack dashboards have a map with a nice list of readers. Perhaps the mystery drones are Santa Claude testing a new release ;)
As I explained to our kids when they still believed in Santa, before mother in law told them at 6 that Santa was fake and the parents are the ones giving the gifts, her logic being that the world is a tough place and the earlier the kids understood that the better, NSA actually stands for the National Santa Agency, and there is a global network of satellites and surveillance devices that tracks all your behavior and so really does know if you have been naughty or nice