The Balto/Togo theory of scientific development

Tragically I gave up on the Plate Tectonics study before answering my most important question: “Is Alfred Wegener the Balto of plate tectonics?”

Let me back up.

Balto

Balto is a famous sled dog. He got a statue in NYC for leading a team of dogs through a blizzard to deliver antibody serum to Nome, Alaska in 1925, ending a diphtheria outbreak. Later Disney made a movie about how great he was.

Except that run was a relay, and Balto only got famous because he did the last leg, which had the most press coverage but was also the easiest. The real hero was Togo, the dog who led the team through the hardest terrain and covered by far the most miles as well. Disney later made a movie about him that makes no mention of Balto for the first 90%, and then goes out of its way to talk about what a shit dog he was, that’s why he didn’t get included in any of the important teams, but Togo had had to do so many hard things they needed a backup team for the trivial last leg so Balto would have to do.

Togo’s owner died mad about the US mainland believing Balto was a hero. But since all the breeders knew who did the hard part Togo enjoyed a post-Nome level of reproductive success that Ghengis Khan could only dream about, so I feel like he was happy with his choices.

plus he did eventually get some statues

But it’s not like Togo did this alone either. He led one team in a relay, and there were 20 humans and 150 dogs that contributed to the overall run. Plus someone had to invent the serum, manufacture it, and get it to the start of the dog relay at Nenana, Alaska. So exactly how much credit should Togo get here?

The part with Wegener

I was pretty sure Alfred Wegener, popularly credited as the discoverer/inventor of continental drift and mentioned more prominently than any other scientist in discussions of plate tectonics, is a Balto.

First of all, continental drift is not plate tectonics. Continental drift is an idea that maybe some stuff happened one time. Plate tectonics is a paradigm with a mechanism that makes predictions and explains a lot of data no one knew was related until that moment.

Second, Wegener didn’t discover any of the evidence he cited, he wasn’t the first to have the idea, and it’s not even clear he did much of the synthesis of the evidence. His original paper refers to “Concerning South America and Africa, biologists and geologists are in close agreement that a Brazilian–African continent existed in the Mesozoic”

So he didn’t invent the idea, gather the data, or even really synthesize the evidence. His guess at the mechanism was wrong. But despite spending hours digging into the specific discovers and synthesizers that contributed to plate tectonics, the only name I remember is Wegener’s. Classic Balto.

On the other hand, some of the people who gathered the data used to discover plate tectonics were motivated by the concept of continental drift, and by Wegener specifically. That seems like it should count for something. My collaborator Jasen Murray thinks it counts for a lot

Jasen would go so far as to argue that shining a beacon in unknown territory that inspires explorers to look for treasure in the right place makes you the Togo, racing through fractured ice rapids social ridicule and self-doubt to do the real work of getting an idea considered at all. Showing up at the finish line to formalize a theory after there’s enough work to know it’s true is Balto work to him. This makes me profoundly uncomfortable because strongly advocating for something unproven terrifies me, but as counterargument arguments go that’s pretty weak.

One difficulty is it’s hard to distinguish “ahead of their time beacon shining” from “lucky idiot”, and even Jasen admits he doesn’t know enough to claim Wegener in particular is a Togo. But doing work that is harder to credit because it’s less legible is also very Togo-like behavior, so this proves nothing about the category. 

So I guess one of my new research questions is “how important are popularizers?” and I hate it.

Bazant: An alternate covid calculator

Most of what I see people use Microcovid.org for now is estimating risk for large gatherings, which it was not designed for and thus doesn’t handle very well. I spent a few hours going through every covid calculator I could find and this calculator from the Bazant lab at MIT, while less user-friendly than Microcovid and having some flaws of its own, is tailored made for calculating risks for groups indoors, and I think it is worth a shot. 

[Note: I’ll be discussing the advanced version of the calculator here; I found the basic version too limited]

The Bazant calculator comes out of physics lab with a very detailed model of how covid particles hang and decay in the air, and how this is affected by ventilation and filtration. I haven’t checked their model, but I never checked Microcovid’s model either. The Bazant calculator lets you very finely adjust the parameters of a room: dimensions, mechanical ventilation, air filtration, etc. It combines those with more familiar parameters like vaccination and mask usage and feeds them into the model in this paper to produce an estimate of how long N people can be in a room before they accumulate a per-person level of risk between 0 and 1 (1 = person is definitely getting covid = 1,000,000 microcovids per person; .1 = 10% chance someone gets sick = 100,000 microcovids per person). It also produces an estimate of how much CO2 should accumulate over that time, letting you use a CO2 monitor to check its work and notice if risk is accumulating more rapidly than expected.

Reasons/scenarios to use the Bazant calculator over Microcovid:

  • You have a large group and want to set % immunized or effective mask usage for the group as a whole, instead of configuring everyone’s vaccinations and masks individually.
  • You want to incorporate the mechanics of the room and ventilation in really excruciating detail. 
  • You want to set your own estimate for prevalence based on beliefs about your subpopulation.
  • You want a live check on your work, in the form of the CO2 estimates.

Reasons to use Microcovid instead:

  • Your scenario is outside – Bazant calculator doesn’t handle this at all.
  • You don’t want to have an opinion on infection prevalence, immunization, or mask usage.
  • Your masks are better than surgical masks (Bazant doesn’t handle N95 or similar. Also, it rates surgical masks as 90% effective, which seems very high to me).
  • Your per-person risk tolerance is < 10,000 microcovids (Bazant calculator can’t bet set at a lower risk tolerance, although you can do math on their results to approximate this).
  • You’re still using a bubble model, or tracking accumulated risk rather than planning for an event.

Scenarios neither handle well

  • Correlated risk. You might be fine with 10% of your attendees getting sick, but not a 10% chance of all of the attendees getting sick at once.
  • Differences in risk from low-dose vs. high-dose exposures.

I’m not currently planning any big events, but if someone else is, please give this a try and let us know if it is useful. 

My Tiny Study Finds No Effect For Ketone Esters as a Nootropic

Several months ago I announced an RCT  I was running for a client to test ketone esters as a nootropic. After many more months than planned (pro-tip: do not create a study that depends on people with ADHD following a finicky multi-step process, and if you must, make your ops perfect) I failed to find an effect of ketone esters on productivity. It was a small study and a lot went wrong so this failure isn’t conclusive, but since I expected to find an enormous effect size, I’m still pretty disappointed. 

I originally intended to write up detailed results and a post-mortem on experimental operations. I’m unfortunately pressed for time right now, and can’t work up the motivation to detail exactly how little effect ketone esters had. If you’re considering running your own experiment and want the details, please reach out here or over email; I’m happy to share if it will be useful, but I didn’t want to delay publishing over details that probably don’t matter. However, you can see the experimental protocol in detail here. If you check, the hash of that file matches the hash in the recruitment doc.

The overall outcome: if you compare the average productivity for control and treatment days per person, and then average the per person difference together, that average is -0.2 (scale of 1-10), meaning treatment made people slightly worse on average. If you include only people who completed more than 10 of the planned 14 observations (4 out of 7), the average improvement is 0.3 (same 1-10 scale). The largest improvement for any one person was 1.1. I didn’t bother to check for statistical significance because even if it eked out a p-value, the real-world significance is too small.  

This result is confusing, because ketone esters obviously work really well for me, and a few other people I know. That could be a placebo, but the effects are so strong and consistent that that seems unlikely. It’s possible there was an experimental error – our operations were really not what they needed to be for a study of anyone, much less people specifically selected for having difficulties with organization – but even if I squint at the data I can’t see the kinds of patterns that might hint at that. Or the design could have been flawed; I was trading off ideal circumstances and ease of use for volume of participants. 

Given these results, I intend to keep taking ketone esters when they feel helpful to me, but can’t recommend them as in any way vindicated for mass use as a nootropic. 

Vavilov Day Starts Tomorrow

Content note: discussion of fasting.

Three weeks ago, I announced a plan to fast from the 25th to the 27th, in honor of Nikolai Vavilov and the staff of his botany institute, several of whom starved to death in the service of ending famine (and were partially successful, although far from the sole contributors). The goal was to test/improve my own ability to do hard things in the service of worthy projects. 

I had wanted to put much more research in the original post than I did, but decided it was more important to get the announcement out quickly and I should save something for the day-of post anyway. Since then, a lot has happened. Over three weeks I had 3 or 4 urgent demands around the size of “my furnace is maybe poison and my landlord is being difficult about it”. Everything is fine now, but it was a lot of effort to get it that way. I also had some emergency work drop in my lap for an extremely worthy project. I’m glad I got the opportunity to contribute and I’d make the same decision again but it ate up all of the slack I had left. And then my cell phone broke.

The immediate impact of this is there’s I’m not writing the highly researched post on Vavilov I wanted to. The internet is full of articles of the quality I could produce in the time I have available, there’s no reason to add to them.

But the more important impact is that I said I wanted to test my ability to do hard things, and then I did that, before the fast even started. My capacity was not as high as I wanted but more than I feared, and my capacity to respond to my limits gracefully instead of failing explosively exceeded my hopes.    

So in a lot of ways the purpose of the fast has already been served. I thought about letting myself out of it, but there are a few dimensions this month hasn’t tested and I still want to play with those. However in light of the fact that I am starting from a place of much lower slack and much higher time value than anticipated, I will be removing some of the rules, such as “I have to work a normal workday” and “I have to do at least one physical activity”. Those rules were for someone who didn’t expend all her reserves doing intense cognitive work on no notice while angry people made horrible noises banging on her furnace for three days straight. As of writing this (Monday night) I haven’t made up my mind on relaxing the calorie restriction to allow for ketone esters, which for me are a small source of calories that greatly reduce the cognitive and emotional costs of fasting. 

Tomorrow (the 26th) is the 69th anniversary of Nikolai Vavilov’s death. The day after is the 68th anniversary of the end of the siege of Leningrad, which meant the institute staff no longer needed to starve themselves to protect their seed bank. I will be fasting from 10PM tonight (the 25th) to 10AM on the 27th, but no promises on doing more than that. And if that high-value project needs more no-notice immediate-turnaround work from me and the ketone esters aren’t enough, I don’t even promise to keep fasting. Because this was never about pain for pain’s sake, it was about testing and increasing my ability to follow through on my own principles, and one of those principles is “don’t pointlessly incapacitate yourself when high impact time-sensitive work is waiting”.  

“…it was hard to wake up, it was hard to get on your feet and put on your clothes in the morning, but no, it was not hard to protect the seeds once you had your wits about you. Saving those seeds for future generations and helping the world recover after war was more important than a single person’s comfort.”

unknown Vavilov Institute scientist

“Eating Dirt Benefits Kids” is Basically Made Up

Sometimes people imply that epistemic spot checks are a waste of time, that it’s too easy to create false beliefs with statements that are literally true but fundamentally misleading. And sometimes they’re right.

On the other hand, sometimes you spend 4 hours and discover a tenet of modern parenting is based on absolutely nothing.

[EDIT: this definitely was a tenet among my friends, but apparently is less widespread than I thought.]

Sorry, did I say 4 hours? It was more like 90 minutes, but I spent another 2.5 hours checking my work just in case. It was unnecessary.

Intro

You are probably familiar with the notion that eating dirt is good for children’s immune systems, and you probably call that Hygiene Hypothesis, although that’s technically incorrect. 

Hygiene Hypothesis can refer to a few different things:

  1. A very specific hypothesis about the balance between specific kinds of immune cells.
  2. A broader hypothesis that exposure to nominally harmful germs provides the immune system training and challenge that ultimately reduces allergies.
    1. One particular form of this involves exposure to macroparasites, but that seems to have fallen out of favor.
  3. The hypothesis that exposure to things usually considered dirty helps populate a helpful microbiome (most often gut, but plausibly also skin, and occasionally eyeball), and that reduces allergies. This is more properly known as the Old Friends hypothesis, but everyone I know combines them.
  4. Pushback on the idea that everything children touch should be super sanitized
  5. The idea that eating dirt in particular is beneficial for children for vague allergy-related reasons.

I went into this research project very sold on the Hygiene Hypothesis (broad sense), and figured this would be a quick due diligence to demonstrate it and get some numbers. And it’s true, the backing for Hygiene and Old Friends Hypothesis seems reasonably good, although I didn’t dig into it because even if they’re true, the whole eating dirt thing doesn’t follow automatically. When I dug into that, what I found was spurious at best, and what gains there were had better explanations than dirt consumption.

This post is not exhaustive. Proving a negative is very tiring, and I felt like I did my due diligence checking the major books and articles making the claim, none of which had a leg to stand on. Counterevidence is welcome. 

Evidence

Being born via c-section instead of vaginally impoverishes a newborn’s microbiome, and applying vaginal fluid post-birth mitigates that

This has reasonable pilot studies supporting it, to the point I mentioned it to a pregnant friend.

There are reports that a mother’s previous c-sections lower a newborn’s risks even further, but I suspect that’s caused by the fact below

Having older siblings reduces allergies

Study. The explanation given is a more germ-rich environment, although that’s not proven.

Daycare reduces later allergies, with a stronger effect the earlier you enter, unless you have older siblings in which case it doesn’t matter

Study. Again, there are other explanations, but contagious diseases sure look promising.

Living with animals when very young reduces allergies

This one is a little more contentious and I didn’t focus on it.  When the animal appears seems to matter a lot.

One very popular study used to bolster Dirt Eating is a comparison of Amish and Hutterite children. Amish children get ~⅙ of the allergies Hutterite children do, which pop articles are quick to attribute to dirt “because Amish children work on farms and Hutterite children don’t.” But there are a lot of differences between the populations: dust in Amish homes have 6x the bacterial toxins of Hutterite homes, the children have much more exposure to animals, and drink unpasteurized milk. 

Limitations of Farm Studies

Even if Amish children did eat more dirt and that was why they were healthier, there’s no transfer from that to urban parks treated with pesticides and highway exhaust. They might be net positive, the contaminants might not matter that much, your park in particular might be fine, no one has proven this dirt is harmful, etc. But you should not rest your decision on the belief that that dirt has been proven beneficial, because no one has looked.

Mouse Studies

There are several very small mouse studies showing mice had fewer allergies when exposed to Amish dirt, but:

  1. They are very small.
  2. They are in mice.
  3. The studies I found never involve feeding the mice dirt. Instead, they place it in bedding, or directly their nasal passages, or gently waft it into the cage with a fan. 

So eating dirt is bad then?

I don’t know! It could easily be fine or even beneficial, depending on the dirt (but I suspect the source of dirt matters a lot). It could be good on the margin for some children and bad for others. Also, avoiding a constant battle to keep your toddler from doing something they extraordinarily want to do is its own reward. What I am asserting is merely that anyone who confidently tells you eating arbitrary dirt is definitely good is wrong, because we haven’t done the experiments to check.

I think any of [communicable diseases, animals, unpasteurized milk] have more support as anti-allergy interventions than dirt, but I hesitate to recommend them given that a high childhood disease load is already known to have significant downsides and the other two are not without risks either.

Epilogue

The frightening thing about this for me is how this became common knowledge even, perhaps especially, among my highly intelligent, relatively authority-skeptical friends, despite falling apart the moment anyone applied any scrutiny. I already thought the state of medical knowledge and the popular translation of that knowledge was poor, but somehow it still found a way to disappoint me.

My full notes are available in Roam.

This post was commissioned by Sid Sijbrandij. It was preregistered on Twitter. I am releasing it under the Creative Commons Attribution 4.0 license. Our initial agreement was that I would be paid before starting work to avoid the appearance of influence; in practice I had the time free and the paperwork was taking forever so I did the research right away and sat on the results for a week.

Thanks to Miranda Dixon-Luinenburg⁩ for copyediting.