Review of Examine.com’s vitamin write-ups

There are a lot of vitamins and other supplements in the world, way more than I have time to investigate. Examine.com has a pretty good reputation for its reports on vitamins and supplements. It would be extremely convenient for me if this reputation was merited. So I asked Martin Bernstoff to spot check some of their reports. 

We originally wanted a fairly thorough review of multiple Examine write-ups. Alas, Martin felt the press of grad school after two shallow reviews and had to step back. This is still enough to be useful so we wanted to share, but please keep in mind its limitations. And if you feel motivated to contribute checks of more articles, please reach out to me (elizabeth@acesounderglass.com).

My (Elizabeth’s) tentative conclusion is that it would take tens of hours to beat an Examine general write-up, but they are not complete in either their list of topics nor their investigation into individual topics. If a particular effect is important to you, you will still need to do your own research.

Photo credit DALL-E

Write-Ups

Vitamin B12

Claim: “The actual rate of deficiency [of B12] is quite variable and it isn’t fully known what it is, but elderly persons (above 65), vegetarians, or those with digestion or intestinal complications are almost always at a higher risk than otherwise healthy and omnivorous youth”

Verdict: True but not well cited. Their citation merely asserts that these groups have shortages rather than providing measurements, but Martin found a meta-analysis making the same claim for vegetarians (the only group he looked for).

Toxicology

Verdict: Very brief. Couldn’t find much on my own. Seems reasonable.

Claim: “Vitamin B12 can be measured in the blood by serum B12 concentrations, which is reproducible and reliable but may not accurately reflect bodily vitamin B12 stores (as low B12 concentrations in plasma or vitamin B12 deficiencies do not always coexist in a reliable manner[19][26][27]) with a predictive value being reported to be as low as 22%”

Verdict: True, the positive predictive value was 22%, but with a negative predictive value of 100% at the chosen threshold. But that’s only the numbers at one threshold. To know whether this is good or bad, we’d have to get numbers at different threshold (or, preferably, a ROC-AUC).

Claim: B12 supplements can improve depression

Examine reviews a handful of observational studies showing a correlation, but includes no RCTs.  This is in spite of there actually being RCTs like Koning et al. 2016 and a full meta analysis, neither of which find an effect. 

The lack of effect in RCTs is less damning than it sounds. I (Elizabeth) haven’t checked all of the studies, but the Koning study didn’t confine itself to subjects with low B12 and only tested serum B12 at baseline, not after treatment. So they have ruled out neither “low B12 can cause depression, but so do a lot of other things” nor “B12 can work but they used the wrong form”.

I still find it concerning that Examine didn’t even mention the RCTs, and I don’t have any reason to believe their correlational studies are any better. 

Interactions with pregnancy

Only one study on acute lymphoblastic leukemia. Seems a weird choice. Large meta-analyses exist for pre-term birth and low birth weight, likely much more important. Rogne et al. 2016.

Overall

They don’t seem to be saying much wrong but the write-up is not nearly as comprehensive as we had hoped. To give Examine its best shot, we decided the next vitamin should be on their best write-up. We tried asking Examine which article they are especially confident in. Unfortunately, whoever handles their public email address didn’t get the point after 3 emails, so Martin made his best guess. 

Vitamin D

Upper respiratory tract infections.

They summarize several studies but miss a very large RCT published in JAMA, the VIDARIS trial. All studies (including the VIDARIS trial) show no effect, so they might’ve considered the matter settled and stopped looking for more trials, which seems reasonable.

Claim: Vitamin D helps premenstrual syndrome

”Most studies have found a decrease in general symptoms when given to women with vitamin D deficiency, some finding notable reductions and some finding small reductions. It’s currently not known why studies differ, and more research is needed”

This summary seemed optimistic after Martin looked into the studies:

  • Abdollahi 2019:
    • No statistically significant differences between groups.
    • The authors highlight statistically significant decreases for a handful of symptoms in the Vitamin D group, but the decrease is similar in magnitude to placebo. Vitamin D and placebo both have 5 outcomes which were statistically significant.
  • Dadkhah 2016:
    • No statistically significant differences between treatment groups
  • Bahrami 2018:
    • No control group
  • Heidari 2019:
    • Marked differences between groups, but absolutely terrible reporting by the authors – they don’t even mention this difference in the abstract. This makes me (Martin) somewhat worried about the results – if they knew what they were doing, they’d focus the abstract on the difference in differences.:
  • Tartagni 2015:
    • Appears to show notable differences between groups, But terrible reporting. Tests change relative to baseline (?!), rather than differences in trends or differences in differences. 

In conclusion, only the poorest research finds effects – not a great indicator of a promising intervention. But Examine didn’t miss any obvious studies.

Claim: “There is some evidence that vitamin D may improve inflammation and clinical symptoms in COVID-19 patients, but this may not hold true with all dosing regimens. So far, a few studies have shown that high dosages for 8–14 days may work, but a single high dose isn’t likely to have the same benefit.”

The evidence Martin found seems to support their conclusions. They’re missing one relatively large, recent study (De Niet 2022). More importantly, all included studies are about hospital patients given vitamin D after admission, which are useless for determining if Vitamin D is a good preventative, especially because some forms of vitamin D take days to be turned into a useful form in the body. 

  • Murai 2021:
    • The regimen was a single, high dose at admission.
    • No statistically significant differences between groups, all the effect sizes are tiny or non-existent.
  • Sabico 2021:
    • Compares Vitamin D 5000 IU/daily to 1000 IU/daily in hospitalized patients.
    • In the Vitamin D group, they show faster
      • Time to recovery (6.2 ± 0.8 versus 9.1 ± 0.8; p = 0.039)
      • Time to restoration of taste (11.4 ± 1.0 versus 16.9 ± 1.7; p = 0.035)
        • The Kaplan-Meier Plot looks weird here, though. What happens on day 14?!
    • All symptom durations, except sore throat, were lower in the 5000 IU group:

All analyses were adjusted for age, BMI and type of D vitamin – which is a good thing, because it appears the 5000 IU group was healthier at baseline:

  • Castillo 2020:
    • Huge effect – half of the control group had to go to the ICU, whereas only one person in the intervention group did so (OR 0.02).
    • Nothing apparently wrong, but I’m still highly suspicious of the study:
      • An apparently well-done randomized pilot trial, early on, published in “The Journal of Steroid Biochemistry and Molecular Biology”. Very worrying that it isn’t published somewhere more prestigious.
      • They gave hydroxychloroquine as the “best available treatment”, even though there was no evidence of effect at the time of the study.
      • They call the study “double masked” – I hope this means double-blinded, because otherwise the study is close to worthless since their primary outcomes are based on doctor’s behavior.
      • The follow-up study is still recruiting.

Conclusion

I don’t know of a better comprehensive resource than Examine.com. It is alas still not comprehensive enough for important use cases, but still a useful shortcut for smaller problems.

Thanks to the FTX Regrant program for funding this post, and Martin for doing most of the work.

“Eating Dirt Benefits Kids” is Basically Made Up

Sometimes people imply that epistemic spot checks are a waste of time, that it’s too easy to create false beliefs with statements that are literally true but fundamentally misleading. And sometimes they’re right.

On the other hand, sometimes you spend 4 hours and discover a tenet of modern parenting is based on absolutely nothing.

[EDIT: this definitely was a tenet among my friends, but apparently is less widespread than I thought.]

Sorry, did I say 4 hours? It was more like 90 minutes, but I spent another 2.5 hours checking my work just in case. It was unnecessary.

Intro

You are probably familiar with the notion that eating dirt is good for children’s immune systems, and you probably call that Hygiene Hypothesis, although that’s technically incorrect. 

Hygiene Hypothesis can refer to a few different things:

  1. A very specific hypothesis about the balance between specific kinds of immune cells.
  2. A broader hypothesis that exposure to nominally harmful germs provides the immune system training and challenge that ultimately reduces allergies.
    1. One particular form of this involves exposure to macroparasites, but that seems to have fallen out of favor.
  3. The hypothesis that exposure to things usually considered dirty helps populate a helpful microbiome (most often gut, but plausibly also skin, and occasionally eyeball), and that reduces allergies. This is more properly known as the Old Friends hypothesis, but everyone I know combines them.
  4. Pushback on the idea that everything children touch should be super sanitized
  5. The idea that eating dirt in particular is beneficial for children for vague allergy-related reasons.

I went into this research project very sold on the Hygiene Hypothesis (broad sense), and figured this would be a quick due diligence to demonstrate it and get some numbers. And it’s true, the backing for Hygiene and Old Friends Hypothesis seems reasonably good, although I didn’t dig into it because even if they’re true, the whole eating dirt thing doesn’t follow automatically. When I dug into that, what I found was spurious at best, and what gains there were had better explanations than dirt consumption.

This post is not exhaustive. Proving a negative is very tiring, and I felt like I did my due diligence checking the major books and articles making the claim, none of which had a leg to stand on. Counterevidence is welcome. 

Evidence

Being born via c-section instead of vaginally impoverishes a newborn’s microbiome, and applying vaginal fluid post-birth mitigates that

This has reasonable pilot studies supporting it, to the point I mentioned it to a pregnant friend.

There are reports that a mother’s previous c-sections lower a newborn’s risks even further, but I suspect that’s caused by the fact below

Having older siblings reduces allergies

Study. The explanation given is a more germ-rich environment, although that’s not proven.

Daycare reduces later allergies, with a stronger effect the earlier you enter, unless you have older siblings in which case it doesn’t matter

Study. Again, there are other explanations, but contagious diseases sure look promising.

Living with animals when very young reduces allergies

This one is a little more contentious and I didn’t focus on it.  When the animal appears seems to matter a lot.

One very popular study used to bolster Dirt Eating is a comparison of Amish and Hutterite children. Amish children get ~⅙ of the allergies Hutterite children do, which pop articles are quick to attribute to dirt “because Amish children work on farms and Hutterite children don’t.” But there are a lot of differences between the populations: dust in Amish homes have 6x the bacterial toxins of Hutterite homes, the children have much more exposure to animals, and drink unpasteurized milk. 

Limitations of Farm Studies

Even if Amish children did eat more dirt and that was why they were healthier, there’s no transfer from that to urban parks treated with pesticides and highway exhaust. They might be net positive, the contaminants might not matter that much, your park in particular might be fine, no one has proven this dirt is harmful, etc. But you should not rest your decision on the belief that that dirt has been proven beneficial, because no one has looked.

Mouse Studies

There are several very small mouse studies showing mice had fewer allergies when exposed to Amish dirt, but:

  1. They are very small.
  2. They are in mice.
  3. The studies I found never involve feeding the mice dirt. Instead, they place it in bedding, or directly their nasal passages, or gently waft it into the cage with a fan. 

So eating dirt is bad then?

I don’t know! It could easily be fine or even beneficial, depending on the dirt (but I suspect the source of dirt matters a lot). It could be good on the margin for some children and bad for others. Also, avoiding a constant battle to keep your toddler from doing something they extraordinarily want to do is its own reward. What I am asserting is merely that anyone who confidently tells you eating arbitrary dirt is definitely good is wrong, because we haven’t done the experiments to check.

I think any of [communicable diseases, animals, unpasteurized milk] have more support as anti-allergy interventions than dirt, but I hesitate to recommend them given that a high childhood disease load is already known to have significant downsides and the other two are not without risks either.

Epilogue

The frightening thing about this for me is how this became common knowledge even, perhaps especially, among my highly intelligent, relatively authority-skeptical friends, despite falling apart the moment anyone applied any scrutiny. I already thought the state of medical knowledge and the popular translation of that knowledge was poor, but somehow it still found a way to disappoint me.

My full notes are available in Roam.

This post was commissioned by Sid Sijbrandij. It was preregistered on Twitter. I am releasing it under the Creative Commons Attribution 4.0 license. Our initial agreement was that I would be paid before starting work to avoid the appearance of influence; in practice I had the time free and the paperwork was taking forever so I did the research right away and sat on the results for a week.

Thanks to Miranda Dixon-Luinenburg⁩ for copyediting.

Review: Martyr Made Podcast

Update (2022-11-01): I stand by what I said about Darry Cooper’s long-form history podcasts but his stuff on current events has gotten increasingly deranged, well beyond what even Twitter can justify.

Introduction

Sometimes I consume media that makes factual claims. Sometimes I look up some of these claims to see how much trust I should place in said media, in a series I call epistemic spot checks. Over the years, I’ve gone back and forth on how useful this is. Focusing on evaluating particular works instead of developing a holistic opinion on an entire subject does feel perverse to me. OTOH, sometimes non-fiction is recreational, and I don’t think having some of my attention directed by people I find insightful and trustworthy is a bad thing, as long as I don’t swallow their views unquestioningly. Additionally, there’s a pleasant orderliness to doing ESCs, like the intellectual equivalent of cleaning my house. It’s not enough in and of itself, but it can free up RAM such that there’s room for deeper work.

I started listening to Darryl Cooper’s Martyr Made last year as part of a deep dive on cults, but kept going because I found him incredibly insightful. After listening to the 30+ hours of the God’s Socialist sequence, I Googled around and found a few accusations of racism against Cooper. I didn’t believe the accusations then, and I still don’t. People can go through the motions of saying what other people tell them to, but they can’t fake what Cooper does, which is to approach every human being as someone worthy of respect and compassion, whose actions are probably reasonable given their incentives. I value that a lot more than proper signaling.

Some time later I found an archive of Cooper’s deleted Twitter logs, and, uh, I get where people are coming from on the racism thing. I still absolutely believe in his respect and compassion for everyone except members of the USSR leadership (and even then, he’ll say very nice things about the intentions of early communists).  However, the thing about doing that genuinely instead of choosing a side and signaling allegiance is that it doesn’t compress well to 140 characters, and he said a bunch of things that were extremely easy to round to terrible beliefs. I might also have mistaken him for racist, if all I had was his Twitter. But given the podcasts, I am very sure that he respects-and-has-compassion-for every human being.

[Between when I started listening and when I published this Cooper returned to Twitter, which I have mixed feelings about. Namely “I think this is bad for him intellectually and emotionally” vs. “He’s talking to me! Hurray!”]

I’m not a big fan of emotion in my history podcasts. Martyr Made is an exception. Cooper goes hours out of his way to make sure you understand how something felt, without ever coming across as dishonest or manipulative. Some of that is that he often uses himself as an example and is very upfront about his flaws. Some of that is the aforementioned respect and compassion seeping into everything he does. Some is good writing. 

For example, God’s Socialist is nominally about Jim Jones and the Jonestown massacre, but Cooper doesn’t believe Jonestown makes any sense unless you understand the 60s, hippies, the Civil Rights movement, and the Black Power movement. The prologue consists of a description of various race riots/race wars, the contemporary and just-pre-Civil-Rights-movements, and easily 15 minutes on his interactions with some homeless people in his neighborhood. For the last of these, he observes that though he’s occasionally kind, he mostly just ignores the individuals in question, and that sometimes he thinks that on Judgement Day the only thing that’s going to matter is how he failed to really help those men- whatever he did, it was for the wrong motives and much too little. I wrote a bunch of angry notes about how virtue ethics was bullshit while listening to this part, but by the end it became clear that he wasn’t making a call to any particular action, it was just an honest accounting of suffering in the world. He was walking me through it because he felt it was necessary to understand Jim Jones, whose first acts as an adult were taking care of people most of society was stepping over. 

All of this is to say: Martyr Made is one of my favorite pieces of nonfiction in the world. I’ve learned so much from it both factually and emotionally, but I felt vulnerable talking about that until I was absolutely rock solid on the author’s epistemics. I finally had time to do an epistemic spot check on the start of God’s Socialist (still my favorite sequence in the series), and I’m extremely relieved to announce that he nailed it, although just like my ESC of Acoup, it is not so amazingly perfect that the follow up wasn’t worth doing (and I assume Cooper would agree with that, just like Bret Devereaux did).

A word on ESCs: there’s a range of things it can mean to check someone’s epistemics. Sometimes it means checking their simple concrete facts. You would be amazed how many problems this catches. Another is to check leaps of logic: they can have their facts right but draw wildly incorrect inferences from them. Finding these requires more cognition, but is also fairly easy. Cooper did great on both of these, which was not surprising. My concern was always that his facts were literally true but unrepresentative. Accurate-in-spirit representation is one of the hardest things to judge, especially about really contentious issues like racial violence where second opinions are just another thing to fact check. What I can say is that everything I checked I was either able to concretely verify, or was extremely consistent with what I was able to find but was open to other interpretations, because it’s a contentious area with motivated record keeping.

The God’s Socialist sequence of Martyr Made is 30 hours long. I have ESCed the prologue, which is 90 minutes long, and some especially load-bearing claims I remembered from later in the podcast. I also happen to have already read one of Cooper’s most quoted sources, The Warmth of Other Suns (affiliate link), back in 2014. 2014 is a long time ago and I didn’t ESC Warmth at the time, but what Cooper quoted was generally in accordance with my memory of it, on both a factual and model level.

Without further adieu…

The Claims

Claim: A 2007 report from the Southern Poverty Law Center on Latino-on-Black violence in Los Angeles (1:02)

He reads this report very nearly word for word. All the differences I caught were very minor wording issues that didn’t change the meaning. I also checked some of SPLC’s claims

SPLC: “Since 1990, the African-American population of Los Angeles has dropped by half as blacks relocated to suburbs”, “Now, about 75% of Highland Park residents are Latinos. Only 2% are black. The rest are white and Asian.”  (8:17)

This was shockingly annoying to verify because I could find stats by year for LA county but not LA the city, and the county includes the suburbs. I did verify that:

  • In 2000 (seven years before the SPLC report came out), Highland Park was 72.4% Latino and 2.4% black (source).
    • Note that if you read the Wikipedia article it says 8.4% black, but it cites my source above. This is plausibly an issue of how to assign mixed-race people (since Wikipedia’s percentages add up to >100%), or the ongoing confusion about how Latino is an ethnicity, not a race.
    • However, that particular neighborhood was already 2.2% black in 1990, although it was a little whiter and less Latino (source).
  • An LA time article also describes South LA shifting from an approximately 1:1 ratio of Latino and Black residents to 2:1 (Highland Park is in northeast LA).

Claim: A number of specific incidents of Latino-on-Black violence in Los Angeles, and some nebulous statistics

I Googled several of these as they came up and they always checked out, although LA’s a big city and Cooper is looking over a long time period, so it would be easy to cherry pick.

Cooper also gave some statistics on hate crime. However, these were always either for a particular neighborhood (too small, data liable to be noisy), or not quite as damning as his tone suggested they were. I found some statistics that came out the same year this episode did that support the general concept that Latino-on-Black violence happens, but I don’t trust the LAPD’s truthseeking on hate crimes. 

Which is to say, Cooper’s claims are well sourced and completely consistent with the available data, but the data is poor and his opinions are more controversial than he acknowledges. I’m sure someone with different motivations could use the same data to make the opposite case, or a different one entirely. Here’s an article published the same year as the SPLC report, calling the claims ridiculous. My tentative take on this is that racial tensions were high and spilling over into violence, but the claims that “all black people in LA were greenlit” (meaning, gang members had the okay from leaders to shoot them) and “all black people in Latino neighborhoods in LA were greenlit” are clearly insane; the murder rate would be much higher if that were true. 

Claim: Quote from Warmth of Other Suns: “In 1950, city aldermen and housing officials proposed restricting 13,000 new public housing units to people who had lived in Chicago for two years. The rule would presumably affect colored migrants and foreign immigrants alike. But it was the colored people who were having the most trouble finding housing and most likely to seek out such an alternative.” (23:00)

This quote is accurate, but my memory of it wasn’t: I had in my notes that this proposal was enacted, and only rechecked the recording when I couldn’t find any such record and wanted to see if he cited a source. His source, Warmth of Other Suns, cites a 1950 newspaper article that I couldn’t find online (it probably exists in ProQuest’s Historical Newspaper archive, but I lack access despite trying ProQuest via multiple libraries).

Claim: Description of the Cicero Riots of 1951 (31:00)

Everything he says is in accordance with the Wikipedia article: it was a horrific multi-day riot and lynching episode triggered by a black family moving into a white neighborhood. 

Cooper doesn’t mention this, but fun fact: according to Wikipedia, the landlord allowed the family to move in not for any noble anti-racism or even free-market motivations, but to punish the neighborhood for fining her for something else. 

Claim: Southern white people did not want black people to leave during the Great Migration, because they needed them as labor (35:00)

Warmth of Other Suns says the same, although that’s not independent confirmation because it’s at least one of Cooper’s sources as well. Wikipedia agrees.

Claim: Northern union leaders were resistant to black migrants because they reduced labor’s power (43:00)

I could not find a smoking gun on this, which makes sense because labor is not going to want to admit it. However I found a number of articles, modern and contemporary, on companies bringing in black workers from the south as strikebreakers, and it would be extremely weird if that didn’t upset union leaders. 

Claim: Jim Jones began as a dynamic and promising civil rights movement leader, branched out into communism (1:05:20)

Yup.

Claim: Jonestown residents were mostly poor and black, and disproportionately children (1:17:00)

Yup and yup.

Note that this was not true of the leadership of Jonestown, which was overwhelmingly white. Cooper gets into this later in the sequence.

Claim: Jim Jones led successful efforts to integrate businesses in Indianapolis (memory)

This claim came later in the sequence. It and the similar claim below were very significant to me and a number of changes in my own models rest on them, so I expanded the scope of the project to include them.

There are many sources repeating this claim, including Wikipedia, some book, and r/HistoryAnecdotes, and none denying it. I am a little suspicious because everyone seems to agree on exactly how many restaurants he integrated, but no one names them. They do name a hospital, but it seems like maybe “integrated” means “he accidentally got assigned to a black ward (because his doctor was black) and refused to leave”. But it’s not surprising that restaurants he integrated either no longer exist or don’t want to be remembered as “the place that excluded minorities until forced to change by the guy who later led America’s largest simultaneous suicide”.

Claim: Jim Jones helped members of his racially-integrated church tremendously (memory)

I found many secondary or tertiary sources saying this and no arguments against, but the only primary sources I could find joined the church in California. I couldn’t find any reports from people who joined while the church was in Indiana. That doesn’t seem damning to me; it’s kinda hard to tell people your lights got turned back on by Jim Jones before he was famous. This interview with a woman who joined in California and narrowly escaped the mass suicide confirms everything it can: she was a true believer in a bunch of good things but also kind of a joiner who ping-ponged between organizations until she found peace with People’s Temple. Another CA joiner talks about joining because her sister needed a rehab program and was recommended to People’s Temple’s program. 

Claim: Jim Jones adopted multiple children of color (memory)

True. The Jones family adopted three Korean children, one part-Native American child, and one black child, who they named James Jones Jr (they also had one biological child and adopted a white child from a People’s Temple member. There are also some People’s Temple kids of unclear paternity).

I recognize that transracial adoption is contentious and actions that were considered progressive and inclusive 60 years ago are now viewed as bad for the children they were supposed to benefit. I also get that lots of adoptive white parents were unprepared to deal with the realities of racism, or harbor it themselves, and that harmed their kids. The whole mass suicide thing casts some doubt on Jim Jones as a parent too. Nonetheless, a white man naming his black son after himself in 1961 was an extraordinarily big deal for which he undoubtedly paid a very high price, and from all this I have to conclude that fighting racism was extremely important to early Jim Jones.

Summary

Overall all of the claims were at least extremely defensible. I wish Cooper acknowledged more of the controversy around his interpretations, but I also appreciate that he comes to actual conclusions with models instead of spewing a bunch of isolated facts. I also wish he provided show notes with citations, because he’s inconsistent about providing sources in the audio.

Doing this check reinforced my belief that having one source for any of your beliefs is malpractice and processing multiple sources is a requirement, however I will very happily continue to have Cooper as a significant source of information, and if I’m totally honest I’m not even going to check all his work this extensively. 

Thanks to Eli Tyre for research assistance, my Patreon Patrons for financial support of this post, and Justis Mills for editing.

Epistemic Spot Check: This Isn’t Sparta

Prologue

Despite not normally being a fan of military history, I’ve really enjoyed Bret Devereaux’s blog Acoup, in which he uses pop culture representations as a starting point to teach the subject. Unfortunately I got stuck in a loop where I didn’t want to read Acoup without fact checking at least one post, and I didn’t want to fact check a post because military history is boring when taught by anyone besides him and Dan Carlin (and I was really disappointed in Carlin’s book. Why, you might ask? Yes, I too wish I’d written that down).  

But I really wanted to start reading Acoup again. My compromise is to:

  1. Use a sequence that involved a lot of non-combat facts (as good military history usually does, and by “good” I mean “enjoyable for me to read”)
  2. Limit myself to some pretty bare bones fact checking. This does not protect me against sophisticated forgeries, but is surprisingly good at catching people with a sincerely believed agenda.

And now, a good old fashioned epistemic spot check of Bret Devereaux’s This Is Not Sparta.

Conclusion

Devereaux is not so much sharing historical facts as comprehensive models, informed by both local facts about the specific area of interest and general knowledge of humans and historical trends. This is great and I wish more people would do it. Along the way he sometimes presents facts as more certain/less controversial among historians than they are. He’s not hiding that he does this, but on first read through I did walk away with the impression that certain things were more settled than they are, and unprepared to argue for their truth.

E.g., Acoup describes helots not only as slaves, but as slaves that were especially poorly treated even by the standards of the time. 

First, let us dispense with the argument, sometimes offered, that the helots were more like medieval serfs than slaves as we understand the ideas and thus not really slaves – this is nonsense. Helots seem to have been able to own moveable property (money, clothing etc), but in fact this is true of many ancient slaves, including Roman ones (the Roman’s called this quasi-property peculium, which also applied to the property of children and even many women who were under the legal power (potestas) of another). Owning small amounts of moveable property was not rare among ancient non-free individuals (or, for that matter, other forms of slavery).

As noted below, this is not the consensus view. The writing leaves me with a good sense of why Devereaux believes what he believes, but not prepared to teach the controversy. I think if I started an argument with a helots-are-serfs partisan based purely on this blog post, I would look stupid and unprepared. Which is fine. The goal of the post is not to help me look smart at parties with classicists, it’s to leave me with better models of militant societies as a group. This is a restatement of the truism that if you really care about something you should probably read more than one source.

A broader example is that Devereaux fills in the paucity of written records about helot life with patterns known from other slave societies.  E.g. there isn’t actually a written reference to Spartiate men (the Spartan nobles) raping helot women. But we can take a look around at better documented slave societies, and at the legal code around the existence of children with Spartiate fathers and helot mothers, and make some educated guesses. This is a good and reasonable thing to do if what you want to do is picture ancient Sparta more accurately, but should not be double counted as evidence for the trends and patterns used to fill in the gaps.

I feel dumb ending with the conclusion “this is good but also never rely on one source.” You knew that already, and so did I, and yet doing this spot check made me more cautious in relying on Acoup than I previously was. Humans, man.

The Actual Spot Check

Full notes are available in my Roam graph

All claims are taken from posts 1 and 2 in the Sparta series. Claims were selected for being easy to verify and do not in the slightest constitute a random sample.

Claim: “Sparta was not a city-state for the simple reason that it didn’t have a city – it had five villages instead”
Verdict: Plausible, subject to arguments on definitions. It’s easy to find tertiary sources describing Sparta as a city, but none saying “I have considered the possibility that Sparta was in fact five villages in a trenchcoat and rejected it for these reasons”. I couldn’t actually find any references to Sparta being five villages- sources that described it as an amalgamation always said four. Bu when I reached out to Mr. Devereaux’s Twitter he explained that he was counting a fifth village that had joined later (he also acknowledged there was some controversy in the description, which he didn’t in the original post). I still wish this had come with exact definitions of what constitutes a city vs a village, and what the implications of the difference are.

Claim: “Spartan boys were, at age seven, removed from their families and  instead grouped into herds (agelai) under the supervision of a single adult male Spartan”… “While the heirs of Sparta’s two hereditary kings were exempt from the agoge – perhaps because the state couldn’t afford to risk their lives so callously – Leonidas was a younger brother and thus was not exempt”
Verdict: confirmed in secondary source (although Devereaux goes to pains to point out the biases of the secondary sources)

Claim: “The boys were intentionally underfed. They were thus encouraged to steal in order to make up the difference, but severely beaten if caught”
Verdict: confirmed in secondary source

Claim: “Not even the exemplary boys escaped the violence, since the Spartan youths were annually whipped at the Altar of Artemis Orthia”
Verdict: confirmed in tertiary source

Claim: “Then there is the issue of relationships. At age **twelve** (Plut. Lyc. 17.1) boys in the agoge would enter a relationship with an older man – Plutarch’s language is quite clear that this is a sexual relationship (note also Aelian VH. 3.10, similarly blunt).”
Verdict: Well supported inference, not actually proven. My literal reading of the sources cited leaves a lot of ambiguity over whether the relationships were sexual or not. This could be a lack of experience reading between the lines of ancient Greek sources, but other historians who have presumably read them also dispute the claim. Given what we know of violent all-male institutions in general I think it’s an extremely reasonable inference that the relationships were sexual (and by our standards, extremely coercive at a minimum), in fact it would be surprising if they weren’t, but this isn’t a smoking gun.

Claim: “These Spartan boys will have to apply to be part of a mess-group (syssitia – a concept we’ll return to later) when they are twenty”
Verdict: confirmed in tertiary source.

Claim: Some boys in the agoge were selected for the krypteia, which patrolled farms at night to murder slaves.
Verdict: confirmed in tertiary source.

Claim: “we actually know that individual Spartans painted their shields with a variety of individual devices.”
Verdict: Multiple tertiary sources report this being true at the time 300 would have taken place, but that the army kit was eventually standardized.

Claim: “While there were supposedly 8,000 male spartiates in 480 there seem to have only been 3,500 by 418  just 2,500 in 394 and just 1,500 in 371.”
Verdict: tertiary sources gave slightly different numbers at slightly different times but the trend was confirmed repeatedly.

Claim: The perioikoi were poor farmers on marginal land on the outskirts of sparta. They were free save for mandatory participation in the army, but had no say in government.
Verdict: gist confirmed in tertiary sources.

Claim: “The hypomeiones seem to consist of the men (and their descendants) who had been spartiates, but had been stripped of citizen status for some reason, usually poverty (but sometimes cowardice)”
Verdict: gist confirmed in multiple tertiary sources, sometimes using a narrower definition

Claim: “The mothakes (singular: mothax) seem to have been the bastard off-spring of spartiate men and helot women”
Verdict: gist confirmed in tertiary source, although they seemed to consider hypomeiones a subset of mothakes.

Claim: Neodamodes were freed spartan slaves
Verdict: gist confirmed in tertiary sources, although they refer only to slaves freed for military service

Claim: Sparta’s population distribution was roughly

Verdict: agrees with cited source.

Claim: Helots were owned by the Spartan state, who assigned them to work land owned by the Spartiates (method of assignment unknown)
Verdict: confirmed by tertiary source.

Claim:  Helots were poorly treated slaves, not serfs or a stage between slave and free
Verdict: certainly seems justified, but officially controversial

Thanks to my Patreon patrons for helping to fund this work.

Update 11/25: Bret Devereaux has a few comments about this on Twitter.

Update 11/27: A friend checks the Spartan military record and finds it meh.

Breaking Questions Down

Previously I talked about discovering that my basic unit of inquiry should be questions, not books. But what I didn’t talk about was how to generate those questions, and how to separate good questions from bad. That’s because I don’t know yet; my own process is mysterious and implicit to me. But I can give a few examples.

For any given question, your goal is to disambiguate it into smaller questions that, if an oracle gave you the answers to all of them, would allow you to answer the original question. Best case scenario, you repeat this process and hit bedrock, an empirical question for which you can find accurate data. You feed that answer into the parent question, and eventually it bubbles up to answering your original question.

That does not always happen. Sometimes the question is one of values, not facts. Sometimes sufficient accurate information is not available, and you’re forced to use a range- an uncertainty that will bubble up through parent answers. But just having the questions will clarify your thoughts and allow you to move more of your attention to the most important things.

Here are a few examples.  First, a reconstructed mind map of my process that led to several covid+economics posts. In the interests of being as informative as possible, this one is kind of stylized and uses developments I didn’t have at the time I actually did the research.

Vague covid panic@2x.png

If you’re curious about the results of this, the regular recession post is here and the oil crisis post is here.

Second, a map I created but have not yet researched, on the cost/benefit profile of a dental cleaning while covid is present.

Risk model of dental cleanings in particular@2x.png

Aside: Do people prefer the horizontal or vertical displays? Vertical would be my preference, but Whimsical does weird things with spacing so the tree ends up with a huge width either way.

Honestly this post isn’t really done; I have a lot more to figure out when it comes to how to create good questions. But I wanted to have something out before I published v0.1 of my Grand List of Steps, so here we are.

Many thanks to Rosie Campbell for inspiration and discussion on this idea.

How to Find Sources in an Unreliable World

I spent a long time stalling on this post because I was framing the problem as “how to choose a book (or paper. Whatever)?”. The point of my project is to be able to get to correct models even from bad starting places, and part of the reason for that goal is that assessing a work often requires the same skills/knowledge you were hoping to get from said work. You can’t identify a good book in a field until you’ve read several. But improving your starting place does save time, so I should talk about how to choose a starting place.

One difficulty is that this process is heavily adversarial. A lot of people want you to believe a particular thing, and a larger set don’t care what you believe as long as you find your truth via their amazon affiliate link (full disclosure: I use amazon affiliate links on this blog). The latter group fills me with anger and sadness; at least the people trying to convert you believe in something (maybe even the thing they’re trying to convince you of). The link farmers are just polluting the commons.

With those difficulties in mind, here are some heuristics for finding good starting places.

  • Search “best book TOPIC” on google
    • Most of what you find will be useless listicles. If you want to save time, ignore everything on a dedicated recommendation site that isn’t five books.
    • If you want to evaluate a list, look for a list author with deep models on both the problem they are trying to address, and why each book in particular helps educate on that problem.  Examples:
    • A bad list will typically have a topic rather than a question they are trying to answer, and will talk about why books they recommend are generically good, rather than how they address a particular issue. Quoting consumer reviews is an extremely bad sign and I’ve never seen it done without being content farming.
  • Search for your topic on Google Scholar
    • Look at highly cited papers. Even if they’re wrong, they’re probably important for understanding what else you read.
    • Look at what they cite or are cited by
    • Especially keep an eye out for review articles
  • Search for web forums on your topic (easy mode: just check reddit). Sometimes these will have intro guides with recommendations, sometimes they will have where-to-start posts, and sometimes you can ask them directly for recommendations. Examples:
  • Search Amazon for books on your topic. Check related books as well.
  • Ask your followers on social media. Better, announce what you are going to read and wait for people to tell you why you are wrong (appreciate it, Ian). Admittedly there’s a lot of prep work that goes into having friends/a following that makes this work, but it has a lot of other benefits so if it sounds fun to you I do recommend it. Example:
  • Ask an expert. If you already know an expert, great. If you don’t, this won’t necessarily save you any time, because you have to search for and assess the quality of the expert.
  • Follow interesting people on social media and squirrel away their recommendations as they make them, whether they’re relevant to your current projects or not.

Types of Knowledge

This is a system for sorting types of knowledge. There are many like it, but this one is mine.

First, there is knowledge you could regurgitate on a test. In any sane world this wouldn’t be called knowledge, but the school system sure looks enthusiastic about it, so I had to mention it. Examples:

  • Reciting the symptoms of childbed fever on command 
  • Reciting Newton’s first law of motion
  • Reciting a list of medications’ scientific and brand names
  • Reciting historical growth rate of the stock market
  • Reciting that acceleration due to gravity on Earth is 9.807 m/s²

 

Second, there is engineering knowledge- something you can repeat and get reasonably consistent results. It also lets you hill climb to local improvements. Examples:

  • Knowing how to wash your hands to prevent childbed fever and doing so
  • Driving without crashing
  • Making bread from a memorized recipe.
  • What are the average benefits and side effects from this antidepressant?
  • Knowing how much a mask will limit covid’s spread
  • Investing in index funds
  • Knowing that if you shoot a cannon ball of a certain weight at a certain speed, it will go X far.
  • Knowing people are nicer to me when I say “please” and “thank you”

 

Third, there is scientific knowledge. This is knowledge that lets you generate predictions for how a new thing will work, or how an old thing will work in a new environment, without any empirical knowledge.

Examples: 

  • Understanding germ theory of disease so you can take procedures that prevent gangrene and apply them to childbed fever.
  • Knowing the science of baking so you can create novel edible creations on your first try.
  • Knowing enough about engines and batteries to invent hybrid cars.
  • Actually understanding why any of those antidepressants works, in a mechanistic way, such that you can predict who they will and won’t work for.
  • A model of how covid is spread through aerosols, and how that is affected by properties of covid and the environment.
  • Having a model of economic change that allows you to make money off the stock market in excess of its growth rate, or know when to pull out of stocks and into crypto.
  • A model of gravity that lets you shoot a rocket into orbit on the first try.
  • A deep understanding of why certain people’s “please”s and “thank you”s get better results than others.

 

Engineering knowledge is a lot cheaper to get and maintain than scientific knowledge, and most of the time it works out. Maybe I pay more than I needed to for a car repair; I’ll live (although for some people the difference is very significant). You need scientific knowledge to do new things, which either means you’re trying something genuinely new, or you’re trying to maintain an existing system in a new environment.

I don’t know if you’ve noticed, but our environment was changing pretty rapidly before a highly contagious, somewhat deadly virus was released on the entire world, and while that had made things simpler in certain ways (such as my daily wardrobe), it has ultimately made it harder to maintain existing systems. This requires scientific knowledge to fix; engineering won’t cut it.

And it requires a lot of scientific knowledge at that- far more than I have time to generate. I could trust other people’s answers, but credentials and authority have never looked more useless, and identifying people I trust on any given subject is almost as time consuming as generating the answers myself.  And I don’t know what to do about that.

 

What to write down when you’re reading to learn

One of the hardest questions I’ve had to answer as part of the project formerly known as epistemic spot checks is: “how do I know what to write down?”

This will be kind of meandering, so here’s the take home. 

For shallow research:

  • Determine/discover what you care about before you start reading.
  • Write down anything relevant to that care.

For deep research:

  • Write down anything you find interesting.
  • Write down anything important to the work’s key argument.
  • Write down anything that’s taking up mental RAM, whether it seems related or interesting or not. If you find you’re doing this a lot, consider you might have a secret goal you don’t know about.
  • The less 1:1 the correspondence between your notes and the author’s words the better. Copy/pasting requires little to no engagement, alternate theories for the explanations spread over an entire chapter require a lot.

 

Now back to our regularly scheduled blog post.

Writing down a thing you’ve read (/heard/etc) improves your memory and understanding, at the cost of disrupting the flow of reading. Having written a thing down makes that one thing easier to rediscover, at the cost of making every other thing you have or will ever write down a little harder to find. Oh, and doing the math on this tradeoff while you’re reading is both really costly and requires knowing the future. 

I would like to give you a simple checklist for determining when to save a piece of information. Unfortunately I never developed one. There are obvious things like “is this interesting to me (for any reason)?” and “is this key to the author’s argument?”, but those never got rid of the nagging feeling that I was losing information I might find useful someday, and specifically that I was doing shallow research (which implies taking the author’s word for things) and not deep (which implies making my own models). 

The single most helpful thing in figuring out what to write down was noticing when my reading was slowing down, which typically meant either there was a particular fact that needed to be moved from short to long term storage, or that I needed to think about something. Things in these categories need to be written down and thought about regardless of their actual importance, because their perceived importance is eating up resources, and 30 seconds writing something down to regain those resources is a good trade even if I never use that information again. If I have one piece of advice, it’s “learn to recognize the subtle drag of something requiring your attention.”

An obvious question is “how do I do that though?”. I’m a mediocre person to answer this question because I didn’t set out to learn the skill, I just noticed I was doing it. But for things in this general class, the best thing I have found to do is get yourself in a state where you are very certain you have no drag (by doing a total brain dump), do some research, and pay attention to when drag develops. 

But of course it’s much better if my sense of “this is important, record it” corresponds with what is actually important. The real question here is “Important to what?” When I was doing book-based reviews, the answer at best was “the book’s thesis”, which as previously discussed gives the author a huge amount of power to control the narrative. But this became almost trivial when I switched the frame to answering a specific set of questions. As long as I had a very clear goal in mind, my subconscious would do most of the work. 

This isn’t a total solution though, because of the vast swath of territory labeled “getting oriented with what I don’t know”. For example right now I want to ask some specific questions about the Great Depression and what it can tell us about the upcoming economic crisis, but I don’t feel I know enough. It is very hard to get oriented with patchwork papers: you typically need books with cohesive narratives, and then to find other ways to undo the authors’ framing. Like a lot of things, this is solved by going meta. “I want to learn enough about the Great Depression that I have a framework to ask questions about parallels to the current crisis” was enough to let me evaluate different “Top Books about the Great Depression” lists and identify the one whose author was most in line with my goals (it was the one on fivebooks, which seems to be the case much more often than chance).

I mentioned “losing flow” as a cost of note taking in my opening, but I’m not actually convinced that’s a cost. Breaking flow also means breaking the author’s hold on you and thinking for yourself. I’ve noticed a pretty linear correlation between “how much does this break flow?” and “how much does this make me think for myself and draw novel conclusions?”. Copy/pasting an event that took place on a date doesn’t break flow but doesn’t inspire much thought. Writing down your questions about information that seems to be missing, or alternate interpretations of facts, takes a lot longer.

Which brings me to another point: for deep reading, copy pasting is almost always Doing It Wrong. Even simple paraphrasing requires more engagement than copy/pasting. Don’t cargo cult this though: there’s only so many ways to say simple facts, and grammar exercises don’t actually teach you anything about the subject.

So there is my very unsatisfying list of how to know what to write down when you’re reading to learn. I hope it helps.

Where to Start Research?

When I began what I called the knowledge bootstrapping project, my ultimate goal was “Learn how to learn a subject from scratch, without deference to credentialed authorities”. That was too large and unpredictable for a single grant, so when I applied to LTFF, my stated goal was “learn how to study a single book”, on the theory that books are the natural subcomponents of learning (discounting papers because they’re too small). This turned out to have a flawed assumption baked into it.

As will be described in a forthcoming post, the method I eventually landed upon involves starting with a question, not a book. If I start with a book and investigate the questions it brings up (you know, like I’ve been doing for the last 3-6 years), the book is controlling which questions get brought up. That’s a lot of power to give to something I have explicitly decided not to trust yet. 

Examples:

  • When reading The Unbound Prometheus, I took the book’s word that a lower European birth rate would prove Europeans were more rational than Asians and focused on determining whether Europe’s birth rates were in fact lower (answer: it’s complicated), when on reflection it’s not at all clear to me that lower birth rates are evidence of rationality.
  • “Do humans have exactly 4 hours of work per day in them?” is not actually a very useful question. What I really wanted to know is “when can I stop beating myself up for not working?“, and the answer to the former doesn’t really help me with the latter. Even if humans on average have 4 hours, that doesn’t mean I do, and of course it varies by circumstances and type of work… and even “when can I stop beating myself up?” has some pretty problematic assumptions built into it, such as “beating myself up will produce more work, which is good.” The real question is something like “how can I approach my day to get the most out of it?”, and the research I did on verifying a paper on average daily work capacity didn’t inform the real question one way or the other.

 

What would have been better is if I’d started with the actual question I wanted to answer, and then looked for books that had information bearing on that question (including indirectly, including very indirectly). This is what I’ve started doing.

This can look very different depending on what type of research I’m doing. When I started doing covid research, I generated a long list of  fairly shallow questions.  Most of these questions were designed to inform specific choices, like “when should I wear what kind of mask?” and “how paranoid should I be about people without current symptoms?”, but some of them were broader and designed to inform multiple more specific questions, such as “what is the basic science of coronavirus?”. These broader, more basic questions helped me judge the information I used to inform the more specific, actionable questions (e.g., I saw a claim that covid lasted forever in your body the same way HIV does, which I could immediately dismiss because I knew HIV inserted itself your DNA and coronaviruses never enter the nucleus).

 


 

I used to read a lot of nonfiction for leisure. Then I started doing epistemic spot checks– taking selected claims from a book and investigating them for truth value, to assess the book’s overall credibility- and stopped being able to read nonfiction without doing that, unless it was one of a very short list of authors who’d made it onto my trust list. I couldn’t take the risk that I was reading something false and would absorb it as if it were true (or true but unrepresentative, and absorb it as representative). My time spent reading nonfiction went way down.

About 9 months ago I started taking really rigorous notes when I read nonfiction. The gap in quality of learning between rigorous notes and my previous mediocre notes was about the same as the gap between doing an epistemic spot check and not. My time spent reading nonfiction went way up (in part because I was studying the process of doing so), but my volume of words read dropped precipitously.

And then three months ago I shifted from my unit of inquiry being “a book”, to being “a question”. I’m sure you can guess where this is going- I read fewer words, but gained more understanding per word, and especially more core (as opposed to shell or test) understanding. 

The first two shifts happened naturally, and while I missed reading nonfiction for fun and with less effort, I didn’t feel any pull towards the old way after I discovered the new way. Giving up book-centered reading has been hard. Especially after five weeks of frantic covid research, all I wanted to do was to be sat down and told what questions were important, and perhaps be walked through some plausible answers. I labeled this a desire to learn, but when I compared it to question-centered research, it became clear that’s not what it was. Or maybe it was a desire to go through the act of learning something, but it was not a desire to answer a question I had and was not prioritized by the importance of a question. It was best classified as leisure in the form of learning, not resolving a curiosity I had.  And if I wanted leisure, better to consume something easier and less likely to lead me astray, so I started reading more fiction, and the rare non-fiction of a type that did not risk polluting my pool of data. And honestly I’m not sure that’s so safe: humans are built to extract lessons from fiction too.

Put another way: I goal factored (figured out what I actually wanted from) reading a nonfiction book, and the goal was almost never best served by using a nonfiction book as a starting point. Investigating a question I cared about was almost always better for learning (even if it did eventually cash out in reading a book), and fiction was almost always better for leisure, in part because it was less tiring, and thus left more energy for question-centered learning when that was what I wanted.

 

The Purpose of Lectures

How to Take Smart Notes (affiliate link) posits that students who handwrite lecture notes gain as many facts and more conceptual understanding than students who type notes to the same lecture, because the slowness of handwriting forces you to compress ideas down to their core, whereas typing lets you transcribe a lecture without reflection. While I agree that translating things in your own words and compressing ideas is better than rote transcription, I have two problems with this.

One, it preemptively gives up on a practical question of which side of a trade-off is better without examining either the conditions or ways to improve the trade off. Given the enormous benefits of electronic storage of notes, maybe we should spend 45 seconds thinking about how to port the benefits of handwritten notes over, or under what circumstances the benefits of quick and high-fidelity transcription outweighs the push to engage more deeply with data.

Two, and this is harder to articulate… there is a reason students are defaulting to transcriptions of lectures, and it’s not because they’re bad or lazy or don’t like thinking. If lecturers actually wanted you to think conceptually about a topic, they would, I don’t know, leave any time at all for that in a lecture (my STEM background may be showing here. Movies tell me English class has more of this). As it is, conceptual understanding and translation requires that you stop listening to the professor- the dreaded multitasking thing that luddites are always going on about.

This is really a college student issue. On the rare occasion I’m trying to learning something from a live lecture, it’s still a non-mandatory event where the speaker cares about either actually teaching something or being entertaining, which solves a lot of these problems. But I’m angry that blame is being placed on students for acquiescing to what the system very strongly pushes them towards.