How’s that Epistemic Spot Check project coming?

 

Quick context: Epistemic spot checks started as a process in which I did quick investigations a few of a book’s early claims to see if it was trustworthy before continuing to read it, in order to avoid wasting time on books that would teach me wrong things. Epistemic spot checks worked well enough for catching obvious flaws (*cou*Carol Dweck*ugh*), but have a number of problems. They emphasize a trust/don’t trust binary over model building, and provability over importance. They don’t handle “severely flawed but deeply insightful” well at all. So I started trying to create something better

Below are some scattered ideas I’m playing with that relate to this project. They’re by no means fully baked, but it seemed like it might be helpful to share them. This kind of assumes you’ve been following my journey with epistemic spot checks at least a little. If you haven’t that’s fine, a more polished version of these ideas will come out eventually.

 

A parable in Three Books.

I’m currently attempting to write up an investigation of Children and Childhood in Roman Italy (Beryl Rawson) (affiliate link) (Roam notes). This is very slow going, because CaCiRI doesn’t seem to have a thesis. At least, I haven’t found one, and I’ve read almost half of the content. It’s just a bunch of facts. Often not even syntheses, just “Here is one particular statue and some things about it.” I recognize that this is important work, even the kind of work I’d use to verify another book’s claims. But as a focal source, it’s deadly boring to take notes on and very hard to write anything interesting about. What am I supposed to say? “Yes, that 11 year old did do well (without winning) in a poetry competition and it was mentioned on his funeral altar, good job reporting that.” I want to label this sin “weed based publishing” (as in, “lost in the weeds”, although the fact that I have to explain that is a terrible sign for it as a name).

One particular bad sign for Children and Childhood in Roman Italy was that I found myself copying multiple sentences at once into my notes. Direct quoting can sometimes mean “there’s only so many ways to arrange these words and the author did a perfectly good job so why bother”, but when it’s frequent, and long, it often means “I can’t summarize or distill what the author is saying”, which can mean the author is being vague, eliding over important points, or letting implications do work that should be made explicit. This was easier to notice when I was taking notes in Roam (a workflowy/wiki hybrid) because Roam pushes me to make my bullet points as self-contained as possible (so when you refer them in isolation nothing is lost), so it became obvious and unpleasant when I couldn’t split a paragraph into self contained assertions. Obviously real life is context-dependent and you shouldn’t try to make things more self-contained than they are, but I’m comfortable saying frequent long quotes are a bad sign about a book.

On the other side you have The Unbound Prometheus (David S. Landes) (affiliate link) (Roam notes), which made several big, interesting, important, systemic claims (e.g., “Britain had a legal system more favorable to industrialization than continental Europe’s”, “Europe had a more favorable climate for science than Islamic regions”), none of which it provided support for (in the sections I read- a friend tells me he gets more specific later). I tried to investigate these myself and ended up even more confused- scholars can’t even agree on whether Britain’s patent protections were strong or weak. I want to label this sin “making me make your case for you”.

A Goldilocks book is The Fate of Rome (Kyle Harper) (affiliate link) (Roam notes). Fate of Rome’s thesis is that the peak of the Roman empire corresponds with unusually favorable weather conditions in the mediteranean. It backs this up with claims about climate archeology, e.g., ice core data (claim 1, 2). This prompted natural and rewarding follow up questions like “What is ice core capable of proving?” and “What does it actually show?”. My note taking system in Roam was superb at enabling investigations of questions like these (my answer).

Based on claims creation, Against the Grain (James Scott) (affiliate link) (Roam notes) is even better. It has both interesting high level models (“settlement and states are different thing that came very far apart”, “states are entangled with grains in particular”) and very specific claims to back them up (“X was permanently settled in year Y but didn’t develop statehood hallmarks A, B, and C until year Z”). It is very easy to see how that claim supports that model, and the claim is about as easy to investigate as it can be. It is still quite possible that the claim is wrong or more controversial than the author is admitting, but it’s something I’ll be able to determine in a reasonable amount of time. As opposed to Unbound Prometheus, where I still worry there’s a trove of data somewhere that answers all of the questions conclusively and I just failed to find it.

[Against the Grain was started as part of the Forecasting project, which is currently being reworked. I can’t research its claims because that would ruin our ability to use it for the next round, should we choose to do so, so evaluation is on hold.]

If you asked me to rate these books purely on ease-of-reading, the ordering (starting with the easiest) would be:

 

  • Against the Grain
  • The Fate of Rome
  • Children and Childhood in Roman Italy
  • The Unbound Prometheus

 

Which is also very nearly the order they were published in (Against the Grain came out six weeks before Fate of Rome; the others are separated by decades). It’s possible that the two modern books were no better epistemically but felt so because they were easier to read. It’s also possible it’s a coincidence, or that epistemics have gotten better in the last 50 years.

 

Model Based Reading

As is kind of implied in the parable above, one shift in Epistemic Spot Checks is a new emphasis on extracting and evaluating the author’s models, which includes an emphasis on finding load bearing facts. I feel dumb for not emphasizing this sooner, but better late than never. I think the real trick here is not identifying that knowing a book’s models are good, but creating techniques for how to do that.

 

How do we Know This?

The other concept I’m playing with is that “what we know” is inextricable from “how we know it”. This is dangerously close to logical positivism, which I disagree with my limited understanding of. And yet it’s really improved my thinking when doing historical research.

This is a pretty strong reversal for me. I remember strongly wanting to just be told what we knew in my science classes in college, not the experiments that revealed it. I’m now pretty sure that’s scientism, not science.

 

How’s it Going with Roam?

When I first started taking notes with Roam (note spelling), I was pretty high on it. Two months later, I’m predictably loving it less than I did (it no longer drives me to do real life chores), but still find it indispensable. The big discovery is that the delight it brings me is somewhat book dependent- it’s great for Against the Grain or The Fate of Rome, but didn’t help nearly so much with Children and Childhood in Roman Italy, because it was most very on-the-ground facts that didn’t benefit from my verification system and long paragraphs that couldn’t be disambiguated.

I was running into a ton of problems with Roam’s search not handling non-sequential words, but they seem to have fixed that. Search is still not ideal, but it’s at least usable

Roam is pretty slow. It’s currently a race between their performance improvements and my increasing hoard of #Claims.

Epistemic Spot Check: The Unbound Prometheus

Introduction

One of the challenging things about learning is knowing what sources you should learn from- if you already knew what was correct, you wouldn’t be trying to learn it in the first place. Epistemic spot checks started as a process in which I did quick investigations a few of a book’s early claims to see if it was trustworthy before continuing to read it, in order to avoid wasting time on books that would teach me wrong things. Friends indicated they found these useful, so I started sharing them, and even got a small Patreon running.

Epistemic spot checks worked well enough for catching obvious flaws (*cou*Carol Dweck*ugh*), but have a number of problems. They emphasize a trust/don’t trust binary over model building, and provability over importance. They don’t handle “severely flawed but deeply insightful” well at all. So I started trying to create something better. This post is part of that attempt, and as such contains both checks of a book’s claims and introspection on the process of checking those claims.

But before I started that improvement process, there was another. Even the quick versions of epistemic spot checks are time consuming, and I am only one person, who will not be unemployed forever. I started working with Foretold and Parallel on a project to amplify my spot checks by having people predict how I would evaluate claims- the idea being that if the masses got good at it, prediction markets could be a partial substitute for my investigations. This blog post is also a part of that project, which entails some extra steps. If you’re interested in how this has worked in the past, check out  The Fate of Rome and The Fall of Rome.

Today’s book is The Unbound Prometheus (affiliate link), which aims to explain why the Industrial Revolution happened when and where it did. Spoiler alert: I didn’t like it.

 

Process

I had three phases of actions: in the first, I read the book, created claims in Foretold, and entered my priors for the claims as predictions. Phase two was much like the only phase of previous checks, in which I had a set period of time (six hours, twice what I had for The Fall of Rome) to investigate randomly selected claims for as long as feels useful, after which I predict what my credence in the claim would be after 10 hours of research. This prediction was submitted as a resolution in Foretold. In practice there were several claims where I stopped an hour in, even though I still had very high uncertainty, because it seemed like it would take a lot of additional time to shrink my confidence bars. 

Answers, here and in the prediction market, are given in Foretold syntax.

In phase three, I had three hours each to answer two questions, randomly selected from those I had answered in phase two. The goal here was to see how good I was at predicting my own answers. The gods of fate were not kind on this one, and I drew the two questions that least benefited from additional time, being fairly strict factual questions with an exhaustible amount of relevant material. These evaluations were not entered in Foretold, but I’ve included them here.

As is my new custom, I took my notes in Roam, a workflowy/wiki hybrid. Until recently I thought Roam was so magic that my raw notes were better formatted there than I could ever hope to make them in a linear document like this, so I could just share my conclusions here, and let people read my notes in Roam if they were especially curious. In between writing most of this post and publishing it I learned that many people find my Roam notes too difficult to read and prefer having them written out linearly. I didn’t have the time or energy to fix this post, but rest assured I’m thinking about how to do this better. In the meantime, Roam notes are formatted as follows:

  • The target source gets its own page
  • On this page I list some details about the book and claims it makes. If the claim is citing another source, I may include a link to the source.
  • If I investigate a claim or have an opinion so strong it doesn’t seem worth verifying (“Parenting is hard”), I’ll mark it with a credence slider. The meaning of each credence will eventually be explained here, although I’m still working out the system.
    • Then I’ll hand-type a number for the credence in a bullet point, because sliders are changeable even by people who otherwise have only read privileges. If the slider and text number disagree, believe the text.
  • You may see a number to the side of a claim. That means it’s been cited by another page. It is likely a synthesis page, where I have drawn a conclusion from a variety of sources. The synthesis pages are also what I’ll be linking to in this post.

Another thing that changed this time around is that I learned to use the multimodal function of Foretold, which lets you combine distribution functions. This is great for hedging your bets but not great for comprehensibility, so I’ll include the graphs generated. Unfortunately Foretold still doesn’t have a graphical export, so I’m using hideous screen shots.

Even more unfortunately, WordPress, which has happily accepted these screen shots in the past, will not tolerate them now. So to see my claims and conclusion, please continue to the Google Doc. Please let me know what was useful, useless, or high friction for you. I’m especially interested in how comprehendible/usable my Roam database is.

Many thanks to my Patreon patrons and Parallel Forecast for financial support for this post

Trip Report: Fertility Drugs

As part of freezing my eggs, I’ve been on an interesting set of hormones. This had some weird effects that seemed worth sharing. Obviously this is an n of one and any particular symptom could have some other cause, but it’s interesting nonetheless.

I was on a variety of hormones but the only one that seemed to change my mood was Menopur, so I’ll only report dosages of that.

Day 1 (4 Vials): 60 minutes after first shot my boyfriend says something that makes me a little sad. I am much sadder and weepier about a mildly sad thing than usual, and it feels similar to but not exactly like normal crying. After crying myself out about boyfriend’s thing, I want to watch a sad movie, a feeling I don’t think I’ve ever had before in my entire life. Dallas Buyers Club proves neither interesting nor sad enough and I quit with 20 minutes left to go (which, to be fair, is when a lot of the sad stuff happens). Boyfriend suggests anime can not be sad and I should stop looking at that part of Netflix. I suggest he has never seen Grave of the Fireflies but drop it there because I don’t want *that* sad.

That response was faster than expected (the doctor confirmed the hormones can work that fast), but otherwise pretty much the emotion I expected reproductive hormones to give me.

Day 2 (4 vials): I wait for the sad to kick in but get distracted talking to my roommate. 70-80 minutes after shot I feel super energized, like I’ve taken caffeine after being off it for a while.

Today I notice how relaxed my muscles are, especially my shoulders. Boyfriend describes me as consistently less cynical, more sincere, and more connected. I wonder how we are dating if cynicism is a negative for him.

Day 3 (4 vials): ~80 minutes after shot I feel really mellow and relaxed, like I’ve taken a small dose of CBD. I attempt to watch Billy Elliot, which Netflix describes as “a tearjerker” and “feel-good”, two sections I’ve never looked in before. I stop watching from boredom an hour in. Boyfriend talks me into watching This is Us, a show designed to make you cry. I enjoy it but don’t cry

Day 4 (4 vials): I feel sleepy an hour or two after the shot, but who knows what that means. I watch more This is Us and have many feelings about it.

Day 5 (3 vials): 6 hours before shot (so 18 hours after the last one) a close friend and I start discussing a point of repeated contention. I scream at him/the situation then cry while we discuss it. Friend is extremely happy to have my emotions closer to the surface and easier to access, and not at all upset about the screaming. We run out of time to finish the discussion and reschedule for Day 10 (which later gets bumped to Day 11).

Day 6 (3 vials): nothing particular happens. Muscles continue to be very relaxed. Test myself out with my friend’s toddler, perhaps have more attention span for his games but it’s a small difference at most.

Day 7 (2 vials): Attempt to watch Rocky 120 minutes after injection. Underdog pulls through is a thing I like normally but this particular one does nothing for me.

Day 8 (2 vials): Boyfriend pitches idea that we need to pull out the big guns, sad movie wise. He convinces me to watch Return to Me, a movie I assumed was from Hallmark but was in fact theatrically released and stars David Duchovny. Character’s death evokes no emotion from me, except when her widower shares his sadness with his dog.

Day 9 (2 vials): Try a horror movie. It fails to evoke a response, but to be fair it’s mostly an artsy allegory using horror movie tropes.

Day 10 (0 vials): muscles continue to be very relaxed.

Day 11: Wake up with tension back in shoulders and an feeling of cynicism suffused through me. Decide hormones have stopped working. Call friend from day 5. Instead of planned discussion we focus on what changed in me. I end up having a major insight and crying a lot. A few hours later I rehash the conversation with my boyfriend and have more insights and crying. Overall the problem seems to be severely constraining fear of disappointing people, which went away on the drugs because I felt secure in my relationships.

Day 12: Go under general anesthetic to have eggs removed. Spend the rest of the day groggy and uncomfortable.

Day 13: Someone makes a joke about my pain levels and I realize I can barely feel my trigeminal neuraglia, and maybe haven’t for a week.

Then I spent a few more days uncomfortable from my inflated ovaries, and the physical effects eased off.

Overall, the hormones seemed to make my emotions closer to the surface and gave me less room to intervene between feeling and action. Around people I was already close to, this translated to letting my guard down and being more authentic. This was in a positive feedback loop with the muscle tension, which was both a cause and consequence of social fear and insecurity. I’m assuming it was the reduced muscle tension that led to the improvement in neuralgia, although there exists a paper suggesting it could be more direct.

Being more in touch with my feelings, or having the memory of being so, also made it easier to advocate for myself/my POV.  I told Day 5 friend I thought someone he *really* looked up to was bullshit, and it did in fact upset him, but we worked through it and ended up in a better place than we started. I questioned another friend about whether a problem was as solved as he believed it to be

In the environment I was in, the emotional effects from the fertility hormones were net positive, to the point I’d take them again just for the experience and pain-relief if it were cheap and side-effect free. I worry what would have happened if I’d had a bad annual review while on them, or screamed at a friend who had screaming-related trauma instead of one who welcomed it as a sign of emotional intimacy. There are reasons I developed the ability to control my emotional expression.

This is speculative, but it wouldn’t surprise me if “same emotions more strongly felt” is a the common reaction to fertility and pregnancy hormones, and the reason we think they cause weepiness and rage and is that most people, especially people fighting infertility, have a lot of sadness and anger stored inside them. Egg freezing is fertility treatments on emotional easy mode, since you haven’t failed at anything yet, but I imagine if I’d been trying for years and failed to conceive, or faced multiple pregnancy losses, there would be a lot of feelings looking for an outlet.

 

Epistemic Spot Check: Fatigue and the Central Governor Module

Epistemic spot checks used to be a series in which I read papers/books and investigated their claims with an eye towards assessing the work’s credibility. I became unhappy with the limitations of this process and am working on creating something better. This post about both the results of applying the in-development process to a particular work, and observations on the process. As is my new custom, this discussion of the paper will be mostly my conclusions. The actual research is available in my Roam database (a workflowy/wiki hybrid), which I will link to as appropriate.

This post started off as an epistemic spot check of Fatigue is a brain-derived emotion that regulates the exercise behavior to ensure the protection of whole body homeostasis, a scientific article by Timothy David Noakes. I don’t trust myself to summarize it fairly (we’ll get to that in a minute), so here is the abstract:

An influential book written by A. Mosso in the late nineteenth century proposed that fatigue that “at first sight might appear an imperfection of our body, is on the contrary one of its most marvelous perfections. The fatigue increasing more rapidly than the amount of work done saves us from the injury which lesser sensibility would involve for the organism” so that “muscular fatigue also is at bottom an exhaustion of the nervous system.” It has taken more than a century to confirm Mosso’s idea that both the brain and the muscles alter their function during exercise and that fatigue is predominantly an emotion, part of a complex regulation, the goal of which is to protect the body from harm. Mosso’s ideas were supplanted in the English literature by those of A. V. Hill who believed that fatigue was the result of biochemical changes in the exercising limb muscles – “peripheral fatigue” – to which the central nervous system makes no contribution. The past decade has witnessed the growing realization that this brainless model cannot explain exercise performance.This article traces the evolution of our modern understanding of how the CNS regulates exercise specifically to insure that each exercise bout terminates whilst homeostasis is retained in all bodily systems. The brain uses the symptoms of fatigue as key regulators to insure that the exercise is completed before harm develops.These sensations of fatigue are unique to each individual and are illusionary since their generation is largely independent of the real biological state of the athlete at the time they develop.The model predicts that attempts to understand fatigue and to explain superior human athletic performance purely on the basis of the body’s known physiological and metabolic responses to exercise must fail since subconscious and conscious mental decisions made by winners and losers, in both training and competition, are the ultimate determinants of both fatigue and athletic performance

The easily defensible version of this claim is that fatigue is a feeling in the brain. The most out there version of the claim is that humans are capable of unlimited physical feats, held back only by their own mind, and the results of sporting events are determined beforehand through psychic dominance competitions. That sounds like I’m being unfair, so let me quote the relevant portion

[A]thletes who finish behind the winner may make the conscious decision not to win, perhaps even before the race begins. Their deceptive symptoms of “fatigue” may then be used to justify that decision. So the winner is the athlete for whom defeat is the least acceptable rationalization

(He doesn’t mention psychic dominance competitions explicitly, but it’s the only way I see to get exactly one person deciding to win each race).

This paper generated a lot of ESC-able claims, which you can see here. These were unusually crisp claims that he provided citations for: absolutely the easiest thing to ESC (having your own citations agree with your summary of them is not sufficient to prove correctness, but lack of it takes a lot works out). But I found myself unenthused about doing so. I eventually realized that I wanted to read a competing explanation instead. Luckily Noakes provided a citation to one, and it was even more antagonistic to him than he claimed.

VO2,max: what do we know, and what do we still need to know?, by Benjamin D. Levine takes several direct shots at Noakes, including:

For the purposes of framing the debate, Dr Noakes frequently likes to place investigators into two camps: those who believe the brain plays a role in exercise performance, and those who do not (Noakes et al. 2004b). However this straw man is specious. No one disputes that ‘the brain’ is required to recruit motor units – for example, spinal cord-injured patients can’t run. There is no doubt that motivation is necessary to achieve VO2,max. A subject can elect to simply stop exercising on the treadmill while walking slowly because they don’t want to continue; no mystical ‘central governor’ is required to hypothesize or predict a VO2 below maximal achievable oxygen transport in this case.

Which I would summarize as “of course fatigue is a brain-mediated feeling: you feel it.” 

I stopped reading at this point, because I could no longer tell what the difference between the hypotheses was. What are the actual differences in predictions between “your muscles are physically unable to contract?” and “your brain tells you your muscles are unable to contract”? After thinking about it for a while, I came up with a few:

  1. The former suggests that there’s no intermediate between “safely working” and “incapacitation”.
  2. The latter suggests that you can get physical gains through mental changes alone.
  3. And that this might lead to tissue damage as you push yourself beyond safe limits.

Without looking at any evidence, #1 seems unlikely to be true. Things rarely work that way in general, much less in bodies.

The strongest pieces of evidence for #2 and #3 isn’t addressed by either paper: cases when mental changes have caused/allowed people to inflict serious injuries or even death to themselves.

  1. Hysterical strength (aka mom lifts car off baby)
  2. Involuntary muscle spasms (from e.g., seizures or old-school ECT)
  3. Stiff-man syndrome.

So I checked these out.

Hysterical strength has not been studied much, probably because IRBs are touchy about trapping babies under cars (with an option on “I was unable to find the medical term for it). There are enough anecdotes that it seems likely to exist, although it may not be common. And it can cause muscle tears, according to several sourceless citations. This is suggestive, but if I was on Levine’s team I’d definitely find it insufficient.

Most injuries from seizures are from falling or hitting something, but it appears possible for injuries to result from overactive muscles themselves. This is complicated by the fact that anti-convulsant medications can cause bone thinning, and by the fact that some unknown percentage of all people are walking around with fractures they don’t know about.

Unmodified electro-convulsive therapy had a small but persistent risk of bone fractures, muscle tears, and join dislocation. Newer forms of ECT use muscle relaxants specifically to prevent this.

Stiff-man Syndrome: Wikipedia says that 10% of stiff-man syndrome patients die from acidosis or autonomic dysfunction. Acidosis would be really exciting- evidence that overexertion of muscles will actually kill you. Unfortunately when I tried to track down the citation, it went nowhere (with one paper inaccessible). Additionally, one can come up with other explanations for the acidosis than muscle exertion. So that’s not compelling.

Overall it does seem clear that (some) people’s muscles are strong enough to break their bones, but are stopped from doing so under normal circumstances. You could call this vindication for Noake’s Central Governor Model, but I’m hesitant. It doesn’t prove you can safely get gains by changing your mindset alone.  It doesn’t prove all races are determined by psychic dominance fights. Yes, Noakes was speculating when he postulated that, but without it his theory is something like “you notice when your muscles reach their limits”. When you can safely push what feel like physical limits on the margin feels like a question that will vary a lot by individual and that neither paper tried to answer.

Overall, Fatigue is a brain-derived emotion that regulates the exercise behavior to ensure the protection of whole body homeostasis neither passed nor failed epistemic spot checks as originally conceived, because I didn’t check its specific claims. Instead I thought through its implications and investigated those, which supported the weak but not strong form of Noake’s argument.

In terms of process, the key here was feeling and recognizing the feeling that investigating forward (evaluating the implications of Noake’s arguments) was more important than investigating backwards (the evidence Noake provided for his hypothesis). I don’t have a good explanation for why that felt right at this time, but I want to track it.

Roam

For the last few epistemic spot check posts I’ve been sharing my evidence via Roam, rather than typing it up in the post itself. Longtime readers: How is that working out for you, relative to the old system?

Epistemic Spot Check: Unconditional Parenting

Epistemic spot checks started as a process in which I investigate a few of a book’s claims to see if it is trustworthy before continuing to read it. This had a number of problems, such as emphasizing a trust/don’t trust binary over model building, and emphasizing provability over importance. I’m in the middle of revamping ESCs to become something better. This post is both a ~ESC of a particular book and a reflection on the process of doing ESCs and what I have and should improve(d).

As is my new custom, I took my notes in Roam, a workflowy/wiki hybrid. Roam is so magic that my raw notes are better formatted there than I could ever hope to make them in a linear document like this, so I’m just going to share my conclusions here, and if you’re interested in the process, follow the links to Roam. Notes are formatted as follows:

  • The target source gets its own page
  • On this page I list some details about the book and claims it makes. If the claim is citing another source, I may include a link to the source.
  • If I investigate a claim or have an opinion so strong it doesn’t seem worth verifying (“Parenting is hard”), I’ll mark it with a credence slider. The meaning of each credence will eventually be explained here, although I’m still working out the system.
    • Then I’ll hand-type a number for the credence in a bullet point, because sliders are changeable even by people who otherwise have only read privileges.
  • You can see my notes on the source for a claim by clicking on the source in the claim
  • You may see a number to the side of a claim. That means it’s been cited by another page. It is likely a synthesis page, where I have drawn a conclusion from a variety of sources.

This post’s topic is Unconditional Parenting (Alfie Kohn) (affiliate link), which has the thesis that even positive reinforcement is treating your kid like a dog and hinders their emotional and moral development.

Unconditional Parenting failed its spot check pretty hard. Of three citations I actually researched (as opposed to agreed with without investigation, such as “Parenting is hard”), two barely mentioned the thing they were cited for as an evidence-free aside, and one reported exactly what UP claimed but was too small and subdivided to prove anything. 

Nonetheless, I thought UP might have good ideas kept reading it. One of the things Epistemic Spot Checks were designed to detect was “science washing”- the process of taking the thing you already believe and hunting for things to cite that could plausibly support it to make your process look more rigorous. And they do pretty well at that. The problem is that science washing doesn’t prove an idea is wrong, merely that it hasn’t presented a particular form of proof. It could still be true or useful- in fact when I dug into a series of self-help books, rigor didn’t seem to have any correlation with how useful they were. And with something like child-rearing, where I dismiss almost all studies as “too small, too limited”, saying everything needs rigorous peer-reviewed backing is the same as refusing to learn. So I continued with Unconditional Parenting to absorb its models, with the understanding that I would be evaluating its models for myself.

Unconditional Parenting is a principle based book, and its principles are:

  • It is not enough for you to love your children; they must feel loved unconditionally. 
  • Any punishment or conditionality of rewards endangers that feeling of being loved unconditionally.
  • Children should be respected as autonomous beings.
  • Obedience is often a sign of insecurity.
  • The way kids learn to make good decisions is by making decisions, not by following directions.

These seem like plausible principles to me, especially the first and last ones. They are, however, costly principles to implement. And I’m not even talking about things where you absolutely have to override their autonomy like vaccines. I’m talking about when your two children’s autonomies lead them in opposite directions at the beach, or you will lose your job if you don’t keep them on a certain schedule in the morning and their intrinsic desire is to watch the water drip from the faucet for 10 minutes. 

What I would really have liked is for this book to spend less time on its principles and bullshit scientific citations, and more time going through concrete real world examples where multiple principles are competing. Kohn explicitly declines to do this, saying specifics are too hard and scripts embody the rigid, unresponsive parenting he’s railing against, but I think that’s a cop out. Teaching principles in isolation is easy and pointless: the meaningful part is what you do when they’re difficult and in conflict with other things you value.

So overall, Unconditional Parenting:

  • Should be evaluated as one dude’s opinion, not the outcome of a scientific process
  • Is a useful set of opinions that I find plausible and intend to apply with modifications to my potential kids.
  • Failed to do the hard work of demonstrating implementation of its principles.
  • Is a very light read once you ignore all the science-washing.

 

 

As always, tremendous thanks to my Patreon patrons for their support.

 

Epistemic (Spot Check?): The Fate of Rome Round 2

Introduction

Two months ago I did an epistemic spot check on Kyle Harper’s The Fate of Rome. At the time I found only a minor flaw- stating that Roman ships weren’t surpassed until the 14th century, when China did it in the 13th century. I did not consider this fatal by any means.

Recently I decided to reread The Fate of Rome (affiliate link). This was driven by a few things. Primarily, I found myself resistant to reading more Roman history, which typically means I’m holding things in my short-term memory and will not be allowed to put new things into my brain until the existing things have been put in long term storage. But it did not hurt at all that I had just gotten access to a new exobrain, Roam, a workflowy/wiki hybrid, and yes, for purposes of this post that is an extremely unfortunate name.

This post is going to wear many hats: a second check of The Fate of Rome, a log of my work improving the epistemic spot check process, and a discussion of how Roam has affected my work. These will not be equally interesting to all people but I couldn’t write it any other way. That said, let us begin.

Process

Previously, I’d “taken notes” by highlighting passages and occasionally writing notes in the Kindle file, and then never reading them because Amazon’s anti-consumer choices made them a pain to access. Worse, I used highlights as an excuse not to take information into my brain- it was a pointer to process something later, not a reminder of something I had already processed.

When I took notes in Roam, I took notes. My initial workflow was to create a page for the book I was reading, and on it list claims from the book, each of which got their own page (I would eventually change that and leave them as bullet points on the source page). You can see the eventual result here: typically I recorded multiple claims per source-page, mostly rephrased into my own words, and always thought through instead of saved for thinking about later. (For comparison: notes from Fall of Rome round 1).

A few changes started about this time:

  • I stopped being able to read without taking notes on my laptop, meaning I could no longer use my Kindle. I don’t think I got worse at reading on Kindle, it just became obvious how bad that always was.
  • Despite having to use a multi-purpose device, I was more focused and harder to distract, probably by an order of magnitude.
  • I couldn’t work on the project passed ~9PM. I don’t think I was ever doing my best work past 9, it just became obvious in contrast to the better work I could now do.
  • I wanted to put a timestamp on every claim, so I noticed when it was unclear what time period a statement referred to.
  • “How do we know that?” questions moved from something I pushed myself to think about during second read-throughs to popping into my head unbidden. There were just natural “How do we know that?” shaped holes in my notes.
  • It became much more obvious when a bunch of paragraphs said nothing, or said nothing I valued, because even when I tried I couldn’t distill them into my notes.
  • Reading books felt like play in a way it never had before, even though it was always something I enjoyed doing.
  • I got more proactive about housecleaning. No, I wasn’t using Roam as a GTD system, it was purely research notes. And yet, I had more activation energy and more willingness to do multi-step chores. I have logs from Toggl to demonstrate this correlation, if not causation. Even assuming it’s causal I’d be shocked if it were common, so you probably shouldn’t incorporate it into your expected value of trying Roam.

At this stage the workflow is nothing I couldn’t have done in google docs, but I didn’t. I have all kinds of justifications about how knowing what I could do with Roam changed how I approached the work, but when I started that was theoretical so I’m not confident that’s what was going on. Nonetheless, I did it in Roam where I didn’t in Docs. 

So I had a Source page and a bunch of Claim pages. I started to do what I used to do in google docs or even a wordpress draft: select a claim and look for things confirming or denying it. This meant putting evidence on the Claims pages. But that didn’t feel right- why should some sources get their own page when others sat on the pages of claims from other sources? So I let claims motivate my choice of sources to look up, but every source got its own page with its claims listed on it. When I felt I knew enough I would create a Synthesis page representing what I really thought, with links to all the relevant claims (Roam lets you link to bullet points, not just pages) and a slider bar stating how firmly I believed it. This supported something I already wanted conceptually, which was shifting from [evaluating claims for truth and then judging the trustworthiness of the book] to [collating data from multiple sources of unknown reliability to inform my opinion of the world]. When this happened it became obvious Claims didn’t need their own pages and could live happily as bullet points on their associated Source page.

Once I had a Synthesis I would back-propagate a Credence to the claim that inspired the thread. Ideally I would have back propagated to all relevant claims, but that was more effort than it was worth. I put credences right in the claim so they would automatically show up when linked to, giving me a quick visual on how credible the book’s claims were when I investigated. The visual isn’t perfect because claims can have wildly different weights, but it is a start.

[Due to a bug, slider bars can be changed even by people given only read-access, so I also put the Credence in text]

Results

It turns out that The Fate of Rome was a near-ideal book about which to start asking “how do we know this?” (or maybe I’ll do more books and find out it’s average, but it definitely rewarded the behavior), because it is working with cutting edge science to prove its points, meaning it’s doing a lot of interpretation.

The Fate of Rome makes two big claims: Rome’s peak coincides with a period of unusually favorable and stable weather in the Mediterranean (from 200 BC to 150 AD), and Rome was a constant disease fest punctuated by peaks of even more illness.  What I would like to do right now is link you to my Fate of Rome Roam page, tell you to look at the links at the bottom, filter for Synthesis, and just browse through my work. It’s better prepared than I could ever do linearly, and lets you choose which parts are important to you. But I suspect there’s a learning curve to Roam so I will write things out the tedious linear way.

The Fate of Rome lists many sources of data on ancient climate. Here is a list of what I consider the 5 strongest, and the time period they supposedly applied. If you were reading this on Roam, you would have page numbers so you could verify my interpretation:

  • Cosmogenic radionuclides in ice cores say that 360BC – 690 AD had unusually stable solar activity
  • (Source unknown) says no major volcanic eruptions between “late republic” (end of the BCs) and “age of Justinian” (530s)
  • Ratio of Oxygen18 to Oxygen16 in stalagmites points to warmth during “early Imperial Rome”
  • The Tiber River flooded regularly (source unknown) during peak Imperial Rome
  • Radiocarbon-dated sediments say the Dead Sea was at a peak from 200 BC to 200 AD

I have three complaints here: he doesn’t share the resolution of each method, two of the data points are unsourced (although one points to a paper where I could have looked it up), and these time periods don’t match up particularly well. For the first: I tried to find the resolution for ice cores at a depth of 2000 years, and was unable to come to a definitive answer, but I did find a suggestion that they’re extremely sensitive to the assumptions in your model, which makes me nervous. The third thing seems even more concerning: if anything it seems like the good times should have rolled through the collapse of the western empire, not ended at 150AD like Fate suggests. When you add in the innate political nature of any claims about changing climate, I’m inclined to view Fate’s climate claims as speculative, although not impossible. 

Another question Fate raises is the baseline health of the Romans. I think Fate is correct that it was terrible, and that’s an update for me. Turns out communal baths are not a source of hygiene before chlorine. Harper claims the disease and parasite load was worse than the people on the same land before or after. I initially thought this seemed reasonable for “before” but unreasonable for “after”- medieval peasants had shockingly terrible diets and disease risks. But if anything the evidence supports the opposite of what I thought– you have to go pretty far back to find people much taller than the Romans, but height jumps just as the (western) empire falls. There are other explanations for this, around exactly which skeletons get found, but basically all the sources I found agreed that the Roman disease load was high.

I’m not without qualms though. A prime piece of evidence he uses to demonstrate a high disease load is dental caries (cavities) versus Linear Enamel Hypoplasia, a defect in the growth of a tooth. Medieval peasants had more caries than Romans but less LEH. Harper’s interpretation is that medieval peasants had worse diets than Romans (because the caries indicate high carb content) but less disease (LEH can be caused by both poor nutrition and disease, and a better diet is indicated by the lack of caries). Martin Bernstorff, a friendly medical student who I met on Roam Slack, helped me out on this one. Based on a half hour of his research, an equally plausible explanation is that medieval peasants had the same disease load but more calcium. This doesn’t mean Rome wasn’t terrible- medieval European peasants had it shockingly bad. But it is not clear cut evidence of Rome being worse.

A sub-claim is that the Antonine Plague (165AD-180AD) was caused by Smallpox. Harper is careful to say that retrospective diagnosis is difficult without biochemical evidence and there’s not actually a lot riding on this conclusion: he’s not doing epidemiological modeling dependent on properties of smallpox in particular, for example. But he does sound very confident, and I wanted to see if that was justified. Martin took a look at this one too, and concluded there was a 95% chance Harper was correct, assuming the Roman doctor’s notes were accurate. The remaining 5% covers the chance of a related pox virus with a lower mortality rate.

Overall I still like The Fate of Rome, but I have much less trust in it than I did after my first spot check, when its only sin was briefly forgetting China existed. It its fight with The Fall of Rome, it has lost ground.

 

More Process

My first try at Fate took an unrecorded number of hours to read, and ~two hours to spot check (this is shorter than usual, because of the amplification experiment) Call it < 10 hours, not counting the time to write it up. This round took 17 hours of combined reading and investigation into claims  (plus 1.5 hours of Martin’s time), and so far three hours to write it up. This isn’t an apples-to-apples comparison, but that’s not *that much* additional time, for the increase in depth and understanding I got. I credit Roam with speeding things up enormously.

Since this is partially a love letter to Roam, I want to add a few things: 

  • Over the years I’ve tried workflowy, calculist, and google docs. I did not go looking for other tools in this space and don’t intend to because I am Roam’s exact use case, so even if it’s not the best now I expect it grow towards me.
  • It’s just into beta and it shows: I probably file a bug or feature request per day. It’s never anything that renders Roam unusable, just things take longer than they should. 
  • Roam’s CEO, Conor White-Sullivan, has encouraged me to share my experience but has not given me anything for this post except a good product and the hope that it will continue to exist if enough people use it. 

 

As always, tremendous thanks to my Patreon patrons for their support. I would additionally like to thank Martin Bernstorff for his research (check out his new blog) and Edo Arad for comments on a draft.